tag:blogger.com,1999:blog-11444168183865530362023-11-15T07:31:20.213-08:00ARM assembly languageAssembly language programming on ARM microprocessors with examples of working code.Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.comBlogger28125tag:blogger.com,1999:blog-1144416818386553036.post-6422613777945277522020-10-12T04:22:00.001-07:002020-10-12T04:22:16.588-07:00I got distracted ...
I now have a Pinebook Pro and a Pine phone.
Both have an ARM64 CPU - I will write something about ARM64 assembler soon!Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com1tag:blogger.com,1999:blog-1144416818386553036.post-53967152796054608752014-03-07T12:22:00.000-08:002014-03-07T12:22:50.694-08:00Triangle numbers<p>The sequence of triangle numbers is generated by adding the natural numbers. So the 7<sup>th</sup> triangle number would be 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. The first ten terms would be:</p>
<p style="text-align:center;">1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...</p>
<p>Let us list the factors of the first seven triangle numbers:</p>
<blockquote style="font-family:courier new;"><b> 1</b>: 1<br/><b> 3</b>: 1,3<br/><b> 6</b>: 1,2,3,6<br/><b>10</b>: 1,2,5,10<br/><b>15</b>: 1,3,5,15<br/><b>21</b>: 1,3,7,21<br/><b>28</b>: 1,2,4,7,14,28</blockquote>
<p>We can see that 28 is the first triangle number to have over five divisors.</p>
<p>What is the value of the first triangle number to have over five hundred divisors?</p>
<pre>
.syntax unified
.align 2
.section .rodata
resstring:
.asciz "%d\n"
.equ last, 250 @ we count up to the square root - so half of the divisors will
@ be necessarily less than the square root and half above.
icount .req r4 @ it is possible to write FORTRAN in any language :-)
jcount .req r5
tmp .req r6
num .req r7
.text
.align 2
.global get_num_divisors
.type get_num_divisors, %function
@ we use jcount to hold current divisor and icount to hold the number of divisors seen
@ num is passed as r0 and the number of divisors is returned.
get_num_divisors:
stmfd sp!, {r4-r7, lr}
mov num, r0
mov icount, 0
mov jcount, 1
jstart:
mov r0, num
mov r1, jcount
bl divide
teq r1, 0
beq factor
b nextj
factor:
add icount, icount, 1
nextj:
add jcount, jcount, 1
mul tmp, jcount, jcount
cmp num, tmp
bgt jstart
mov r0, icount
ldmfd sp!, {r4-r7, pc}
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r7, lr}
mov num, 0
mov icount, 0
mov jcount, 1
loop:
cmp num, #last
bge printme
add icount, icount, jcount
mov r0, icount
bl get_num_divisors
mov num, r0
add jcount, jcount, 1
b loop
printme:
mov r1, icount
ldr r0, =resstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r7, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre>
<p><b>divide.s</b></p>
.syntax unified
# divide takes value in r0, divisor in r1 and returns dividend in r0 and modulus in r1
.global divide
.type divide, %function
divide:
stmfd sp!, {lr}
# see http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf
bl __aeabi_uidivmod
ldmfd sp!, {pc}
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-57106905896818530852014-03-07T11:47:00.002-08:002014-03-07T11:47:50.154-08:00Back on track...A month turns into 6 so easily. While away I have replaced my desktop machine with a CubieTruck - does all I need and runs Debian well. Also just got a udroid u3 for evaluation purposes and getting that to work. More code soon.Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-49420783033293750872013-08-22T09:52:00.001-07:002013-08-22T10:06:07.230-07:00Greatest product of adjacent numbers in a grid<p>A note to any readers : This will be the last post for a month.</p>
<p>In the 20<img src="images/symbol_times.gif" width="9" height="9" alt="×" border="0" style="vertical-align:middle;"/>20 grid below, four numbers along a diagonal line have been marked in red.</p>
<p style="font-family:courier new;text-align:center;font-size:10pt;">
08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08<br/>
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00<br/>
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65<br/>
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91<br/>
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80<br/>
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50<br/>
32 98 81 28 64 23 67 10 <span style="color:#ff0000;"><b>26</b></span> 38 40 67 59 54 70 66 18 38 64 70<br/>
67 26 20 68 02 62 12 20 95 <span style="color:#ff0000;"><b>63</b></span> 94 39 63 08 40 91 66 49 94 21<br/>
24 55 58 05 66 73 99 26 97 17 <span style="color:#ff0000;"><b>78</b></span> 78 96 83 14 88 34 89 63 72<br/>
21 36 23 09 75 00 76 44 20 45 35 <span style="color:#ff0000;"><b>14</b></span> 00 61 33 97 34 31 33 95<br/>
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92<br/>
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57<br/>
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58<br/>
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40<br/>
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66<br/>
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69<br/>
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36<br/>
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16<br/>
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54<br/>
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48<br/></p>
<p>The product of these numbers is 26 <img src="images/symbol_times.gif" width="9" height="9" alt="×" border="0" style="vertical-align:middle;"/> 63 <img src="images/symbol_times.gif" width="9" height="9" alt="×" border="0" style="vertical-align:middle;"/> 78 <img src="images/symbol_times.gif" width="9" height="9" alt="×" border="0" style="vertical-align:middle;"/> 14 = 1788696.</p>
<p>What is the greatest product of four adjacent numbers in the same direction (up, down, left, right, or diagonally) in the 20<img src="images/symbol_times.gif" width="9" height="9" alt="×" border="0" style="vertical-align:middle;"/>20 grid?</p>
<p>The solution to this problem is quite interesting because it demonstrates the use of a macro. A macro is a construct available within <em>as</em> which permits the same sort of abstraction as a procedure|subroutine|function would in a high-level language.</p>
<p>Inside the macro, the parameter <em>a</em> is accessed via <em>\a</em> and the construct <em>\@</em> expands out to a unique value each time the macro is invoked; this allows the use in labels which are distinct for each macro invocation.</p>
<pre class="example">
.syntax unified
.align 2
.section .rodata
buffer:
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
.byte 0, 0, 0, 8, 2, 22, 97, 38, 15, 0, 40, 0, 75, 4, 5, 7, 78, 52, 12, 50, 77, 91, 8, 0, 0, 0
.byte 0, 0, 0, 49, 49, 99, 40, 17, 81, 18, 57, 60, 87, 17, 40, 98, 43, 69, 48, 4, 56, 62, 0, 0, 0, 0
.byte 0, 0, 0, 81, 49, 31, 73, 55, 79, 14, 29, 93, 71, 40, 67, 53, 88, 30, 3, 49, 13, 36, 65, 0, 0, 0
.byte 0, 0, 0, 52, 70, 95, 23, 4, 60, 11, 42, 69, 24, 68, 56, 1, 32, 56, 71, 37, 2, 36, 91, 0, 0, 0
.byte 0, 0, 0, 22, 31, 16, 71, 51, 67, 63, 89, 41, 92, 36, 54, 22, 40, 40, 28, 66, 33, 13, 80, 0, 0, 0
.byte 0, 0, 0, 24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50, 0, 0, 0
.byte 0, 0, 0, 32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70, 0, 0, 0
.byte 0, 0, 0, 67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21, 0, 0, 0
.byte 0, 0, 0, 24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72, 0, 0, 0
.byte 0, 0, 0, 21, 36, 23, 9, 75, 0, 76, 44, 20, 45, 35, 14, 0, 61, 33, 97, 34, 31, 33, 95, 0, 0, 0
.byte 0, 0, 0, 78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92, 0, 0, 0
.byte 0, 0, 0, 16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 0, 17, 54, 24, 36, 29, 85, 57, 0, 0, 0
.byte 0, 0, 0, 86, 56, 0, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58, 0, 0, 0
.byte 0, 0, 0, 19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40, 0, 0, 0
.byte 0, 0, 0, 4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66, 0, 0, 0
.byte 0, 0, 0, 88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69, 0, 0, 0
.byte 0, 0, 0, 4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36, 0, 0, 0
.byte 0, 0, 0, 20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16, 0, 0, 0
.byte 0, 0, 0, 20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54, 0, 0, 0
.byte 0, 0, 0, 1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48, 0, 0, 0
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
.byte 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
resstring:
.asciz "%d\n"
.equ ocolumns, 26
.equ icolumns, 20
.equ colstart, 23
.equ border, 3
.equ offset, 2
.equ points, 4
.equ north, -ocolumns
.equ north_east, -ocolumns+1
.equ east, 1
.equ south_east, ocolumns+1
.equ south, ocolumns
.equ south_west, ocolumns - 1
.equ west, -1
.equ north_west, -ocolumns-1
.equ point_set, 1
tmp .req r2
pointv .req r3
data_ptr .req r4
icount .req r5
jcount .req r6
kcount .req r7
maxv .req r8
dpstore .req r9
.macro direction a
mov data_ptr, dpstore
mov kcount, points
mov pointv, point_set
kloop\@:
ldrb tmp, [data_ptr], \a
mul pointv, pointv, tmp
subs kcount, kcount, 1
bne kloop\@
cmp maxv, pointv
movlt maxv, pointv @ set maxv to max of maxv and pointv
.endm
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r9, lr}
mov icount, icolumns
iloop:
mov jcount, icolumns
jloop:
mov tmp, ocolumns
add data_ptr, icount, offset
mul tmp, tmp, data_ptr
add tmp, tmp, jcount
add tmp, tmp, offset
ldr data_ptr, =buffer
add dpstore, data_ptr, tmp @ data[i][j]
direction north
direction north_east
direction east
direction south_east
direction south
direction south_west
direction west
direction north_west
subs jcount, jcount, 1
bne jloop
subs icount, icount, 1
bne iloop
printme:
mov r1, maxv
ldr r0, =resstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r9, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-50504469114261666992013-08-15T08:44:00.000-07:002013-08-16T02:41:03.111-07:00Summing prime numbers<p>The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.</p>
<p>Find the sum of all the primes below two million.</p>
<!--
<p class="info">Note: This problem has been changed recently, please check that you are using the right parameters.</p>
-->
<pre class="example">
.syntax unified
.equ word,4
.equ limit,2000000
.equ numprimes4,595732 @ sufficient to hold all the bytes in all the elements of the vector we use to hold the primes
sum_hi .req r4
sum_lo .req r5
numprimes .req r6
primes_ptr .req r7
number .req r8
limit .req r9
.section .bss
.lcomm primes_vector,numprimes4 @ allocate space for vector of primes
.section .rodata
llustring:
.asciz "%llu\n"
.text
.align 8
.global main
.type main, %function
main:
stmfd sp!, {r4-r9, fp, lr}
ldr primes_ptr, =primes_vector
mov numprimes, 1
mov number, 2
str number, [primes_ptr] @ store 2 to primes vector
ldr limit, =limit
mov sum_hi, 0 @ initialise sum
mov sum_lo, 2
mov number, 3 @ start with 3 as first odd prime
loop:
cmp number, limit
bge printme @ leave if we have exceeded limit
mov r0, number
ldr r1, =primes_vector
mov r2, numprimes
bl prime_vector
teq r0, 1 @ check for primality of number
bne nexti
str number, [primes_ptr, numprimes, lsl 2] @ if prime, add to vector
add numprimes, numprimes, 1 @ increment prime count
adds sum_lo, sum_lo, number @ and add to sum
adc sum_hi, sum_hi, 0
nexti:
add number, number, 2 @ increment number by 2
b loop
printme:
mov r2, sum_lo
mov r3, sum_hi
ldr r0, =llustring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r9, fp, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre class="example">
<p><b>prime_vector.s</b></p>
.syntax unified
.equ word, 4
# this subroutine returns 1 if the passed number is prime; 0 if not
#
# inputs
# r0 - integer to test
# r1 - pointer to vector of prime integers smaller than r0
# r2 - length of vector passed in r1
#
# outputs
# r0 - prime boolean
number .req r4
vptr .req r5
tmp .req r6
squared .req r7
vsize .req r8
.global prime_vector
.type prime_vector, %function
.text
prime_vector:
stmfd sp!, {r4-r8, lr}
mov number, r0
mov vptr, r1
mov vsize, r2
nexti:
ldr tmp, [vptr], word @ get vector element to tmp
mul squared, tmp, tmp @ and square it
cmp squared, number @ compare the square to the input number
movgt r0, 1 @ store 1 to r0 if square is greater (i.e. number is primes)
bgt last @ and jump out
mov r0, number
mov r1, tmp
bl divide @ divide number by tmp
teq r1, 0 @ and check if it is a divisor
moveq r0, 0 @ if so, not prime, so set r0 to 0
beq last @ and leave
subs vsize, vsize, 1 @ decrement prime counter
bgt nexti @ and try next vector element
mov r0, 1
last:
ldmfd sp!, {r4-r8, pc}
</pre>
<pre class="example">
<p><b>divide.s</b></p>
.syntax unified
# divide takes value in r0, divisor in r1 and returns dividend in r0 and modulus in r1
.global divide
.type divide, %function
divide:
stmfd sp!, {lr}
# see http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf
bl __aeabi_uidivmod
ldmfd sp!, {pc}
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-87437524996811705632013-07-28T05:00:00.001-07:002013-07-28T05:07:06.101-07:00Generating processor-specific code from a source file<p>There are many ARM cores as described at <a href="http://en.wikipedia.org/wiki/Arm_architecture">Wikipedia</a>. The processors get more complex as they evolve and occasionally new instructions are introduced. One instance is the <em>rev</em> instruction which will <a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/Cihjgdid.html">reverse the bytes in a word</a> e.g. for converting big-endian numbers to little-endian and vice versa.</p>
<p>In my Sheeva Plug there is a version 5TE processor (which does not support this instruction); my Raspberry Pi has a version 6 processor (which does).</p>
<p>I would like to keep my code in one source file, and dynamically assemble the optimum code for the processor upon which it was built.</p>
<p>We can use the macro assembler of the GNU assembler to achieve this.</p>
<pre class="example">
.if ARCH >= 6
rev r0, r0
.else
and r2, r0, #0xff000000 @ load the top 2 bytes to r2
and r3, r0, #0x00ff0000 @ load the next 2 bytes to r3
and r4, r0, #0x0000ff00 @ load the next 2 bytes to r4
and r5, r0, #0x000000ff @ load the bottom 2 bytes to r5
mov r0, r2, lsr #24 @ shift r2 bytes to bottom and store to r0
orr r0, r3, lsr #8 @ or the remaining shifted data
orr r0, r4, lsl #8
orr r0, r5, lsl #24
.endif
</pre>
<p>We can interrogate the machine at build time by looking inside <em>/proc/cpuinfo</em> via</p>
<p><em>cat /proc/cpuinfo | grep ^Processor</em> which gives : </p>
<ul>
<li>On A SheevaPlug : <em>Processor : Feroceon 88FR131 rev 1 (v5l)</em></li>
<li>On A Raspberry Pi : <em>Processor : ARMv6-compatible processor rev 7 (v6l)</em></li>
</ul>
<p>And then we pull the appropriate number within the Makefile and then we can build via <em>make endian</em> using the following <em>Makefile</em></p>
<pre class="example">
AS := /usr/bin/as
LD := /usr/bin/ld
ASOPTS := -gstabs
ARCH = $(shell sed -ne 's/^Processor\s*:.*(v\([0-9]\)l)/\1/p' /proc/cpuinfo)
endian: endian.s
echo ARCH is $(ARCH)
$(AS) $(ASOPTS) --defsym ARCH=$(ARCH) -march=armv$(ARCH) -o $@.o $^
$(LD) -o $@ $@.o
</pre>
Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-58047840655384459282013-07-23T12:57:00.004-07:002013-07-23T13:41:14.752-07:00Pythagorean triplets<div class="problem_content" role="problem">
<p>A Pythagorean triplet is a set of three natural numbers, <var>a</var> <img src="images/symbol_lt.gif" width="10" height="10" alt="<" border="0" style="vertical-align:middle;"/><var>b</var> <img src="images/symbol_lt.gif" width="10" height="10" alt="<" border="0" style="vertical-align:middle;"/><var>c</var>, for which,</p>
<div style="text-align:center;"> <var>a</var><sup>2</sup> + <var>b</var><sup>2</sup> = <var>c</var><sup>2</sup></div>
<p>For example, 3<sup>2</sup> + 4<sup>2</sup> = 9 + 16 = 25 = 5<sup>2</sup>.</p>
<p>There exists exactly one Pythagorean triplet for which <var>a</var> + <var>b</var> + <var>c</var> = 1000.<br/>Find the product <var>abc</var>.</p></div>
<pre class="example">
.syntax unified
.equ limit,1000
.align 4
icount .req r4 @ use of i, j and k as loop indices is historic :-)
jcount .req r5 @ we use icount for a, jcount for b and kcount for c
kcount .req r6
tmp .req r8
jksum .req r9
.section .rodata
resstring:
.asciz "%d\n"
.text
.global main
.type main, %function
main:
stmfd sp!, {r4-r9, lr}
ldr icount, =limit
istart:
subs jcount, icount, 1 @ i > j so start j at i-1
beq nexti
jstart:
subs kcount, jcount, 1 @ j > k too
beq nextj
add tmp, icount, jcount
kstart:
add tmp, tmp, kcount
cmp tmp, #limit
bne nextk
mul jksum, kcount, kcount @ we re-use the register later as jksum (this is ksquared here)
@ we could also alias ksquared to r9 - but then we have to avoid
@ collision with jksum
mul tmp, jcount, jcount @ could alias jsquared to r8 (or just re-use tmp) as here
add jksum, jksum, tmp
mul tmp, icount, icount @ and also could alias isquared 9but choose to reuse tmp)
cmp jksum, tmp @ this is the important comparison - if i*i == (j*j) + (k*k)
beq printme @ then leave since there is only one solution
nextk:
add tmp, icount, jcount
subs kcount, kcount, 1
bne kstart
nextj:
subs jcount, jcount, 1
bne jstart
nexti:
subs icount, icount, 1
bne istart
printme:
mul tmp, icount, jcount
mul tmp, tmp, kcount
mov r1, tmp
ldr r0, =resstring
bl printf
mov r0, 0
ldmfd sp!, {r4-r9, pc}
mov r7, 1
swi 0
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-45360308385393348602013-07-14T04:12:00.001-07:002013-07-14T04:17:37.628-07:00Find product of digits<p>Find the greatest product of five consecutive digits in the 1000-digit number.</p>
<p style="font-family:courier new;font-size:10pt;text-align:center;">
73167176531330624919225119674426574742355349194934<br/>
96983520312774506326239578318016984801869478851843<br/>
85861560789112949495459501737958331952853208805511<br/>
12540698747158523863050715693290963295227443043557<br/>
66896648950445244523161731856403098711121722383113<br/>
62229893423380308135336276614282806444486645238749<br/>
30358907296290491560440772390713810515859307960866<br/>
70172427121883998797908792274921901699720888093776<br/>
65727333001053367881220235421809751254540594752243<br/>
52584907711670556013604839586446706324415722155397<br/>
53697817977846174064955149290862569321978468622482<br/>
83972241375657056057490261407972968652414535100474<br/>
82166370484403199890008895243450658541227588666881<br/>
16427171479924442928230863465674813919123162824586<br/>
17866458359124566529476545682848912883142607690042<br/>
24219022671055626321111109370544217506941658960408<br/>
07198403850962455444362981230987879927244284909188<br/>
84580156166097919133875499200524063689912560717606<br/>
05886116467109405077541002256983155200055935729725<br/>
71636269561882670428252483600823257530420752963450<br/></p>
<pre class="example">
.syntax unified
.equ limit,10000
.equ outer, 996
.equ inner, 5
.align 4
address .req r4
thisbyte .req r5
icounter .req r6
ocounter .req r7
maxv .req r8
tmp .req r9
.section .data
buffer:
.byte 7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4, 9, 1, 9, 2, 2
.byte 5, 1, 1, 9, 6, 7, 4, 4, 2, 6, 5, 7, 4, 7, 4, 2, 3, 5, 5, 3, 4, 9, 1, 9
.byte 4, 9, 3, 4, 9, 6, 9, 8, 3, 5, 2, 0, 3, 1, 2, 7, 7, 4, 5, 0, 6, 3, 2, 6
.byte 2, 3, 9, 5, 7, 8, 3, 1, 8, 0, 1, 6, 9, 8, 4, 8, 0, 1, 8, 6, 9, 4, 7, 8
.byte 8, 5, 1, 8, 4, 3, 8, 5, 8, 6, 1, 5, 6, 0, 7, 8, 9, 1, 1, 2, 9, 4, 9, 4
.byte 9, 5, 4, 5, 9, 5, 0, 1, 7, 3, 7, 9, 5, 8, 3, 3, 1, 9, 5, 2, 8, 5, 3, 2
.byte 0, 8, 8, 0, 5, 5, 1, 1, 1, 2, 5, 4, 0, 6, 9, 8, 7, 4, 7, 1, 5, 8, 5, 2
.byte 3, 8, 6, 3, 0, 5, 0, 7, 1, 5, 6, 9, 3, 2, 9, 0, 9, 6, 3, 2, 9, 5, 2, 2
.byte 7, 4, 4, 3, 0, 4, 3, 5, 5, 7, 6, 6, 8, 9, 6, 6, 4, 8, 9, 5, 0, 4, 4, 5
.byte 2, 4, 4, 5, 2, 3, 1, 6, 1, 7, 3, 1, 8, 5, 6, 4, 0, 3, 0, 9, 8, 7, 1, 1
.byte 1, 2, 1, 7, 2, 2, 3, 8, 3, 1, 1, 3, 6, 2, 2, 2, 9, 8, 9, 3, 4, 2, 3, 3
.byte 8, 0, 3, 0, 8, 1, 3, 5, 3, 3, 6, 2, 7, 6, 6, 1, 4, 2, 8, 2, 8, 0, 6, 4
.byte 4, 4, 4, 8, 6, 6, 4, 5, 2, 3, 8, 7, 4, 9, 3, 0, 3, 5, 8, 9, 0, 7, 2, 9
.byte 6, 2, 9, 0, 4, 9, 1, 5, 6, 0, 4, 4, 0, 7, 7, 2, 3, 9, 0, 7, 1, 3, 8, 1
.byte 0, 5, 1, 5, 8, 5, 9, 3, 0, 7, 9, 6, 0, 8, 6, 6, 7, 0, 1, 7, 2, 4, 2, 7
.byte 1, 2, 1, 8, 8, 3, 9, 9, 8, 7, 9, 7, 9, 0, 8, 7, 9, 2, 2, 7, 4, 9, 2, 1
.byte 9, 0, 1, 6, 9, 9, 7, 2, 0, 8, 8, 8, 0, 9, 3, 7, 7, 6, 6, 5, 7, 2, 7, 3
.byte 3, 3, 0, 0, 1, 0, 5, 3, 3, 6, 7, 8, 8, 1, 2, 2, 0, 2, 3, 5, 4, 2, 1, 8
.byte 0, 9, 7, 5, 1, 2, 5, 4, 5, 4, 0, 5, 9, 4, 7, 5, 2, 2, 4, 3, 5, 2, 5, 8
.byte 4, 9, 0, 7, 7, 1, 1, 6, 7, 0, 5, 5, 6, 0, 1, 3, 6, 0, 4, 8, 3, 9, 5, 8
.byte 6, 4, 4, 6, 7, 0, 6, 3, 2, 4, 4, 1, 5, 7, 2, 2, 1, 5, 5, 3, 9, 7, 5, 3
.byte 6, 9, 7, 8, 1, 7, 9, 7, 7, 8, 4, 6, 1, 7, 4, 0, 6, 4, 9, 5, 5, 1, 4, 9
.byte 2, 9, 0, 8, 6, 2, 5, 6, 9, 3, 2, 1, 9, 7, 8, 4, 6, 8, 6, 2, 2, 4, 8, 2
.byte 8, 3, 9, 7, 2, 2, 4, 1, 3, 7, 5, 6, 5, 7, 0, 5, 6, 0, 5, 7, 4, 9, 0, 2
.byte 6, 1, 4, 0, 7, 9, 7, 2, 9, 6, 8, 6, 5, 2, 4, 1, 4, 5, 3, 5, 1, 0, 0, 4
.byte 7, 4, 8, 2, 1, 6, 6, 3, 7, 0, 4, 8, 4, 4, 0, 3, 1, 9, 9, 8, 9, 0, 0, 0
.byte 8, 8, 9, 5, 2, 4, 3, 4, 5, 0, 6, 5, 8, 5, 4, 1, 2, 2, 7, 5, 8, 8, 6, 6
.byte 6, 8, 8, 1, 1, 6, 4, 2, 7, 1, 7, 1, 4, 7, 9, 9, 2, 4, 4, 4, 2, 9, 2, 8
.byte 2, 3, 0, 8, 6, 3, 4, 6, 5, 6, 7, 4, 8, 1, 3, 9, 1, 9, 1, 2, 3, 1, 6, 2
.byte 8, 2, 4, 5, 8, 6, 1, 7, 8, 6, 6, 4, 5, 8, 3, 5, 9, 1, 2, 4, 5, 6, 6, 5
.byte 2, 9, 4, 7, 6, 5, 4, 5, 6, 8, 2, 8, 4, 8, 9, 1, 2, 8, 8, 3, 1, 4, 2, 6
.byte 0, 7, 6, 9, 0, 0, 4, 2, 2, 4, 2, 1, 9, 0, 2, 2, 6, 7, 1, 0, 5, 5, 6, 2
.byte 6, 3, 2, 1, 1, 1, 1, 1, 0, 9, 3, 7, 0, 5, 4, 4, 2, 1, 7, 5, 0, 6, 9, 4
.byte 1, 6, 5, 8, 9, 6, 0, 4, 0, 8, 0, 7, 1, 9, 8, 4, 0, 3, 8, 5, 0, 9, 6, 2
.byte 4, 5, 5, 4, 4, 4, 3, 6, 2, 9, 8, 1, 2, 3, 0, 9, 8, 7, 8, 7, 9, 9, 2, 7
.byte 2, 4, 4, 2, 8, 4, 9, 0, 9, 1, 8, 8, 8, 4, 5, 8, 0, 1, 5, 6, 1, 6, 6, 0
.byte 9, 7, 9, 1, 9, 1, 3, 3, 8, 7, 5, 4, 9, 9, 2, 0, 0, 5, 2, 4, 0, 6, 3, 6
.byte 8, 9, 9, 1, 2, 5, 6, 0, 7, 1, 7, 6, 0, 6, 0, 5, 8, 8, 6, 1, 1, 6, 4, 6
.byte 7, 1, 0, 9, 4, 0, 5, 0, 7, 7, 5, 4, 1, 0, 0, 2, 2, 5, 6, 9, 8, 3, 1, 5
.byte 5, 2, 0, 0, 0, 5, 5, 9, 3, 5, 7, 2, 9, 7, 2, 5, 7, 1, 6, 3, 6, 2, 6, 9
.byte 5, 6, 1, 8, 8, 2, 6, 7, 0, 4, 2, 8, 2, 5, 2, 4, 8, 3, 6, 0, 0, 8, 2, 3
.byte 2, 5, 7, 5, 3, 0, 4, 2, 0, 7, 5, 2, 9, 6, 3, 4, 5, 0
.section .rodata
.align 2
resstring:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r9, lr}
ldr address, =buffer
ldr ocounter, =outer
outer_start:
ldr icounter, =inner
mov tmp, #1
inner_start:
ldrb thisbyte, [address], 1
mul tmp, tmp, thisbyte
subs icounter, icounter, 1
bne inner_start
cmp maxv, tmp
movlt maxv, tmp @ overwrite maxv with tmp if tmp is bigger
sub address, address, 4
subs ocounter, ocounter, 1
bne outer_start
printme:
mov r1, maxv
ldr r0, =resstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r9, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-46681063302781440512013-07-11T12:08:00.000-07:002013-07-11T13:19:45.196-07:00Finding prime numbers<p>By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.</p>
<p>What is the 10 001st prime number?</p>
<pre class="example">
.syntax unified
.equ limit,10000
.equ limit4,40000
.align 4
number .req r4
count .req r5
numprimes .req r6
primes_ptr .req r7
.section .bss
.lcomm primes_vector,limit4 # sufficient for 10000 32-bit ints
.section .rodata
.align 2
resstring:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r7, lr}
ldr primes_ptr, =primes_vector
mov numprimes, 1
mov number, 2
str number, [primes_ptr]
ldr count, =limit
mov number, 3 @ 2 is the first prime
loop:
mov r0, number
ldr r1, =primes_vector
mov r2, numprimes
bl prime_vector
teq r0, 1
bne nexti
str number, [primes_ptr, numprimes, lsl 2]
add numprimes, numprimes, 1
subs count, count, 1
beq printme
nexti:
add number, number, 2
b loop
printme:
mov r1, number
ldr r0, =resstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r7, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre><pre class="example">
<p><b>prime_vector.s</b></p>
.syntax unified
.equ word, 4
# this subroutine returns 1 if the passed number is prime; 0 if not
#
# inputs
# r0 - integer to test
# r1 - pointer to vector of prime integers smaller than r0
# r2 - length of vector passed in r1
#
# outputs
# r0 - prime boolean
number .req r4
vptr .req r5
tmp .req r6
squared .req r7
vsize .req r8
.global prime_vector
.type prime_vector, %function
.text
.align 2
prime_vector:
stmfd sp!, {r4-r8, lr}
mov number, r0
mov vptr, r1
mov vsize, r2
nexti:
ldr tmp, [vptr], word
mul squared, tmp, tmp
cmp squared, number @ check if vector element squared is greater than test number
movgt r0, 1 @ if so test numer is prime so leave
bgt last
mov r0, number
mov r1, tmp
bl divide @ divide number by vector element - if no remainder
teq r1, 0 @ it is not prime so leave
moveq r0, 0
beq last
subs vsize, vsize, 1 @ check next element
bgt nexti
mov r0, 0
last:
ldmfd sp!, {r4-r8, pc}
</pre><pre class="example">
<p><b>divide.s
</b></p>
.syntax unified
# divide takes value in r0, divisor in r1 and returns dividend in r0 and modulus in r1
.global divide
.type divide, %function
divide:
stmfd sp!, {lr}
# see http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf
bl __aeabi_uidivmod
ldmfd sp!, {pc}
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-8262791833427025632013-07-08T12:35:00.003-07:002013-07-08T12:36:57.672-07:00Sum squares and square sums<p>The sum of the squares of the first ten natural numbers is,</p>
<div style="text-align:center">1<sup>2</sup> + 2<sup>2</sup> + ... + 10<sup>2</sup> = 385</div>
<p>The square of the sum of the first ten natural numbers is,</p>
<div style="text-align:center">(1 + 2 + ... + 10)<sup>2</sup> = 55<sup>2</sup> = 3025</div>
<p>Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 - 385 = 2640.</p>
<p>Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.</p>
<pre class="example">
.syntax unified
.equ limit,100
number .req r4
sumsq .req r5
sqsum .req r6
tmp .req r7
.section .rodata
.align 2
string:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r7, lr}
mov sqsum, 0
mov sumsq, 0
ldr number, =limit
loop:
mul tmp, number, number
add sqsum, sqsum, tmp
# decrement number and loop or exit
subs number, number, 1
beq end_loop
b loop
end_loop:
ldr number, =limit
add number, number, 1
ldr sumsq, =limit
mul sumsq, sumsq, number
lsr sumsq, sumsq, 1
mul sumsq, sumsq, sumsq
last:
sub tmp, sumsq, sqsum
mov r1, tmp
ldr r0, =string @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r7, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-32997267396621506892013-07-02T11:35:00.002-07:002013-07-02T11:40:41.918-07:00Divisibility<p>2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.</p>
<p>What is the smallest positive number that is <dfn title="divisible with no remainder">evenly divisible</dfn> by all of the numbers from 1 to 20?</p>
<pre class="example">
.syntax unified
.equ limit,20
.align 4
@ algorithm
@ initialise try_products to 1
@ foreach number > 1 and <= limit
@ test if it is prime
@ if try_products is set, then multiply the number by itself
@ while it does not exceed limit, then multiply the total by
@ this product. if the number squared exceeds the limit, then
@ set try_product to 0.
@ if try_products is 0 and the number is prime, then multiply
@ the total by number.
try_product .req r4
number .req r5
last .req r6
total .req r7
tmp .req r8
.section .rodata
.align 2
resstring:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r8, lr}
mov total, 1
mov try_product, 1
mov number, 2
loop:
mov r0, number
bl isprime20
cmp r0, 1
bne nexti
cmp try_product, 1
bne no_product
mul tmp, number, number
cmp tmp, limit
ble prod_start
mov try_product, 0
b no_product
prod_start:
mov tmp, number
mov last, tmp
prod_loop:
cmp tmp, limit
bgt last_mul
mov last, tmp
mul tmp, tmp, number
b prod_loop
last_mul:
mul total, total, last
no_product:
cmp try_product, 0
bne nexti
mul total, total, number
nexti:
cmp number, limit
beq printme
add number, number, 1
b loop
printme:
mov r1, total
ldr r0, =resstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r8, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
# this subroutine returns 1 if the passed number (<= 20) is prime; 0 if not
#
# inputs
# r0 - integer to test
#
# outputs
# r0 - prime boolean
.global isprime20
.type isprime20, %function
.text
.align 2
isprime20:
stmfd sp!, {lr}
mov r1, r0
ands r2, r1, 1
bne odd
mov r0, 0
cmp r1, 2 @ 2 is the only prime even r1
bne last
mov r0, 1
b last
odd:
mov r0, 1
cmp r1, 9 @ 9 and 15 are the only non-prime
bne test15 @ odd numbers less than 20
mov r0, 0
b last
test15:
cmp r1, 15
bne last
mov r0, 0
last:
ldmfd sp!, {pc}
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-59234556292772676792013-07-01T11:49:00.002-07:002013-07-02T02:28:58.264-07:00Palindromic numbers<p>A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 <img src="images/symbol_times.gif" width="9" height="9" alt="×" border="0" style="vertical-align:middle;"/> 99.</p>
<p>Find the largest palindrome made from the product of two 3-digit numbers.</p>
<pre>
.syntax unified
.equ max3,999
.equ min3,100
.equ maxdigits,6
i .req r4
j .req r5
product .req r6
maxp .req r7
mini .req r8
minj .req r9
maxj .req r10
.section .rodata
.align 2
sumstring:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r10, lr}
ldr i, =max3
ldr mini, =min3
ldr maxj, =max3
ldr minj, =min3
iloop:
mov j, maxj
jloop:
mul product, i, j
mov r0, product
bl is_palindromic
cmp r0, #1
bne next
cmp product, maxp
ble next
mov maxp, product
mov r0, product
bl divide_by_10 @ divides r0 by 10
bl divide_by_10 @ so 3 consecutive calls
bl divide_by_10 @ will divide by 1000
mov minj, r0
mov minj, r0
next:
subs j, j, 1
cmp j, minj
bgt jloop
subs i, i, 1
mov maxj, i
cmp i, mini
bgt iloop
last:
mov r1, maxp
ldr r0, =sumstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r10, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<p>A division of the found product by 1000 to reset the minima is an optimisation to speed termination.</p>
<p>We divide by 10 three times in succession - we could write a routine to divide by 1000, but one to divide by 10 is much more useful.</p>
<pre class="example">
<p><b>ispalindromic.s</b></p>
.syntax unified
.equ datum_size, 1
.equ digits, 6
# this subroutine returns 1 if the passed 6-digit number is palindromic; 0 if not
# the number is a product of 2 3-digit numbers so we assume the product has 6 digits
#
# inputs
# r0 - integer to test
#
# outputs
# r0 - palindromic boolean
#
# local
#
left .req r4
right .req r5
counter .req r6
buffer_address .req r7
running .req r8
tmp .req r9
#
.section .bss
.lcomm buffer, 6 @ use digits here?
#
.global is_palindromic
.type is_palindromic, %function
.global get_digits
.type get_digits, %function
.text
.align 2
#
is_palindromic:
stmfd sp!, {r4-r9, lr}
bl get_digits
mov counter, 3 @ digits >> 1
ldr buffer_address, =buffer
ip_last: @ the next few lines are the nub of the problem
sub left, buffer_address, counter @ on the first pass we compare x[0] and x[5]
ldrb tmp, [left, 3] @ if they are equal we compare x[1] and x[4]
add right, buffer_address, counter @ if they are equal we compare x[2] and x[3]
ldrb running, [right, 2]
teq tmp, running
bne no @ mismatch between tested digits
subs counter, counter, 1
bgt ip_last @ test next pair for equality
mov r0, 1
b last
no:
mov r0, 0
last:
ldmfd sp!, {r4-r9, pc}
#
get_digits: @ this routine writes a number into a string
stmfd sp!, {r7-r8, lr}
ldr running, =digits
ldr buffer_address, =buffer
gd_loop:
subs running, running, 1
bl divide_by_10_remainder
strb r1, [buffer_address], #datum_size
bgt gd_loop
gd_last:
ldmfd sp!, {r7-r8, pc}
</pre>
<pre class="example">
<p><b>divide_by_10.s</b></p>
.syntax unified
# this subroutine divides the passed number by 10 and
# returns the dividend and remainder
#
# The const -0x33333333 is 0xccccccd (2s complement)
# 0xcccccccc is 12/15th (0.8) of 0xffffffff and we use this as
# a multiplier, then shift right by 3 bits (divide by 8) to
# effect a multiplication by 0.1
#
# We multiply this number by 10 (multiply by 4, add 1 then multiply by 2)
# and subtract from the original number to give the remainder on division
# by 10.
#
# inputs
# r0 - integer to divide
#
# outputs
# r0 - the dividend
# r1 - the remainder
.equ const,-0x33333333
#.equ const,0xcccccccd
.text
.align 2
.global divide_by_10_remainder
.type divide_by_10_remainder, %function
divide_by_10_remainder:
stmfd sp!, {lr}
cmp r0, 10
blt rsmall
ldr r1, =const
umull r2, r3, r1, r0
mov r2, r3, lsr #3 @ r2 = r3 / 8 == r0 / 10
mov r3, r2 @ r3 = r2
mov r3, r3, asl #2 @ r3 = 4 * r3
add r3, r3, r2 @ r3 = r3 + r2
mov r3, r3, asl #1 @ r3 = 2 * r3
rsb r3, r3, r0 @ r3 = r0 - r3 = r0 - 10*int(r0/10)
mov r1, r3 @ the remainder
mov r0, r2 @ the dividend
b rlast
rsmall:
mov r1, r0
mov r0, 0
rlast:
ldmfd sp!, {pc}
# this subroutine divides the passed number by 10
# returns the dividend
#
# The const -0x33333333 is 0xccccccd (2s complement)
# 0xcccccccc is 12/15th (0.8) of 0xffffffff and we use this as
# a multiplier, then shift right by 3 bits (divide by 8) to
# effect a multiplication by 0.1
#
# We multiply this number by 10 (multiply by 4, add 1 then multiply by 2)
#
# inputs
# r0 - integer to divide
#
# outputs
# r0 - the dividend
.align 2
.global divide_by_10
.type divide_by_10, %function
divide_by_10:
stmfd sp!, {lr}
cmp r0, 10
blt small
ldr r1, =const
umull r2, r3, r1, r0
mov r2, r3, lsr #3 @ r2 = r3 / 8 == r0 / 10
mov r0, r2 @ the dividend
b last
small:
mov r0, 0
last:
ldmfd sp!, {pc}
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-67103237563128822602013-06-30T06:31:00.002-07:002013-07-10T04:54:57.221-07:00Prime factors<p>The prime factors of 13195 are 5, 7, 13 and 29.</p>
<p>What is the largest prime factor of the number 600851475143 ?</p>
<pre class="example">
.syntax unified
@ 600851475143 is 8BE589EAC7 in hex
.equ numhi,139 @ 0x8b
.equ numlo,3851020999 @ 0xe589eac7
num_hi .req r4
num_lo .req r5
maxdiv .req r6
n .req r7
.section .rodata
.align 2
resstring:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r7, lr}
mov maxdiv, 0
mov n, 1
ldr num_lo, =numlo
ldr num_hi, =numhi
loop:
add n, n, 2 @ start at 3 and increment by 2
loop1:
mov r0, num_lo
mov r1, num_hi
mov r2, n
mov r3, 0
bl long_divide
teq r2, 0
bne loop
teq r3, 0
bne loop
mov num_lo, r0 @ only get here when we have no remainder
mov num_hi, r1 @ save the divisor as the new number
mov r0, n
bl isprime
teq r0, 1
bne loop @ increment n if n is non-prime
cmp maxdiv, n
movlt maxdiv, n @ save n as the largest divisor if it is larger
teq num_lo, 1 @ we know it has prime factors
beq printme
b loop1
printme:
mov r1, maxdiv
ldr r0, =resstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r7, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<p>The ARM is a RISC microprocessor and does not provide a divide command
in the instruction set. We use a division routine from <em>libgcc</em>.</p>
<p>Using the <em>-S</em> option to <em>gcc</em> when a c file is compiled
will generate an assembly language file which can be read for enlightenment.</p>
<pre class="example">
<p><b>isprime.s</b></p>
.syntax unified
# this subroutine returns 1 if the passed number is prime; 0 if not
# inputs
# r0 - integer to test
# outputs
# r0 - prime boolean
number .req r4
divisor .req r5
tmp .req r6
.global isprime
.type isprime, %function
.text
.align 2
isprime:
stmfd sp!, {r4-r6, lr}
mov number, r0
ands tmp, number, 1
bne odd
mov r0, 0
cmp number, 2 @ 2 is the only prime even number
bne last
mov r0, 1
b last
odd:
mov divisor, 3
cmp number, 8
bgt big
mov r0, 1
cmp number, 1 @ 1 is the only odd number < 8 not prime
bne last
mov r0, 0
b last
big:
mov r0, number
mov r1, divisor
bl divide
teq r1, 0
beq factor
add divisor, divisor, 2
mul tmp, divisor, divisor
subs tmp, tmp, number
ble big
mov r0, 1
b last
factor:
mov r0, 0
last:
ldmfd sp!, {r4-r6, pc}
</pre>
<p>The isprime function uses a naive, yet straightforward, algorithm to
determine whether a given number is prime.</p>
<pre class="example">
<p><b>divide.s</b></p>
.syntax unified
# divide takes value in r0, divisor in r1 and returns dividend in r0 and modulus
in r1
.global divide
.type divide, %function
divide:
stmfd sp!, {lr}
# see http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf
bl __aeabi_uidivmod
ldmfd sp!, {pc}
</pre>
<p>Our divide and long_divide routines are wrappers around the <em>libgcc</em>
functions to allow comments.</p>
<pre class="example">
<p><b>long_divide.s
</b></p>
.syntax unified
# long_divide takes low numerator in r0, high numerator in r1, low denominator in r2 and high denominator in r3
# returns low quotient in r0, high quotient in r1, low remainder in r2 and high remainder in r3
.global long_divide
.type long_divide, %function
long_divide:
stmfd sp!, {lr}
# see https://codereview.chromium.org/5302007/diff/12001/arch/arm/lib/_uldivmod.S
bl __aeabi_uldivmod
ldmfd sp!, {pc}
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-29327528674604098532013-06-28T00:50:00.002-07:002013-06-28T02:31:54.822-07:00The Fibonacci sequence<p>Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:</p>
<p style="text-align:center;">1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...</p>
<p>By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.</p>
<pre>
.syntax unified
.equ maxfib,4000000
previous .req r4
current .req r5
next .req r6
sum .req r7
max .req r8
tmp .req r9
.section .rodata
.align 2
fibstring:
.asciz "fib is %d\n"
sumstring:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r9, lr}
ldr max, =maxfib
mov previous, 1
mov current, 1
mov sum, 0
loop:
cmp current, max
bgt last
add next, current, previous
movs tmp, current, lsr 1 @ set carry flag from lsr - for the odd-valued terms
@ we discard the result of the movs and are only interested
@ in the side effect of the lsr which pushes the lower bit
@ of current (1 for odd; 0 for even) into the carry flag
@ movs will update the status register (c.f. mov which will not)
addcc sum, sum, current @ we add current to the sum ONLY when cc (carry clear) is true
@ these are even-valued fibonacci terms
mov previous, current
mov current, next
b loop
last:
mov r1, sum
ldr r0, =sumstring @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r9, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-40436900671125749922013-06-25T14:54:00.001-07:002013-06-25T15:00:45.447-07:00Find the sum of all the multiples of 3 or 5<p>If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.</p>
<p>Find the sum of all the multiples of 3 or 5 below 1000.</p>
<pre class="example">
.syntax unified
.equ max3start,999
.equ max5start,995
number .req r4
matched .req r5
sum .req r6
max3 .req r7
max5 .req r8
.section .rodata
.align 2
string:
.asciz "%d\n"
.text
.align 2
.global main
.type main, %function
main:
stmfd sp!, {r4-r8, lr}
ldr max5, =max5start
ldr max3, =max3start
ldr number, =max3start @ start at 1000 - 1 ; numbers < 1000
mov matched, 0
mov sum, 0
loop:
cmp number, max3
bne test5
# matched a multiple of 3 - decrement max3
# add to sum and set matched to 1
mov matched, 1
add sum, sum, number
subs max3, max3, 3
test5:
cmp number, max5
bne last
# matched a multiple of 5 - decrement max5
# add to sum and set matched to 1
subs max5, max5, 5
cmp matched, 1 @ have we already added it?
addne sum, sum, number @ if not add it to the total
last:
# decrement number and reset matched and loop
mov matched, 0
subs number, number, 1
bne loop
mov r1, sum
ldr r0, =string @ store address of start of string to r0
bl printf
mov r0, 0
ldmfd sp!, {r4-r8, pc}
mov r7, 1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-90040614443326937402013-06-23T07:23:00.000-07:002013-06-24T11:58:07.952-07:00Armstrong numbers<p>Armstrong (or <a href="http://en.wikipedia.org/wiki/Narcissistic_number">Narcissistic</a>) numbers are a special set of integers.</p>
<p>An n-digit number is an Armstrong number if the sum of the n-th power of the digits is equal to the original number. For example, 371 is an Armstrong number because the sum of 3 cubed, 7 cubed and 1 cubed is equal to 371.</p>
<p>Here we compute all the Armstrong numbers of 2, 3 and 4 digits.</p>
<ul>
<li>Power function</li>
<pre class="example">
# this subroutine returns the passed digit to the passed power
#
# inputs
# r0 - digit
# r1 - power
#
# outputs
# r0 - digit ** power
#
# locals
# r4
.globl _power
.align 2
.text
_power:
nop
stmfd sp!, {r4, lr} @ save variables to stack
subs r1, r1, #1 @ leave unless power > 1
ble _power_end
mov r4, r0 @ copy digit
_power_loop_start:
mul r0, r4, r0 @ raise to next power
subs r1, r1, #1
beq _power_end @ leave when done
b _power_loop_start @ next iteration
_power_end:
ldmfd sp!, {r4, pc} @ restore state from stack and leave subroutime
</pre>
<li>Armstrong function</li>
<pre class="example">
# inputs
# r0 - number
#
# outputs
# r0 - armstrong number
#
# local r4, r5, r6, r7, r8
.equ ten,10
.equ hundred,100
.equ thousand,1000
.equ ten_thousand,10000
number .req r4
width .req r5
digit .req r6
current .req r7
armstrong .req r8
.globl _armstrong
.align 2
.text
_armstrong:
nop
stmfd sp!, {r4, r5, r6, r7, r8, lr} @ save variables to stack
mov number, r0 @ copy passed parameter to working number
cmp number, #ten @ exit unless number > 10
blt _end
ldr current, =ten_thousand @ exit unless number < 10000
cmp number, current
bge _end
mov width, #0 @ initialise
mov digit, #0
mov armstrong, #0
ldr current, =thousand @ handle 1000 digit
_thousand_start:
cmp number, current
blt _thousand_end @ exit thousand code if none left
mov width, #4 @ width must be 4
add current, current, #thousand @ bump thousand counter
add digit, digit, #1 @ and corresponding digit count
b _thousand_start @ and loop
_thousand_end:
add number, number, #thousand @ need number modulo thousand
sub number, number, current
mov r0, digit @ push digit
mov r1, width @ and width
bl _power @ to compute digit **width
add armstrong, r0, armstrong @ and update armstrong number with this value
ldr current, =hundred @ then we do the hundreds as we did the thousands
mov digit, #0
_hundred_start:
cmp number, current
blt _hundred_end
teq width, #0 @ and only set width if it is currently unset
moveq width, #3
_hundred_width:
add current, current, #hundred @ yada yada as thousands above
add digit, digit, #1
b _hundred_start
_hundred_end:
add number, number, #hundred
sub number, number, current
mov r0, digit
mov r1, width
bl _power
add armstrong, r0, armstrong
ldr current, =ten @ then the tens as the hundred and thousands above
mov digit, #0
_ten_start:
cmp number, current
blt _ten_end
teq width, #0
moveq width, #2
_ten_width:
add current, current, #ten
add digit, digit, #1
b _ten_start
_ten_end:
add number, number, #ten
sub number, number, current
mov r0, digit
mov r1, width
bl _power
add armstrong, r0, armstrong
mov r0, number @ then add in the trailing digits
mov r1, width
bl _power
add armstrong, r0, armstrong
mov r0, armstrong @ and copy the armstrong number back to r0 for return
_end:
ldmfd sp!, {r4, r5, r6, r7, r8, pc} @ restore state from stack and leave subroutine
</pre>
<p>We improve the legibility of the code by e.g. <em>number .req r4</em>. This textual substitution does increase the semantic value of the code considerably.</p>
<li>Armstrong function with a macro to abstract repeated code</li>
<p>N.B. This is functionally equivalent but much shorter than the previous function. The variable \@ here is a magic variable, incremented each time the macro is instantiated. This enables the use of distinct labels, which we need here.</p>
<pre class="example">
# inputs
# r0 - number
#
# outputs
# r0 - armstrong number
#
# local r4, r5, r6, r7, r8
.equ ten,10
.equ hundred,100
.equ thousand,1000
.equ ten_thousand,10000
number .req r4
width .req r5
digit .req r6
current .req r7
armstrong .req r8
.macro armstrong_digit a, b
ldr current, =\a
mov digit, #0
_start\@:
cmp number, current
blt _end\@
teq width, #0 @ and only set width if it is currently unset
moveq width, #\b
add current, current, #\a
add digit, digit, #1
b _start\@
_end\@:
add number, number, #\a
sub number, number, current
mov r0, digit
mov r1, width
bl _power
add armstrong, r0, armstrong
.endm
.globl _armstrong
.align 2
.text
_armstrong:
nop
stmfd sp!, {r4, r5, r6, r7, r8, lr} @ save variables to stack
mov number, r0 @ copy passed parameter to working number
cmp number, #ten @ exit unless number > 10
blt _end
ldr current, =ten_thousand @ exit unless number < 10000
cmp number, current
bge _end
mov width, #0 @ initialise
mov armstrong, #0
armstrong_digit thousand 4
armstrong_digit hundred 3
armstrong_digit ten 2
mov r0, number @ then add in the trailing digits
mov r1, width
bl _power
add armstrong, r0, armstrong
mov r0, armstrong @ and copy the armstrong number back to r0 for return
_end:
ldmfd sp!, {r4, r5, r6, r7, r8, pc} @ restore state from stack and leave subroutine
</pre>
<p>Here we introduce the use of the macro assembly functions of <em>as</em> Again this change leads to improved code - we eliminate duplication thus improving maintainability and legibility.</p>
<li>Armstrong_main function</li>
<pre class="example">
.equ ten,10
.equ ten_thousand,10000
.section .rodata
.align 2
string:
.asciz "armstrong number of %d is %d\n"
.text
.align 2
.global main
.type main, %function
main:
ldr r5, =ten
ldr r6, =ten_thousand
mov r4, r5 @ start with n = 10
_main_loop:
cmp r4, r6 @ leave if n = 10_000
beq _main_end
mov r0, r4 @ call the _armstrong function
bl _armstrong
teq r0, r4 @ if the armstong value = n print it
bne _main_next @ else skip
mov r2, r0
mov r1, r4
ldr r0, =string @ store address of start of string to r0
bl printf @ call the c function to display information
_main_next:
add r4, r4, #1
b _main_loop
_main_end:
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<li>Result</li>
<pre>
bob@poland:~/src/problems_for_computer_solution/07_armstrong_numbers$ ./armstrong4macro
armstrong number of 153 is 153
armstrong number of 370 is 370
armstrong number of 371 is 371
armstrong number of 407 is 407
armstrong number of 1634 is 1634
armstrong number of 8208 is 8208
armstrong number of 9474 is 9474
bob@poland:~/src/problems_for_computer_solution/07_armstrong_numbers$
</pre>
</ul>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-89844438232114777692013-06-22T11:52:00.001-07:002013-06-23T06:49:32.167-07:00Use of the floating point unit<p>This demonstrate the simple use of the floating-point unit and printing of floating point vaues - works on the Raspberry Pi</p>
<pre class="example">
.section .rodata
string:
.asciz "pi (%f) times e (%f) is %f\n"
string2:
.asciz "float (%f)\n"
.text
.global main
.type main, %function
main:
stmfd sp!, {fp, lr} @ save the frame pointer (r11) and link register (r14) to stack
sub sp, sp, #16 @ allocate space on stack for 16 bytes (stack grows downwards)
ldr r3, const_pi @ put the address of const_pi in r3
str r3, [r1] @ store this address (in r3) to the address pointed to by r1
flds s14, [r1] @ load a 32 bit word (s14) from the address pointed to by r1 (const_pi)
fcvtds d5, s14 @ convert the single precision (s14) to double precision (d5)
ldr r3, const_e @ same comments as const_pi
str r3, [r2]
flds s15, [r2]
fcvtds d6, s15 @ convert the single precision (s14) to double precision (d5)
fmuls s15, s14, s15 @ multiply s14 (pi) by s15 (e) and store in r15
fcvtds d7, s15 @ convert the result in s15 to a double (d7)
fstd d6, [sp] @ push the double d6 (const_e) on to the stack
fstd d7, [sp, #8] @ push the double d7 (const_pi * const_e) on to the stack
ldr r0, =string @ put the address of the first string in r0
fmrrd r2, r3, d5 @ fmrrd transfers the contents of the low half of d5 into r2,
@ and the contents of the high half of d5 into r3.
bl printf @ we can then print 3 double precision numbers
ldr r0, =string2
fmrrd r2, r3, d5
bl printf @ print const_pi
ldr r0, =string2
fmrrd r2, r3, d6
bl printf @ print const_e
ldr r0, =string2
fmrrd r2, r3, d7
bl printf @ print the product of pi * e
add sp, sp, #16 @ restore the stack pointer (discard the 2 double precision locals)
ldmfd sp!, {fp, pc} @ restore the frame pointer and link register from the stack
mov r0, #0
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
.align 2
const_pi:
.float 3.1415926
const_e:
.float 2.718281
</pre>
<pre>
bob@sweden:~/src/asm$ gcc -gstabs -o fp_mul fp_mul.s
bob@sweden:~/src/asm$ ./fp_mul
pi (3.141593) times e (2.718281) is 8.539731
float (3.141593)
float (2.718281)
float (8.539731)
bob@sweden:~/src/asm$
</pre>
<p><em>printf</em> requires that the first double precision number is stored in the 2 integer registers, r2 and r3, and any subsequent double precision numbers are pushed on to the stack. This program demonstrates this usage.</p>
<p>Using <em>gdb</em> can show more details</p>
<pre>
(gdb) b main
Breakpoint 1 at 0x83d0: file fp_mul.s, line 11.
(gdb) run
Starting program: /home/bob/src/asm/fp_mul
Breakpoint 1, main () at fp_mul.s:11
11 sub sp, sp, #16
(gdb) i r
r0 0x1 1
r1 0xbefffd24 3204447524
r2 0xbefffd2c 3204447532
r3 0x83cc 33740
r4 0x0 0
r5 0x0 0
r6 0x8320 33568
r7 0x0 0
r8 0x0 0
r9 0x0 0
r10 0xb6fff000 3070226432
r11 0x0 0
r12 0xb6fb5000 3069923328
sp 0xbefffbd0 0xbefffbd0
lr 0xb6ea181c -1226172388
pc 0x83d0 0x83d0 <main+4>
cpsr 0x60000010 1610612752
(gdb) s
13 ldr r3, const_pi
(gdb) i r r3
r3 0x83cc 33740
(gdb) s
14 str r3, [r1]
(gdb) i r r3
r3 0x40490fda 1078530010
(gdb) i r r1
r1 0xbefffd24 3204447524
(gdb) s
15 flds s14, [r1]
(gdb) s
16 fcvtds d5, s14
(gdb) i r s14
s14 3.1415925 (raw 0x40490fda)
(gdb) x/4x const_pi
0x8448 <const_pi>: 0xda 0x0f 0x49 0x40
(gdb) i r r1
r1 0xbefffd24 3204447524
(gdb) i r r3
r3 0x40490fda 1078530010
(gdb) x/4x 0xbefffd24
0xbefffd24: 0xda 0x0f 0x49 0x40
(gdb) s
18 ldr r3, const_e
(gdb)
19 str r3, [r2]
(gdb)
20 flds s15, [r2]
(gdb)
21 fcvtds d6, s15
(gdb)
23 fmuls s15, s14, s15
(gdb) i r s14
s14 3.1415925 (raw 0x40490fda)
(gdb) i r s15
s15 2.71828103 (raw 0x402df851)
(gdb) s
24 fcvtds d7, s15
(gdb) i r s15
s15 8.53973103 (raw 0x4108a2bd)
(gdb) s
26 fstd d6, [sp]
(gdb) i r d7
d7 8.5397310256958008 (raw 0x40211457a0000000)
(gdb) s
27 fstd d7, [sp, #8]
(gdb) s
28 ldr r0, =string
(gdb) s
29 fmrrd r2, r3, d5
(gdb) s
30 bl printf
(gdb) i r r2
r2 0x40000000 1073741824
(gdb) i r r3
r3 0x400921fb 1074340347
(gdb) i r d5
d5 3.1415925025939941 (raw 0x400921fb40000000)
(gdb)
</pre>
<p>I do think that the code may be more robust if we were to define some storage space for the pointers <em>r1</em> and <em>r2</em> - if these were null when we enter the routine we would dereference null pointers which is never good. The registers, <em>r0</em>-<em>r3</em> are specified by as temporary registers by the calling convention so we are at the mercy of the contents of these registers from the previous call?</p>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-28907872219737978932013-06-22T09:45:00.001-07:002013-06-24T11:56:29.693-07:00A bubble sort function<p>A bubble sort is the simplest sorting algorithm. It compares each vector element with it's successor and swaps them if they are out of order. It is inefficient, but sufficient for small vectors.</p>
<pre class="example">
# this subroutine bubble sorts vectors of words
#
# inputs
# r0 - start of vector
# r1 - number of elements to sort
#
# no outputs
#
# locals
# r4 - current pointer
# r5 - inner counter
# r6 - keep_going flag
# r7 - first element
# r8 - second element
.equ datum_size,4 @ works with 4 byte (32 bit) words
.globl _bubble
.text
_bubble:
stmfd sp!, {r4, r5, r6, r7, r8, lr} @ save variables to stack
cmp r1, #1 @ number of elements must be > 1
ble end_outer @ stop if nothing to do
sub r5, r1, #1 @ need n-1 comparisons
mov r4, r0 @ initialise current pointer
mov r6, #0 @ this register set when we swap
loop_start:
ldr r7, [r4], #datum_size @ load one element
ldr r8, [r4] @ and next one
cmp r7, r8 @ compare them
ble no_swap @ branch if second greater
mov r6, #1 @ set keep_going flag
sub r4, r4, #datum_size @ reset pointer to first element
swp r8, r8, [r4] @ exchange value in r8 and address in r4
str r8, [r4, #datum_size]! @ store new r8 to incremented address
no_swap:
subs r5, r5, #1 @ decrement counter
bne loop_start @ and restart loop if more needed
end_inner:
cmp r6, #0 @ check keep_going flag
beq end_outer @ and leave if not set
mov r6, #0 @ clear keep_going flag
mov r4, r0 @ reset pointer
sub r5, r1, #1 @ reset counter
b loop_start @ start another iteration
end_outer:
ldmfd sp!, {r4, r5, r6, r7, r8, pc} @ restore state from stack and leave subroutime
</pre>
<p>A bubble sort test harness to use the subroutine</p>
<pre class="example">
@ An example of moving data from memory to a register
.globl main
.equ items,16 @ maybe would use evalues - values >> 2 if it weren't a demo?
.section .rodata
before:
.asciz "Before sorting, values are : %d"
after:
.asciz "After sorting, values are : %d"
comma:
.asciz ", %d"
new_line:
.asciz "\n"
ok:
.asciz "ok\n"
.section .data
values:
.word 105, -7, 235, 61, 28, 315, 456, 63, 134, 97, 221, 53, 1000899, 145, 117, 5
evalues:
.text
main:
ldr r0, =values @ the start of the vector
mov r1, #items @ how many elements to print
ldr r2, =before @ the start string
ldr r3, =comma @ string for subsequent elements
bl _vprintw @ print the unsorted vector
ldr r0, =new_line
bl printf @ print the newline
ldr r0, =values @ the start of the vector
mov r1, #items @ how many elements to sort
bl _bubble @ call the sort routine
ldr r0, =values
mov r1, #items
ldr r2, =after
ldr r3, =comma
bl _vprintw @ print the sorted vector
ldr r0, =new_line
bl printf
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre class="system">
bob@poland:~/www/examples$ make bubble
/usr/bin/gcc -gstabs -o bubble bubble.s bubble_sub.s vprintw.s
bob@poland:~/www/examples$ ./bubble
Before sorting, values are : 105, -7, 235, 61, 28, 315, 456, 63, 134, 97, 221, 53, 1000899, 145, 117, 5
After sorting, values are : -7, 5, 28, 53, 61, 63, 97, 105, 117, 134, 145, 221, 235, 315, 456, 1000899
bob@poland:~/www/examples$
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com3tag:blogger.com,1999:blog-1144416818386553036.post-80601993050261229412013-06-22T04:56:00.000-07:002013-06-22T09:26:00.669-07:00Improving the output format<li>This subroutine to display a vector of words using different strings for first and subsequent elements</li>
<pre class="example">
# this subroutine prints vectors of words
#
# inputs
# r0 - start of vector
# r1 - number of elements to print
# r2 - pointer to start of string used to print first element
# r3 - pointer to start of string used to print subsequent elements
#
# no outputs
#
.globl _vprintw
.equ datum_size,4 @ working with 4 byte (32 bit) words
_vprintw:
stmfd sp!, {r4, r5, r6, r7, lr} @ save registers on the stack
cmp r1, #0 @ exit if no elements
ble last
mov r4, r0 @ copy the parameters to locals
mov r5, r1
mov r6, r2
mov r7, r3
ldr r1, [r4], #datum_size @ load first vector element to r0 and bump pointer
mov r0, r6 @ address of first string to r0
bl printf @ and print it
nop
subs r5, r5, #1 @ decrement counter
beq last @ and fall out if zero
vprintw_loop:
ldr r1, [r4], #datum_size @ load next vector item to r0 and bump pointer
mov r0, r7 @ address of subsequent string to r0
bl printf @ and print it
subs r5, r5, #1 @ decrement counter
bne vprintw_loop @ and loop if non-zero
last:
ldmfd sp!, {r4, r5, r6, r7, pc} @ restore registers from stack and return
</pre>
<p>A simple test harness to use the subroutine is given by</p>
<pre class="example">
.globl main
.section .rodata
first:
.asciz "Vector of words - values : %d"
subsequent:
.asciz ", %d"
final:
.asciz "\n"
values:
.word 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60
endvalues:
.align 2
.text
main:
ldr r0, =values
mov r1, #11
ldr r2, =first
ldr r3, =subsequent
bl _vprintw @ print the vector elements
ldr r0, =final
bl printf @ print the terminating newline
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre class="system">
bob@poland:~/www/examples$ make test_vprintw
/usr/bin/gcc -gstabs -o test_vprintw test_vprintw.s vprintw.s
bob@poland:~/www/examples$ ./test_vprintw
Vector of words - values : 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60
bob@poland:~/www/examples$
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-88972976077607572012013-06-22T04:39:00.001-07:002013-06-24T12:00:04.208-07:00Structured Programs<p>A subroutine to print a vector of bytes is:</p>
<pre class="example">
# this subroutine prints vectors of bytes
#
# inputs
# r0 - start of vector
# r1 - number of elements to print
# r2 - pointer to start of string used to print each element
#
# no outputs
#
.globl _vprintb
.equ datum_size,1 @ 1 because we are using bytes
_vprintb:
stmfd sp!, {r4, r5, r6, lr} @ stash the registers we use
mov r4, r0 @ r4 is the pointer to the data
mov r5, r1 @ r5 is the counter
mov r6, r2 @ r6 is the pointer to the string
vprintb_loop:
ldrb r1, [r4], #datum_size @ put the vector element in r1
@ implicitly increment the pointer r4 by datum_size
mov r0, r6
bl printf @ print the string containing the vector element
subs r5, r5, #1 @ decrement the counter
bne vprintb_loop @ loop if the counter is not zero
ldmfd sp!, {r4, r5, r6, pc} @ restore the stashed registers
</pre>
<p>A simple test harness to use the subroutine is given by</p>
<pre class="example">
.equ datum_size,1
.globl main
.section .rodata
output:
.asciz "The value is %d\n"
values:
.byte 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60
endvalues:
.text
main:
ldr r0, =values @ the start of the vector
mov r1, #11 @ the number of elements
ldr r2, =output @ the output string
bl _vprintb @ call the subroutine
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre class="system">
bob@poland:~/www/examples$ make test_vprintb
/usr/bin/gcc -gstabs -o test_vprintb test_vprintb.s vprintb.s
bob@poland:~/www/examples$ ./test_vprintb
The value is 10
The value is 15
The value is 20
The value is 25
The value is 30
The value is 35
The value is 40
The value is 45
The value is 50
The value is 55
The value is 60
bob@poland:~/www/examples$
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-34170065559682263212013-06-21T14:55:00.000-07:002013-06-24T12:01:22.401-07:00Converting endianness<p>ARM v6 has a <em>rev</em> instruction to convert big- and little-endian values. A sheevaplug uses ARM v5TE so we have to do the inversion explicitly</p>
<pre class="example">
# swaptest.s – Converting big-endian to little-endian and vice-versa
# In Arm V6 we can use the rev instruction - but not supported on v5TE
.globl _start
.section .data
vstart:
.word 0x12345678
.text
.align 2
_start:
ldr r1, =vstart
ldr r0, [r1] @ load word to r0
and r2, r0, #0xff000000 @ load the top 2 bytes to r2
and r3, r0, #0x00ff0000 @ load the next 2 bytes to r3
and r4, r0, #0x0000ff00 @ load the next 2 bytes to r4
and r5, r0, #0x000000ff @ load the bottom 2 bytes to r5
mov r0, r2, lsr #24 @ shift r2 bytes to bottom and store to r0
orr r0, r3, lsr #8 @ or the remaining shifted data
orr r0, r4, lsl #8
orr r0, r5, lsl #24
_stop:
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre class="system">
bob@poland:~/www/examples$ grep architecture /proc/cpuinfo
CPU architecture: 5TE
bob@poland:~/www/examples$
bob@poland:~/www/examples$ gdb endian
GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/bob/www/examples/endian...done.
(gdb) b _start
Breakpoint 1 at 0x8078: file endian.s, line 10.
(gdb) b _stop
Breakpoint 2 at 0x80a0: file endian.s, line 21.
(gdb) run
Starting program: /home/bob/www/examples/endian
Breakpoint 1, _start () at endian.s:11
11 ldr r0, [r1] @ load word to r0
Current language: auto
The current source language is "auto; currently asm".
(gdb) x/x &vstart
0x100a8 <vstart>: 0x12345678
(gdb) cont
Continuing.
Breakpoint 2, _stop () at endian.s:22
22 swi 0 @ then invoke the syscall from linux
(gdb) p/x $r0
$2 = 0x78563412
(gdb) quit
A debugging session is active.
Inferior 1 [process 17053] will be killed.
Quit anyway? (y or n) y
bob@poland:~/www/examples$
</pre>
<p>So we see the inverted data in <tt>r0</tt> when the program stops.</p>
Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-2106770477119396332013-06-21T10:07:00.000-07:002013-06-24T12:02:53.860-07:00Storing data to an element of a vector<p>We write a constant to the second element of the vector and then read and return it.</p>
<pre class="example">
# movtest4.s – An example of indirect addressing
.equ datum_size,1
.globl _start
.align 2
.section .data
values:
.byte 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60
.text
.align 2
_start:
ldr r4, =values
mov r5, #100
strb r5, [r4, #datum_size]! @ store 100 to the second element of values
@ we update the pointer r4 via the trailing !
ldrb r0, [r4] @ load that value to r0 for return to OS
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<p>We write a constant to the second element of the vector and then read and return it.</p>
<pre class="system">
bob@poland:~/www/examples$ make movtest4
/usr/bin/as -gstabs -o movtest4.o movtest4.s
/usr/bin/ld -o movtest4 movtest4.o
bob@poland:~/www/examples$ ./movtest4
bob@poland:~/www/examples$ echo $?
100
bob@poland:~/www/examples$
</pre>
<p>Again, it can be instructive to walk through the code with <em>gdb</em>.</p>
<pre>
(gdb) b _start
Breakpoint 1 at 0x8078: file movtest4.s, line 12.
(gdb) run
Starting program: /home/bob/src/professional_assembly_language/arm/chap05/movtest4
Breakpoint 1, _start () at movtest4.s:12
12 mov r5, #100
(gdb) i r r5
r5 0x0 0
(gdb) x/11b &values
0x10090 <values>: 10 15 20 25 30 35 40 45
0x10098 <values+8>: 50 55 60
(gdb) s
13 strb r5, [r4, #datum_size]! @ write back the incremented data pointer
(gdb) i r r5
r5 0x64 100
(gdb) x/11b &values
0x10090 <values>: 10 15 20 25 30 35 40 45
0x10098 <values+8>: 50 55 60
(gdb) s
14 ldrb r0, [r4] @ load that value to r0 for return to OS
(gdb) x/11b &values
0x10090 <values>: 10 100 20 25 30 35 40 45
0x10098 <values+8>: 50 55 60
(gdb) i r r0
r0 0x0 0
(gdb) s
15 mov r7, #1 @ set r7 to 1 - the syscall for exit
(gdb) i r r0
r0 0x64 100
(gdb)
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-74368194385796217502013-06-21T09:49:00.000-07:002013-06-24T12:04:05.439-07:00Using indexed memory locations<pre class="example">
.equ datum_size,1
.globl main
.align 2
.section .rodata
output:
.asciz "The value is %d\n"
values:
.byte 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60
endvalues:
.align 2
.text
main:
stmfd sp!, {r5, r6, lr} @ save the registers we use to the stack
ldr r5, =endvalues
ldr r6, =values
loop:
ldrb r1, [r6], #datum_size @ r6 will be implicitly incremented so r1 will contain each element in turn
ldr r0, =output @ the address of the string is put in r0
bl printf @ we print the value of r1 in the string pointed to by r0
cmp r6, r5 @ have we reached the endvalues label?
bne loop @ if not, read and print the next element
ldmfd sp!, {r4, r5, r6, pc} @ restore registers before exit
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<p>Here the <em>.equ</em> directive is introduced which aliases a value to a name.</p>
<p>We do so because we can easily change from using bytes for storage of the value to using half-words (of 2 bytes) or words (of 4 bytes) in the data section and modification of the <em>ldrb</em> instruction to <em>ldrh</em> or <em>ldr</em> respectively.</p>
<pre class="system">
bob@poland:~/www/examples$ make movtest3
/usr/bin/gcc -gstabs -o movtest3 movtest3.s
bob@poland:~/www/examples$ ./movtest3
The value is 10
The value is 15
The value is 20
The value is 25
The value is 30
The value is 35
The value is 40
The value is 45
The value is 50
The value is 55
The value is 60
bob@poland:~/www/examples$
</pre>
<p>Again we use <em>gcc</em> rather than <em>as</em> to assemble since we use the C function, <em>printf</em>.</p>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-87952066863930917332013-06-21T09:35:00.000-07:002013-06-24T12:05:57.009-07:00Moving registers to the data section
<pre class="example">
.global _start
.section .data
value: .byte 42
.text
_start:
ldr r1, =value
mov r0, #9 @ r0 is now 9
strb r0, [r1], #0 @ store the register r0 to the address in r1
add r0, r0, r0 @ r0 is now 18
ldrb r0, [r1], #0 @ load the byte at address r1 to r0
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<p>We initialise r0, and store it to memory, and then double the value of r0, and then read back from memory to r0.</p>
<p>Our return code (the value of r0) is the same as that stored rather than the doubled version.</p>
<p>By implication, our store and load were executed correctly.</p>
<p>If implication is suspicious, then it is possible to walk through the code using <em>gdb</em> to verify.</p>
<pre class="system">
bob@poland:~/www/examples$ make movtest2
/usr/bin/as -gstabs -o movtest2.o movtest2.s
/usr/bin/ld -o movtest2 movtest2.o
bob@poland:~/www/examples$ ./movtest2
bob@poland:~/www/examples$ echo $?
9
bob@poland:~/www/examples$
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0tag:blogger.com,1999:blog-1144416818386553036.post-5820557550167801982013-06-20T16:19:00.000-07:002013-06-24T12:06:35.179-07:00Moving data from memory to a register<p>We use rodata (for read-only data).</p>
<pre class="example">
# movtest1.s
@ An example of moving data from memory to a register
.global _start
.section .rodata
value: .byte 42
.align 2
.text
_start:
ldr r1, =value
ldrb r0, [r1], #1 @ load the byte at address r1 to r0
@ r0 being the return value when we exit
mov r7, #1 @ set r7 to 1 - the syscall for exit
swi 0 @ then invoke the syscall from linux
</pre>
<pre class="system">
<p>We use a makefile to build the software.</p>
bob@poland:~/www/examples$ make movtest1
/usr/bin/as -gstabs -o movtest1.o movtest1.s
/usr/bin/ld -o movtest1 movtest1.o
bob@poland:~/www/examples$ ./movtest1
bob@poland:~/www/examples$ echo $?
42
bob@poland:~/www/examples$
</pre>Bobhttp://www.blogger.com/profile/02882434663178342942noreply@blogger.com0