July 24, 2013 at 05:53 Tags Assembly

Recently I've been doing some x64 assembly hacking, and something I had to Google a bit and collect from a few places is how to go over all command-line arguments (colloquially known as argv from C) and do something with them.

I already discussed how arguments get passed into a program in the past (not the C main , mind you, but rather the real entry point of a program - _start ), so what was left is just a small matter of implementation. Here it is, in GNU Assembly ( gas ) syntax for Linux. This is pure assembly code - it does not use the C standard library or runtime at all. It demonstrates a lot of interesting concepts such as reading command-line arguments, issuing Linux system calls and string processing.

#---------------- DATA ----------------# .data # We need buf_for_itoa to be large enough to contain a 64-bit integer. # endbuf_for_itoa will point to the end of buf_for_itoa and is useful # for passing to itoa. .set BUFLEN, 32 buf_for_itoa: .space BUFLEN, 0x0 .set endbuf_for_itoa, buf_for_itoa + BUFLEN - 1 newline_str: .asciz "

" argc_str: .asciz "argc: " #---------------- CODE ----------------# .globl _start .text _start: # On entry to _start, argc is in (%rsp), argv[0] in 8(%rsp), # argv[1] in 16(%rsp) and so on. lea argc_str, %rdi call print_cstring mov (%rsp), %r12 # save argc in r12 # Convert the argc value to a string and print it out mov %r12, %rdi lea endbuf_for_itoa, %rsi call itoa mov %rax, %rdi call print_cstring lea newline_str, %rdi call print_cstring # In a loop, pick argv[n] for 0 <= n < argc and print it out, # followed by a newline. r13 holds n. xor %r13, %r13 .L_argv_loop: mov 8(%rsp, %r13, 8), %rdi # argv[n] is in (rsp + 8 + 8*n) call print_cstring lea newline_str, %rdi call print_cstring inc %r13 cmp %r12, %r13 jl .L_argv_loop # exit(0) mov $60, %rax mov $0, %rdi syscall

This code uses a couple of support functions. The first is print_cstring :

# Function print_cstring # Print a null-terminated string to stdout. # Arguments: # rdi address of string # Returns: void print_cstring: # Find the terminating null mov %rdi, %r10 .L_find_null: cmpb $0, (%r10) je .L_end_find_null inc %r10 jmp .L_find_null .L_end_find_null: # r10 points to the terminating null. so r10-rdi is the length sub %rdi, %r10 # Now that we have the length, we can call sys_write # sys_write(unsigned fd, char* buf, size_t count) mov $1, %rax # Populate address of string into rsi first, because the later # assignment of fd clobbers rdi. mov %rdi, %rsi mov $1, %rdi mov %r10, %rdx syscall ret

More interestingly, here is itoa . It's a bit more general than what I actually use in the main program because it also supports negative numbers. It can convert any number that fits into a 64-bit register. Note the unusual API for receiving and returning the place where the actual string is written. Since it's very natural for an itoa implementation to emit the digits in reverse, I wanted to avoid actual string reversing by writing the digits into a buffer from the end towards the beginning.

# Function itoa # Convert an integer to a null-terminated string in memory. # Assumes that there is enough space allocated in the target # buffer for the representation of the integer. Since the number itself # is accepted in the register, its value is bounded. # Arguments: # rdi: the integer # rsi: address of the *last* byte in the target buffer # Returns: # rax: address of the first byte in the target string that # contains valid information. itoa: movb $0, (%rsi) # Write the terminating null and advance. # If the input number is negative, we mark it by placing 1 into r9 # and negate it. In the end we check if r9 is 1 and add a '-' in front. mov $0, %r9 cmp $0, %rdi jge .L_input_positive neg %rdi mov $1, %r9 .L_input_positive: mov %rdi, %rax # Place the number into rax for the division. mov $10, %r8 # The base is in r8 .L_next_digit: # Prepare rdx:rax for division by clearing rdx. rax remains from the # previous div. rax will be rax / 10, rdx will be the next digit to # write out. xor %rdx, %rdx div %r8 # Write the digit to the buffer, in ascii dec %rsi add $0x30, %dl movb %dl, (%rsi) cmp $0, %rax # We're done when the quotient is 0. jne .L_next_digit # If we marked in r9 that the input is negative, it's time to add that # '-' in front of the output. cmp $1, %r9 jne .L_itoa_done dec %rsi movb $0x2d, (%rsi) .L_itoa_done: mov %rsi, %rax # rsi points to the first byte now; return it. ret

Some notes about the code: