Tutorials/Assembler Tutorial
![]() |
Assembler commands MindMap |
Machine language is everywhere. Whether you are playing Call of Duty, surf in the internet or write a document - it is machine language that is being executed inside your computer. No matter if you wrote your software in C, BASIC or Ruby, at execution time it has been translated to machine language. Machine language is the godfather of programming languages. Its commands are binary, for example "wait" is 90h for x86. This is why assembler exists - it maps the machine language commands to mnemonics like "jmp" for "jump" or "nop" for "wait". You see this is very low-level and I like low-level topics because I think the deeper it goes the more interesting it becomes. So here I show you how I deal with machine language and assembler. I am using x86 Linux in the examples.
Contents |
Your first program
A "hello world" program in assembler is already advanced. So as a first lesson we will take a look at a program that does nothing but an endless loop. Here is it:
endless.asm <syntaxhighlight> global _start _start:
nop
jmp _start </syntaxhighlight> This assembler source code contains two commands, "nop" for "no operation" and "jmp" for "jump". The other two lines is a label (_start:) and meta-information (global _start saying that "start" is where the program starts).
compile it
Compile it for 64bit Linux:
nasm -f elf64 endless.asm
link it
ld -s -o endless endless.o
execute it
Call the program like this:
./endless
Now you will need to press CTRL_C to stop the program. Note that this is possible because there is an operating system giving time slices to the process and the operating system is watching for keypresses still.
disassemble it
Now we want to take a look at the machine language in this program right? Here it is:
# objdump -M intel -d endless endless: file format elf64-x86-64 Disassembly of section .text: 0000000000400080 <.text>: 400080: 90 nop 400081: eb fd jmp 0x400080
So, the byte "90" (hexadecimal) is machine code for "do nothing" or "wait" or "no operation", its assembler mnemonic is "nop". "eb" is "jump" or "jmp", its parameter is where to jump. It is a relative jump, jumping to ff would mean jumping to the same byte, so to the parameter of jump. "jmp fe" means "jump to the jump command" and "jmp fd" means "jump to the byte before the jump command".
What have we seen
So compiling translated the assembler commands into machine language, for example "nop" got translated to the byte 90h. This machine code got embedded into the default file format for a Linux executable, called elf.
Theory: registers and flags
The x86 processor uses registers to store data in it from RAM. Typically the processor would read a byte from RAM into a register, perform an operation on it (like adding a number) and write it back to RAM. The processor has at least one register to point to the RAM address it is executing the code from. This register cannot be modified. The x86 processor also has "normal" registers called eax, ebx, ecx, edx, ebp, edi and esi. They are 32 bit and you can work with them - working means adding, subtracting, comparing and more. The lower parts (16 bits) of eax, ebx, exc and edx are called ax, bx, cx, dx. The lower half (8 bits, one byte) of them are called al, bl, cl and dl. The higher byte of them are called ah, bh, ch and dh. So if you change the value of ah, you will at the same time change the value of ax and eax.
Here is an example of eax having the value FF008844 and the resulting values for ax, ah and al:
value of EAX upper 16bit ax=lower 16bit ah=higher 8bit al=lower 8bit FF 00 88 44
The processor also contains registers that tell it where it put variables onto the stack (stack pointer, esp) and where it is executing instructions in memory (instruction pointer, eip). You cannot or should not access these registers.
The processor also contains flags. They are used e.g. for conditional executions. For example if you compare two number you will typically load one number into one register, say ax, and compare it with the other number using the command cmp. Imagine you have one variable in your RAM segment at position 0x04 and the other at 0x08. To compare them you will first load the first variable into register eax:
mov eax,DWORD PTR [rbp-0x4]
then you will compare eax with the number at position 0x08:
cmp eax,DWORD PTR [rbp-0x8]
Now the processor's flag will contain the information if the last comparison yielded equal or not equal. So you can now "jump if not equal" with the command jne:
jne 40051d
In C, the data type to fill the register eax is called int, so this is how you would program it in C:
int main() { int i=0x23; int n=0x25; if (i==n) {n=i;}; }
Hello world
Write a C program
This is the program:
hello.c
#include <stdio.h> int main() { int i=0x23; printf("hello world"); }
Translate the program to machine language
Now we compile it:
gcc hello.c -o hello
and see that it runs:
./hello hello world
disassemble it
To disassemble it we have to possibilities, disassemble to gcc assembler or the Intel dialect. Here are they:
gcc Assembler Syntax | Intel Assembler Syntax |
---|---|
# objdump -d hello [...] 000000000040055d <main>: 40055d: 55 push %rbp 40055e: 48 89 e5 mov %rsp,%rbp 400561: 48 83 ec 10 sub $0x10,%rsp 400565: c7 45 fc 23 00 00 00 movl $0x23,-0x4(%rbp) 40056c: bf 04 06 40 00 mov $0x400604,%edi 400571: b8 00 00 00 00 mov $0x0,%eax 400576: e8 c5 fe ff ff callq 400440 <printf@plt> 40057b: c9 leaveq 40057c: c3 retq |
# objdump -Mintel -d hello [...] 000000000040053c <main>: 40053c: 55 push rbp 40053d: 48 89 e5 mov rbp,rsp 400540: 48 83 ec 20 sub rsp,0x20 400544: c7 45 fc 23 00 00 00 mov DWORD PTR [rbp-0x4],0x23 40054b: bf 4c 06 40 00 mov edi,0x40064c 400550: b8 00 00 00 00 mov eax,0x0 400555: e8 d6 fe ff ff call 400430 <printf@plt> 40055a: c9 leave 40055b: c3 ret |
Intel assembler vs GNU assembler
We see two assembler representations of the same machine code. The mnemonics are partly different - movl for "move long value", retq instead of ret for "return", callq instead of call for "execute syscall".
Raspberry Pi vs Intel
Now let's compile hello.c one time on an Intel notebook, and one time on a Raspberry Pi. Let's see how the machine code differs.
Intel Notebook | Raspberry Pi |
---|---|
# objdump -d hello [...] 0000000000001149 <main>: 1149: f3 0f 1e fa endbr64 114d: 55 push %rbp 114e: 48 89 e5 mov %rsp,%rbp 1151: 48 83 ec 10 sub $0x10,%rsp 1155: c7 45 fc 23 00 00 00 movl $0x23,-0x4(%rbp) 115c: 48 8d 3d a1 0e 00 00 lea 0xea1(%rip),%rdi # 2004 <_IO_stdin_used+0x4> 1163: b8 00 00 00 00 mov $0x0,%eax 1168: e8 e3 fe ff ff callq 1050 <printf@plt> 116d: b8 00 00 00 00 mov $0x0,%eax 1172: c9 leaveq 1173: c3 retq 1174: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 117b: 00 00 00 117e: 66 90 xchg %ax,%ax |
# objdump -d hello [...] 00010408 <main>: 10408: e92d4800 push {fp, lr} 1040c: e28db004 add fp, sp, #4 10410: e24dd008 sub sp, sp, #8 10414: e3a03023 mov r3, #35 ; 0x23 10418: e50b3008 str r3, [fp, #-8] 1041c: e59f0010 ldr r0, [pc, #16] ; 10434 <main+0x2c> 10420: ebffffb0 bl 102e8 <printf@plt> 10424: e3a03000 mov r3, #0 10428: e1a00003 mov r0, r3 1042c: e24bd004 sub sp, fp, #4 10430: e8bd8800 pop {fp, pc} 10434: 000104a8 .word 0x000104a8 |
We see that the Raspberry Pi, based on an ARM chip, has a fixed instruction length - each command is four bytes long.
How it works
The actual "hello world" string is stored not in the <main> section but in the data section. Note that the "text" section is the "code" section; it is the section that will be executed:
tweedleburg:~ # strings hello /lib64/ld-linux-x86-64.so.2 libc.so.6 printf __libc_start_main __gmon_start__ GLIBC_2.2.5 UH-@ UH-@ []A\A]A^A_ hello world ;*3$"
Before the program is executed the loader will load the needed library functions into the program's memory segment. Here is how you find the library functions:
# nm hello | grep U U __libc_start_main@@GLIBC_2.2.5 U printf@@GLIBC_2.2.5
Then the loader will start __libc_start_main which will call the main function. The main function will call printf and hand over parameters like the address of the string to output in registers. The return code will be handed over as register content.
translate C to assembler
To learn the syntax of a gcc assembler program, let's write a C program and compile it without assembling it. Here is the C program, hello.c:
#include <stdio.h> int main() { int i=0x23; printf("hello world"); }
Now we compile this without assembling it:
# gcc -o hello.asm -S hello.c
Now we have the program transformed to assembler and take a look at it:
# cat hello.asm .file "hello.c" .section .rodata .LC0: .string "hello world" .text .globl main .type main, @function main: .LFB2: pushq %rbp .LCFI0: movq %rsp, %rbp .LCFI1: subq $32, %rsp .LCFI2: movl $35, -4(%rbp) movl $.LC0, %edi movl $0, %eax call printf [...]
Now we know the syntax of gcc assembler and we can finally write a program that consists of an endless loop:
.text .globl main main: start: nop; jmp start
Compiling it requires the information that the file is assembler:
gcc -x assembler hello.asm -o hello.bin
Create a boot sector
Under program your own OS I show how to create a boot sector for your own operating system. The bad thing about this is that all functions of your operating system are not available to you and that you only have 512 bytes at your availability. The good thing is that the processor will execute each of your commands while on a running kernel, your program may not have the necessary privileges to run and abort with a segment violation.
Here is our boot sector:
- create a file kernel.s
kernel.s
start: ; this should print H mov ax, 0xe48 mov bx, 7 int 0x10 ; E mov ax, 0xe45 int 0x10 ; L mov ax, 0xe4C int 0x10 ; L mov ax, 0xe4C int 0x10 ; O mov ax, 0xe4F int 0x10 .ende jmp .ende
You may note that we say here "mov ax,..." while in the previous example we have seen "mov eax,...". The reason is that there are so many assembler dialects.
- translate this assembler code into machine language:
nasm kernel.s
- the result is the file kernel. Let's look at it:
tweedleburg:~ # ll kernel -rw-r--r-- 1 root root 30 Nov 27 21:29 kernel tweedleburg:~ # hexdump -C kernel 00000000 b8 48 0e bb 07 00 cd 10 b8 45 0e cd 10 b8 4c 0e |.H.......E....L.| 00000010 cd 10 b8 4c 0e cd 10 b8 4f 0e cd 10 eb fe |...L....O.....| 0000001e tweedleburg:~ #
You see the mov ax (or mov eax) assembler command is again translated to b8 as a byte in machine language. You see all assembler commands are translated and there is nothing but machine language in that file. If you want to use this, see programming your own OS.
x86 assembler
In all of this article we are talking about Intel x86 assembler. Only here I want to give you an example for ARM assembler that I got from http://linuxintro.org/wiki/objDump:
Nokia-N810-43-7:~# objdump -d a.out | head a.out: file format elf32-littlearm Disassembly of section .init: 000084f8 <_init>: 84f8: e52de004 str lr, [sp, #-4]! 84fc: e24dd004 sub sp, sp, #4 ; 0x4 8500: eb000035 bl 85dc <call_gmon_start> 8504: e28dd004 add sp, sp, #4 ; 0x4
The main difference is, as you can see, that this machine language is has a fixed-width command set - every command consists of 4 bytes. That makes it easier to jump to the command that is one, two or whatever commands apart. While the machine language is completely different, there are many assembler mnemonics that also exist in x86 assembler.
run vlc as root
Knowing assembler can help you in situations where you need to do variations to already compiled programs. For example when you start vlc as root you will get an error message
VLC is not supposed to be run as root. Sorry. If you need to use real-time priorities and/or privileged TCP ports you can use vlc-wrapper (make sure it is Set-UID root and cannot be run by non-trusted users first).
If you - like me - do not see why root should not be allowed to run vlc, you can disassemble the code using objdump -d
objdump -d -M intel /usr/bin/vlc [...] 4010f9: e8 32 0a 00 00 call 401b30 <unsetenv> 4010fe: e8 3d fe ff ff call 400f40 <geteuid@plt> 401103: 85 c0 test eax,eax 401105: 0f 84 04 06 00 00 je 40170f <fflush@plt+0x66f> 40110b: be ca 1f 40 00 mov esi,0x401fca 401110: bf 06 00 00 00 mov edi,0x6 [...]
As you can see, the program calls the syscall geteuid. The return value is stored in register AX. Then AX is compared against 0 (test eax,eax). If it is 0, the "equal" flag in the processor is set. The next instruction is je ("jump if the equal flag is set"), a conditional jump. The solution is to replace the call to geteuid by a command to set AX to another value but 0, for example
b8 00 00 00 01
and then vlc will always run as if your user id was 1. Based on this I could write http://www.linuxintro.org/wiki/Run_vlc_as_root which shows how to do it in 3 lines of bash code.
what's your name in assembler?
We said that everything that is executed in a computer is machine language. Now, can we execute everything as machine language? What does your name mean if you interpret it as machine language? To find out write a program that contains as many NOP operations as your name contains characters. I am showing this for the name Thorsten:
name.asm
global _start _start: nop nop nop nop nop nop nop nop
compile that program:
nasm -f elf64 name.asm
link it:
ld -s -o name name.o
the nop commands will be translated to bytes with the value 90h. Use a hex editor to replace them with your name:
okteta name
Then save and disassemble the file:
objdump -M intel -d endless endless: file format elf64-x86-64 Disassembly of section .text: 0000000000400080 <.text>: 400080: 54 push rsp 400081: 68 6f 72 73 74 push 0x7473726f 400086: 65 6e outs dx,BYTE PTR gs:[rsi]
You see, "T" in machine language is a command to push the register rsp, "h" is byte 68, it is a push command that takes "orst" as parameter and "e", byte 65h, is a command to send something to a computer port (like the serial and parallel port). What a beautiful name I have :)
what's this byte in assembler?
We remember the difference between assembler and machine language - assembler contains human-readable commands like NOP, machine language contains bytes that can be executed by the processor like 90h for NOP.
Now we have translated C into machine language (by compiling it), translated it from machine language to assembler (by disassembling it) and translated from C into assembler (by compiling with the -c option). Now it's time to write some arbitrary bytes and translate them to assembler. I assume that the maximum ammount of parameters an assembler command will use is 4. So let's write a program:
# cat >ass AAAAA # objdump -D -b binary -mi386 -Maddr16,data16,intel ass ass: file format binary Disassembly of section .data: 00000000 <.data>: 0: 41 inc cx 1: 41 inc cx 2: 41 inc cx 3: 41 inc cx 4: 41 inc cx 5: 0a .byte 0xa
In other words, we have written text into a file ("AAAAA" plus enter) and told objdump "this is an executable program in machine language, tell us what it does". And the answer is: It increases the register CX five times. "A" or byte 65 or 41h has a meaning in machine language - increase register CX. It does not require an argument.
Going further we can try more bytes and make a table what byte means what in machine language:
J: 4a dec dx K: 4b dec bx L: 4c dec sp M: 4d dec bp N: 4e dec si O: 4f dec di
C instructions in assembler
noteable C instructions in assembler:
C code:
int main() { int i=5; }
assembler code:
4004fd: 55 push rbp 4004fe: 48 89 e5 mov rbp,rsp 400501: 48 c7 45 f8 05 00 00 mov QWORD PTR [rbp-0x8],0x5 400508: 00 400509: 5d pop rbp
C code:
int main() { int *i; *i=(int) 5; }
assembler code:
4004fd: 55 push rbp 4004fe: 48 89 e5 mov rbp,rsp 400501: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8] 400505: c7 00 05 00 00 00 mov DWORD PTR [rax],0x5 40050b: 5d pop rbp
commands
Noteable assembler commands are
- mov for moving data from RAM to a register and other ways
- push for saving the content of a register to RAM (stack)
- pop for restoring the content of a register
- cmp for comparing two values (and setting the flag for equal)
- tst for comparing (and setting the flag for zero)
- jmp for jump operations
- jnz for jump on not zero
- jne for jump on not equal (jump if the equal flag is not set)
- call for executing functions aka sub-procedures
- ret for returning from a sub-procedure
- nop for no operation (delay)
- int for triggering a software interrupt
input/output commands
Input/output commands are there to talk with the printer, keyboard, mouse, scanner, all the periphery and internal devices such as the sound card. Typically this communication will be done via a syscall and the operating system will prohibit doing the communication directly - your program should not print while other processes are printing or the result will be garbage. So your operating system probably has a print scheduler and your program should communicate with this instead of directly with the printer. Similar is true for all other devices. Note that in case of a printer, data will mainly be sent to it, however there may also be situations where you want to receive data from the printer (is the ink still full?) and there are even situations where you want to send data to the mouse (think of force-feedback mice). The commands for I/O operations in assembler are:
- out for sending data to periphery (e.g. USB devices, serial devices, parallel devices, internal devices like soundcards)
- in for reading data