Difference between revisions of "Tutorials/Assembler Tutorial"
(→Hello world) |
|||
Line 21: | Line 21: | ||
= Hello world = | = Hello world = | ||
− | We now create a hello world program and disassemble it: | + | We now create a hello world program in C. Then we compile and disassemble it. So we have the C compiler translate it into machine language and then we use a disassembler to translate it into assembler. This is the program: |
cat hello.c | cat hello.c | ||
#include <stdio.h> | #include <stdio.h> | ||
Line 30: | Line 30: | ||
printf("hello world"); | printf("hello world"); | ||
} | } | ||
+ | Now we compile it: | ||
gcc hello.c -o hello | gcc hello.c -o hello | ||
+ | and see that it runs: | ||
./hello | ./hello | ||
hello world | hello world | ||
− | + | To disassemble it, say | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
objdump -M intel -d hello | objdump -M intel -d hello | ||
And the result for the main section is: | And the result for the main section is: |
Revision as of 20:41, 27 November 2014
Everything that is executed on a computer is executed in machine language. If you develop software in php, this software will be interpretreted by php to run. The interpreter is available in machine language. If you write software in C, the C compiler will translate your source code into machine language, a process known as compiling. Machine language is the godfather of programming languages and assembler is there to translate machine language into mnemonics, where one mnemonic stands for one command in machine language. You see this is very low-level and I like low-level topics. So here I show you how I deal with machine language and assembler. I am using x86 Linux in the examples.
Contents |
Endless loop
A "hello world" program in assembler is already advanced. So as a first lesson we will take a look at a program that does nothing but an endless loop. Here is it:
endless.asm
global _start _start: nop jmp _start
This assembler source code contains two commands, "nop" for "no operation" and "jmp" for "jump". The other two lines is a label (_start:) and meta-information (global _start saying that "start" is where the program starts).
compile it
nasm -f elf64 endless.asm
link it
ld -s -o endless endless.o
call it
./endless
Hello world
We now create a hello world program in C. Then we compile and disassemble it. So we have the C compiler translate it into machine language and then we use a disassembler to translate it into assembler. This is the program:
cat hello.c #include <stdio.h> int main() { int i=0x23; printf("hello world"); }
Now we compile it:
gcc hello.c -o hello
and see that it runs:
./hello hello world
To disassemble it, say
objdump -M intel -d hello
And the result for the main section is:
000000000040053c <main>: 40053c: 55 push rbp 40053d: 48 89 e5 mov rbp,rsp 400540: 48 83 ec 20 sub rsp,0x20 400544: c7 45 fc 23 00 00 00 mov DWORD PTR [rbp-0x4],0x23 40054b: bf 4c 06 40 00 mov edi,0x40064c 400550: b8 00 00 00 00 mov eax,0x0 400555: e8 d6 fe ff ff call 400430 <printf@plt> 40055a: c9 leave 40055b: c3 ret
GCC assembler
To learn the syntax of a gcc assembler program, let's write a C program and compile it without assembling it. Here is the C program, hello.c:
#include <stdio.h> int main() { int i=0x23; printf("hello world"); }
Now we compile this without assembling it:
# gcc -o hello.asm -S hello.c
Now we have the program transformed to assembler and take a look at it:
# cat hello.asm .file "hello.c" .section .rodata .LC0: .string "hello world" .text .globl main .type main, @function main: .LFB2: pushq %rbp .LCFI0: movq %rsp, %rbp .LCFI1: subq $32, %rsp .LCFI2: movl $35, -4(%rbp) movl $.LC0, %edi movl $0, %eax call printf [...]
Now we know the syntax of gcc assembler and we can finally write a program that consists of an endless loop:
.text .globl main main: start: nop; jmp start