Something about assembly
Paolo Bonzini
April 28th, 2010
Outline
●
Basic x86 assembly syntax
– Registers
– Addressing modes (16-bit vs. 32-bit)
– Instructions
●
Basic GCC __asm__ syntax
●
Resources
Prerequisites
●
You are not scared by assembly
– Maybe you even enjoyed some of it in school
●
You have overheard some conversations
between assembly/compiler junkies
– Real mode/protected mode
– Position-independent code
x86 registers
%eax %ax %ah %al
%ebx %bx %bh %bl
31 16 15 8 7 0 %ecx %cx %ch %cl
%edx %dx %dh %dl
%ah %al
%esi %si --
%edi %di --
%ax
%ebp %bp --
%eax (386+) %esp %sp --
x86-64 registers
63 31 16 15 8 7 0
%rax %eax %ax %al %rbp %ebp %bp %bpl
%rbx %ebx %bx %bl %rsp %esp %sp %spl
%rcx %ecx %cx %cl %r8 %r8d %r8w %r8b
%rdx %edx %dx %dl ⋮ ⋮ ⋮ ⋮
%r15 %r15d %r15w %r15b
%rsi %esi %si %sil
%rdi %edi %di %dil %ah %bh %ch %dh
Other registers
●
Flags
●
Protected mode
– Control registers (%crN)
– Descriptor registers (ldtr, gdtr, idtr)
– Model-specific registers
●
Hardware breakpoints
– Debug registers (%drN)
●
Segment registers
x86 Segmentation
Real mode Protected mode
20-bit addressing 32-bit addressing
Segment register contributes Segment register points into
to bits 4:19 of address LDT or GDT
Shifted segment register Base value (from LDT/GDT)
summed to 16-bit address summed to 32-bit address
No paging: memory accessed Optional paging (virtual
only by physical address addressing)
%cs, %ds, %es, %fs, %gs, %cs, %ds, %es, %ss access
%ss often point to different the entire address space
bases (but it's just a convention)
%fs/%gs often don't (used
e.g. for thread-local storage)
x86 Addressing modes
16-bit (real mode) 32-/64-bit (protected mode)
Base %bp or %bx Any register!
Index %si or %di Any but %esp
Scale --- Index scaled by 1/2/4/8
Offset 16-bit 32-bit (sign extended to 64)
Segment Default: You don't care about the
●
%ds for no base default (%ds = %ss)
register or %bx
●
otherwise %ss
●
All parts are optional
x86 Addressing examples
16-bit 32-bit
ofs(%base, %index) ofs(%base, %index, %scale)
●
8(%bp) ●
8(%ebp)
●
16(%si) ●
(%esi,eax,4)
●
(%bx,%si) ●
SYMBOL(,%esi,2)
●
%es:(%di) ●
%fs:24
●
SYMBOL_NAME
64-bit
Same as 32-bit
●
8(%rbp)
●
one addition: 8(%rip)
Instruction syntax
●
Three syntaxes
– Intel: looks nicer, actually very quirky.
Often used on Windows
– AT&T: ugly as hell, but a bit more orthogonal.
Most common on Unices
– nasm: tries to make Intel syntax less quirky. Nice,
but not widespread
●
I'll cover AT&T syntax only
Instruction syntax
movl 32(%esp), %eax
Operand size
Instruction Source Destination
(b,w,l,q)
●
Operand size often unnecessary, but double-
checked by assembler
●
Two-operand arithmetic, destination last
●
Immediates (incl. addresses) look like $2
●
Up to one memory operand usually allowed
Some mnemonics
● mov
●
Arithmetics: add, sub, and, or, xor, cmp,
test
– dest op source, set flags
– cmp and test do not write to dest
●
Extension (two operand sizes): movs, movz
– Example: movsbl
●
Stack: push, pop
●
Flow transfer: jmp, call, ret, jCOND
Flags
●
Usual suspects: sign, overflow, zero, carry
●
Combined in useful ways for you by jCOND
instructions
Equality je, jne
Signed comparison jl, jle, jg, jge (Less/Greater)
Unsigned comparison jb, jbe, ja, jae (Below/Above)
Flag tests jo, jno, js, jns, (jc, jnc, jz, jnz)
Other synonyms jnge, jng, jnle, jnl, jnae, jna, jnbe, jnb
●
cmp does dest – source, watch out when
thinking about less than or greater than!
Typical idioms
or %eax, %eax Instead of cmp $0, %eax
test %eax, %eax Often followed by js/jns instead of jl/jge
(either would work in this case)
pushf; pop %eax Move flags to %eax
sbb %eax, %eax %eax = 0 if carry clear, -1 if set
xor %eax, %eax %eax = 0
lea 16(%eax), %ebx “Load effective address”
Often used to do 3-address adds and
shifts
“Advanced” keywords
●
Protected mode
– Protection levels (aka rings)
– Descriptor tables
– Gates
●
Paging
●
Segment descriptor cache
– “Big real mode”
GCC asm statements
asm [volatile] ●
Volatile asm: cannot be
(“template” scheduled or eliminated
: outputs ●
Four colon-separated
: inputs sections
: clobbers) ●
All sections optional,
but at least one colon
must be there
GCC asm statements
asm [volatile] ●
C string, escaped %
(“template” – %esi is written %%esi
: outputs – %0, %1 replaced by
: inputs outputs and inputs
: clobbers)
●
Newlines usually
written as \n\t
GCC asm statements
asm [volatile] ●
Comma separated
(“template” ●
Format:
: outputs “=r” (C lvalue)
: inputs “=m” (C lvalue)
: clobbers) ●
“=rm” allows more than
one choice
●
If no outputs, asm is
automatically made
volatile
GCC asm statements
asm [volatile] ●
Comma separated
(“template” ●
Format:
: outputs “r” (C expression)
: inputs “m” (C expression)
“i” (C constant)
: clobbers)
●
“NN ” (C expression):
use same place as the
NN-th output
GCC asm statements
asm [volatile] ●
List of registers
(“template” destroyed by the asm
: outputs ●
Add “memory” if the
: inputs asm reads or writes
memory
: clobbers)
●
Used for register
allocation and
scheduling
Why?
●
Provides high-level information about the
operands
– Very effective inlining and CSE
●
Tightly integrated with register allocation
– Fewer moves for inputs and outputs
●
Also integrated with instruction selection
– “r” (0) becomes a xor
– “m” (0x12345678) can be taken from .rodata
Simple example
asm volatile ("movq %0, %%cr0"
:: "r" (ctxt->cr0));
●
Likely becomes two instructions
– Load ctxt->cr0 in a register
– movq reg, %cr0
●
Still register allocation, inlining etc. can be
creative and optimize it to one instruction
Explicit register choices
register long _eax asm("eax") =
SYS_read;
register int _ebx asm("ebx") = fd;
register void*_ecx asm("ecx") = buf;
register long _edx asm("edx") = len;
asm volatile ("int $0x80"
: "=r" (_eax)
: "0" (_eax), "r" (_ebx),
"r" (_ecx), "r" (_edx)
: "memory", "cc")
●
Use matching constraint if the same register is
used for input and output
Single-register constraints
asm volatile
("movl %1,%%eax; int $0x80"
: "=a" (result)
: "i" (SYS_read), "b" (fd),
"c" (buf), "d" (len)
: "memory", "cc")
●
a/b/c/d/S/D for %eax...%edx, %esi...%edi
●
Other registers not available
●
Use matching constraint here too
Single-register constraints
asm ("xchgl %%ebx, %1 \n\t" \
"cpuid \n\t" \
"xchgl %%ebx, %1 \n\t" \
: "=a"(a),"=r"(b),"=c"(c),"=d"(d) \
: "0" (level))
●
%ebx not usable in 32-bit shared libraries (PIC
base register)
●
“m” might refer to %ebx!
– Workaround: “r” (&var) or similar
Constraints and modifiers
asm volatile ("rolb $4, %b0", // %al
: "=q" (var)
: "0" (var) : "cc")
asm volatile ("rolb $4, %h0", // %ah
: "=Q" (var)
: "0" (var) : "cc")
●
q limits to %eax...%edx for 32-bit compiles
●
Q limits to %eax...%edx always
●
Other modifiers: %w0, %k0, %q0 (for 16, 32,
64-bit respectively)
Resources
●
http://www.intel.com/products/processor/
manuals/
●
http://developer.amd.com/documentation/
guides/Pages/default.aspx#Manuals
●
http://www.ibiblio.org/gferg/ldp/GCC-Inline-
Assembly-HOWTO.html
That's all
●
Q&A