KEMBAR78
x86 Assembly and GCC __asm__ Guide | PDF | Computer Architecture | Computer Memory
0% found this document useful (0 votes)
167 views28 pages

x86 Assembly and GCC __asm__ Guide

This document provides an overview of x86 assembly syntax and GCC inline assembly. It covers basic x86 registers and addressing modes, common instruction syntax in AT&T format, typical assembly idioms, and the syntax for inline assembly statements in GCC. The document provides prerequisites and outlines basic x86 assembly concepts to set up an introduction to assembly programming.

Uploaded by

Paolo Bonzini
Copyright
© Attribution ShareAlike (BY-SA)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF or read online on Scribd
0% found this document useful (0 votes)
167 views28 pages

x86 Assembly and GCC __asm__ Guide

This document provides an overview of x86 assembly syntax and GCC inline assembly. It covers basic x86 registers and addressing modes, common instruction syntax in AT&T format, typical assembly idioms, and the syntax for inline assembly statements in GCC. The document provides prerequisites and outlines basic x86 assembly concepts to set up an introduction to assembly programming.

Uploaded by

Paolo Bonzini
Copyright
© Attribution ShareAlike (BY-SA)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF or read online on Scribd
You are on page 1/ 28

Something about assembly

Paolo Bonzini

April 28th, 2010


Outline

Basic x86 assembly syntax
– Registers
– Addressing modes (16-bit vs. 32-bit)
– Instructions

Basic GCC __asm__ syntax

Resources
Prerequisites

You are not scared by assembly
– Maybe you even enjoyed some of it in school

You have overheard some conversations
between assembly/compiler junkies
– Real mode/protected mode
– Position-independent code
x86 registers
%eax %ax %ah %al
%ebx %bx %bh %bl
31 16 15 8 7 0 %ecx %cx %ch %cl
%edx %dx %dh %dl
%ah %al
%esi %si --
%edi %di --
%ax
%ebp %bp --
%eax (386+) %esp %sp --
x86-64 registers

63 31 16 15 8 7 0

%rax %eax %ax %al %rbp %ebp %bp %bpl


%rbx %ebx %bx %bl %rsp %esp %sp %spl
%rcx %ecx %cx %cl %r8 %r8d %r8w %r8b
%rdx %edx %dx %dl ⋮ ⋮ ⋮ ⋮
%r15 %r15d %r15w %r15b
%rsi %esi %si %sil
%rdi %edi %di %dil %ah %bh %ch %dh
Other registers

Flags

Protected mode
– Control registers (%crN)
– Descriptor registers (ldtr, gdtr, idtr)
– Model-specific registers

Hardware breakpoints
– Debug registers (%drN)

Segment registers
x86 Segmentation
Real mode Protected mode
20-bit addressing 32-bit addressing
Segment register contributes Segment register points into
to bits 4:19 of address LDT or GDT
Shifted segment register Base value (from LDT/GDT)
summed to 16-bit address summed to 32-bit address
No paging: memory accessed Optional paging (virtual
only by physical address addressing)
%cs, %ds, %es, %fs, %gs, %cs, %ds, %es, %ss access
%ss often point to different the entire address space
bases (but it's just a convention)
%fs/%gs often don't (used
e.g. for thread-local storage)
x86 Addressing modes
16-bit (real mode) 32-/64-bit (protected mode)
Base %bp or %bx Any register!
Index %si or %di Any but %esp
Scale --- Index scaled by 1/2/4/8
Offset 16-bit 32-bit (sign extended to 64)
Segment Default: You don't care about the

%ds for no base default (%ds = %ss)
register or %bx

otherwise %ss


All parts are optional
x86 Addressing examples
16-bit 32-bit
ofs(%base, %index) ofs(%base, %index, %scale)

8(%bp) ●
8(%ebp)

16(%si) ●
(%esi,eax,4)

(%bx,%si) ●
SYMBOL(,%esi,2)

%es:(%di) ●
%fs:24

SYMBOL_NAME
64-bit
Same as 32-bit

8(%rbp)

one addition: 8(%rip)
Instruction syntax

Three syntaxes
– Intel: looks nicer, actually very quirky.
Often used on Windows
– AT&T: ugly as hell, but a bit more orthogonal.
Most common on Unices
– nasm: tries to make Intel syntax less quirky. Nice,
but not widespread

I'll cover AT&T syntax only
Instruction syntax
movl 32(%esp), %eax

Operand size
Instruction Source Destination
(b,w,l,q)


Operand size often unnecessary, but double-
checked by assembler

Two-operand arithmetic, destination last

Immediates (incl. addresses) look like $2

Up to one memory operand usually allowed
Some mnemonics
● mov

Arithmetics: add, sub, and, or, xor, cmp,
test
– dest op source, set flags
– cmp and test do not write to dest

Extension (two operand sizes): movs, movz
– Example: movsbl


Stack: push, pop

Flow transfer: jmp, call, ret, jCOND
Flags

Usual suspects: sign, overflow, zero, carry

Combined in useful ways for you by jCOND
instructions
Equality je, jne
Signed comparison jl, jle, jg, jge (Less/Greater)
Unsigned comparison jb, jbe, ja, jae (Below/Above)
Flag tests jo, jno, js, jns, (jc, jnc, jz, jnz)
Other synonyms jnge, jng, jnle, jnl, jnae, jna, jnbe, jnb


cmp does dest – source, watch out when
thinking about less than or greater than!
Typical idioms
or %eax, %eax Instead of cmp $0, %eax
test %eax, %eax Often followed by js/jns instead of jl/jge
(either would work in this case)
pushf; pop %eax Move flags to %eax
sbb %eax, %eax %eax = 0 if carry clear, -1 if set
xor %eax, %eax %eax = 0
lea 16(%eax), %ebx “Load effective address”
Often used to do 3-address adds and
shifts
“Advanced” keywords

Protected mode
– Protection levels (aka rings)
– Descriptor tables
– Gates

Paging

Segment descriptor cache
– “Big real mode”
GCC asm statements
asm [volatile] ●
Volatile asm: cannot be
(“template” scheduled or eliminated
: outputs ●
Four colon-separated
: inputs sections
: clobbers) ●
All sections optional,
but at least one colon
must be there
GCC asm statements
asm [volatile] ●
C string, escaped %
(“template” – %esi is written %%esi
: outputs – %0, %1 replaced by
: inputs outputs and inputs
: clobbers)

Newlines usually
written as \n\t
GCC asm statements
asm [volatile] ●
Comma separated
(“template” ●
Format:
: outputs “=r” (C lvalue)
: inputs “=m” (C lvalue)
: clobbers) ●
“=rm” allows more than
one choice

If no outputs, asm is
automatically made
volatile
GCC asm statements
asm [volatile] ●
Comma separated
(“template” ●
Format:
: outputs “r” (C expression)
: inputs “m” (C expression)
“i” (C constant)
: clobbers)

“NN ” (C expression):
use same place as the
NN-th output
GCC asm statements
asm [volatile] ●
List of registers
(“template” destroyed by the asm
: outputs ●
Add “memory” if the
: inputs asm reads or writes
memory
: clobbers)

Used for register
allocation and
scheduling
Why?

Provides high-level information about the
operands
– Very effective inlining and CSE

Tightly integrated with register allocation
– Fewer moves for inputs and outputs

Also integrated with instruction selection
– “r” (0) becomes a xor
– “m” (0x12345678) can be taken from .rodata
Simple example
asm volatile ("movq %0, %%cr0"
:: "r" (ctxt->cr0));

Likely becomes two instructions
– Load ctxt->cr0 in a register
– movq reg, %cr0

Still register allocation, inlining etc. can be
creative and optimize it to one instruction
Explicit register choices
register long _eax asm("eax") =
SYS_read;
register int _ebx asm("ebx") = fd;
register void*_ecx asm("ecx") = buf;
register long _edx asm("edx") = len;
asm volatile ("int $0x80"
: "=r" (_eax)
: "0" (_eax), "r" (_ebx),
"r" (_ecx), "r" (_edx)
: "memory", "cc")

Use matching constraint if the same register is
used for input and output
Single-register constraints
asm volatile
("movl %1,%%eax; int $0x80"
: "=a" (result)
: "i" (SYS_read), "b" (fd),
"c" (buf), "d" (len)
: "memory", "cc")

a/b/c/d/S/D for %eax...%edx, %esi...%edi

Other registers not available

Use matching constraint here too
Single-register constraints
asm ("xchgl %%ebx, %1 \n\t" \
"cpuid \n\t" \
"xchgl %%ebx, %1 \n\t" \
: "=a"(a),"=r"(b),"=c"(c),"=d"(d) \
: "0" (level))

%ebx not usable in 32-bit shared libraries (PIC
base register)

“m” might refer to %ebx!
– Workaround: “r” (&var) or similar
Constraints and modifiers
asm volatile ("rolb $4, %b0", // %al
: "=q" (var)
: "0" (var) : "cc")
asm volatile ("rolb $4, %h0", // %ah
: "=Q" (var)
: "0" (var) : "cc")

q limits to %eax...%edx for 32-bit compiles

Q limits to %eax...%edx always

Other modifiers: %w0, %k0, %q0 (for 16, 32,
64-bit respectively)
Resources

http://www.intel.com/products/processor/
manuals/

http://developer.amd.com/documentation/
guides/Pages/default.aspx#Manuals

http://www.ibiblio.org/gferg/ldp/GCC-Inline-
Assembly-HOWTO.html
That's all

Q&A

You might also like