COMPUTER ORGANIZATION
&
ASSEMBLY LANGUAGE
Assembly Language Fundamentals
Assembly Language Fundamentals
⦿ Basic Elements of Assembly Language
⦿ Example: Adding and Subtracting Integers
⦿ Assembling, Linking, and Running Programs
⦿ Defining Data
⦿ Symbolic Constants
Basic Elements of Assembly
Language
⦿ Integer constants
⦿ Integer expressions
⦿ Character and string constants
⦿ Reserved words and identifiers
⦿ Directives and instructions
⦿ Labels
⦿ Mnemonics and Operands
⦿ Comments
⦿ Examples
Integer Constants
⦿ Optional leading + or – sign
⦿ binary, decimal, hexadecimal, or octal digits
⦿ Common radix characters:
› h – hexadecimal
› d – decimal
› b – binary
› r – encoded real
Examples: 30d, 6Ah, 42, 1101b
Hexadecimal beginning with letter: 0A5h
Integer Expressions
⦿ Operators and precedence levels:
⦿ Examples:
Character and String Constants
⦿ Enclose character in single or double quotes
› 'A', "x"
› ASCII character = 1 byte
⦿ Enclose strings in single or double quotes
› "ABC"
› 'xyz'
› Each character occupies a single byte
⦿ Embedded quotes:
› 'Say "Goodnight," Gracie'
Reserved Words and Identifiers
⦿ Reserved words (Appendix D) cannot be used
as identifiers
› Instruction mnemonics, directives, type attributes,
operators, predefined symbols
⦿ Identifiers
› 1-247 characters, including digits
› case insensitive (by default)
› first character must be a letter, _, @, or $
Directives
⦿ Commands that are recognized and acted upon
by the assembler
› Not part of the Intel instruction set
› Used to declare code, data areas, select memory
model, declare procedures, etc.
› Not case sensitive.
⦿ Different assemblers have different directives
› NASM != MASM, for example
Instructions
⦿ Assembled into machine code by assembler
⦿ Executed at runtime by the CPU
⦿ Member of the Intel IA-32 instruction set
⦿ Parts
› Label
› Mnemonic
› Operand
› Comment
Labels
⦿ Act as place markers
› marks the address (offset) of code and data
⦿ Follow identifer rules
⦿ Data label
› must be unique
› example: myArray
⦿ Code label
› target of jump and loop instructions
› example: L1:
Mnemonics and Operands
⦿ Instruction Mnemonics
› "reminder"
› examples: MOV, ADD, SUB, MUL, INC, DEC
⦿ Operands
› constant (immediate value)
› constant expression
› register
› memory (data label)
Comments
⦿ Comments are good!
› explain the program's purpose
› when it was written, and by whom
› revision information
› tricky coding techniques
› application-specific explanations
⦿ Single-line comments
› begin with semicolon (;)
⦿ Multi-line comments
› begin with COMMENT directive and a
programmer-chosen character
› end with the same programmer-chosen character
Instruction Format Examples
⦿ No operands
› stc ; set Carry flag
⦿ One operand
› inc eax ; register
› inc myByte ; memory
⦿ Two operands
› add ebx,ecx ; register, register
› sub myByte,25 ; memory, constant
› add eax,36 * 25 ; register, expression
Example: Adding and Subtracting
Integers
TITLE Add and Subtract (AddSub.asm)
; This program adds and subtracts 32-bit integers.
INCLUDE Irvine32.inc
.code
main PROC
mov eax,10000h ; EAX = 10000h
add eax,40000h ; EAX = 50000h
sub eax,20000h ; EAX = 30000h
call DumpRegs ; display registers
exit
main ENDP
END main
Example Output
Program output, showing registers and flags:
EAX=00030000 EBX=7FFDF000 ECX=00000101 EDX=FFFFFFFF
ESI=00000000 EDI=00000000 EBP=0012FFF0 ESP=0012FFC4
EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0
Assemble-Link Execute Cycle
⦿ The following diagram describes the steps from creating a source
program through executing the compiled program.
⦿ If the source code is modified, Steps 2 through 4 must be repeated.
Listing File
⦿ Use it to see how your program is compiled
⦿ Contains
› source code
› addresses
› object code (machine language)
› segment names
› symbols (variables, procedures, and constants)
⦿ Example: addSub.lst
Map File
⦿ Information about each program
segment:
› starting address
› ending address
› size
› segment type
⦿ Example: addSub.map
Defining Data
⦿ Intrinsic Data Types
⦿ Data Definition Statement
⦿ Defining BYTE and SBYTE Data
⦿ Defining WORD and SWORD Data
⦿ Defining DWORD and SDWORD Data
⦿ Defining QWORD Data
⦿ Defining TBYTE Data
⦿ Defining Real Number Data
⦿ Little Endian Order
⦿ Adding Variables to the AddSub Program
⦿ Declaring Uninitialized Data
Intrinsic Data Types (1 of 2)
⦿ BYTE, SBYTE
› 8-bit unsigned integer; 8-bit signed integer
⦿ WORD, SWORD
› 16-bit unsigned & signed integer
⦿ DWORD, SDWORD
› 32-bit unsigned & signed integer
⦿ QWORD
› 64-bit integer
⦿ TBYTE
› 80-bit integer
Intrinsic Data Types (2 of 2)
⦿ REAL4
› 4-byte IEEE short real
⦿ REAL8
› 8-byte IEEE long real
⦿ REAL10
› 10-byte IEEE extended real
Data Definition Statement
⦿ A data definition statement sets aside storage in
memory for a variable.
⦿ May optionally assign a name (label) to the data
⦿ Syntax:
[name] directive initializer [,initializer] . . .
⦿ All initializers become binary data in memory
Defining BYTE and SBYTE Data
Each of the following defines a single byte of storage:
value1 BYTE 'A' ; character constant
value2 BYTE 0 ; smallest unsigned byte
value3 BYTE 255 ; largest unsigned byte
value4 SBYTE -128 ; smallest signed byte
value5 SBYTE +127 ; largest signed byte
value6 BYTE ? ; uninitialized byte
A variable name is a data label that implies an offset (an address).
Defining Bytes
Examples that use multiple initializers:
list1 BYTE 10,20,30,40
list2 BYTE 10,20,30,40
BYTE 50,60,70,80
BYTE 81,82,83,84
list3 BYTE ?,32,41h,00100010b
list4 BYTE 0Ah,20h,‘A’,22h
Defining Strings (1 of 2)
⦿ A string is implemented as an array of characters
› For convenience, it is usually enclosed in quotation marks
› It usually has a null byte at the end
⦿ Examples:
str1 BYTE "Enter your name",0
str2 BYTE 'Error: halting program',0
str3 BYTE 'A','E','I','O','U'
greeting1 BYTE "Welcome to the Encryption Demo program "
BYTE "created by Kip Irvine.",0
greeting2 \
BYTE "Welcome to the Encryption Demo program "
BYTE "created by Kip Irvine.",0
Defining Strings (2 of 2)
⦿ End-of-line character sequence:
› 0Dh = carriage return
› 0Ah = line feed
str1 BYTE "Enter your name: ",0Dh,0Ah
BYTE "Enter your address: ",0
newLine BYTE 0Dh,0Ah,0
Idea: Define all strings used by your program in the same area of the data
segment.
Using the DUP Operator
⦿ Use DUP to allocate (create space for) an array or
string.
⦿ Counter and argument must be constants or
constant expressions
var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
var3 BYTE 4 DUP("STACK") ; 20 bytes: "STACKSTACKSTACKSTACK"
var4 BYTE 10,3 DUP(0),20
Defining WORD and SWORD Data
⦿ Define storage for 16-bit integers
› or double characters
› single value or multiple values
word1 WORD 65535 ; largest unsigned value
word2 SWORD –32768 ; smallest signed value
word3 WORD ? ; uninitialized, unsigned
word4 WORD "AB" ; double characters
myList WORD 1,2,3,4,5 ; array of words
array WORD 5 DUP(?) ; uninitialized array
Defining DWORD and SDWORD
Data
Storage definitions for signed and unsigned 32-bit integers:
val1 DWORD 12345678h ; unsigned
val2 SDWORD –2147483648 ; signed
val3 DWORD 20 DUP(?) ; unsigned array
val4 SDWORD –3,–2,–1,0,1 ; signed array
Defining QWORD, TBYTE, Real
Data
Storage definitions for quadwords, tenbyte values, and real
numbers:
quad1 QWORD 1234567812345678h
val1 TBYTE 1000000000123456789Ah
rVal1 REAL4 -2.1
rVal2 REAL8 3.2E-260
rVal3 REAL10 4.6E+4096
ShortArray REAL4 20 DUP(0.0)
Little Endian Order
⦿ All data types larger than a byte store their individual
bytes in reverse order. The least significant byte occurs at
the first (lowest) memory address.
⦿ Example:
val1 DWORD 12345678h
Adding Variables to AddSub
TITLE Add and Subtract, Version 2 (AddSub2.asm)
; This program adds and subtracts 32-bit unsigned
; integers and stores the sum in a variable.
INCLUDE Irvine32.inc
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
.code
main PROC
mov eax,val1 ; start with 10000h
add eax,val2 ; add 40000h
sub eax,val3 ; subtract 20000h
mov finalVal,eax ; store the result (30000h)
call DumpRegs ; display the registers
exit
main ENDP
END main
Declaring Unitialized Data
⦿ Use the .data? directive to declare an
unintialized data segment:
.data?
⦿ Within the segment, declare variables with "?"
initializers:
smallArray DWORD 10 DUP(?)
Advantage: the program's EXE file size is reduced.
Symbolic Constants
⦿ Equal-Sign Directive
⦿ Calculating the Sizes of Arrays and
Strings
⦿ EQU Directive
⦿ TEXTEQU Directive
Equal-Sign Directive
⦿ name = expression
› expression is a 32-bit integer (expression or constant)
› may be redefined
› name is called a symbolic constant
⦿ good programming style to use symbols
COUNT = 500
.
.
mov al,COUNT
Calculating the Size of a Byte
Array
⦿ current location counter: $
› subtract address of list
› difference is the number of bytes
list BYTE 10,20,30,40
ListSize = ($ - list)
Calculating the Size of a Word Array
⦿ current location counter: $
› subtract address of list
› difference is the number of bytes
› divide by 2 (the size of a word)
list WORD 1000h,2000h,3000h,4000h
ListSize = ($ - list) / 2
Calculating the Size of a Doubleword
Array
⦿ current location counter: $
› subtract address of list
› difference is the number of bytes
› divide by 4 (the size of a doubleword)
list DWORD 1,2,3,4
ListSize = ($ - list) / 4
EQU Directive
⦿ Define a symbol as either an integer or text expression.
⦿ Cannot be redefined
PI EQU <3.1416>
pressKey EQU <"Press any key to continue...",0>
.data
prompt BYTE pressKey
TEXTEQU Directive
⦿ Define a symbol as either an integer or text expression.
⦿ Called a text macro
⦿ Can be redefined
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">
rowSize = 5
.data
prompt1 BYTE continueMsg
count TEXTEQU %(rowSize * 2) ; evaluates the expression
move TEXTEQU <mov>
setupAL TEXTEQU <move al,count>
.code
setupAL ; generates: "mov al,10"