Department of Electrical and computer Engineering
College of Engineering and Technology
Jimma University
Machine Language:
Set of fundamental instructions the machine can
execute
Expressed as a pattern of 1’s and 0’s
Assembly Language:
Alphanumeric equivalent of machine language
Mnemonics more human-oriented than 1’s and 0’s
Assembler:
Computer program that transliterates (one-to-
one mapping) assembly to machine language
Computer’s native language is machine/assembly
language
Learning assembly language programming will
help understanding the operations of the
microprocessor
Faster and shorter programs.
Compilers do not always generate optimum code.
Small controllers embedded in many
products
Have specialized functions,
Rely so heavily on input/output functionality,
HLLs inappropriate for product development.
Game developers
3
Assembly language program must be translated to machine language for
the target processor.
The following diagram describes the steps from creating a source program
through executing the compiled program.
If the source code is modified, Steps 2 through 4 must be repeated.
Link
Library
Step 2: Step 3: Step 4:
Source assembler Object linker Executable OS loader
Output
File File File
Listing Map
Step 1: text editor File File
4
MASM
Microsoft : Macro Assembler
TASM
Borland : Turbo Assembler
NASM
Library General Public License (LGPL) [Free] :
Netwide Assembler
etc, Flat Assembler, SpAssembler
.model small
.stack 100h
.data
message db 'Hello World', 13, 10, '$'
.code
start:
mov ax, @data
mov ds, ax
mov dx, offset message ; copy address of message to dx
mov ah, 9h ; string output
int 21h ; display string
mov ax, 4c00h
int 21h
end start
6
TITLE PRGM1
.MODEL SMALL
.STACK 100H
.DATA
A DW 2
B DW 5
SUM DW ?
.CODE
MAIN PROC
; initialize DS
MOV AX, @DATA
MOV DS, AX
; add the numbers
MOV AX, A
ADD AX, B
MOV SUM, AX
; exit to DOS
MOV AX, 4C00H
INT 21H
MAIN ENDP
END MAIN
An instruction is a statement that becomes
executable when a program is assembled.
Assembled into machine code by assembler
An instruction contains:
Label (optional)
Mnemonic (required)
Operand (depends on the instruction)
Comment (optional)
Basic syntax
[label:] mnemonic [operands] [ ;comment]
Act as place markers
marks the address (offset) of code and data
Follow identifier rules
Data label
must be unique
example: my Array (not followed by colon)
count DWORD 100
Code label
target of jump and loop instructions
example: L1: (followed by colon)
target:
Mov ax, bx
…
jmp target
10
Instruction Mnemonics
memory aid
examples: MOV, ADD, SUB, MUL, INC, DEC
Operands
constant 96
constant expression 2+4
register ax
memory (data label) count
Constants and constant expressions are
often called immediate values
•STC instruction
•stc ; set Carry flag
•INC instruction
•inc ax ; add 1 to EAX
•MOV instruction
•Mov count, bx ; move BX to
count
12
Comments are good!
explain the program's purpose
when it was written, and by
whom
revision information
tricky coding techniques
application-specific explanations
Single-line comments
begin with semicolon (;)
.model small
.stack 100h
.data
message db 'Hello World', 13, 10, '$'
.code
start:
Mov ax, @data
Mov ds, ax
Mov dx, offset message ; copy address of message to dx
Mov ah, 9h ; string output
int 21h ; display string
Mov ax, 4c00h
int 21h
end start
Commands that are recognized and acted
upon by the assembler
Not part of the Intel instruction set
Used to declare code, data areas, select
memory model, declare procedures, etc.
not case sensitive
Different assemblers have different
directives
NASM not the same as MASM, for example
My Var DWORD 26 ; DWORD directive, set aside
; enough space for double word
Mov ax, my Var ; MOV instruction
EQU pseudo-op used to assign a name to
constant.
Makes assembly language easier to
understand.
No memory allocated for EQU names.
LF EQU 0AH
MOV DL, 0AH
MOV DL, LF
PROMPT EQU “Type your name”
MSG DB “Type your name”
MDG DB PROMPT
Used to define arrays whose elements share
common initial value.
It has the form: repeat_count DUP (value)
Numbers DB 100 DUP(0)
Allocates an array of 100 bytes, each initialized to 0.
Names DW 200 DUP(?)
Allocates an array of 200 uninitialized words.
Two equivalent definitions
Line DB 5, 4, 3 DUP(2, 3 DUP(0), 1)
Line DB 5, 4, 2, 0, 0, 0, 1, 2, 0, 0, 0, 1, 2, 0, 0, 0, 1
Use DUP to allocate (create space for) an array
or string. Syntax: counter DUP ( argument )
Counter and argument must be constants or
constant expressions
var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
var3 BYTE 4 DUP("STACK") ; 20 bytes: "STACKSTACKSTACKSTACK"
var4 BYTE 10,3 DUP(0),20 ; 5 bytes
var4 10
0
0
0
20
Used to override declared type of an address
expression.
Examples:
MOV [BX], 1 illegal, there is ambiguity
MOV Bye PTR [BX], 1 legal
MOV WORD PTR [BX], 1 legal
Let j be defined as follows
j DW 10
MOV AL, j illegal
MOV AL, Byte PTR J legal
Identifiers
Programmer-chosen name to identify a variable, constant,
procedure, or code label
1-247 characters, including digits
not case sensitive
first character must be a letter, _, @, ?, or $
Subsequent characters may also be digits
Cannot be the same as a reserved word
@ is used by assembler as a prefix for predefined symbols,
so avoid it identifiers
Examples
Var1, Count, $first, _main, MAX, open_file, myFile, xVal,
_12345
Reserved words cannot be used as identifiers
Instruction mnemonics
MOV, ADD, MUL,, …
Register names
Directives – tells MASM how to assemble programs
type attributes – provides size and usage information
BYTE, WORD
Operators – used in constant expressions
predefined symbols – @data
See MASM reference in Appendix A
A data definition statement sets aside storage in memory for a
variable.
May optionally assign a name (label) to the data
Syntax:
[name] directive initializer [,initializer] . . .
value1 BYTE 10
All initializers become binary data in memory
Each variable has a type and assigned a memory
address.
Data-defining pseudo-ops
DB define byte
DW define word
DD define double word (two consecutive
words)
DQ define quad word (four consecutive words)
DT define ten bytes (five consecutive words)
Each pseudo-op can be used to define one or
more data items of given type.
Assembler directive format defining a byte
variable
name DB initial value
a question mark (“?”) place in initial value leaves
variable uninitialized
I DB 4 define variable I with initial value 4
J DB ? Define variable J with uninitialized
value
Name DB “Course” allocate 6 bytes for
Name K 05
K DB 5, 3, -1 allocates 3 bytes 03
FF
Offset Value
0000 10
list1
0001 20
Examples that use 0002 30
multiple initializers: 0003 40
0004 10
list2
list1 BYTE 10,20,30,40 0005 20
list2 BYTE 10,20,30,40 0006 30
BYTE 50,60,70,80 0007 40
BYTE 81,82,83,84 0008 50
0009 60
list3 BYTE ?,32,41h,00100010b
000A 70
list4 BYTE 0Ah,20h,‘A’,22h
000B 80
000C 81
000D 82
000E 83
000F 84
list3 0010
Assembler directive format defining a word
variable I 04
Name DW initial value 00
I DW 4
J FE
FF
J DW -2
K BC
1A
K DW 1ABCH
L 31
30
L DW “01”
Enclose character in single or double quotes
'A', "x"
ASCII character = 1 byte
Enclose strings in single or double quotes
"ABC"
'xyz'
Each character occupies a single byte
Embedded quotes:
'Say "Goodnight," Gracie'
A string is implemented as an array of characters
For convenience, it is usually enclosed in quotation marks
It often will be null-terminated (ending with ,0)
Examples:
str1 BYTE "Enter your name",0
str2 BYTE 'Error: halting program',0
str3 BYTE 'A','E','I','O','U'
greeting BYTE "Welcome to the Encryption Demo program "
End-of-line character sequence:
0Dh = carriage return
0Ah = line feed
str1 BYTE "Enter your name: ",0Dh,0Ah
BYTE "Enter your address: ",0
newLine BYTE 0Dh,0Ah,0
Idea: Define all strings used by your program in the same area of
the data segment.
31
32
CPU communicates with peripherals through I/O
registers called I/O ports.
Two instructions access I/O ports directly: IN and
OUT.
Used when fast I/O is essential, e.g. games.
Most programs do not use IN/OUT instructions
port addresses vary among computer models
much easier to program I/O with service routines
provided by manufacturer
Two categories of I/O service routines
Basic input/output system (BIOS) routines
Disk operating system (DOS) routines
DOS and BIOS routines invoked by INT (interrupt)
instruction.
A set of programs always present in system
BIOS routines most primitive in a computer
Talks directly to system hardware
Hardware specific - must know exact port
address and control bit configuration for I/O
devices
BIOS supplied by computer manufacturer and
resides in ROM
Provides services to O.S. or application
Enables O.S. to be written to a standard
interface
System Hardware
Non-standard interface
BIOS
Standard interface
Operating System
Standard interface
Application Program
INT21H used to invoke a large number of
DOS function.
Type of called function specified by putting a
number in AH register.
AH=1 single-key input with echo
AH=2 single-character output
AH=9 character string output
AH=8 single-key input without echo
AH=0Ah character string input
37
Input: AH=2, DL= ASCII code of character to be
output
Output: AL=ASCII code of character
To display a character
MOV AH, 2
MOV DL, ‘?’ ; displaying character ‘?’
INT 21H
To read a character and display it
MOV AH, 1
INT 21H
MOV AH, 2
MOV DL, AL
INT 21H
39
Input:AH=1
Output: AL= ASCII code if character key is
pressed, otherwise 0.
To input character with echo:
MOV AH, 1
INT 21H ; read character will be in AL register
To input a character without echo:
MOV AH, 8
INT 21H ; read character will be in AL register
.model small
.stack 100h
.data
message db 'Hello World', 13, 10, '$'
.code
start:
Mov ax, @data
Mov ds, ax
Mov dx, offset message ; copy address of message to dx
Mov ah, 9h ; string output
int 21h ; display string
Mov ax, 4c00h
int 21h
end start
41
Input: AH=9, DX= offset address of a string.
String must end with a ‘$’ character.
To display the message Hello!
MSG DB “Hello!$”
MOV AH, 9
MOV DX, offset MSG
INT 21H
OFFSET operator returns the address of a
variable
The instruction LEA (load effective address)
loads destination with address of source
LEA DX, MSG
Prompt the user to enter a lowercase letter,
and on next line displays another message
with letter in uppercase.
Enter a lowercase letter: a
In upper case it is: A
.DATA
CR EQU 0DH
LF EQU 0AH
MSG1 DB ‘Enter a lower case letter: $’
MSG2 DB CR, LF, ‘In upper case it is: ‘
Char DB ?, ‘$’
.CODE
.STARTUP ; initialize data segment
LEA DX, MSG1 ; display first message
MOV AH, 9
INT 21H
MOV AH, 1 ; read character
INT 21H
SUB AL, 20H ; convert it to upper case
MOV CHAR, AL ; and store it
LEA DX, MSG2 ; display second message and
MOV AH, 9 ; uppercase letter
INT 21H
.EXIT ; return to DOS
.DATA
String1 DB “Hello”
String2 DB 5 dup(?)
.CODE
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
CLD
MOV CX, 5
LEA SI, String1
LEA DI, String2
REP MOVSB
46