4.
2 Assembly language
4
WHAT YOU SHOULD ALREADY KNOW
Try these three questions before you start the c) Why do programmers find writing in this
second part of this chapter. type of programming language difficult?
1 a) Name two types of low-level programming 2 Find at least two different types of CPU and
language. the language they use.
4.2 Assembly language
b) Name the only type of programming 3 Look at your computer and/or laptop and/or
language that a CPU recognises. phone and list the programming language(s)
they use.
Key terms
Machine code – the programming language that the
CPU uses.
Instruction – a single operation performed by a CPU.
Assembly language – a low-level chip/machine specific
programming language that uses mnemonics.
Opcode – short for operation code, the part of a
machine code instruction that identifies the action the
CPU will perform.
Operand – the part of a machine code instruction that
identifies the data to be used by the CPU.
Source code – a computer program before translation
into machine code.
Assembler – a computer program that translates
programming code written in assembly language into
machine code. Assemblers can be one pass or two pass.
Instruction set – the complete set of machine code
instructions used by a CPU.
Object code – a computer program after translation into
machine code.
4.2.1 Assembly language and machine code
The only programming language that a CPU can use is machine code.
Every different type of computer/chip has its own set of machine code
instructions. A computer program stored in main memory is a series of
machine code instructions that the CPU can automatically carry out during
the fetch-execute cycle. Each machine code instruction performs one simple
task, for example, storing a value in a memory location at a specified
address. Machine code is binary, it is sometimes displayed on a screen as
hexadecimal so that human programmers can understand machine code
instructions more easily.
121
457591_04_CI_AS & A_Level_CS_107-135.indd 121 25/04/19 9:07 AM
Writing programs in machine code is a specialised task that is very time
4
consuming and often error prone, as the only way to test a program written
in machine code is to run it and see what happens. In order to shorten
the development time for writing computer programs, other programming
languages were developed, where the instructions were easier to learn
and understand. Any program not written in machine code needs to be
translated before the CPU can carry out the instructions, so language
translators were developed.
The first programming language to be developed was assembly language, this
4 Processor fundamentals
is closely related to machine code and uses mnemonics instead of binary.
LDD Total 0140 00000000110000000
ADD 20 0214 00000001000011000
STO Total 0340 00000001110000000
Assembly language mnemonics Machine code Machine code binary
hexadecimal
The structure of assembly language and machine code instructions is the same.
Each instruction has an opcode that identifies the operation to be carried out
by the CPU. Most instructions also have an operand that identifies the data to
be used by the opcode.
Opcode
Opcode Operand Operand
LDD Total 0140
Assembly language mnemonics Machine code hexadecimal
4.2.2 Stages of assembly
Before a program written in assembly language (source code) can be executed,
it needs to be translated into machine code. The translation is performed by a
program called an assembler. An assembler translates each assembly language
instruction into a machine code instruction. An assembler also checks the
syntax of the assembly language program to ensure that only opcodes from
the appropriate machine code instruction set are used. This speeds up the
development time, as some errors are identified during translation before the
program is executed.
There are two types of assembler: single pass assemblers and two pass
assemblers. A single pass assembler puts the machine code instructions
straight into the computer memory to be executed. A two pass assembler
produces an object program in machine code that can be stored, loaded then
executed at a later stage. This requires the use of another program called a
loader. Two pass assemblers need to scan the source program twice, so they
can replace labels in the assembly program with memory addresses in the
machine code program.
Label Memory address
LDD Total 0140
Assembly language mnemonics Machine code hexadecimal
122
457591_04_CI_AS & A_Level_CS_107-135.indd 122 25/04/19 9:07 AM
Pass 1
»
»
»
Read the assembly language program one line at a time.
Ignore anything not required, such as comments.
Allocate a memory address for the line of code.
4
» Check the opcode is in the instruction set.
» Add any new labels to the symbol table with the address, if known.
» Place address of labelled instruction in the symbol table.
Pass 2
4.2 Assembly language
» Read the assembly language program one line at a time.
» Generate object code, including opcode and operand, from the symbol table
generated in Pass 1.
» Save or execute the program.
The second pass is required as some labels may be referred to before their
address is known. For example, Found is a forward reference for the JPN
instruction.
Label Opcode Operand
Notfound: LDD 200
CMP #0
JPN Found
JPE Notfound
Found: OUT
If the program is to be loaded at memory address 100, and each memory
location contains 16 bits, the symbol table for this small section of program
would look like this:
Label Address
Notfound 100
Found 104
123
457591_04_CI_AS & A_Level_CS_107-135.indd 123 25/04/19 9:07 AM