Structure of machine code
Machine code is the binary representation of an instruction.
- This is split into the opcode and the instruction’s data
Assembly code uses text / mnemonics to represent machine
code instructions.
Each processor architecture has a set of instructions that it can run;
known as the instruction set.
The basic structure of an instruction is:
Bits 1-6 Bits 7-16
Opcode Data
- Between architectures, the size of instructions can differ and
can also differ between instructions.
- The above instruction is 16 bits in size.
- Every instruction in an instruction set has a unique binary
value called the opcode.
- The opcode is used to determine which instruction to run.
An example of an instruction set that uses opcode is:
Opcode Assembly mnemonic Description
000 001 MOV Moves a value to a register
000 010 ADD Adds a value and stores in acc
000 100 SUB Subtracts a value and stores in
acc
001 000 MUL Multiplies a value and stores in
acc
- Each opcode is assigned to a respective mnemonic in
Assembly language, allowing humans to code it easily without
having to deal with binary which can be very tricky.
A simple program that would perform (4 + 5) * 2 would go:
- MOV value 0 to ACC (opcode: 000 001)
- ADD 4 to ACC (opcode: 000 010)
- ADD 5 to ACC (opcode: 000 010)
- MUL 2 to ACC (opcode: 001 000)
- Some instructions take more than one word.
- A word is a basic unit of data in computer architecture.
- In the example, the MOV instruction has two parts
(arguments).
- The first part is the register, like ACC (accumulator), which is
register 0.
- The second part stores the value to move into ACC.
- ADD and MUL store the value to add or multiply directly in the
instruction, called immediate addressing.
- Alternatively, the data can be stored separately.
- Using two words in an instruction allows full 16-bit data,
supporting larger numbers.