Microprocessor Systems
1
Overview
CPU overview
Cortex-M0+ Processor Core
Cortex-M0+ Processor Core Registers
Memory System and Addressing
Thumb Instruction Set
2
CPU OVERVIEW
Central Processing Unit (CPU)
CPU is the fundamental execution/processing unit
of the computer
CPU consists of ALU, Control Unit, and Registers
CPU is characterized by:
Clock frequency
Speed
Data bus width
Instruction set
Addressing capability
Addressing capacity
Internal Structure of CPU
Intel 4004
in 1971, commercially available single-chip microprocessor
12 bit address bus
4 bit data bus
CPU Elements
Program Counter (PC)
Instruction Register (IR)
Instruction Decoder
Arithmetic and Logic Unit (ALU)
General Purpose Registers
Special Purpose Registers (SP, BP, IX, CCR, etc.)
Control Unit (CU)
Internal Structure
of CPU
Example:
Motorola 6802
Registers in the Fetch Unit
Program Counter: holds the memory location
of the next instruction.
Instruction Register: holds the current
instruction being executed
Instruction Decoder
It decodes the instructions and generates the
control signals
Arithmetic Logic Unit (ALU)
ALU performs all arithmetic and logic
operations in a microprocessor
ALU has two inputs (A, B) for the operands
and one input for a control signal that
selects the operation
Operation and Shift control bits determine,
which type of operation to perform (F)
Output is the result of operation (R) and
status information (D)
Status information is used to indicate cases
Zero: if all result lines have value 0
Overflow: integer overflow of add and
subtract functions
For unsigned integers, it does not provide any
useful information
Registers
A register is a storage location in the CPU
It is used to hold data or a memory address during
the execution of an instruction
Because the register file is small and close to the
ALU, accessing data in registers is much faster than
accessing data in memory outside the CPU
The register file makes program execution more
efficient
The number of registers varies from computer to
computer
Condition Code Register or Flag Register
Depending on the outcomes of Arithmetic or Logical
operations, we can branch and jump
The eight-bit Condition Code Register (CCR) provides a
status report on the ALU's activity
Carry/Borrow
Half carry from bit 3 to bit 4
oVerflow
CCR also provides a status report after loading ACC
Zero
Negative
V Z N H C
Condition Code Register (CCR)
They flag certain conditions resulting from the ALU
outcomes
Example:
A= 01001000 B= 01111001
A+B:
A 01001000
B +01111001
11000001 V=1 Z=0 N=1 H=1 C=0
Depending on the outcomes of Arithmetic or
Logical operations, we can branch and jump
The Stack
A stack is a last-in-first-out data structure
A stack of a computer works just like a real
stack, e.g., of books. If you have a stack of
books, you can put another book on top:
BOOK3
BOOK3
BOOK2 BOOK2
BOOK1 BOOK1
This is called a push
All that happens is the stack gets one book
deeper, and the last book you added is on top
The Stack
You can also take a book off the top of the stack:
BOOK3
BOOK3
BOOK2 BOOK2
BOOK1 BOOK1
This is called a pop.
The stack gets one book shorter, and the book you get
from the top is the one you added, or pushed, most
recently
Because a pop gives you back the item you most recently
pushed, a stack is called a last-in-first-out, or LIFO,
structure
The Stack
A stack is a last-in-first-out data structure.
D4 D4 D5 D6 D6
SP
11 D3 D3 D3 D3 D3 D3
10 D2 D2 D2 D2 D2 D2
01 D1 D1 D1 D1 D6 D6
00 D4 D4 D5 D5 D5
SP=11 SP=00 SP=11 SP=00 SP=01 SP=00
Stack Pointer
The stack is a way of using
the memory. SP
All that's needed is some Address $A000
unused memory and an $A000 D0 $9FFF
index register, called
the Stack Pointer (SP), $9FFF D1 $9FFE
that always points to the $9FFE D2 $9FFD
next available (empty)
location above the current $9FFD D3 $9FFC
top of the stack $9FFC D4 $9FFB
The stack grows toward $9FFB
lower addresses
$9FFA
Control Unit
The control unit is a synchronous sequential
logic circuit that sends control signals to the data
processing unit, memory and other parts of the
system
The signals from the control unit tells the data
processing unit to manipulate data according to
the algorithm built into the sequential logic circuit
The control unit is instruction controlled;
therefore it can do more than one algorithm based
on its design (programmable)
Typical control units recognize several hundred
different instruction codes
System Clock
In order to regulate when the control unit issues its
control signals, computers use a system clock
System clock generates regular pulses to synchronize
all system events and determine the speed at which
processing can occur
Each fetch-execute instruction cycle is divided into
states, which are one clock pulse long
Most instructions require multiple steps, and so require
several clock pulses to complete
(multi-cycle processor design)
Some individual steps (e.g. a memory access) take
longer & may require additional clock pulses to
complete – these clock cycles spent waiting are called
wait states
System Clock
The clock speed of a CPU determines how
often a new instruction is executed, and is
measured in MHz or GHz
For example: 1.7GHz means that a computer
could execute 1,700,000,000 instructions per
second! (if it executes 1 instruction at a cycle)
System Clock
However, all recent microprocessors overlap the fetching,
decoding and execution of a number of instructions at the
same time – this is called pipelining
Therefore, clock speed is not necessarily an accurate
measure of performance, and other measurements are
required
Single Word,
1-Address Instruction Format
Single word instruction example for 8-bit words
32 possible op-codes
D7 D6 D5 D4 D3 D2 D1 D0 8 possible addresses
Plausible for register
Op-code Address operations
Example for 16-bit words
2048 possible op-codes
D15.........................D5 D4.........D0 32 possible adresses
More operation code and
Op-code Address addressing possibilities for
longer memory words
Single Word,
2-Address Instruction Format
2-Address instruction in an 8-bit word
D7 D6 D5 D4 D3 D2 D1 D0 4 op-codes
8 Address 1
8 Address 2
Op-code Address 1 Address 2
Example for 16-bit words
D15..............D12 D11............D6 D5...........D0 16 op-codes
64 Address 1
64 Address 2
Op-code Address 1 Address 2
Multiple Words,
1-Address Instruction Format
1-Address instruction in multiple words
Operation Code (Op-code) 1. Octal
256 Instructions
65536 Address
Upper half of the address 2. Octal
Lower half of the address 3. Octal
Architectures and Memory Speed
Load/Store Architecture
Developed to simplify CPU design and improve performance
Memory wall: CPUs keep getting faster than memory
Memory accesses slow down CPU, limit compiler optimizations
Change instruction set to make most instructions independent of
memory
Data processing instructions can access registers only
Load data into the registers
Process the data
Store results back into memory
More effective when more registers are available
Register/Memory Architecture
Data processing instructions can access memory or registers
Memory wall is not very high at lower CPU speeds (e.g. under
50 MHz)
Instruction Sets
Depending on the architecture, the instruction set is organized
CISC (Complex Instruction Set Computer): Contains of a large
number of instructions
More complex on hardware
Examples:
MC680x, MC68K, Intel40xx, Intel80xx,
Intel x86 (32bit and 64bit laptop, desktop, server systems),
IBM System-Z Mainframes and many other supercomputers
RISC (Reduced Instruction Set Computer): Contains fewer but
effective instructions
More complex on software
Examples:
ARM (iPad, iPhone, iPod, Blackberry, Android phones)
IBM Power PC (Wii, Xbox, Sony's PS)
Oracle (SUN) Sparc
Embedded applications
Single board computers
Instruction Set Differences
Consider A = B + C in a high level language
It might be translated into one instruction with
a CISC architecture
add mem(B), mem(C), mem(A)
Or four with a RISC architecture
load R1, B
load R2, C
add R3, R2, R1
store A, R3
Instruction Set Completeness
A computer should have a set of instructions so that
the user can construct machine language programs to
evaluate any function that is known to be computable
Computer design should have a sufficient number of
four instruction categories
Transfer instructions: Data transfers among registers or
registers and main memory
Load, Store, Transfer, Swap…
Arithmetic, logic, and shift instructions:
Add, Complement, Increment, circulate, shift, AND, Clear, Set…
Program control Instructions and instructions to check
status conditions: Program sequencing and control
Compare, Branch, Jump, Go to and Return from Subroutine,
Handle Interrupt service, Allow or Not-Allow interrupt requests
Input/Output Instructions:
Input data, Output data, Control peripherals, Status
Machine and Assembly Language Example:
Assembly language template
{Tag} Operation, Operand : {Explanation}
START LDAA, <$0080> : load ACCA the contents of memory address <$0080>
ADDA, <$0081> : Add ACCA the contents of memory address <$0081>
STAA, <$0082> : Store contents of ACCA to the memory address <$0082>
Address Content (Machine language)
0010 00 20 00 80
0014 03 20 00 81
0018 01 20 00 82
CORTEX-M0+ CPU CORE
Microcontroller vs. Microprocessor
Both have a CPU core to
execute instructions
Microcontroller has
peripherals for embedded
Arm Cortex System
interfacing and control M0+ Core
Analog
Analog Memory and
Memory
Debug Interfaces Timers
Non-logic level signals Interface
Communication
Timing Interrupt
Controller
Clocks Interfaces
Clock generators Micro
Trace Security and
Human-Machine
Interface (HMI)
Integrity
Communications Buffer
point to point
network
Reliability and safety
Cortex-M0+ Core and Processor
NXP Kinetis KL25Z family microcontroller
*NXP KL25Z128VLK4
Development Board
FRDM-KL25Z development board
35
Simplified structure of the CPU core
Register and ALU data flow, identified with single lines.
Memory addresses and data, identified with double lines.
Control and selection signals, identified with dotted lines.
36
ARM Processor Core Registers
ARM Processor Core Registers (32 bits each)
R0-R12 - General purpose registers for data
processing
SP - Stack pointer (R13)
Can refer to one of two SPs
Main Stack Pointer (MSP)
Process Stack Pointer (PSP)
Uses MSP initially, and whenever in Handler mode
When in Thread mode, can select either MSP or PSP
using SPSEL flag in CONTROL register.
LR - Link Register (R14)
Holds return address when called with Branch & Link
instruction (B&L)
PC - program counter (R15)
Operating Modes
Reset
Thread
Mode.
MSP or PSP.
Exception Starting
Processing Exception
Completed Processing
Handler
Mode
MSP
Which SP is active depends on operating mode, and
SPSEL (CONTROL register bit 1)
SPSEL == 0: MSP
SPSEL == 1: PSP
ARM Processor Core Registers
Program Status Register (PSR) is three views of same
register
Application PSR (APSR)
Condition code flag bits Negative, Zero, oVerflow, Carry
Interrupt PSR (IPSR)
Holds exception number of currently executing ISR
Execution PSR (EPSR)
Thumb state
ARM Processor Core Registers
PRIMASK - Exception mask register
Bit 0: PM Flag
Set to 1 to prevent activation of all exceptions with configurable
priority
Access using CPS, MSR and MRS instructions
Use to prevent data race conditions with code needing
atomicity
CONTROL
Bit 1: SPSEL flag
Selects SP when in thread mode: MSP (0) or PSP (1)
Bit 0: nPRIV flag
Defines whether thread mode is privileged (0) or unprivileged (1)
With OS environment,
Threads use PSP
OS and exception handlers (ISRs) use MSP