KEMBAR78
ARM7 | PDF | Central Processing Unit | Computer Architecture
0% found this document useful (0 votes)
18 views45 pages

ARM7

Uploaded by

sumannelaturi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views45 pages

ARM7

Uploaded by

sumannelaturi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

What Is ARM?

• Advanced RISC Machine

• First RISC microprocessor for commercial use

• Market-leader for low-power and cost-


sensitive embedded applications

2
ARM7TDMI
TDMI = (?)
– Thumb instruction set
– Debug-interface (JTAG/ICEBreaker)
– Multiplier (hardware)
– Interrupt (fast interrupts)
ARM7/ARM9 Architecture Feature Highlights
•32/16-bit RISC architecture ( ARM v4T )
•32-bit ARM instruction set for maximum performance and flexibility
•16-bit Thumb instruction set for increased code density
•Unified bus interface, 32-bit data bus carries both instructions and data
•8-, 16-, and 32-bit Data Types
•Three-stage pipeline
•4GBytes Linear Address Space
•32-bit ALU and high-performance multiplier
•37 piece of 32 bit register
•Very small die size and low power consumption
•Fully static operation
•Coprocessor interface
•Extensive debug facilities:
Embedded ICE-RT real-time debug unit.
On-chip JTAG interface unit.
•Interface for direct connection to Embedded Trace Macro cell (ETM).
•Pipelined (ARM7: 3 stages)
• Cached (depending on the implementation)
• Von Neuman-type bus structure (ARM7), Harvard (ARM9)
•7 modes of operation (usr, fiq, irq, svc, abt, sys, und)
• Simple structure -> reasonably good speed / power consumption
ratio
•Very Low Power Consumption: Industry-leader in MIPS/Watt.
Differences between RISC and CISC

CISC RISC
Variable size instructions with many Fixed size instructions (32 bit)with few
formats formats
Multi clock complex instructions. Single clock reduced instructions.
Memory to memory load andstore Register to register load andstore
instructions
Small code size, high cycles per second. Large code size, Low cycles per second.
Emphasis on hardware Emphasis on software
Increased hardware cost. Reduced hardware cost.
ARM Powered Products

7
Pipeline Organization
• Increases speed –
most instructions executed in single cycle
• Versions:
– 3-stage (ARM7TDMI and earlier)
– 5-stage (ARMS, ARM9TDMI)
– 6-stage (ARM10TDMI)

8
ARM7 Pipeline Model

 ARM7  standard 3-stage pipelined architecture

FETCH DECODE EXECUTE

 Fetch Instruction  Decode Instruction  Execute Instruction


 Select/Increment PC  Generate Ctrl. signals  Arithmetic / Logic
 Read next instruction  Generate immediate  Calc. branch addr.
 Read from register file  Load / Store

 Related Blocks  Related Blocks  Related Blocks


 Address Selector  Control Logic (Decoder)  Shifter
 Address Incrementer  Register File  Multiplier
 Address Register  ALU

*Register write back (WB) is hidden


Pipeline Organization
• 3-stage pipeline: Fetch – Decode - Execute
• Three-cycle latency,
one instruction per cycle throughput

i
n
s
t i Fetch Decode Execute
r
u Fetch Decode Execute
i+1
c
t
i i+2 Fetch Decode Execute
o cycle
n
t t+1 t+2 t+3 t+4 10
Pipeline Organization Stages:
• 5-stage pipeline:
Reduces work per cycle => Fetch
allows higher clock frequency
Decode
Separates data and instruction
memory => Execute
reduction of CPI
(average number of clock Cycles Per Buffer/data
Instruction)
Write-back

11
Pipeline Organization
• Pipeline flushed and refilled on branch,
causing execution to slow down
• Special features in instruction set
eliminate small jumps in code
to obtain the best flow through pipeline

12
ARM-7 Architecture
ARM Architecture Version Summary
Core Version Feature
ARM1 v1 26 bit address
ARM2, ARM2as, ARM3 v2 32 bit multiply
coprocessor
ARM6, ARM60, ARM610, v3 32 bit addresses
ARM7, ARM710, Separate PC and PSRs
ARM7D, ARM7DI Undefined instruction and
Abort modes
Fully static
Big or little endian
StrongARM, SA-110, SA-1100 v4 Half word and signed
ARM8, ARM810 halfword/byte support
Enhanced multiplier
System mode
ARM7TDMI, ARM710T, ARM720T, ARM740T v4T Thumb instruction set
ARM9TDMI, ARM920T, ARM940T

T: Thumb instruction set D: On-chip Debug


14 M: enhanced Multiplier I: Embedded ICE Logic
ARM Architecture Version
Core SummaryVersion
(cont.) Feature
ARM1020T v5T Improved ARM/Thumb
Interworking
CLZ instruction for
improved division
ARM9E-S, ARM10TDMI, ARM1020E v5TE Extended multiplication
and saturated maths for
DSP-like functionality
ARM7EJ-S, ARM926EJ-S, ARM1026EJ-S v5TEJ Jazelle Technology for
Java acceleration
ARM11, ARM1136J-S, v6 Low power needed
SIMD (Single Instruction
Multiple Data) media
processing extensions

J: Jazelle E: Enhanced DSP instruction


15 S: Synthesizable F: integral vector floating point unit
ARM7 Datapath Overview
A[31:0] control

address regi ster


FETCH
P
C i ncrem enter

PC
regi ster
DECODE bank

i nstructi on

decode
A m ul ti pl y &
L regi ster
U control
A B
b
u b b
EXECUTE s u
s barrel
shi fter
u
s

ALU

(WB) data out regi ster data i n regi ster

D[31:0] *Pipeline registers are omitted


ARM7TDMI Interface Signals (1/4)
mclk A[31:0]
clock
control wait
Din[31:0]
eclk

confi gurati on bigend Dout[31:0]

irq D[31:0] memory


i nterrupts ¼q i nterface
isy nc bl[3:0]
r/w
i ni ti al i zati on reset mas[1:0]
mreq
enin
enout seq
lock
enouti
abe trans
ale MMU
mode[4:0] i nterface
bus ape abort
control dbe
tbe Tbit st ate
busen
highz ARM7T DMI tapsm[3:0]
busdis ir[3:0]
ecapclk core tdoen T AP
tck1 i nformati on
dbgrq
tck2
breakpt
screg[3:0]
dbgack
exec driv ebs
extern1 ecapclkbs
extern0 icapclkbs
debug dbgen highz
rangeout0 boundary
pclkbs scan
rangeout1 rstclkbs extensi on
dbgrqi sdinbs
commrx sdoutbs
commtx shclkbs
opc shclk2bs
coprocessor cpi
i nterface cpa TRST
TCK JT AG
cpb
TMS control s
Vdd TDI
power
Vss TDO

18
Processor Modes

 The ARM has seven basic operating modes:


1. User : unprivileged mode under which most tasks run

2. FIQ : entered when a high priority (fast) interrupt is raised

3. IRQ : entered when a low priority (normal) interrupt is raised

4. Supervisor : entered on reset and when a Software Interrupt


5. instruction is executed

6. Abort : used to handle memory access violations

7. Undef : used to handle undefined instructions

8. System : privileged mode using the same registers as user mode

39v10 The ARM Architecture 19


TM
19
The ARM Register Set

Current Visible Registers


r0
Abort
SVC
Undef
IRQ
FIQ
User Mode
Mode
Mode
Mode
Mode
Mode
r1
r2
r3 Banked out Registers
r4
r5
r6 User FIQ IRQ SVC Undef Abort
r7
r8 r8 r8
r9 r9 r9
r10 r10 r10
r11 r11 r11
r12 r12 r12
r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp)
r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr)
r15 (pc)

cpsr
spsr spsr spsr spsr spsr spsr

39v10 The ARM Architecture 20


TM
20
Register Organization Summary
User FIQ IRQ SVC Undef Abort
r0
r1
User
r2 mode
r3 r0-r7,
r4 r15, User User User User
r5 and mode mode mode mode Thumb state
cpsr r0-r12, r0-r12, r0-r12, r0-r12,
r6
r15, r15, r15, r15,
Low registers
r7 and and and and
r8 r8 cpsr cpsr cpsr cpsr
r9 r9
r10 r10 Thumb state
r11 r11 High registers
r12 r12
r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp)
r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr)
r15 (pc)

cpsr
spsr spsr spsr spsr spsr

Note: System mode uses the User mode register set

39v10 The ARM Architecture 21


TM
21
The Registers

 ARM has 37 registers all of which are 32-bits long.


 1 dedicated program counter
 1 dedicated current program status register
 5 dedicated saved program status registers
 30 general purpose registers

 The current processor mode governs which of several banks is


accessible. Each mode can access
 a particular set of r0-r12 registers
 a particular r13 (the stack pointer, sp) and r14 (the link register, lr)
 the program counter, r15 (pc)
 the current program status register, cpsr

Privileged modes (except System) can also access


 a particular spsr (saved program status register)

39v10 The ARM Architecture 22


TM
22
Current Program Status Registers
31 28 27 24 23 16 15 8 7 6 5 4 0

NZCVQ J U n d e f i n e d I FT mode
f s x c
 Condition code flags  Interrupt Disable bits.
 N = Negative result from ALU  I = 1: Disables the IRQ.
 Z = Zero result from ALU  F = 1: Disables the FIQ.
 C = ALU operation Carried out
 V = ALU operation oVerflowed
 T Bit
 Architecture xT only
 T = 0: Processor in ARM state
 Sticky Overflow flag - Q flag
 T = 1: Processor in Thumb state
 Architecture 5TE/J only
 Indicates if saturation has occurred
 Mode bits
 Specify the processor mode
 J bit
 Architecture 5TEJ only
 J = 1: Processor in Jazelle state

39v10 The ARM Architecture 23


TM
23
Program Counter (r15)

 When the processor is executing in ARM state:


 All instructions are 32 bits wide
 All instructions must be word aligned
 Therefore the pc value is stored in bits [31:2] with bits [1:0] undefined (as
instruction cannot be halfword or byte aligned).

 When the processor is executing in Thumb state:


 All instructions are 16 bits wide
 All instructions must be halfword aligned
 Therefore the pc value is stored in bits [31:1] with bit [0] undefined (as
instruction cannot be byte aligned).

 When the processor is executing in Jazelle state:


 All instructions are 8 bits wide
 Processor performs a word access to read 4 instructions at once

39v10 The ARM Architecture 24


TM
24
Saved Program Status Register (SPSR)
Each privileged mode (except system mode) has
associated with it a SPSR
This SPSR is used to save the state of CPSR when
the privileged mode is entered in order that the user
state can be fully restored when the user process is
resumed
Often the SPSR may be untouched from the time
the privileged mode is entered to the time it is used
to restore the CPSR
If the privileged supervisor calls to itself the SPSR
must be copied into a general register and saved

25
What is Exceptions

 Exceptions are usually used to handle unexpected events which


arise during the execution of a program, such as interrupts or
memory faults, also cover software interrupts, undefined
instruction traps, and the system reset
 Three groups:
 Exceptions generated as the direct effect of executing an instruction

 Software interrupts, undefined instructions, and


prefetch abort
 Exceptions generated as a side effect of an instruction
 Data aborts
 Exceptions generated externally
 Reset, IRQ and FIQ

27
Exception Entry
 When an exception arises
 ARM completes the current instruction as best it can (except that
reset exception)
 handle the exception which starts from a specific location (exception
vector).
 Processor performs the following sequence:
 Change to the operating mode corresponding to the particular
exception
 Stores the return address in LR_<mode>
 Copy old CPSR into SPSR_<mode>
 Set appropriate CPSR bits
 If core currently in Thumb state then ARM state is entered.
 Disable IRQs by setting bit 7
 If the exception is a fast interrupt, disable further faster interrupt by
setting bit 6 of the CPSR

28
Exception Entry
 Force PC to relevant vector address

Priority Exception Mode vector address


1 Reset SVC 0x00000000
2 Data abort (data access memory fault) Abort 0x00000010
3 FIQ (fast interrupt ) FIQ 0x0000001C
4 IRQ (normal interrupt) IRQ 0x00000018
5 Prefetch abort (instruction fetch memory fault) Abort 0c0000000C
6 Undefined instruction UND 0x00000004
Software interrupt (SWI) SVC 0x00000008

 Normally the vector address contains a branch to the relevant routine


 Exception handler use r13_<mode> and r14_<mode> to hold the stack
point and return address

29
Exception Return

 Once the exception has been handled, the user task is normally
resumed
 The sequence is
 Any modified user registers must be restored from the handler’s
stack
 CPSR must be restored from the appropriate SPSR
 PC must be changed back to the relevant instruction address
 The last two steps happen atomically as part of a single
instruction

30
Exceptions of ARM-7
 Mode changes can be made under
 Software control
 External interrupts
 Exception process
 The modes other than user mode are privileged modes
 Have full access to system resources
 Can change mode freely
 Exception modes
 FIQ
 IRQ
 Supervisor mode
 Abort: data abort and instruction prefetch abort
 Undefined
Exception

Task flow
Class Cause
Interrupt External stimulus
Fault Internal cause
Trap Trap instruction
Exception (cont’d)

ARM7 (ISA v4) Exceptions

Type Class Description (Cause)


Reset Power Up
Undefined Instruction FAULT Invalid / coprocessor instruction
Prefetch Abort FAULT TLB miss for instruction
Data Abort FAULT TLB miss for data access
IRQ INTERRUPT Normal interrupt
FIQ INTERRUPT Fast Interrupt (no context switch)
SW Interrupt TRAP Undefined / coprocessor
instruction
Exception (cont’d)

ARM7 (ISA v4) Exception Vectors

Exception Address Mode on Entry


Reset 0x00000000 Supervisor
Undefined Instruction 0x00000004 Undefined
SW Interrupt 0x00000008 Supervisor
Prefetch Abort 0x0000000C Abort
Data Abort 0x00000010 Abort
IRQ 0x00000018 IRQ
FIQ 0x0000001C FIQ
Reserved 0x00000014 Reserved
Exception Handling

 When an exception occurs, the ARM:


 Copies CPSR into SPSR_<mode>
 Sets appropriate CPSR bits
 Change to ARM state 0x1C FIQ
0x18 IRQ
 Change to exception mode 0x14 (Reserved)
 Disable interrupts (if appropriate) 0x10 Data Abort
0x0C Prefetch Abort
 Stores the return address in LR_<mode>
0x08 Software Interrupt
 Sets PC to vector address 0x04 Undefined Instruction
 To return, exception handler needs to: 0x00 Reset
 Restore CPSR from SPSR_<mode> Vector Table
 Restore PC from LR_<mode>
Vector table can be at
This can only be done in ARM state. 0xFFFF0000 on ARM720T
and on ARM9/10 family devices
Exception (cont’d) Process

• Current Program Status Register (CPSR)


• Saved Program Status Register (SPSR)
• On exception, entering mod mode:
– (PC + 4)  LR
– CPSR  SPSR_mod
– PC  IV address
– R13, R14 replaced by R13_mod, R14_mod
– In case of FIQ mode R7 – R12 also replaced
Exception priorities
When multiple exceptions arise at the same time, a fixed
priority system determines the order in which they are
handled:

Highest priority:
1. Reset
2. Data abort
3. FIQ
4. IRQ
5. Prefetch abort

Lowest priority:
6. Undefined Instruction, Software interrupt.
Memory Organization

There are two ways to store data in memory

1 Little-Endian
2 Big – Endian
Memory Organization

• Word, half-word alignment (xxxx00 or xxxxx0)


• ARM can be set up to access data in either little-endian or big-
endian format, through they default to little-endian.
The most significant byte (MSB) value, which is 0Ah in our
example, is stored at the memory location with the lowest
address, the next byte value in significance, 0Bh, is stored at
the following memory location and so on. This is akin to Left-to-
Right reading in hexadecimal order.

Big-endian
The least significant byte (LSB) value, 0Dh, is at the lowest
address. The other bytes follow in increasing order of
significance

Little-endian
Advanced Microprocessor Bus Architecture
(AMBA)
Advanced Microprocessor Bus Architecture

AHB The AMBA AHB is for high performance, high clock


frequency system modules. It acts as a high performance
system backbone that is capable for doing burst transfer,
connecting the CPU and to on chip and off chip memories.

ASB AMBA ASB is an alternative system bus suitable for use


where the high performance features of AHB are not
required. ASB also supports the efficient connection of CPU,
on chip memory and off chip memories.

APB AMBA APB is for low-power peripherals. It is optimized


for minimal power consumption and reduced interface
complexity to support peripheral functions. APB is connected
to CPU via AHB/ASB-APB bridge.
5-Stage Pipeline ARM Organization
5-Stage Pipeline Organization (1/2)
next
pc
+4
• Fetch
I-cache fetch
– The instruction is fetched from
pc + 4
memory and placed in the
pc + 8
instruction pipeline
I decode
r15
instruction
decode
• Decode
register read
immediate
– The instruction is decoded and
fields
register operands read from the
LDM/
mul register files. There are 3
STM
+4
post-
index
shift reg operand read ports in the
shift
pre-index
execute
register file so most ARM
mux
ALU forwarding
paths instructions can source all their
B, BL
MOV pc
operands in one cycle
SUBS pc

byte repl.
• Execute
buffer/
– An operand is shifted and the
D-cache
load/store
address
data ALU result generated. If the
rot/sgn ex instruction is a load or store, the
LDR pc
memory address is computed in
register write
the ALU
write-back
5-Stage Pipeline Organization (2/2)
next
pc
+4
• Buffer/Data
– Data memory is accessed if
I-cache fetch
pc + 4

required. Otherwise the ALU


pc + 8 I decode result is simply buffered for one
r15
instruction
decode
cycle
• Write back
register read
immediate
fields

LDM/
mul – The result generated by the
STM
+4
post-
index
shift reg
instruction are written back to
shift
pre-index
execute
the register file, including any
ALU
mux
forwarding
paths data loaded from memory
B, BL
MOV pc
SUBS pc

byte repl.

D-cache buffer/
load/store
address
data

rot/sgn ex
LDR pc

register write write-back

You might also like