KEMBAR78
Proc Emb - Ch2 | PDF | Arm Architecture | Central Processing Unit
0% found this document useful (0 votes)
116 views29 pages

Proc Emb - Ch2

Uploaded by

Jihene Zgolli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views29 pages

Proc Emb - Ch2

Uploaded by

Jihene Zgolli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Chapter 2

The ARM Cortex-M4 Processor Architecture

1
Chapter Syllabus
 ARM Architectures and Processors
 What is ARM Architecture
 ARM Processor Families
 ARM Cortex-M Series
 Cortex-M4 Processor
 ARM Processor vs. ARM Architectures

 ARM Cortex-M4 Processor


 Cortex-M4 Processor Overview
 Cortex-M4 Block Diagram
 Cortex-M4 Registers

2
ARM ARCHITECTURES AND
PROCESSORS

3
What is ARM Architecture
 ARM architecture is a family of RISC-based processor architectures
 Well-known for its power efficiency;
 Hence widely used in mobile devices, such as smartphones and tablets
 Designed and licensed to a wide eco-system by ARM

 ARM Holdings
 The company designs ARM-based processors;
 Does not manufacture, but licenses designs to semiconductor partners who add their own Intellectual Property
(IP) on top of ARM’s IP, fabricate and sell to customers;
 Also offer other IP apart from processors, such as physical IPs, interconnect IPs, graphics cores, and development
tools.

4
ARM Processor Families
Cortex-A57
 Cortex-A series (Application) Cortex-A53

Cortex-A15
 High performance processors capable of full Operating System (OS) support;
Cortex-A9 Cortex-A
 Applications include smartphones, digital TV, smart books, home gateways etc. Cortex-A8
Cortex-A7
 Cortex-R series (Real-time) Cortex-A5

Cortex-R7
 High performance for real-time applications;
Cortex-R5 Cortex-R
 High reliability Cortex-R4

Cortex-M4
 Applications include automotive braking system, powertrains etc. Cortex-M3

 Cortex-M series (Microcontroller)


Cortex-M1
Cortex-M0+
Cortex-M
Cortex-M0
 Cost-sensitive solutions for deterministic microcontroller applications;
SC000
 Applications include microcontrollers, mixed signal devices, smart sensors, automotive body
electronics and airbags;
SC100
SC300
SecurCore
 SecurCore series ARM11
ARM9 Classic
 High security applications. ARM7

 Previous classic processors


As of Dec 2013
 Include ARM7, ARM9, ARM11 families
5
Design an ARM-based SoC

 Select a set of IP cores from ARM and/or other third-party IP vendors


 Integrate IP cores into a single chip design
 Give design to semiconductor foundries for chip fabrication

IP libraries SoC
Cortex-A9 Cortex-R5 Cortex-M4 ARM
ROM RAM
processor
ARM7 ARM9 ARM11
System bus ARM-based
DRAM ctrl FLASH ctrl SRAM ctrl SoC
Peripherals
AXI bus AHB bus APB bus

GPIO I/O blocks Timer


External Interface

Licensable IPs SoC Design Chip Manufacture

6
ARM Cortex-M Series
 Cortex-M series: Cortex-M0, M0+, M1, M3, M4.
 Energy-efficiency
 Lower energy cost, longer battery life

 Smaller code
 Lower silicon costs

 Ease of use
 Faster software development and reuse

 Embedded applications
 Smart metering, human interface devices, automotive and industrial control systems, white goods,
consumer products and medical instrumentation

As of Dec 2013
7
ARM Processors vs. ARM Architectures
 ARM architecture
 Describes the details of instruction set, programmer’s model, exception model, and memory map
 Documented in the Architecture Reference Manual

 ARM processor
 Developed using one of the ARM architectures
 More implementation details, such as timing information
 Documented in processor’s Technical Reference Manual

ARMv4/v4T ARMv5/ v4E ARMv6 ARMv7 ARMv8


Architecture Architecture Architecture Architecture ARMv7-A Architecture ARMv8-A
e.g. Cortex-A9 e.g. Cortex-A53
Cortex-A57
ARMv7-R
e.g. Cortex-R4 ARMv8-R

ARM v6-M ARMv7-M


e.g. Cortex-M0, M1 e.g. Cortex-M4

e.g. ARM7TDMI e.g. ARM9926EJ-S e.g. ARM1136


8
ARM Cortex-M Series Family
ARM Core Hardware Hardware Saturated DSP Floating
Processor Thumb® Thumb®-2
Architecture Architecture Multiply Divide Math Extensions Point

Von 1 or 32
Cortex-M0 ARMv6-M Most Subset No No Software No
Neumann cycle

Von 1 or 32
Cortex-M0+ ARMv6-M Most Subset No No Software No
Neumann cycle

Von 3 or 33
Cortex-M1 ARMv6-M Most Subset No No Software No
Neumann cycle

Cortex-M3 ARMv7-M Harvard Entire Entire 1 cycle Yes Yes Software No

Cortex-M4 ARMv7E-M Harvard Entire Entire 1 cycle Yes Yes Hardware Optional

9
ARM CORTEX-M4 PROCESSOR
OVERVIEW

10
Cortex-M4 Processor Overview
 Cortex-M4 Processor
 Introduced in 2010
 Designed with a large variety of highly efficient signal processing features

 Features extended single-cycle multiply accumulate instructions, optimized SIMD arithmetic, saturating
arithmetic and an optional Floating Point Unit.
 High Performance Efficiency
 1.25 DMIPS/MHz (Dhrystone Million Instructions Per Second / MHz) at the order of µWatts / MHz

 Low Power Consumption


 Longer battery life – especially critical in mobile products

 Enhanced Determinism
 The critical tasks and interrupt routines can be served quickly in a known number of cycles

11
Cortex-M4 Processor Features
 32-bit Reduced Instruction Set Computing (RISC) processor
 Harvard architecture
 Separated data bus and instruction bus

 Instruction set
 Include the entire Thumb®-1 (16-bit) and Thumb®-2 (16/ 32-bit) instruction sets

 3-stage + branch speculation pipeline


 Performance efficiency
 1.25 – 1.95 DMIPS/MHz (Dhrystone Million Instructions Per Second / MHz)

 Supported Interrupts
 Non-maskable Interrupt (NMI) + 1 to 240 physical interrupts
 8 to 256 interrupt priority levels

12
Cortex-M4 Processor Features
 Supports Sleep Modes

 Up to 240 Wake-up Interrupts

 Integrated WFI (Wait For Interrupt) and WFE (Wait For Event) Instructions and Sleep On Exit capability (to be covered in more detail
later)
 Sleep & Deep Sleep Signals

 Optional Retention Mode with ARM Power Management Kit


 Enhanced Instructions

 Hardware Divide (2-12 Cycles)

 Single-Cycle 16, 32-bit MAC, Single-cycle dual 16-bit MAC

 8, 16-bit SIMD arithmetic

 Debug
 Optional JTAG & Serial-Wire Debug (SWD) Ports

 Up to 8 Breakpoints and 4 Watchpoints

 Memory Protection Unit (MPU)

 Optional 8 region MPU with sub regions and background region


13
Cortex-M4 Block Diagram
ARM Cortex-M4 Microprocessor

Optional FPU
Nested Vector Optional
Optional Interrupt
WIC Embedded
Controller Processor core Trace Macrocell
(Wakeup (NVIC)
Interrupt
Controller)

Optional
Optional Memory Optional Serial
Debug
protection unit Wire Viewer
Access Port

Optional Optional
Flash Data
patch watchpoints

Bus matrix

SRAM and
Code interface
peripheral interface

14
Cortex-M4 Block Diagram
 Processor core
 Contains internal registers, the ALU, data path, and some control logic

 Registers include sixteen 32-bit registers for both general and special usage

 Processor pipeline stages


 Three-stage pipeline: fetch, decode, and execution

 Some instructions may take multiple cycles to execute, in which case the pipeline will be stalled

 The pipeline will be flushed if a branch instruction is executed

 Up to two instructions can be fetched in one transfer (16-bit instructions)

Instruction 1 Fetch Decode Execute

Instruction 2 Fetch Decode Execute

Instruction 3 Fetch Decode Execute

Instruction 4 Fetch Decode Execute

Time
15
Cortex-M4 Block Diagram

 Nested Vectored Interrupt Controller (NVIC)


 Up to 240 interrupt request signals and a non-maskable interrupt (NMI)
 Automatically handles nested interrupts, such as comparing priorities between interrupt requests and the
current priority level
 Wakeup Interrupt Controller (WIC)
 For low-power applications, the microcontroller can enter sleep mode by shutting down most of the
components.
 When an interrupt request is detected, the WIC can inform the power management unit to power up the
system.
 Memory Protection Unit (optional)
 Used to protect memory content, e.g. make some memory regions read-only or preventing user applications
from accessing privileged application data

16
Cortex-M4 Block Diagram

 Bus interconnect
 Allows data transfer to take place on different buses simultaneously
 Provides data transfer management, e.g. a write buffer, bit-oriented operations (bit-band)
 May include bus bridges (e.g. AHB-to-APB bus bridge) to connect different buses into a network
using a single global memory space
 Includes the internal bus system, the data path in the processor core, and the AHB LITE interface
unit
 Debug subsystem
 Handles debug control, program breakpoints, and data watchpoints
 When a debug event occurs, it can put the processor core in a halted state, where developers can
analyse the status of the processor at that point, such as register values and flags

17
ARM CORTEX-M4 PROCESSOR
REGISTERS

18
Cortex-M4 Registers

 Processor registers
 The internal registers are used to store and process temporary data within the processor core
 All registers are inside the processor core, hence they can be accessed quickly
 Load-store architecture
◦ To process memory data, they have to be first loaded from memory to registers, processed inside the
processor core using register data only, and then written back to memory if needed

 Cortex-M4 registers
 Register bank
◦ Sixteen 32-bit registers (thirteen are used for general-purpose);

 Special registers

19
Cortex-M4 Registers R0

R1

R2

R3
Low
Registers
R4
Register bank
R5
General purpose
register R6

R7

R8

R9

R10 High
Registers
R11

R12 MSP
Stack Pointer (SP) Main Stack Pointer
R13(banked)
Link Register (LR) R14 PSP
Program Counter (PC) Process Stack Pointer
R15

Special registers Program Status Registers (PSR) x PSR APSR EPSR IPSR

PRIMASK Application Execution Interrupt


PSR PSR PSR
Interrupt mask register FAULTMASK

BASEPRI
Stack definition CONTROL
20
Cortex-M4 Registers
 R0 – R12: general purpose registers
 Low registers (R0 – R7) can be accessed by any instruction Data Data

 High registers (R8 – R12) sometimes cannot be accessed e.g. by some Thumb (16-bit) PUSH POP
instructions
Low

 R13: Stack Pointer (SP)


Stack Address
 Records the current address of the stack
SP
 Used for saving the context of a program while switching between tasks High
PC
 Cortex-M4 has two SPs: Main SP, used in applications that require privileged access e.g. OS Heap
kernel, and exception handlers, and Process SP, used in base-level application code (when not
running an exception handler)

 Program Counter (PC)


 Records the address of the current instruction code Code

 Automatically incremented by 4 at each operation (for 32-bit instruction code), except


branching operations
 A branching operation, such as function calls, will change the PC to a specific address, meanwhile
it saves the current PC to the Link Register (LR)
21
Cortex-M4 Registers

 R14: Link Register (LR)


 The LR is used to store the return address of a subroutine or a function call
 The program counter (PC) will load the value from LR after a function is finished

Current PC Current LR
PC LR
1. Save current Main Main
PC to LR Program Program

Code region
code

Code region
LR Load PC with the code
address in LR to
return to the main
2. Load PC with program
the starting
address of the
subroutine subroutine
subroutine Current PC
PC

Call a subroutine Return from a subroutine to the main program

22
Cortex-M4 Registers
 xPSR, combined Program Status Register
 Provides information about program execution and ALU flags
 Application PSR (APSR)
 Interrupt PSR (IPSR)
 Execution PSR (EPSR)

APSR NZCVQ Reserved

IPSR Reserved ISR number

EPSR ICI/IT T Reserved ICI/IT

xPSR NZCVQ ICI/IT T Reserved ICI/IT ISR number

bit31 bit24 bit16 bit8 bit0

23
Cortex-M4 Registers
 APSR
 N: negative flag – set to one if the result from ALU is negative

 Z: zero flag – set to one if the result from ALU is zero

 C: carry flag – set to one if an unsigned overflow occurs

 V: overflow flag – set to one if a signed overflow occurs

 Q: sticky saturation flag – set to one if saturation has occurred in saturating arithmetic instructions, or overflow has occurred
in certain multiply instructions

 IPSR
 ISR number – current executing interrupt service routine number

 EPSR
 T: Thumb state – always one since Cortex-M4 only supports the Thumb state (more on processor states in the next module)

 IC/IT: Interrupt-Continuable Instruction (ICI) bit, IF-THEN instruction status bit

24
ARM CORTEX-M4 PROCESSOR
MEMORY MAP

25
Cortex-M4 Memory Map

 The Cortex-M4 processor has 4 GB of memory address space


 Support for bit-band operation (detailed later)

 The 4GB memory space is architecturally defined as a number of regions


 Each region is given for recommended usage
 Easy for software programmer to port between different devices

 Nevertheless, despite of the default memory map, the actual usage of the memory map
can also be flexibly defined by the user, except some fixed memory addresses, such as
internal private peripheral bus

26
Cortex-M4 Memory Map
Reserved for other purposes Vendor specific 0xFFFFFFFF
ROM table
Memory 0xE0100000
512MB
Private peripherals Private Peripheral Bus 0xE00FFFFF External PPB
e.g. NVIC, SCS (PPB) 0xE0000000
External PPB
0xDFFFFFFF
Embedded trace macrocell
Mainly used for external peripherals Trace port interface unit
e.g. SD card External device 1GB
Reserved
0xA0000000
0x9FFFFFFF System Control Space, including
Mainly used for external memories Nested Vectored Interrupt
e.g. external DDR, FLASH, LCD External RAM 1GB Controller (NVIC) Internal PPB
0x60000000 Reserved
Mainly used for on-chip peripherals 0x5FFFFFFF
Fetch patch and breakpoint unit
e.g. AHB, APB peripherals Peripherals 512MB
0x40000000 Data watchpoint and trace unit
0x3FFFFFFF
Mainly used for data memory
e.g. on-chip SRAM, SDRAM SRAM 512MB Instrumentation trace macrocell
0x20000000
0x1FFFFFFF
Mainly used for program code
Code 512MB
e.g. on-chip FLASH 0x00000000

27
Cortex-M4 Memory Map
 Code Region
 Primarily used to store program code
 Can also be used for data memory
 On-chip memory, such as on-chip FLASH

 SRAM Region
 Primarily used to store data, such as heaps and stacks
 Can also be used for program code
 On-chip memory; despite its name “SRAM”, the actual device could be SRAM, SDRAM or other types

 Peripheral Region
 Primarily used for peripherals, such as Advanced High-performance Bus (AHB) or Advanced Peripheral Bus
(APB) peripherals
 On-chip peripherals

28
Cortex-M4 Memory Map

 External RAM Region


 Primarily used to store large data blocks, or memory caches
 Off-chip memory, slower than on-chip SRAM region

 External Device Region


 Primarily used to map to external devices
 Off-chip devices, such as SD card

 Internal Private Peripheral Bus (PPB)


 Used inside the processor core for internal control
 Within PPB, a special range of memory is defined as System Control Space (SCS)
 The Nested Vectored Interrupt Controller (NVIC) is part of SCS

29

You might also like