Module-1
Chapter -1
• Microprocessors versus Microcontrollers,
• ARM Embedded Systems: The ARM Design Philosophy,
• Embedded System Hardware,
• ARM Processor Fundamentals: Registers, Current Program
Status Register, Pipeline, Exceptions, Interrupts, and the
Vector Table
• Core extension
Difference between microcontroller and microprocessor
Difference between microcontroller and microprocessor
Difference between microcontroller and microprocessor
SL NO MICROPROSESSOR MICROCONTROLLER
1 Act as a heart of computer systems Act as a heart of Embedded systems
2 Since memory and I/O is to be connected Since on-chip memory component is
externally therefore the circuits is more complex available therefore the circuit is more
efficient
3 A microprocessor having Zero status flag No Zero status flag
4 Has less number of registers therefore most of has more number of registers therefore
the operation are memory based a program is easier to write
5 Microprocessors generally does not have RAM, Microcontroller is ‘all in one’ processor,
ROM and I/O pins. with RAM, I/O ports, all on the chip.
6 Microprocessors usually use its pins as a bus to Controlling bus is internal and not
interface to RAM, ROM, and peripheral devices. available to the board designer.
Hence, the controlling bus is expandable at the
board level.
Difference between microcontroller and microprocessor
SL MICROPROSESSOR MICROCONTROLLER
NO
7 Microprocessors are generally capable of being Microcontrollers are usually used for more
built into bigger general purpose applications. dedicated applications.
8 Microprocessors, generally do not have power Microcontrollers have power saving system, like
saving system. idle mode or power saving; mode so overall it uses
less power.
9 The overall cost of systems made with Microcontrollers are made by using
Microprocessors is high, because of the high complementary metal oxide semiconductor
number of external components required. technology; so they are far cheaper than
Microprocessors.
10 Processing speed of general microprocessors is Processing speed of Microcontrollers is about 8
above 1 GHz; so it works much faster than MHz to 50 MHz.
Microcontrollers.
11 Microprocessors are based on von-Neumann Microcontrollers are based on Harvard
model; where, program and data are stored in architecture; where, program memory and data
same memory module. memory are separate.
12 Mainly used in personal computer Used mainly in washing machine, MP3 Player etc
ARM Embedded System
• Is a key component of many successful 32-bit embedded system
• Is considered to be family of Central Processing Units that is used in music players,
smartphones, wearables, tablets and other consumer electronic devices.
• The architecture of ARM processor is created by Advanced RISC Machines, hence
name ARM.
• Since it is very small in size and very few instruction sets and transistors, so it is
perfect fit for small size devices
• It has less power consumption along with reduced complexity in its circuits.
• The main features of ARM Processor
1. Multiprocessing Systems – ARMv6K had ability to support 4 CPUs along with its
hardware.
2. Memory Management –
3. One cycle execution time –
4. Large number of registers –
RISC (Reduced Instruction Set Computer)
• ARM core uses RISC architecture
• Features of RISC Processor are
1. It has relatively few instructions and few addressing modes.
2. Memory access is limited to load and store instructions.
3. All operations are done within the registers of the CPU.
4. It has fixed-length, easily decoded instruction format.
5. Single-cycle instruction execution.
6. Provides flexibility and intelligence in software rather than
hardware.
CISC (Complex Instruction Set Computer)
• Features of CISC Processor are −
1. A large number of instructions-typically from 100 to 250
instructions
2. Some instructions that perform specialized tasks and are used
infrequently.
3. A large variety of addressing modes-typically from 5 to 20
different modes.
4. It has variable-length instruction formats.
5. It uses instructions that manipulate operands in memory.
Difference between RISC and CISC
Sl
No
RISC CISC
1 RISC processors have simple instructions CSIC processor has complex instructions that
taking about one clock cycle. take up multiple clocks for execution.
Performance is optimized with more focus Performance is optimized with more focus on
on software hardware.
2 It has no memory unit and uses separate It has a memory unit to implement complex
hardware to implement instructions.. instructions.
3 It has a memory unit to implement complex It has a microprogramming unit.
instructions.
4 it has only a few instructions in the The instruction set has a variety of different
instruction set. instructions that can be used for complex
operations.
5 Multiple register sets are present Only has a single register set
Difference between RISC and CISC
6 RISC processors are highly pipelined They are normally not pipelined or less
pipelined
7 The complexity of RISC lies with the compiler that The complexity lies in the microprogram
executes the program
8 Execution time is very less Execution time is very high
9 Code expansion can be a problem Code expansion is not a problem
10 The decoding of instructions is simple. Decoding of instructions is complex
11 It does not require external memory for It requires external memory for
calculations calculations
12 The most common RISC microprocessors are Examples of CISC processors are the
Alpha, ARC, ARM, AVR, MIPS, PA-RISC, PIC, Power System/360, VAX, PDP-11, Motorola
Architecture, and SPARC. 68000 family, AMD, and Intel x86 CPUs.
RISC is implemented with four major design rules
• Instruction –
1. Less number of instruction set
2. Single cycle is required to execute instruction
3. Compiler synthesizes complicated operations by combining
several simple instructions
4. Each instruction is a fixed length
• Pipelines –
1. Instructions is broken down into smaller units that can be
executed in parallel by pipelines.
2. Maximizes throughput
• Registers –
1. Large general purpose register set
2. Can contain data or memory
• Load-store architecture –
1. The processor operates on data held in registers
2. Separate load store instructions required to transfer
data between register and memory
ARM design philosophy
• Physical features that have driven the ARM processor
design.
1. Small device that reduces power consumption and extend
battery operation
2. High code density: useful for applications that have limited on-
board memory
3. Price sensitive and use slow and low-cost memory devices.
4. Reduce the area of the die taken up by the embedded
processor.
Instruction set for Embedded systems
• The ARM instruction set differs from the pure RISC definition in several ways that make the
ARM instruction set suitable for embedded applications:
• Variable cycle execution for certain instructions- Not every ARM instruction executes in a
single cycle. For example, load-store-multiple instructions vary in the number of execution
cycles depending upon the number of registers being transferred.
• Inline barrel shifter leading to more complex instructions- The inline barrel shifter is a
hardware component that pre-processes one of the input registers before it is used by an
instruction.
• Thumb 16-bit instruction set- ARM enhanced the processor core by adding a second 16-bit
instruction set called Thumb that permits the ARM core to execute either 16- or 32-bit
instructions.
• Conditional execution- An instruction is only executed when a specific condition has been
satisfied
• Enhanced instructions- The enhanced digital signal processor (DSP) instructions were added
to the standard ARM instruction set to support fast 16×16-bit multiplier operations and
saturation.
Embedded system hardware
Embedded system hardware
Figure 1.2 shows a typical embedded device based on an ARM core. The device
has four main hardware components:
1. The ARM processor: It controls the embedded device. An ARM processor
comprises a core (the execution engine that processes instructions and
manipulates data) plus the surrounding components that interface with a bus.
2. Controllers: Two commonly found controllers are interrupt and memory
controllers.
❖ Memory Controllers: Memory controllers connect different types of memory to the
processor bus. On power-up a memory controller is configured in hardware to allow
certain memory devices to be active.
❖ An interrupt controller: An interrupt controller determines which peripheral or device
can interrupt the processor at any specific time.
3. Peripherals: Embedded systems that interact with the outside world need
some form of peripheral device. A peripheral device performs input and
output functions for the chip by connecting to other devices
Embedded system hardware
4. Bus: A bus is used to communicate between different parts of the
device.
• Embedded system uses on-chip bus called as AMBA (Advanced
microcontroller Bus architecture) that allows different peripheral
devices to be connected to ARM.
• AMBA buses introduced were the ARM System Bus (ASB), ARM
Peripheral Bus (APB). & ARM High Performance Bus (AHB)
• AHB-APB bridge: to connect ARM High Performance Bus (AHB) to ARM
Peripheral Bus (APB).
• AHB-external bridge: to connect ARM High Performance Bus (AHB) to
external devices
• AHB arbiter: The arbiter is used to ensure that, at any point in time,
Memory
• An embedded system have some form of memory to store and execute the code.
• It has some specific memory characteristics, such as hierarchy, width, and type.
Hierarchy
• Cache is placed between main memory and the core. It is used to speed up data transfer between the
processor and main memory.
• The main memory is around 256 KB to 256 MB, depending on the application.
• Secondary storage is the largest and slowest form of memory (600MB to 60GB).
Width
• The memory width is the number of bits the memory returns on each access. It may be 8, 16, 32 or 64-
bits.
Types
• Read only memory (ROM)
• Flash ROM
• Dynamic Random Access Memory (DRAM)
• Static Random Access Memory (SRAM)
• Synchronous dynamic Random Access Memory (SDRAM)
ARM bus technology
• Embedded systems use different bus technologies than those designed for
x86 PCs.
• Embedded devices use an on-chip bus that is internal to the chip and that
allows different peripheral devices to be interconnected with an ARM core.
• There are two different classes of devices attached to the bus.
• The ARM processor core is a bus master—a logical device capable of initiating a
data transfer with another device across the same bus.
• Peripherals tend to be bus slaves—logical devices capable only of responding to a
transfer request from a bus master device.
• A bus has two architecture levels.
• The first is a physical level that covers the electrical characteristics and bus width
(16, 32, or 64 bits).
• The second level deals with protocol—the logical rules that govern the
communication between the processor and a peripheral.
AMBA Bus Protocol
• The Advanced Microcontroller Bus Architecture (AMBA) was introduced in
1996 and has been widely adopted as the on-chip bus architecture used
for ARM processors.
• AMBA buses introduced were the ARM System Bus (ASB), ARM Peripheral
Bus (APB). & ARM High Performance Bus (AHB)
• AHB provides higher data throughput than ASB because it is based on a
centralized multiplexed bus scheme rather than the ASB bidirectional bus
design.
• This change allows the AHB bus to run at higher clock speeds and to be
the first ARM bus to support widths of 64 and 128 bits.
• ARM has introduced two variations on the AHB bus:
• Multi-layer AHB: the Multi-layer AHB bus allows multiple active bus masters.
• AHB-Lite: AHB-Lite is a subset of the AHB bus and it is limited to a single bus
Peripherals
• A peripheral device performs input and output functions for the chip by connecting to other devices that are
off-chip.
• Each peripheral device performs a single function
• All arm peripherals are memory mapped the programming interface is a set of memory addressed registers
• Specialized peripherals called as controllers that implement higher levels of functionality
• Two types of controllers
1. Memory controller
• Connect different types of memory to the processor bus
• On power-up a memory controller is configured in hardware to allow certain memory devices to
be active
• Some memory devices must be set up by software
2. Interrupt controller
• It provides a programmable that allows software to determine which peripheral or device can
interrupt the processor at any specific time by setting the appropriate bits in the interrupt controller
register
• Two types controller
1. Standard interrupt controller (SIC)
2. Vector Interrupt controller(VIC)
Embedded system software
• An embedded system needs software to drive it. Figure 1.4 shows four typical
software components required to control an embedded device.
Figure 1.4: four typical software components required to control an embedded device.
1. Initialization code:
• Initialization code (or boot code) takes the processor from the
reset state to a Run state.
• It usually configures the memory controller and processor caches
and initializes some devices.
• The initialization code handles a number of administrative tasks
namely initial hardware configuration, diagnostics, and booting.
❖ Initial hardware configuration involves setting up the target platform
so it can boot an image.
❖ Diagnostics are often embedded in the initialization code. Diagnostic
code tests the system by exercising the hardware target to check if the
target is in working order.
❖ Booting involves loading an image and handing control over to that
image.
2. Operating System
• The operating system provides an infrastructure to control applications and
manage hardware system resources.
• An operating system organizes the system resources: the peripherals,
memory, and processing time.
• ARM processors support over 50 operating systems. We can divide operating
systems into two main categories: real-time operating systems (RTOSs) and
platform operating systems.
3. Device drivers
• The device drivers are the third component shown in Figure 1.2. They provide
a consistent software interface to the peripherals on the hardware device.
4. Application
• Finally, an application performs one of the tasks required for a device.
• For example, a mobile phone might have a diary application. There may be
multiple applications running on the same device, controlled by the operating
ARM Processor Fundamentals
• ARM Core Dataflow Model
• Registers and Current Program Status Register
• Pipeline
• Exceptions, Interrupts, and the Vector Table
• Core Extensions
Data flow diagram
• Fig shows the ARM core dataflow model.
• Functional units are connected by data buses,
• The arrows represent the flow of data,
• The lines represent the buses, and the boxes
represent either an operation unit or a storage
area.
• Data enters the processor core through the Data
bus.
• The data may be an instruction to execute or
a data item.
• The instruction decoder translates instructions
before they are executed.
• The ARM processor uses load-store architecture.
• Load instructions copy data from memory to registers in
the core, and conversely the store instructions copy data from
registers to memory.
• Data items are placed in the register file—a storage bank
Data flow diagram……
• Since the ARM core is a 32-bit processor, most instructions treat the registers as holding signed or
unsigned 32-bit values.
• The sign extend hardware converts signed 8-bit and 16-bit numbers to 32-bit values as they are read from
memory and placed in a register.
• ARM instructions typically have two source registers, Rn and Rm, and a single result or destination register,
Rd. Source operands are read from the register file using the internal buses A and B, respectively.
• The ALU (arithmetic logic unit) or MAC (multiply-accumulate unit) takes the register values Rn and Rm
from the A and B buses and computes a result. Data processing instructions write the result in Rd directly
to the register file.
• Load and store instructions use the ALU to generate an address to be held in the address register and
broadcast on the Address bus.
• One important feature of the ARM is that register Rm alternatively can be preprocessed in the barrel
shifter before it enters the ALU. Together the barrel shifter and ALU can calculate a wide range of
expressions and addresses.
• After passing through the functional units, the result in Rd is written back to the register file using the
Result bus.
• For load and store instructions the incrementer updates the address register before the core reads or
writes the next register value from or to the next sequential memory location.
• The processor continues executing instructions until an exception or interrupt changes the normal
Registers
• General-purpose registers hold either data
or an address
• Identified by R prefixed to the register number
• 18 active registers: 16 data registers and 2 process status
registers
• 3 registers are assigned to a special function :
• R13- used as the stack pointer (sp) and stores the
address of the stack.
• R14- is where the core puts the return address
whenever it calls a subroutine
• R15- contains the address of the next instruction to be
fetched by the processor
• There are two program status registers:
• CPSR : current program status register
• SPSR : saved program status register
Current program status register (CPSR)
• Uses the cpsr to monitor and control internal operations
• It’s a 32 bit register and divided in to 4 fields : flags, status, extension and
control
• In current designs the extension and status fields are reserved for future use.
• The control field contains the processor mode, state, and interrupts mask
bits
• The flags field contains the condition flags
1. Processor Modes
• The processor mode determines which registers are active and the access
rights to the cpsr register itself.
• Each processor mode is either privileged or nonprivileged:
• A privileged mode allows full read-write access to the cpsr.
• Nonprivileged mode only allows read access to the control field in the cpsr but still
allows read-write access to the condition flags.
• There are seven processor modes in total: six privileged modes (abort, fast
interrupt request, interrupt request, supervisor, system, and undefined) and
one nonprivileged mode (user).
• The processor enters abort mode when there is a failed attempt to access memory.
• Fast interrupt request and interrupt request modes correspond to the two
interrupt levels available on the ARM processor.
• Supervisor mode is the mode that the processor is in after reset and is generally
the mode that an operating system kernel operates in.
Fields of CPSR
• System mode is a special version of user mode that allows full read-write
access to the cpsr.
• Undefined mode is used when the processor encounters an instruction that
is undefined or not supported by the implementation.
• User mode is used for programs and applications.
Interrupt Mask
• Interrupt masks are used to stop specific interrupt requests from interrupting the
processor.
• There are two interrupt request levels available on the ARM processor core—interrupt
request (IRQ) and fast interrupt request (FIQ).
• The I bit masks IRQ when set to binary 1, and similarly the F bit masks FIQ when set to
binary
State and Instruction Sets
• The state of the core determines which instruction set is being executed.
• There are three instruction sets: ARM, Thumb, and Jazelle.
• The ARM instruction set is only active when the processor is in ARM state (32 bit).
• Similarly the Thumb instruction set is only active when the processor is in Thumb state (16 bit). Once
in Thumb state the processor is executing purely Thumb 16-bit instructions.
• The ARM designers introduced a third instruction set called Jazelle. Jazelle executes 8-bit instructions.
Condition Flags
• Condition flags are updated by comparisons and the arithmetic and logical operations only
if instructions are used with the S suffix.
• For example, if SUBS subtract instruction results in a register value of zero, then
the Z flag in the cpsr is set. This particular subtract instruction specifically
updates the cpsr
• the processor is in supervisor
(SVC) mode since the mode[4:0]
is equal to binary 10011.
Banked registers
• 37 registers called banked registers
• These registers are called banked registers and are identified by the shading in the diagram.
• They are available only when the processor is in a particular mode; for example, abort mode
has banked registers r13_abt, r14_abt and spsr_abt.
• Every processor mode except user mode can change mode by writing directly to the mode
bits of the cpsr.
• All processor modes except system mode have a set of associated banked registers that are
a subset of the main 16 registers.
• A banked register maps one-to one onto a user mode register. If you change processor
mode, a banked register from the new mode will replace an existing register.
• For example, when the processor is in the interrupt request mode, the instructions you
execute still access registers named r13 and r11.
• However, these registers are the banked registers r13_irq and r14_irq. The user mode
registers r13_usr and r14_usr are not affected by the instruction referencing these
registers.
• The processor mode can be changed by a program that writes directly to the cpsr
• When power is applied to the core, it starts in supervisor mode, which is privileged. Starting
in a privileged mode is useful since initialization code can use full access to the cpsr.
Banked Registers(cont…)
• Figure illustrates what happens when an interrupt forces a
mode change
• The figure shows the core changing from user mode to
interrupt request mode, which happens when an
interrupt request occurs due to an external device
raising an interrupt to the processor core.
• This change causes user registers r13 and r14 to be banked.
The user registers are replaced with registers r13_irq and
r14_irq, respectively.
• Note r14_irq contains the return address and r13_irq
contains the stack pointer for interrupt request mode.
• Figure 1.7 also shows a new register appearing in interrupt
request mode: the saved program status register (spsr),
which stores the previous mode’s cpsr.
• The saving of the cpsr only occurs when an exception or
interrupt is raised
State and instruction sets
• The state of the core determines which instruction set is being executed
• There are 3 instruction sets :
1. ARM – 32-bit instructions
2. Thumb – 16-bit instructions
3. Jazelle – 8-bit instructions
• When both J and T bits are 0, the processor is in ARM state and executes ARM instructions .
Conditional execution
• Conditional execution controls whether processor has to execute an instruction
or not.
• Most instructions have a condition attribute that determines if the core will
execute it based on the setting of the condition flags.
• Prior to execution, the processor compares the condition attribute with the
condition flags in the cpsr. If they match, then the instruction is executed;
otherwise the instruction is ignored.
• The condition attribute is post fixed to the instruction mnemonic, which is
encoded into the instruction. Table lists the conditional execution code
mnemonics.
Conditional execution
Pipeline
• Pipeline is a process where process fetches the next instruction
while other instructions are being decoded and executed. It speeds
up the execution.
• Figure shows a three stage pipeline:
1. Fetch : loads an instruction from memory
2. Decode : identifies the instruction to be execute
3. Execute :processes the instruction and write the result back to a register
• Figure below illustrates the pipeline using a simple example. It shows a sequence of three
instructions being fetched, decoded, and executed by the processor.
• The three instructions are placed into the pipeline sequentially.
• In the first cycle the core fetches the ADD instruction from memory.
• In the second cycle the core fetches the SUB instruction and decodes the ADD instruction.
• In the third cycle, both the SUB and ADD instructions are moved along the pipeline. The ADD instruction is
executed, the SUB instruction is decoded, and the CMP instruction is fetched. This procedure is called filling
the pipeline.
• As the pipeline length increases, the execution speed increases.
• As the pipeline length increases, the amount of work done at each stage is
reduced and processor operates at a high speed performance increases
• Pipeline Executing Characteristics
• ARM pipeline has not processed an instruction until it passes completely through
execution stage
• For example, an ARM7 pipeline (with three stages) has executed an instruction only when
the fourth instruction is fetched.
• The following Figure shows an instruction sequence on an ARM7 pipeline.
EXCEPTIONS, INTERRUPTS AND THE VECTOR TABLE:
• When an exception or interrupt occurs,
the processor sets the pc to a
specific memory address. The
address is within a special address
range called the vector table.
• The entries in the vector table are
instructions that branch to specific
routines designed to handle a
particular exception or interrupt.
• The memory map address 0x00000000
(or in some processors starting at the offset
0xffff0000) is reserved for the vector
table, a set of 32-bit words .
• Reset vector is the location of the first instruction executed by the processor
when power is applied. This instruction branches to the initialization code.
• Undefined instruction vector is used when the processor cannot decode an
instruction
• Software interrupt vector is called when you execute a SWI instruction
• Prefetch abort vector occurs when the processor attempts to fetch an
instruction from an address without the correct access permissions.
• Data abort vector is similar to a prefetch abort
• Interrupt request vector is used by external hardware to interrupt the normal
execution flow of the processor. It can only be raised if IRQs are not masked in
the CPSR.
• Fast interrupt request vector is similar to the interrupt request, but is reserved
for hardware requiring faster response times. It can only be raised if FIQs are not
masked in the CPSR.
CORE EXTENSIONS:
• Core extensions are the standard hardware components placed next to the ARM
core.
• They improve performance, manage resources, and provide extra functionality
and are designed to provide flexibility in handling particular applications.
• Each ARM family has different extensions available. There are three hardware
extensions: cache and tightly coupled memory, memory management, and the
coprocessor interface.
Cache and Tightly Coupled Memory:
• The cache is a block of fast memory placed between main memory and the
core.
• It allows for more efficient fetches from some memory types.
• With a cache the processor core can run for the majority of the time without
having to wait for data from slow external memory.
• Most ARM-based embedded systems use a single-level cache internal to the processor.
• ARM has two forms of cache. The first is found attached to the Von Neumann–style
cores. It combines both data and instruction into a single unified cache, as shown in the
following Figure.
• The second form, attached to the Harvard-style cores,
has separate caches for data and instruction, as
shown in the Following Figure.
• A cache provides an overall increase in
performance, but at the expense of predictable execution.
But the real-time systems require the code
execution to be deterministic— the time taken
for loading and storing instructions or data
must be predictable.
• This is achieved using a form of memory called
tightly coupled memory (TCM). TCM is fast
SRAM located close to the core and guarantees
the clock cycles required to fetch instructions Figure: Von Neumann Architecture with Cache
or data
• TCMs appear as memory in the address map and can be accessed as fast memory.
• By combining both technologies, ARM processors can have both improved performance and
predictable real-time response. The following Figure shows an example core with a
combination of caches and TCMs.
Figure: Harvard Architecture with Caches and TCMs
Memory Management:
• Embedded systems often use multiple memory devices. It is usually necessary to have a
method to organize these devices and protect the system from applications trying to
make inappropriate accesses to hardware. This is achieved with the assistance of
memory management hardware.
• ARM cores have three different types of memory management hardware—
• Non protected memory is fixed and provides very little flexibility. It is normally used for
small, simple embedded systems that require no protection from rogue applications.
• a memory protection unit (MPU) employ a simple system that uses a limited number of
memory regions. These regions are controlled with a set of special coprocessor
registers, and each region is defined with specific access permissions. This type of
memory management is used for systems that require memory protection but don’t
have a complex memory map.
• a memory management unit (MMU) The MMU uses a set of translation tables to
provide fine-grained control over memory. These tables are stored in main memory
and provide a virtual-to-physical address map as well as access permissions. MMUs are
designed for more sophisticated platform operating systems that support multitasking.
Coprocessors:
• Coprocessors can be attached to the ARM processor. A coprocessor extends the processing
features of a core by extending the instruction set or by providing configuration registers.
More than one coprocessor can be added to the ARM core via the coprocessor interface.
• The coprocessor can be accessed through a group of dedicated ARM instructions that
provide a load-store type interface. o For example, coprocessor 15: The ARM processor
uses coprocessor 15 registers to control the cache, TCMs, and memory management.
• The coprocessor can also extend the instruction set by providing a specialized group of new
instructions. o For example, there are a set of specialized instructions that can be added to
the standard ARM instruction set to process vector floating-point (VFP) operations.
• These new instructions are processed in the decode stage of the ARM pipeline.
• If the decode stage sees a coprocessor instruction, then it offers it to the relevant
coprocessor.
• If the coprocessor is not present or doesn’t recognize the instruction, then the ARM takes
an undefined instruction exception, which allows you to emulate the behaviour of the
coprocessor in software