Computer Organization and Architecture
Computer Architecture refers to those attributes of a system that have a direct impact
on the logical execution of a program. Examples:
the instruction set
the number of bits used to represent various data types
I/O mechanisms
memory addressing techniques
Computer Organization refers to the operational units and their interconnections that
realize the architectural specifications. Examples are things that are transparent to the
programmer:
control signals
interfaces between computer and peripherals
the memory technology being used
So, for example, the fact that a multiply instruction is available is a computer
architecture issue. How that multiply is implemented is a computer organization issue.
Addressing Modes
The most common addressing techniques are:
• Immediate
• Direct
• Indirect
• Register
• Register indirect
• Displacement
• Stack
Immediate Addressing
The simplest form of addressing is immediate addressing, in which the operand value is
present in the instruction
Operand = A
This mode can be used to define and use constants or set initial values of variables .
The advantage of immediate addressing is that no memory reference other than the
instruction fetch is required to obtain the operand, thus saving one memory or cache cycle in
the instruction cycle. The disadvantage is that the size of the number is restricted to the size
of the address field, which, in most instruction sets, is small compared with the word length.
Direct Addressing
A very simple form of addressing is direct addressing, in which the address field contains the
effective address of the operand:
EA = A
The advantage is it requires only one memory reference and no special calculation. The
disadvantage is that it provides only a limited address space.
Indirect Addressing
With direct addressing, the length of the address field is usually less than the word length,
thus limiting the address range. One solution is to have the address field refer to the address
of a word in memory, which in turn contains a full-length address of the operand. This is
known as indirect addressing:
EA = (A)
Register Addressing
Register addressing is similar to direct addressing. The only difference is that the address
field refers to a register rather than a main memory address:
EA = R
To clarify, if the contents of a register address field in an instruction is 5, then register R5 is
the intended address, and the operand value is contained in R5.
The advantages of register addressing are that (1) only a small address field is needed in the
instruction, and (2) no time-consuming memory references are required because the memory
access time for a register internal to the processor is much less than that for a main memory
address. The disadvantage of register addressing is that the address space is very limited.
Register Indirect Addressing
Just as register addressing is analogous to direct addressing, register indirect addressing is
analogous to indirect addressing. In both cases, the only difference is whether the address
field refers to a memory location or a register. Thus, for register indirect address,
EA = (R)
The advantages and limitations of register indirect addressing are basically the same as for
indirect addressing. In both cases, the address space limitation (limited range of addresses) of
the address field is overcome by having that field refer to a word-length location containing
an address. In addition, register indirect addressing uses one less memory reference than
indirect addressing.
Displacement Addressing
A very powerful mode of addressing combines the capabilities of direct addressing and
register indirect addressing. We will refer to this as displacement addressing:
EA = A + (R)
Displacement addressing requires that the instruction have two address fields, at least one of
which is explicit. The value contained in one address field (value = A) is used directly. The
other address field, or an implicit reference based on opcode, refers to a register whose
contents are added to A to produce the effective address.
Stack Addressing
It is sometimes referred to as a pushdown list or last-in-first-out queue.
Items are appended to the top of the stack so that, at any given time, the block is partially
filled. Associated with the stack is a pointer whose value is the address of the top of the stack.
Alternatively, the top two elements of the stack may be in processor registers, in which case
the stack pointer references the third element of the stack.
The stack pointer is maintained in a register. Thus, references to stack locations in memory
are in fact register indirect addresses. The stack mode of addressing is a form of implied
addressing. The machine instructions need not include a memory reference but implicitly
operate on the top of the stack
INSTRUCTION PIPELINING
To improve the performance of a CPU we have two options: 1) Improve the hardware by
introducing faster circuits. 2) Arrange the hardware such that more than one operation can be
performed at the same time. Since there is a limit on the speed of hardware and the cost of
faster circuits is quite high, we have to adopt the 2nd option.
As computer systems evolve, greater performance can be achieved by taking advantage of
improvements in technology, such as faster circuitry, use of multiple registers rather than a
single accumulator, and the use of a cache memory. Another organizational approach is
instruction pipelining in which new inputs are accepted at one end before previously accepted
inputs appear as outputs at the other end.
(Two stage instruction pipelining)
The pipeline has two independent stages. The first stage fetches an instruction and buffers it.
When the second stage is free, the first stage passes it the buffered instruction. While the
second stage is executing the instruction, the first stage takes advantage of any unused
memory cycles to fetch and buffer the next instruction. This is called instruction prefetch or
fetch overlap. This process will speed up instruction execution only if the fetch and execute
stages were of equal duration, the instruction cycle time would be halved. However, if we
look more closely at this pipeline we will see that this doubling of execution rate is unlikely
for 3 reasons:
1 .The execution time will generally be longer than the fetch time. Thus, the fetch stage may
have to wait for some time before it can empty its buffer.
2. A conditional branch instruction makes the address of the next instruction to be fetched
unknown. Thus, the fetch stage must wait until it receives the next instruction address from
the execute stage. The execute stage may then have to wait while the next instruction is
fetched.
3. When a conditional branch instruction is passed on from the fetch to the execute stage, the
fetch stage fetches the next instruction in memory after the branch instruction. Then, if the
branch is not taken, no time is lost .If the branch is taken, the fetched instruction must be
discarded and a new instruction fetched. To gain further speedup, the pipeline must have
more stages.
Let us consider the following decomposition of the instruction processing.
1. Fetch instruction (FI): Read the next expected instruction into a buffer.
2. Decode instruction (DI): Determine the opcode and the operand specifiers.
3. Calculate operands (CO): Calculate the effective address of each source operand. This may
involve displacement, register indirect, indirect, or other forms of address calculation.
4. Fetch operands (FO): Fetch each operand from memory.
5. Execute instruction (EI): Perform the indicated operation and store the result, if any, in the
specified destination operand location.
6. Write operand (WO): Store the result in memory
A six-stage pipeline can reduce the execution time for 9 instructions from 54 time units to 14
time units.
Performance of a pipelined processor
Consider a ‘k’ segment pipeline with clock cycle time as ‘Tp’. Let there be ‘n’ tasks to be
completed in the pipelined processor. Now, the first instruction is going to take ‘k’ cycles to
come out of the pipeline but the other ‘n – 1’ instructions will take only ‘1’ cycle each, i.e, a
total of ‘n – 1’ cycles. So, time taken to execute ‘n’ instructions in a pipelined processor:
ETpipeline
= (k + n – 1) Tp
In the same case, for a non-pipelined processor, the execution time of ‘n’ instructions will be:
ETnon-pipeline = n * k * Tp
So, speedup (S) of the pipelined processor over the non-pipelined processor, when ‘n’ tasks
are executed on the same processor is:
S=Performance of non-pipelined processor/ Performance of pipelined processor
As the performance of a processor is inversely proportional to the execution time, we have,
S=ETnon-pipeline/ETpipeline
=>S=[n*k*Tp]/[(k+n–1)*Tp]
S = [n * k] / [k + n – 1]
When the number of tasks ‘n’ is significantly larger than k, that is, n >> k
S=n*k/n
S = k, where ‘k’ are the number of stages in the pipeline
Types of Interrupts and How to Handle Interrupts :
In early years of computing processor has to wait for the signal for processing, so processor
has to check each and every hardware and software program in the system if it has any signal
to process. If any signal came for the process, processor will take some time to process the
signal due to the polling process in action. So system performance also will be degraded and
response time of the system will also decrease. So to over this problem engineers introduced
a new mechanism, in this mechanism processor will not check for any signal from hardware
or software but instead hardware/software will only send the signal to the processor for
processing. The signal from hardware or software should have highest priority because
processor should leave the current process and process the signal of hardware or software.
This mechanism of processing the signal is called interrupt of the system.
What is an Interrupt?
Interrupt is a signal which has highest priority from hardware or software which processor
should process its signal immediately.
Types of Interrupts:
Although interrupts have highest priority than other signals, there are many type of interrupts
but basic type of interrupts are
1. Hardware Interrupts: If the signal for the processor is from external device or
hardware is called hardware interrupts. Example: from keyboard we will press the key
to do some action this pressing of key in keyboard will generate a signal which is
given to the processor to do action, such interrupts are called hardware interrupts.
Hardware interrupts can be classified into two types they are
Maskable Interrupt: The hardware interrupts which can be delayed when a much
highest priority interrupt has occurred to the processor.
Non Maskable Interrupt: The hardware which cannot be delayed and should process
by the processor immediately.
2. Software Interrupts: Software interrupt can also divided in to two types. They are
Normal Interrupts: the interrupts which are caused by the software instructions are
called software instructions.
Exception: unplanned interrupts while executing a program is called Exception. For
example: while executing a program if we got a value which should be divided by
zero is called a exception.
Classification of Interrupts According to Periodicity of Occurrence:
1. Periodic Interrupt: If the interrupts occurred at fixed interval in timeline then that
interrupts are called periodic interrupts
2. Aperiodic Interrupt: If the occurrence of interrupt cannot be predicted then that
interrupt is called aperiodic interrupt.
Classification of Interrupts According to the Temporal Relationship with System
Clock:
1. Synchronous Interrupt: The source of interrupt is in phase to the system clock is
called synchronous interrupt. In other words interrupts which are dependent on the
system clock. Example: timer service that uses the system clock.
2. Asynchronous Interrupts: If the interrupts are independent or not in phase to the
system clock is called asynchronous interrupt. Interrupt Handling:
We know that instruction cycle consists of fetch, decode, execute and read/write
functions. After every instruction cycle the processor will check for interrupts to be
processed if there is no interrupt is present in the system it will go for the next
instruction cycle which is given by the instruction register. If there is an interrupt
present then it will trigger the interrupt handler, the handler will stop the present
instruction which is processing and save its configuration in a register and load the
program counter of the interrupt from a location which is given by the interrupt vector
table. After processing the interrupt by the processor interrupt handler will load the
instruction and its configuration from the saved register, process will start its
processing where it’s left. This saving the old instruction processing configuration and
loading the new interrupt configuration is also called as context switching. The
interrupt handler is also called as Interrupt service routine (ISR). There are different
types of interrupt handler which will handle different interrupts. For example for the
clock in a system will have its interrupt handler, keyboard it will have its interrupt
handler for every device it will have its interrupt handler. The main features of the
ISR are
Interrupts can occur at any time they are asynchronous. ISR’s can call for
asynchronous interrupts.
Interrupt service mechanism can call the ISR’s from multiple sources.
ISR’s can handle both maskable and non maskable interrupts. An instruction in a
program can disable or enable an interrupt handler call.
ISR on beginning of execution it will disable other devices interrupt services. After
completion of the ISR execution it will re initialize the interrupt services.
The nested interrupts are allowed in ISR for diversion to other ISR.
Semiconductor Memories:
RAM, ROM, SRAM, DRAM
Semiconductor memory is the main memory element of a microcomputer-based system and
is used to store program and data. The main memory elements are nothing but
semiconductor devices that stores code and information permanently. The semiconductor
memory is directly accessible by the microprocessor. And the access time of the data present
in the primary memory must be compatible with the operating time of the microprocessor.
Thus semiconductor devices are preferred as primary memory. With the rapid growth in the
requirement for semiconductor memories there have been a number of technologies and
types of memory that have emerged like ROM, RAM, EPROM, EEPROM, Flash memory,
DRAM, SRAM, SDRAM etc.
(i) Random Access Memory (RAM)
As the names suggest, the RAM or random access memory is a form of semiconductor
memory technology that is used for reading and writing data in any order - in other words as
it is required by the processor. It is used for such applications as the computer or processor
memory where variables and other storage are required on a random basis. Data is stored and
read many times to and from this type of memory.
Random access memory is used in huge quantities in computer applications as current day
computing and processing technology requires large amounts of memory to enable them to
handle the memory hungry applications used today. Many types of RAM including SDRAM
with its DDR3, DDR4, and soon DDR5 variants are used in huge quantities.
DRAM
Dynamic RAM is a form of random access memory. DRAM uses a capacitor to store each bit
of data, and the level of charge on each capacitor determines whether that bit is a logical 1 or
0. However these capacitors do not hold their charge indefinitely, and therefore the data
needs to be refreshed periodically. As a result of this dynamic refreshing it gains its name of
being a dynamic RAM.
DRAM is the form of semiconductor memory that is often used in equipment including
personal computers and workstations where it forms the main RAM for the computer. The
semiconductor devices are normally available as integrated circuits for use in PCB assembly
in the form of surface mount devices or less frequently now as leaded components.
Disadvantages of DRAM
1. Complex manufacturing process
2. Data requires refreshing
3. More complex external circuitry required (read and refresh periodically)
4. Volatile memory
5. Relatively slow operational speed
6. Need to refresh the capacitor charge every once in two milliseconds.
SRAM
SRAM is stands for Static Random Access Memory. This form of semiconductor memory
gains its name from the fact that, unlike DRAM, the data does not need to be refreshed
dynamically. These semiconductor devices are able to support faster read and write times
than DRAM (typically 10 ns against 60 ns for DRAM), and in addition its cycle time is much
shorter because it does not need to pause between accesses.
However they consume more power, they are less dense and more expensive than DRAM. As
a result of this SRAM is normally used for caches, while DRAM is used as the main
semiconductor memory technology.
SDRAM
Synchronous DRAM. This form of semiconductor memory can run at faster speeds than
conventional DRAM. It is synchronized to the clock of the processor and is capable of
keeping two sets of memory addresses open simultaneously. By transferring data alternately
from one set of addresses, and then the other, SDRAM cuts down on the delays associated
with non-synchronous RAM, which must close one address bank before opening the next.
Within the SDRAM family there are several types of memory technologies that are seen.
These are referred to by the letters DDR - Double Data Rate. DDR4 is currently the latest
technology, but this is soon to be followed by DDR5 which will offer some significant
improvements in performance.
The general procedure of static memory interfacing with 8086 is briefly described as follows:
1. Arrange the available memory chips so as to obtain 16-bit data bus width. The upper 8-bit
bank is called ‘odd address memory bank’ and the lower 8-bit bank is called ‘even address
memory bank’.
2. Connect available memory address lines of memory chips with those of the microprocessor
and also connect the memory RD and WR inputs to the corresponding processor control
signals. Connect the 16-bit data bus of the memory bank with that of the microprocessor
8086.
3. The remaining address lines of the microprocessor, BHE and A0 are used for decoding the
required chip select signals for the odd and even memory banks. CS of memory is derived
from the O/P of the decoding circuit
Example problems on memory interfacing with 8086
RISC (Reduced Instruction Set Computer)
To execute each instruction, if there is separate electronic circuitry in the control unit, which
produces all the necessary signals, this approach of the design of the control section of the
processor is called RISC design. It is also called hard-wired approach.
Examples of RISC processors: IBM RS6000, MC88100. DEC’s Alpha 21064, 21164 and
21264 processors
Features of RISC Processors:
The standard features of RISC processors are listed below:
RISC processors use a small and limited number of instructions.
RISC machines mostly uses hardwired control unit.
RISC processors consume less power and are having high performance.
Each instruction is very simple and consistent.
RISC processors uses simple addressing modes.
RISC instruction is of uniform fixed length.
CISC (Complex Instruction Set Computer)
If the control unit contains a number of microelectronic circuitry to generate a set of control
signals and each micro-circuitry is activated by a microcode, this design approach is called
CISC design.
Examples of CISC processors are: Intel 386, 486, Pentium, Pentium Pro, Pentium II,
Pentium III ,Motorola’s 68000, 68020, 68040, etc.
Features of CISC Processors:
The standard features of CISC processors are listed below:
CISC chips have a large amount of different and complex instructions.
CISC machines generally make use of complex addressing modes.
Different machine programs can be executed on CISC machine.
CISC machines uses micro-program control unit.
CISC processors are having limited number of registers.
Von Neumann Architecture& Harvard Architecture
Von Neumann Architecture Harvard Architecture
The same physical memory is used to store Separate physical memory is provided to
instructions and data store instructions and data
It is based on the stored-program concept It is based on relay-based computer models
Common bus for transferring instructions and Separate buses are used to transfer
data instructions and data
Two clock cycles are used to execute a single A single clock cycle is used to execute a
instruction single instruction
It is cheaper in comparison to Harvard It is more expensive than Von Neumann’s
architecture architecture
Used in personal computers Used in microcontrollers and signal
processing.
Embedded Systems: Basic structure, characteristics
The basic structure of Embedded system consists of the following ,
Processor: The processor is the heart of embedded system. The selection of processor is
based on the following consideration
Instruction set
Maximum bits of operation on single arithmetic and logical operation
Speed
Algorithms processing and capability
Types of processor( microprocessor, microcontroller, digital signal processor,
application specific processor, general purpose processor)
Power source:
Internal power supply is must. Es require from power up to power down to start time task.
Also it can run continuously that is stay “On’ system consumes total power hence efficient
real time programming by using proper ‘wait’ and ‘stop’ instruction or disable some unit
which are not in use can save or limit power consumption.
Clock / oscillator Circuits:
The clock ckt is used for CPU, system timers, and CPU machine cycles clock controls the
time for executing an instruction. Clock oscillator may be internal or external .It should be
highly stable.
Real time clock(RTC):
It require to maintain scheduling various tasks and for real time programming RTC also use
for driving timers, counters needs in the system.
Resets Ckts and power on reset:
Reset process starts executing various instruction from the starting address. The address is set
by the processor in the program counter. The reset step resent and runs the program in the
following way
System program that execute from beginning
System boot up program
System initialization program
Memory :
A system embeds either in the internal flash or ROM, PROM or in an external flash or ROM
or PROM of the microcontroller.
Requires real time performance
It should have high availability and reliability.
Developed around a real-time operating system
Usually, have easy and a diskless operation, ROM boot
Designed for one specific task
It must be connected with peripherals to connect input and output devices.
Offers high reliability and stability
Needed minimal user interface
Limited memory, low cost, fewer power consumptions
It does not need any secondary memory in computer.
Applications - Sensors and Transducers Interfacing of sensors and actuators with
microcontroller (motor/display/LED/relay, switch)
Interface a Single Pole Single Throw (SPST) Switch to the microcontroller.
To convert the 100MΩ/0.1Ω resistance into a digital signal, we can use a pull-down resistor
to ground or a pull-up resistor to +3.3V as shown in Figure. Notice that 10 kΩ is 100,000
times larger than the on-resistance of the switch and 10,000 times smaller than its off-
resistance. Another way to choose the pull-down or pull-up resistor is to consider the input
current of the microcontroller input pin. The current into the microcontroller will be less than
2µA .So, if the current into microcontroller is 2µA, then the voltage drop across the 10 kΩ
resistor will be 0.02 V, which is negligibly small. With a pull-down resistor shown on the
right side of Figure, the digital signal will be low if the switch is not pressed and high if the
switch is pressed. The signal being 3.3V when the switch is pressed is defined as positive
logic, because the asserted switch state is a logic high. Conversely, with a pull-up resistor
shown on the left side of Figure, the digital signal will be high if the switch is not pressed and
low if the switch is pressed. The signal being 0V when the switch is pressed is defined
as negative logic, because the asserted switch state is a logic low.
Two ways to interface a Single Pole Single Throw (SPST) Switch to the microcontroller.
LED Interfacing
Relay Interfacing
DC MOTOR Interfacing
Data Acquisition Systems
The systems, used for data acquisition are known as data acquisition systems. These data
acquisition systems will perform the tasks such as conversion of data, storage of data,
transmission of data and processing of data.
Data acquisition systems can be classified into the following two types.
Analog Data Acquisition Systems
Digital Data Acquisition Systems
The data acquisition systems, which can be operated with analog signals are known as analog
data acquisition systems. Following are the blocks of analog data acquisition systems.
Transducer − It converts physical quantities into electrical signals.
Signal conditioner − It performs the functions like amplification and selection of
desired portion of the signal.
Display device − It displays the input signals for monitoring purpose.
Graphic recording instruments − These can be used to make the record of input data
permanently.
Magnetic tape instrumentation − It is used for acquiring, storing & reproducing of
input data.
Digital Data Acquisition Systems
The data acquisition systems, which can be operated with digital signals are known as digital
data acquisition systems. So, they use digital components for storing or displaying the
information.
Mainly, the following operations take place in digital data acquisition.
Acquisition of analog signals
Conversion of analog signals into digital signals or digital data
Processing of digital signals or digital data
Following are the blocks of Digital data acquisition systems.
Transducer − It converts physical quantities into electrical signals.
Signal conditioner − It performs the functions like amplification and selection of
desired portion of the signal.
Multiplexer − connects one of the multiple inputs to output. So, it acts as parallel to
serial converter.
Analog to Digital Converter − It converts the analog input into its equivalent digital
output.
Display device − It displays the data in digital format.
Digital Recorder − It is used to record the data in digital format.
Buses & Protocols – I2C, SPI
I2C
I2C or I2C stands for ‘Inter-Integrated Circuit’ and is a simple ‘two wire’ protocol with
just two wires, and was developed by Philips in 1980 for its TV applications which
required the connection of a CPU to many ICs. Today, this bus is very widely used in
the embedded fi eld. Th is is a synchronous, half duplex, serial protocol and is also byte
oriented, which means that one byte is sent, but one bit at a time in a serial fashion. After
each byte, an acknowledgement is to be sent by the receiver IC to the sender IC.
The master, usually a microcontroller unit (MCU), can transmit as well as receive, so also the
slaves depending on whether they are input or output devices. For example, a slave which
is a ROM can only be read from, an LCD controller can only be written to, while an
external RAM chip can be read and written into. The two signal wires are bidirectional and
carry the signals SCL, the serial clock and SDA the serial data. Each device has its own
unique address, usually fixed by hardware.
First, the master issues a START signal. This signal causes all the slaves to come to
attention and listen. The start condition corresponds to the action of the master pulling
the SDA line low, when the clock (SCL) is high.
The first byte sent by the master is the address. Th is address (7-bit) is sent serially
on the SDA line (MSB fi rst). Note that the bits on the SDA line are synchronized
by the clock signal on the SCL line which means that the data on the SDA line
is read during the time that the clock on the SCL line is high (data is valid at the
L to H transition of the clock).
Just after this, the master also sends the R/W signal indicating the direction of data
transfer .Note that all activities are synchronized by the clock.
Only one of the slaves will have the broadcasted address, and on realizing that
its address matches with this address, the particular slave responds by sending an
‘acknowledge’ signal back to the master.
Now a byte can be received from the slave if the R/W bit is set to READ, or be
written to the slave
Once this data transfer is over, the device (master or slave) that has received the
byte sends an acknowledge signal. Acknowledgement is when the receiver drives
SDA low.
If more bytes are to be transferred above steps repeated.
After this, the master pulls the SCL high, and then the SDA line also. This
amounts to a STOP condition when the bus is idle, also indicating that it is
available for use by other slaves.
There are three standards for I2C bus and have the following three speeds:
i) Slow (under 100 Kbps)
ii) Fast (400 Kbps)
iii) High-speed (3.4 Mbps)
SPI
The serial peripheral interface (SPI) is one of the most widely used interfaces
between microcontroller and peripheral ICs such as sensors, ADCs, DACs, shift
registers, SRAM, and others. SPI is a synchronous, full duplex master-slave-based
interface. The data from the master or the slave is synchronized on the rising or
falling clock edge. Both master and slave can transmit data at the same time. The
SPI interface can be either 3-wire or 4-wire.
4-wire SPI devices have four signals:
Clock (SPI CLK, SCLK)
Chip select (CS)/SS
Master out, slave in (MOSI)
Master in, slave out (MISO)
The device that generates the clock signal is called the master. Data transmitted between the
master and the slave is synchronized to the clock generated by the master. SPI devices
support much higher clock frequencies compared to I2C interfaces. SPI interfaces can have
only one master and can have one or multiple slaves.
The chip select signal from the master is used to select the slave. This is normally an active
low signal and is pulled high to disconnect the slave from the SPI bus. When multiple slaves
are used, an individual chip select signal for each slave is required from the master. MOSI
and MISO are the data lines. MOSI transmits data from the master to the slave and MISO
transmits data from the slave to the master.
Data Transmission
To begin SPI communication, the master must send the clock signal and select the slave by
enabling the CS signal. Usually chip select is an active low signal; hence, the master must
send a logic 0 on this signal to select the slave.
SPI is a full-duplex interface; both master and slave can send data at the same time via the
MOSI and MISO lines respectively. During SPI communication, the data is simultaneously
transmitted (shifted out serially onto the MOSI/SDO bus) and received (the data on the bus
(MISO/SDI) is sampled or read in). The serial clock edge synchronizes the shifting and
sampling of the data. The SPI interface provides the user with flexibility to select the rising or
falling edge of the clock to sample and/or shift the data.