EC6504 Unit 2 Updated Notes
EC6504 Unit 2 Updated Notes
Prepared By Verified By
Mr.B.Ramesh Asso.Prof/ ECE
Mrs.C.Rajani Assit.Prof/ ECE
1
UNIT II
8086 SYSTEM BUS STRUCTURE
Syllabus:
8086 signals – Basic configurations – System bus timing –System design
using 8086 – IOprogramming – Introduction to Multiprogramming – System Bus
Structure - Multiprocessor configurations – Coprocessor, Closely coupled and loosely
Coupled configurations – Introduction to advanced processors.
2.1 8086
SIGNALS PIN OUT SIGNALS AND FUNCTIONS OF
8086
8086 is available in three clock rates, i.e. 5, 8 and 10 MHz, packaged as a 40 pin
chip. The 8086 operates in single processor or multiprocessor configurations to
achieve high performance.
The following signal descriptions are common for both the minimum and maximum
modes. AD15-- AD0 These are the time multiplexed memory I/O address and data lines.
Address remains on the lines during T 1 state, while the data is available on the data
bus duringT2, T3, Tw andT4. Here T2, T3, T4 and Tw are the clock states of a machine
2
cycle. Tw is a wait state. These lines are active high and float to a tristate during
interrupt acknowledge and local bus hold acknowledge cycles.
3
A19/S6,A18/S5,A17/S4, A16/S3 These are the time multiplexed address and status lines.
During T1, these are the most significant address lines for memory operations.
During I/O operations, these lines are low. During memory or I/O operations, status
information is available on those lines for T 2, T3, Tw andT4. The status of the interrupt
enable flag bit (displayed on S 5) is updated at the beginning of each clock cycle. The
S4 and S3 combinedly indicate which segment register is presently being used for
memory accesses as shown in Table 1.1. These lines float to tri-state off (tristated)
during the local bus hold acknowledge. The status line S 6 is always low (logical). The
address bits are separated from the status bits using latches controlled by the ALE
signal.
Table 1.1
S4 S3 Indications
0 0 Alternate Data
0 1 Stack
1 0 Code or none
1 1 Data
BHE / S7-Bus High Enable/Status The bus high enable signal is used to indicate the
transfer of data over the higher order (D 15—D8) data bus as shown in Table 1.2. It
goes low for the data transfers over D 15—D8 and is used to derive chip selects of odd
address memory bank or peripherals. BHE is low during T 1 for read, write and
interrupt acknowledge cycles, whenever a byte is to be transferred on the higher
byte of the data bus. The status information is available during T 2, T3 andT4. The
signal is active low and is tristated during ‗hold‘. It is low during T 1 for the first pulse
of the interrupt acknowledge cycle.
Table 1 .2 Bus High Enable/Status
BHE A0 Indications
0 0 Whole Word
0 1 Upper byte from or to odd address
1 0 Lower byte from or to even address
1 1 None
RD-Read Read signal, when low, indicates the peripherals that the processor is
performing a memory or I/O read operation. RD is active low and shows the state for
T2, T3, Tw of any read cycle. The signal remains tristated during the ‗hold
4
acknowledge‘.
READY This is the acknowledgement from the slow devices or memory that they
have completed the data transfer. The signal made available by the devices is
synchronized by the 8284A clock generator to provide ready input to the 8086. The
signal is active high.
5
INTR- Interrupt Request This is a level triggered input. This is sampled during the last
clock cycle of each instruction to determine the availability of the request. If any
interrupt request is pending, the processor enters the interrupt acknowledge cycle.
This can be internally masked by resetting the interrupt enable flag. This signal is
active high and internally synchronized.
TEST This input is examined by a ‗WAIT‘ instruction. If the TEST input goes low,
execution will continue, else, the processor remains in an idle state. The input is
synchronized internally during each clock cycle on leading edge of clock.
NMI-Non-maskable Interrupt This is an edge-triggered input which causes a Type2
interrupt. The NMI is not maskable internally by software. A transition from low to
high initiates the interrupt response at the end of the current instruction. This input
is internally synchronized.
RESET This input causes the processor to terminate the current activity and start
execution from FFFF0H. The signal is active high and must be active for at least four
clock cycles. It restarts execution when the RESET returns low. RESET is also
internally synchronised.
CLK-Clock Input The clock input provides the basic timing for processor operation
and bus control activity. Its an asymmetric square wave with 33% duty cycle. The
range of frequency for different 8086 versions is from 5MHz to 10MHz.
Vcc +5V power supply for the operation of the internal circuit.
GND ground for the internal circuit.
MN/MX The logic level at this pin decides whether the processor is to operate in
either minimum (single processor) or maximum (multiprocessor) mode.
MINIMUM MODE
The following pin functions are for the minimum mode operation of 8086.
7
DEN-Data Enable This signal indicates the availability of valid data over the
address/data lines. It is used to enable the transreceivers (bidirectional buffers) to
separate the data from the multiplexed address/data signal. It is active from the
middle of T2 until the middle of T4. DEN is tristated during ‗hold acknowledge‘ cycle.
HOLD, HLDA-Hold /Hold Acknowledge When the HOLD line goes high, it indicates to
the processor that another master is requesting the bus access. The processor, after
receiving the HOLD request, issues the hold acknowledge signal on HLDA pin, in the
middle of the next clock cycle after completing the current bus (instruction) cycle. At
the same time, the processor floats the local bus and control lines. When the
processor detects the HOLD line low, it lowers the HLDA signal. HOLD is an
asynchronous input, and it should be externally synchronized.
If the DMA request is made while the CPU is performing a memory or I/O cycle, it will
release the local bus during T4 provided:
1. The request occurs on or before T2 state of the current cycle.
2. The current cycle is not operating over the lower byte of a word (or operating
on an odd address).
3. The current cycle is not the first acknowledge of an interrupt acknowledge
sequence. A Lock instruction is not being executed
MAXIMUM MODE SIGNALS
The following pin functions are applicable for maximum mode operation of 8086.
S2, S1, S0-Status Lines These are the status lines which reflect the type of operation,
being carried out by the processor. These become active during T 4 of the previous
cycle and remain active during T1 and T2 of the current bus cycle. The status lines
return to passive state during T 3 of the current bus cycle so that they may again
become active for the next bus cycle during T 4. Any change in these lines during T 3
indicates the starting of a new cycle, and return to passive state indicates end of the
bus cycle. These status lines are encoded in Table 1.3.
Table 1 .3
S 2 S1 S0 INDICATIONS
8
0 0 0 Interrupt Acknowledge
0 0 1 Read I/O port
0 1 0 Write I/O port
0 1 1 Halt
1 0 0 Code Access
1 0 1 Read Memory
1 1 0 Write memory
1 1 1 Passive
9
LOCK This output pin indicates that other system bus masters will be prevented
from the
system bus, while the LOCK signal is low. The LOCK signal is activated by the LOCK
prefix instruction and remains active until the completion of the next instruction.
This floats to tri-state off during ―hold acknowledge‖. When the CPU is executing a
critical instruction which requires the system bus, the LOCK prefix instruction
ensures that other processors connected in the system will not gain the control of
the bus. The 8086, while executing the prefixed instruction, asserts the bus lock
signal output, which may be connected to an external bus controller.
QS1, QS0-Queue Status These lines give information about the status of the code
prefetch queue. These are active during the CLK cycle after which the queue
operation is performed. These are encoded as shown in Table 1.4.
Table 1 .4
QS1 QS2 Indications
0 0 No operation
0 1 First byte of Opcode from the
queue
1 0 Empty queue
1 1 Subsequent byte from the
queue
RQ / GT0 , RQ / GT1 -Request/Grant These pins are used by other local bus masters,
in maximum mode, to force the processor to release the local bus at the end of the
processor‘s currentbus cycle. Each of the pins is bidirectional with RQ / GT0 having
10
float and that it will enter the ―hold acknowledge‖ state at next clock cycle.
The CPU‘s bus interface unit is likely to be disconnected from the local bus of
the system.
3. A one clock wide pulse from another master indicates to 8086 that the ‗hold‘
requestis about to end and the 8086 may regain control of the local bus at the
next clock cycle.
Thus each master to master exchange of the local bus is a sequence of 3 pulses.
Theremust be at least one dead clock cycle after each bus exchange. The request
and grant pulses are active
11
low. For the bus requests those are received while 8086 is performing memory or I/O
cycle, the granting of the bus is governed by the rules as discussed in case of HOLD
and HLDA in minimum mode.
signal indicates that the valid data is available on the data bus, while DT/ R
indicates the direction of data, i.e. from or to the processor. The system contains
memory for the monitor and users program storage. Usually, EPROMS are used for
monitor storage, while RAMs for users program storage. A system may contain I/O
devices for communication with the processor as well as some special purpose I/O
devices. The clock generator generates the clock from the crystal oscillator and then
shapes it and divides to make it more precise so that it can be used as an accurate
timing reference for the system. The clock generator also synchronizes some
external signals with the system clock. Since it has 20 address lines and 16 data
lines, the 8086 CPU requires three octal address latches and two octal data buffers
for the complete address and data separation. The system configuration is shown
below
12
The read cycle begins in T1 with the assertion of address latch enable (ALE) signal
and also M / IO signal. During the negative going edge of this signal, the valid
address is latched on the local bus.
The BHE and A0 signals address low, high or both bytes. From T1 to T4, the M/IO
signal indicates a memory or I/O operation.
At T2, the address is removed from the local bus and is sent to the output. The bus is
then tri- stated. The read (RD) control signal is also activated in T2.The read (RD)
signal causes the address device to enable its data bus drivers. After RDthe valid
data is available on the data bus.When the processor returns the read signal to high
level, the addressed device will again tristate its bus drivers.
A write cycle also begins with the assertion of ALE and the emission of the address.
The M/ IO signal is again asserted to indicate a memory or I/O operation. In T2, after
sending the address in T1, the processor sends data to be written to the addressed
location. The data remains on the
bus until middle ofT4 state. WR becomes active at the beginning ofT 2 . BHE and A0
are used to select the proper byte or bytes of memory or I/O word to be read or
written.
13
14
M/ IO RD WR Indications
0 0 1 I/O Read
0 1 0 I/O Write
1 0 1 Memory Read
1 1 0 Memory Write
pin to ground. In this mode, the processor derives the status signals S2 , S1 and S0 .
bus controller derives the control signals using this status information. In the
maximum mode, there may be more than one microprocessor in the system
configuration. The other components in the system are the same as in the minimum
mode system.
like RD andWR (for memory and I/O devices), DEN , DT/ R , ALE, etc. using the
information made available by the processor on the status lines. The bus controller
chip has input lines and
S2 , S1 and S0 CLK. These inputs to 8288 are driven by the CPU. It derives the
outputs ALE,
DEN , DT/ R , MRDC , MWTC , AMWC , IORC , IOWC and AIOWC . The AEN , IOB
and
CEN pins are specially useful for multiprocessor systems. AEN and IOB are
generally
grounded. CEN pin is usually tied to +5V. The significance of the MCE/ PDEN output
depends upon the status of the IOB pin. If IOB is grounded, it acts as master cascade
enable to control cascaded 8259A, else it acts as peripheral data enable used in the
multiple bus configurations.
INTA pin is used to issue two interrupt acknowledge pulses to the interrupt
controller or to an interrupting device.
15
IORC , IOWC are I/O read command and I/O write command signals respectively.
These signals enable an IO interface to read or write the data from or to the
MWTC are memory read command and memory write command signals respectively
and may be used as memory read and write signals. All these command signals
instruct the memory to accept or send data from or to the bus. For both of these
write command signals, the advanced
signals namely AMWC and AIOWC are available. They also serve the same purpose,
but are activated one clock cycle earlier than the IOWC and MWTC signals,
respectively. The
16
maximum mode system is shown in Fig. 1.10.The maximum mode system timing
diagrams are also divided in two portions as read (input) and write (output) timing
diagrams. The address/data and address/status timings are similar to the minimum
mode. ALE is asserted in T1, just like minimum mode.
S0, S1, S2 are set at the beginning of bus cycle.8288 bus controller will output a
pulse as on the ALE and apply a required signal to its DT / R pin during T1.
In T2, 8288 will set DEN=1 thus enabling transceivers, and for an input it will
activate MRDC or IORC. These signals are activated until T4. For an output, the
AMWC or AIOWC is activated from T2 to T4 and MWTC or IOWC is activated from
T3 to T4.
The status bit S0 to S2 remains active until T3 and become passive during
T3 and T4. If reader input is not activated before T3, wait state will be
inserted between T3 and T4. Memory Read Timing Diagram for Maximum
17
Memory Write Timing Diagram for Maximum Mode Operation of 8086: -
The Fig. 2.12 shows the typical minimum mode 8086 system.
For interfacing memory module to 8086, it is necessary to have odd
and even memorybanks. This is implemented by using two EPROMs and two
RAMs.
Data lines DI5-D8 areconnected to odd bank of EPROM and RAM,, and data
lines DrD0 are connected to evenbank of EPROM and RAM.
18
Address lines are connected to EPROM and RAM as per theircapacities.
19
RD signal is connected to the output enable (0E) signals of EPROMs
andRAMs. WR signal is connected to WR signal of RAMs.
Two separate decoders are used toGenerate chip select signals for memory
and I/O devices. These chip select signals are logically ORed with either BHE
or to generate final chip select signals.
RD and WR signals are connected tothe RD and WR signals of I/O
device. Data lines D15-D0 are connected to the data lines ofI/O
device
20
Figure 2.13 Maximum Mode 8086 system
2.5 IO programming
The transfer of data between keyboard and microprocessor, and
microprocessor and display device is called input /output data transfer or I/O
data transfer.
This data transfer is done by using I/O ports.
2.5.1 Input port:
It is used to read data from the input device such as
keyboard. The simplest form of input port is a buffer.
The input device is connected to the microprocessor through buffer as shown
Data bus Fig 2.14.
This buffer is a tri-state buffer and its output is available only Enable when
enable signal is active.
When microprocessor wants to read data from the input device (keyboard), the
control signals from the microprocessor activates the buffer by asserting
enable Input of the buffer,
Once the buffer is enabled, data train the Input device is available on
the data bus. Microprocessor reeds this data by Initiating reed
command.
2.5.2 Output port:
It is used to send data to the output device such as display from the
microprocessor.
The simplest formof output port is a latch.The output device is
21
connected to the microprocessor through latch, as shown In the Fig. 2.15.
When microprocessor wants to send data to the output device, it puts the
data on the data bus and activates the clock signalof the latch.
22
Figure 2.12 Output Port
I/O system until it finds that the operation is complete. This process is
illustrated in below figure 2.16.
Figure 2.13 Flowchart for I/O service routine
2.5.4 Interrupt Driven I/O
The moat common method of servicing such device is the polled
approach. This is where the processor must test each device in
sequence.
It needs communication with the processor.
23
It is easy to see that a large portion of the main program is looping
through this continuous polling cycle.
Allows the processor to execute its main program and only stop to service
peripheral devices when it is told to do so by the device itself.
24
The methodwould provide an external asynchronous input tothe processor.
Instruction that is currently being executed and fetch a new routine that will
service the requesting device.
Once this servicing iscompleted, the processor would resume exactly where
It left off. This method of servicing I/O request is called Interrupt driven I/O.
When a processor is interrupted, Itstops. Executingits current program and calls
a special routine which services the Interrupt thisis illustrated in fig.2.17.
Interruptionis called Interrupt and the special routine executed to service the
Interrupt is called Interrupt Service routine (ISR).
26
So this method of data transfer is not suitable for large data transfers.
27
Figure 2.16DMA Controller Operating In A Microprocessor System
28
Figure 2.7UniprogrammingApproach
2) The 110 in finished (Point C) the processing is resumed and the same
description applies to point D, E and F. At the end of P1, P2 can start which
hasthe same operation as that P1.
Each process contains the process control block(PCB). PCB is the data structure
used by the operating system.
1. Pointer:
Pointer points to another process controlblock. Pointer is used for maintaining
the list Scheduling list
2. Process state:
Process state may be new, ready, Memory locationrunning, waiting and so on. .
3. Program counter:
30
It indicates the address of thenext instruction to be executed for this process.
4. CPU registers:
31
It includes general purpose blockregister, stack pointers, index
registers andaccumulators etc.
5. Memory management information:
Include the value ofbase and limit register.
Information is useful for deallocating the memorywhen the process terminates.
6. Accounting information:
The information includes the amount of CPU and realtime used, time limits,
job or process numbers, account numbers etc.
2.6.3 Semaphore
The software technique used to solve the same problem is, Mutual exclusion.
The program region where the common resources, are used is called critical
program region.
Semaphore implementation in 8086:
In 8086, the XCHG instruction along withthe LOCK prefix can be used to set or reset
Semaphore.
Program sequence : MOV AL, 00H
Check again : LOCK XCHG semaphore, AL
TEST AL, AL
JZ checkagain
. . Critical region in which program access the shored
resources MOV semaphore, 1
The XCHG semaphore, AL instruction exchanges the contents of the AL register with
the contents of the memory location in which semaphore is stored.
The XCHG instruction requires two bus cycles.
1) During this XCHG instruction, achieved by LOCK prefix in the 8086. LOCK prefix
activates the LOCK output pin during the execution of the instruction that
follows the prefix.
2) During the execution of XCHG instruction, The LOCK output pin is in the active
state which does not allow other processor to getcontrol of the system bus.
2.6.4 Swapping
Swapping is a technique of temporarily removing inactive program (from the
memory al a system.)
It removes the process from the primary memory when it is blocked
anddeallocating the memory. Fig. 2.23shows the swapping of process.
For example. When process P1 requests an I/O operation. It becomes
blocked and will not return to the ready state.
Process manager places the process P1 into a blocked state, then the
memorymanager swaps the process P1 from primary memory to secondary
32
memory and process P,secondary memory to primary memory Process P,
changes the state, after swapping.
Figure 2.23 Swapping Of Processes
33
When process is swapped out, its executable image is copied to secondary
memory. When the process is swapped back into available primary memory
and swapped out is copied into the new block allocated by the memory
manager.
Binding Method:
If the address binding is done at load time then the process is moved to same
location of previous one
If the address binding done at execution time then the process can be
swapped into a different memory space.
2.6.5 Memory Management:
The placement of blocks of information in a memory system is called memory
allocation. The memory management system keeps the table.
The table indicates which parts of memory are available and which are
occupied.
The criteria for selecting a particular block is replaced Is indicated by the
replacement policy.
Nonpreemptive allocation:
First fit :
In this algorithm, searching is started either at the beginning of the memoryor
where the previous first-fit search ended.
In this algorithm the first free memory block which is big enough is allocated
to the block
k. The searching process is stopped as soon as a free memory block with
enough space is allocated.
Best fit:
In this algorithm, all free memory blocks are searched and smallest free
memory block which is large enough to accommodate desired k Block is used
to allocate k.
This algorithm uses free memory space more efficiently than first-fit algorithm.
The Fig 2.24 Shows the allocation of memory blocks using first fit and best fit
algorithms.
34
Figure 2.24 NonPreemptive Memory Allocation
Preemptive Allocation:
Nonpreemptive allocation cannot make efficient use of memory in all situations.
35
Much more efficient use of the available memory space is possible if the
occupied space can be reallocated to make room for incoming blocks.
Reallocation of the blocks can be done by a method is called Compaction.
2.7 System Bus Structure
The following figure 2.25 illustrates the fundamental structure of a system
bus and its relationship to be various components if the computer system.
The complexity of the bus control logic depends in the amount of
translation needed between the system bus and the pins on the CPU.
All of the address and data lines and most of the control lines use are capable
of being logically disconnected from the CPU or bus control logic.
The timing of the signals within the CPU and bus control logic is controlled by
a clock. The bus cycles and CPU activity are controlled by ground of clock
pulses.
The CPU on put is transaction would processed by outputting the address of
the data during first clock cycle.
36
Read is to take place during the second clock cycle.
37
Waiting an intermediate number of clock cycles for the addressed device to
put the data on the data lines, inputting the data and signaling the device that
the transfer is complete during the last clock cycle.
Types:
The multiprocessor systems are implemented using one of the two basic
architectures: Loosely coupled architecture and closely coupled architecture.
The systems using these architectures are known as loosely coupled
systems and closely coupled systems respectively.
COPROCESSOR, CLOSELY COUPLED AND LOOSELY COUPLED
CONFIGURATIONS
2.8.2 Closely Coupled Multiprocessor Configuration
In the closely coupled system (CCS) the processors or supporting processors
(coprocessor, math‘sprocessor) share clock generator, bus control logic, and
entire memory and I/O subsystem.
Such systems communicate through a shared main memory.
Data can communicate from one processor to the other is on the order of the
bandwidth of the memory.
Due to memory contentions two or more processors attempt to access the
same memory unit concurrently. When high-speed or real-time processing is
desired. Closely coupled systems (CCS) may be used.
o There are two models of a
CCS: 1, CCS without private
cache
2. CCS with private cache.
38
Figure 2.26CCS without Private Cache
The PMIN is a switch which is used to connect every processor to every memory
module.
This switch is P by M crossbar with PM sets of cross points.
When the crossbar switch is distributed across the memory modules, the
system is known as a multiported memory system.
A memory can satisfy only one processor‘s request in a given memory cycle.
Hence, if two or more processors attempt to access the same memory
module, a conflict occurs which is resolved or arbitrated by PMIN.
To avoid excessive conflicts the number of memory module L is usually as large
as P. Another method to minimize conflicts is to associate a reserved storage
area with each processor. This is the unmapped Local memory (ULM).
ULM is used to store kernel code and operating system tables often used
by the processes running on that processor.
The IOPIN is used to allow a processor to communicate with an I/O channel
which is connected to peripheral devices.
The ISIN is used for two purposes: To direct an interrupt to any other
interprocessor network and to initiate hardware alarm in case of processor
failure.
39
Figure 2.27CCS with Private Cache
In the first model (that is without private cache) each memory reference goes
through the PMIN, it encounters delay in the process or memory switch and
hence the instruction cycle time increases.
The increase in the instruction cycle lime reduces the system throughput.
This delay can be reduced by associating a cache with each processor to
capture most of the references made by a processor.
Another advantage of the cache is that the traffic through the crossbar switch
can be reduced, which subsequently reduces the contention at the cross
points.
More than one inconsistent copy of data may exist in the system as this
multiprocessor organization encounters the cache coherence problem.
2.8.2.3 Closely Coupled System using 8086
The CPU (8086) is the master or host and the supporting processor is
the slave. Therefore, two 8086s cannot appear in this configuration.
The CPU provides the bus control logic.
So the bus request signal from the supporting processor is connected to
the CPU. The Fig. 2.28 shows the simplest form of closely coupled
configuration.
40
Figure 2.28Closely Coupled Configurations
Fig. 2.29 shows the interaction between CPU and independent processor in closely
coupled configuration.
41
Figure 2.30Loosely Coupled Configurations
43
4. It follows IEEE floating point standard.
5. It is multi bus compatible.
2.8.4.2 Pin Diagram of 8087
Fig. 2.31 shows pin diagram of 8087.
45
1 0 1 - read
memory
1 1 0 - write
memory
1 1 1 - passive
2.8.4.38087 Architecture
Error Flags
1) IE: An invalid operation such as stack overflow, stack underflow, invalid
operand, square root of a negative number etc.
47
2) DE: The operand is not normalized.
3) ZE: A divide by zero error.
4) OE:An exponent overflows error.
5) PE: A precision error.
Interrupt Flag:IR:indicates the existence of the interrupt request.
Condition Code
C0 - C3 indicates the condition code.
The condition codes are set by the compare and examine instructions.
Stack Bits
ST: S0-S2 indicates the top of stack.
Busy Status:
B: Indicates current operation is not
completed.
48
Figure 2.34 Bit definition of control register
Tag Register: TAG register holds the status of the contents of data register. This
includes 0 0–DataValid
0 1 -Zero
1 0 –Aspecialvalue
1 1 - Empty
49
Figure 2.35 Data Format of NDP 8087
Example 1: Convert125912510in short real, long real and temporary real formats
50
2.8.4.5 Stacks in 8087
The 8087 has a 3-bit stack pointer which holds the number of the register
which is the current top-of-stack.
When the 8087 is initialized, the 3-bit stack pointer in the 8087 is loaded with
000 that indicates register C) is a top of stack.
As shown in Fig. 2.36, the stack of 8087.When 8087 reads the first number,
stack is decremented to 111(7) and the number is stored in register number
111(7), now register 7 is the top of stack.
51
2.8.4.6 Instructions in 8087
The 8087 instructions, which can be divided into six groups.
1) Data transfer instructions
2) Arithmetic instructions
3) Compare instructions
4) transcendental instructions
5) Load constant instructions
6) Processor control instructions.
FST destination:
Copies ST to a specified stack position or to a specified memory location.
Exceptions: 1, 0, U. P.
Examples:
FST ST (3) ; Copy ST to ST(3)
FST [BX] ; Copy ST to memory pointed by IBXI
FSTP destination:
Copies ST to a specified stack element or memory location and increments the
stack
pointer by one to point to the next element on the stack.
This is a stack POP operation.
52
FADDP ST(2) ; Add ST(2) to ST.
Increment stack pointer so ST(2) becomes ST.
53
FIADD source:
Adds integer from memory to ST, stores the result in ST.
Exceptions: I, D, O, P.
Example
FIADD CARS_SOLD ; Integer number from memory + ST
b) Subtraction
FSUB destination, source:
Subtracts the real number at the specified source from the real number at the
specified destination and puts the result in the specified destination.
Exceptions: I, D O, U, P.
Examples:
FSUI3 ST (3), ST ; ST(3) - ST(2) - ST
FSUL3 DIFFERENCE ; ST * ST - real from
memory FSUI3 ;ST* {ST(1)-ST)
FISUB source:
Subtracts integer number stored in memory from ST and stores result in ST.
Exceptions: I, D,O, P.
Example
FISUB DIFFERENCE ; ST ST - integer from memory
C) Reversed Subtraction
FSUBR destination, source
FSUBRP destination, source
FISUBR source
These instructions operate same as the FSUB instructions.
Subtract the contents of the specified destination from the contents of the
specified source and put the difference in the specified destination.
d) Multiplication
FMUL destination, source:
Multiply real umber from source by real number from specified destination,
and put result in specified stack element.
Exceptions: I, D, O, U, P.
FMUL ST(2), ST ; Multiply ST(2) and ST. result in
ST(2) FMUL ST,ST(5) ; Multiply ST(5) to ST. result in ST
54
Table 1.7C3, Q and CO Status word
2.8.4.6.4 Processor Control Instructions
These instructions do not perform computations.
They are used to do tasks such as initializing the 8087, enabling interrupts,
writing the status word to memory, etc.
FINIT / FNINT:
Initializes8087. Disables interrupt output, sets stack pointer to register 7,
sets default
status.
FDIS / FNDISI:
Disables the 8087 interrupt output pin so that it cannot cause an interrupt
when an exception (error) occurs.
FENI / FNENI:
Enables 8087 interrupt output so it can cause an interrupt when an exception
occurs.
FLDCW source:
Loads a status word from a memory location into the 8087 status register.
This instruction should be preceded by the FCLEX instruction to prevent a
possible exception response
FSTCWIFNSTCW destination:
Copies the 8087 control word to a memory location. Determine its current
value with 8086 Instructions.
FSTSW / FNSTW destination:
Copies the 8087 status word to a memory location.
FCLEX/FNCLEX:
Clears all of the 8(187 exception flag bits in the status register. Unasserts BUSY
and INT outputs.
FSAVE / FNSAVE destination:
Copies the 8087 control word, status word, pointers, and entire register stack
to 94-byte area of memory.
FRSTOR source:
Copies a 94-byte area of memory into the 8087 control register, status register.
pointer registers, and stack registers.
FSTENV / FNSTENV destination:
Copies the 6087 control register, status register, tagwords, and exception
pointers to a series of memory locations.
FLDENV source:
55
Loads the 8087 control register, status register, tag word, and exception
pointers from a named area in memory.
56
FINCSTP:
Increment the 8087 stack pointer by one.
57
Figure 2.38 Internal Block Diagram 0f 8089
C) DMA operation
1. GA and GB:
GA and GB are used as source and destination pointers.
If GA points to source, GB points to destination, and vice versa.
2. GC:
When a translation operation is performed along with a DMA transfer, CC
stores the base address of 256 byte.
3. BC:
BC is used as a byte counter.
It is decremented by 1 after each, byte transfer and 2 after each word transfer.
4. MC:
MC is used for mask compare operation.
MC stores the byte to be compared in its lower byte and mask pattern in the
5. IX:
higher byte. IX register is used as an index register.
6. TP:
TP is a task pointer
Stores the address of the next instruction to be executed.
It has a TAG bit to indicate whether the next instruction is stored in the system
or I/O space.
7. PP:
PP is a parameter pointer.
It is automatically filled by 8089 at the time of initialization
of a task. It stores an address of the parameter block.
58
Figure 2.39 Channel control Register
59
2.9 Introduction to advanced processors
2.9.1 80286 Microprocessor
2.9.1.1 Limitations of 80286:
1. Slow processing speed
2. Less addressing capacity
3. Smaller data paths
4. Not able to do floating point arithmetic on its own
5. Lack of security mechanism required for multiuser and multitaskingenvironment
6. Not able to do parallel processing
7. Lack of enhanced pipelined architecture
8. Lack of powerful instruction set which can support operating system
9. Does not support paging and virtual addressing
10.Does not support branch predictions to improve overall operation speed.
2.9.1.2 Features
1) The 80286 is a 16-bit processor. The 16-bit ALU allows to process 16-bit data.
2) It has 24-bit address bus. It can access up to 16 Mbytes (224) of
physicalmemory or 1 Gigabyte (2°) of virtual memory.
3) The 80286 can be operated at three different clock speeds. These are 4
MHz(80286-4), 6 MHz (80286-6), and 8 MHz (80286).
4) The 80286 includes special instructions to support operating systems.
5) The 80286 is housed in a 68-pin leadless flat package.
6) It contains four separate processing units. These are the Bus Unit (BU), the
Instruction Unit (lii), the Address Unit (AU) and the Execution Unit (EU
7) The 80286 microprocessor is compatible with their earlier 8086, 8088, 80186
and 80188 chips.
8) It has virtual memory-management circuitry and protection circuitry.
Bus Unit:
It includes address latches and data transceivers, bus interface and control
circuitry, instruction pre fetches and a 6 byte instruction queue.
The Bus unit does all the memory and 1/O read/write operations.
It pre fetches instruction bytes and puts them in a 6 byte pre fetch queue.
The Bus unit is responsible for the transfer of data to and from the processor
extension devices
60
Figure 2.40 Block Diagram of 80286
2. Address Unit:
It includes the segment registers (same as on 8086 and 80186), an offset adder
and a physical address adder.
The 80286 can be operated in two memory addressing modes:
1) Real address mode
2) Protected virtual address mode.
The address unit computes 20-bit physical address based on the 16-bit contents
of a segment register and a 16-bit offset just like an 8086.
The CS, DS, SS and ES registers are used to hold the base addresses for the
segments currently in use.
The instruction pointer IP, stack pointer SP is used to hold the offset for code
segment and stack segment respectively.
3. Execution Unit:
The execution unit includes ALU, registers (same as on 8086 and 80186) and
the CPU. The registers consists general purpose registers, index registers,
pointer registers, flag register and the 16-bit machine status word (MSW)
register.
4. Instruction Unit:
It includes an instruction decoder and a three decodedinstructions queue.
The instruction unit decodes unto three prefetchedinstructions and holds
them in the queue.
5. Flag Register:
The flag register of 80286 consists of two new flags: NT, IOPL.
1) NT (Nested flag): This flag is set when one system task invokes another task.
2) IOPL (110 Privilege level) : The two bits in the IOPL are used by the processor
and the operating system to determine your application‘s access to
I/Facilities.
61
Figure 2.41 Bit Patterns of 80286 status word and flag register
The table 2.8shows the 80286 interrupt types and their vector locations.
2.9.1.4.2 80286 Protected Virtual Address Mode (PVAM)
The protection bit of the machine status word, (MSW) it is possible to switch
operation mode from REAL to PVAM.
The Protected Virtual Address Mode (PVAM) provides memory management,
protection, task switching and interrupt processing.
The Protected Virtual Address Mode is also called protected mode.
62
Table 2.8 Interrupt Types
64
Figure 2.44 Interrupt Descriptor table Definition
65
Functions of Protection Mechanism:
The main functions of the protection mechanism in the 80286 are:
1. To protect system software from user programs.
2. To protect user tasks from each other.
3. To protect the regions of memory from accidental access.
Privilege Levels
The 80286 has a four-level hierarchical privilege system.
Controls the use of Privileged instructions and access to
descriptors The table 2.9shows the four privilege levels.
2.9.280386 Microprocessor
A feature of the 80386DX is its ability to operate in three different modes:
1. Real Address Mode
2. Virtual 8086 Mode
3. Protected Virtual 8086 Mode.
2.9.2.1 80386 Features:
1) The 80386 is a 32-bit processor. The 32-bit ALU allows to process 32-bit data.
2) It has 32-bit address bus.
3) The 80386 runs with speed up to20 MHz instructions per second.
4) The pipelined architecture of the 80386, allows simultaneous
instructionfetching, decoding, execution and memory management.
5) It allows programmers to switch between different operating systems
6) It can operate on 7 different data types:
a. Bit b. Byte c. Word d. Double word e. word f. Quad word g. Ten byte.
7) The 80386 can operate in real mode, protected mode or a variation of
protected mode called virtual 8086 mode.
8) The 80386 microprocessor is compatible with their earlier 8086, 8088.
66
Figure 2.45 Block Diagram for 80386
67
Figure 2.47 Functional units of 80386
69
The execution unit partially supports pipelining.
It overlaps the execution of anymemory reference instruction with the
previous instruction.
v) Segmentation Unit
The segmentation unit translates logical addresses into linear.
The segmentation unit compares the effective address forthe length limit.
The segment unit adds the segmentbase and the effective address to
generate linear address.
70
Figure 2.48 80386 register set (part1)
71
Figure 2.49(a) General Purpose register
The other four general purpose registers, are the two pointer registers, ESP
and EBP, and the two index registers, ESI and EDI.
They are used to store offset addresses of memory locations relative to the
segment registers.
The index registers ESI and EDI are used to store offset values to be
incremented or decremented when stepping through block of data.
The index registers are also used to hold offset addresses for instructions that
access data stored in the data segment part of memory.
The pointer register ESP and EBP are used to store offset addresses of
memory locations relative to the stack segment register.
2.9.2.3 2) Segment Registers
The 80386 has a 1M-byteaddress space in real mode.
The 80386supports six simultaneouslyaccessible memory blocks
calledsegments. A segment memory consisting of64K consecutive byte-
wide storagelocations.
These segments areaddressed by 16-bit registers: CS, DS, ES, SS, FS and GS.
72
Figure 2.50 Segment Registers
1. The CS (Code Segment) register holds the base address of the currently active code
segment
2. The OS (Data Segment) is used to hold the address of currently active data
segment.
3. The ES (Extra Segment), F5, and GS are used as general data segment registers.
4. The base address of the currently active stack segment is contained in the
SS(Stack Segment)register.
2.9.2.3.3) Index Pointers, and Base Registers
The offset used tocalculate physical address is contained in any of the pointer1
base, or index registers.
73
Figure2.51 Bit Pattern of Flag Register
CF (Carry flag):
This bit is set by arithmetic instructions that generate either acarry or a borrow.
Carry flag is also used in shift and rotate instructionsto contain the bit shifted
or rotated out of the register.
PF (Parity flag):
The parity bit is set by most instructions if the least significant 8bit of the
result contain even number of one‘s.AF (Auxiliary carry flag) :
The programmer can‘t access this bitdirectly, but this bit is internally used for
BCD arithmetic.
ZF (Zero flag):
Zero flag is set to 1, if the result of an operation is zero.
74
The two bits In the IOPL are used by the processor
The operating system to determine your application‘s access to I/O
facilities. Itholds privilege level, from O to 3
75
IF (Interrupt Flag) :
When interrupt flag is set, the 80386 recognizes and handles external
hardware interrupts on its INTR pin.
If the interrupt flag is cleared, 80386 ignore any inputs on this pin.
TF (Trap Flag) :
Trap flag allows user to single-step through programs.
When an80386 detects that this flag is set, it executes one instruction and then
automaticallygenerates an internal exception 1.
2.9.2.3.8) System Address Registers
there are four systems address register:
1) TR (Task Register),
2) IDTR (InterruptDescriptor Table Register)
3) GDTR (Global Descriptor Table Register)
4) LDTR (LocalDescriptor Table Register).
These registers hold the addresses for the four special descriptor table segments.
1) The TR (Task Register) points to the Task state segment
2) The EDTR (Interrupt Descriptor Table Register) points to the
InterruptDescriptor Table (IDT)
3) The GDTR (Global Descriptor Table Register) points to the Global
Descriptor Table (CDT)
4) The LDTR (Local Descriptor Table Register) points to the local Descriptor
Table(LDT)
77
5) ET (Extension Type),
6) PG (Paging).
PE (Protection Enable) :
This bit is similar to the VM bit in EFLAGs
. It controls the 80386‘s mode of
operation.
MP (Math Present) :
When this bit is set, the 80386 assumes that real floatingpoint hardware
(80287 or 80387) is present in the system.
EM (Emulate Coprocessor):
When this bit is set, the 80386 will generate
anexception 11. It attempts to execute a floating
pointinstruction.
Programmer can use this exception handler to emulate floating pointhardware in
software.
TS (Task Switched) :
The 80386 sets the bit automatically every time it performs atask
switch. It will never clear this bit on its own
ET (Extension Type) :
80386 detect whether numericprocessor connected is 80287 or
80387 Sets ET to logic 1, if numeric processor is80387.
This is necessary because the 80387 uses a slightly different protocol than
80287.
PG (Paging) :
This bit enables or disables paging mechanism in MemoryManagement Unit
(MMU). If bit is set, paging is enabled.
Control Register 1 (CR1): This is reserved by Intel.
Control Register 2 (CR2):
CR2 is read-only register.
Control Register 3 (CRy):
Control register 3 holds the physical address of the root of the two-level
pagingtables used when paging is enabled.
It is also called Page Directory Base Register (PDBR).
2.9.2.3.10Debug Registers 7
The DR0 to DR, registersare used to control debug feature.
The debug registers DRO to DR3 contain addressesassociated with one of four
breakpoints
(DR,) Fig. 2.55 shows debug registers.
The software Debugger can load breakpointaddresses in these registers to aid in
debugging.
78
Figure 2.55Debug register
80
Figure 2.56 gives the list of different access types.
There are in all four such fields (L. G, RW and LEN) for four breakpoints (B0-B3).
The DR7 contains three more bits. These are
LE (Local Exact):
The pipelined architecture of 80386 fetches, decodes nextinstruction before
the current one completes.
This bit appliesto all four linear breakpoints.
GE (Global Exact) :
This is similar to the LE bit.
If this bit is set, 80386 informsabout breakpoint at the instant it occurs
regardless of task
GD (Global debug access) :
When this bit is set, the 80386 denies the further access to any of the debug
registers, either for reading or writing.
These registers are used to check translation look aside buffer (TLB) of the pagingunit.
Test Register 6:
The TLB testing command
81
registers. TR6 is divided into fields
as follows:
C : This s a command bit. When this bit is cleared, awrite to the TLB is
performed.
82
W (bit 5) : Not
writable W (bit 6)
: Writable
U (bit7) : Not
user U(bit8) :
User
13 (bit 9) : Not
dirty D(bitl0) :
Dirty
V (bit 11) : Valid
BCD numbers:
The 803S6DX has the ability to perform four-function arithmetic on numbers
that are represented in binary-coded decimal (BCD).
The 80386DX can handle two different BCD formats:
1) Unpacked BCD
2) Packed BCD
BIT:
The 80386DX also supports BIT‖ data type.
The bit data type a Allows a program to directly access and modify any selected
bit within a bit string.
The 80386DX assembler supports eight instructions for bit
operations. These are: BT, BTC, BTS, BTR, BSF, BSR, IBTS, and
XBTS.
Strings:
The 80386DX supports bit string, byte string, word string, and word strings.
83
2.9.2.5 Operating Modes of 80386:
Real Mode
The 80386 microprocessor can operate basically in either Real Mode, or
Protected Mode. The 80386 maintains the compatibility of the object code
with 8086 and 80286 running in real mode.
The 80386 can access the 32-bit register set of 80386DX.
84
It is also possible to use addressing modes with the 32-bit override instruction
prefixes.
Protected Mode:
80386DX are unlocked when the 80386DX operates in Protected Mode.
Features of Protected Mode:
1. Protected Mode vastly increases the linear address space to four gigabyte (232
bytes)
2. It allows the running of virtual memory programs of almost unlimited size (64
terabytes or 2‘ bytes).
3. Protected Mode allows the 80386DX to run all of the existing 8086 and 80286
programs.
4. It provides a sophisticated memory management and a hardware-assisted
protection mechanism.
5. It provides special 80386 instructions for multitasking operating systems.
6. It supports paging mechanism.
85
descriptors (protected mode addressing). The 8086 virtual mode solves this
problem.
86
2.9.3.180486 Processor Features
88
Pentium processor has a dual integer
processor. It allows execution of two
instructions per clock.
Superscalar Processor:
Processors capable of parallel instruction execution of multiple instructions are
known as superscalar processors.
Executing two integer or two floating point instructions simultaneously and thus
it support superscalar architecture.
89
Figure 2.61 Block Diagram of 80486
91
Write Buffers:
The Pentium processor provides two write buffers, one for each of the two
internal execution pipelines.
This architecture improves performance when back-to-back writes occur.
93
Internal Cache Control:
Internal Cache Control logic monitors input signals to determine when to
snoop the address bus and outputs signals
Parity Generation and Control:
It generates even data parity for each of the eight data paths
It also generates a parity bit for the address during write bus cycles
Code Cache
It holds copies of the most frequently used instructions,
It is dedicated to supplying instructions to each of the processor‘s execution
pipelines. The cache is organized as a two-way set associative cache.
Prefetcher
Prefetcher requests for Instructions from the code cache.
Prefetch Buffers
Pentium provides four prefetch buffers.
They work as two independent pairs. When instructions are prefetchedfrom
the cache, they are placed into one set of prefetchbuffers
Instruction Decode Unit
Pentium provides two stage decoding.
The instructions are decoded in two stages known as Decode I (Dl) and Decode
2 (D2).
Control Unit
It is also referred to as the Microcode Unit. This control unit consists of the
following sub- units:
1) Microcode Sequencer
2) Microcode Control ROM
Arithmetic Logic Units (ALUs)
Pentium provides two ALUs to perform the arithmetic and logical operations.
The ALU for the UM pipeline can complete and operation prior to the ALU in
the ―V‖ pipeline,
Address Generators
Pentium provides two Address Generators (one for each pipeline).
They generate the address specified by the instructions in their respective
pipeline.
Data Cache
A separate internal Data Cache holds copies of the most frequently used data
requested by the two integer pipelines and the Floating Point Unit.
The internal data cache is an 8KB write-back cache, organized as two-way set
associative with 32-byte lines.
The Data Cache directory is triple ported to allow simultaneous access from
each of the pipelines and to support snooping.
Paging Unit
It can handle two linear addresses at the same time to support both pipelines.
Floating-Point Unit
The floating point unit performs floating point operations.
It can accept up to two floating point operations per clock when one of the
instructions is an exchange instruction.
94
Part-A (2 Marks Questions and Answers)
1. What are tightly coupled systems or closely coupled systems?
In a tightly coupled systems the microprocessor (either coprocessor
orindependent processors may share a common clock and bus control logic.. The
twoprocessors in a closely coupled system may communicate using a common
system busor common memory.
3. Write some advantages of loosely coupled systems over tightly coupled systems
More number of CPUs can be added in loosely coupled systems to improvethe
system performance. The system structure is modular and hence easy tomaintain
and troubleshoot.
A fault in a single module does not lead to a complete system breakdown.
Due to the independent processing modules used in the system, it is more
faulttolerant, more suitable to parallel applications due to its modularorganizations.
96
Wor
d
Shor
t
Long
Packed decimal
number(BCD) Floating point
/real number Short
Long
Temporary
real
12. What are the three basic Multiprocessor Configurations that the 8086 can support?
i. Coprocessor Configuration
ii. Closely Coupled Configuration
iii.Loosely Coupled Configuration
97
18. How does CPU differentiate the 8087 instructions from its own instructions (May/June
2013)
The 8087 instructions can be distinguished from 8086 instructions by letter F which
stands for floating point number. All mnemonics in 8087 begins with the letter F.
98
19. What are the modes of operation of 8087? (May/June 2013)
1) Real Mode
2) Protected mode
3) Virtual mode
20. How 8089 operates in loosely coupled configuration and tightly coupled configuration
(May/June 2013)
In 8086 is used in its maximum mode. The 8089, 8086 reside on the same local bus,
sharing the same set of system buffers.
21. In what ways are the standard microprocessor and coprocessor differ from each other
(Nov/Dec 2012)
A processor provides auxiliary functions or features that the main processor does not
have. These might include floating point support or hardware encryption.A
coprocessor is generally not usable without its main processor, whereas a processor
may function in a crippled or less powerful form without a coprocessor. An example
of a processor and a coprocessor pair would be the 80386 and the 80486
23. Compare closely coupled configuration features with loosely coupled configuration
features. (May /June 2012)
100
4. Discuss briefly the instructions supported by 8087 Numeric Data Processor.
(Refer Sec 2.8.6)(Nov /Dec 2011)
5. Explain the block diagram of 8089 I/O processor.(Refer Sec 2.8.4.7)(Nov /Dec 2011,
2012 and May/June 2010.2012,2014)
6. Explain the salient features of 8087 coprocessor units in architectural diagram
(Refer Sec 2.8.4.1)(Nov /Dec 2011)
7. Describe maximum mode of operation of 8086 (Refer Sec 2.2.2) (May/June 2013)
8. Draw and discuss a typical minimum mode 8086 system (Refer Sec 2.2.1) (May/June
2013)
9. Give two examples of 8087 data transfer instructions, arithmetic instructions,
Processor control instructions and transcendental instructions.(Refer Sec 2.8.4.6)
(May/June 2012)
101