EE-808 Fall 2024
Digital Integrated Circuit Design
Lecture # 08
Design of Sequential Logic Circuits
Muhammad Imran
muhammad.imran1@seecs.edu.pk
Acknowledgement
2
Content from following resources has been used in these lectures
Digital Integrated Circuits, Adam Teman, BIU, Israel
Jan. M. Rabaey, Digital Integrated Circuits, 2nd Ed.
Contents
3
Introduction
What is Sequential Logic?
Why Sequential Logic?
Basics of Sequential Logic
Timing Parameters in Sequential Circuits
Other Latch and Flip-Flop Implementations
Basic Timing Constraints
Static Timing Analysis
What is Sequential Logic and why we need it?
Sequential Logic
5
Output is a function of both the current state and the previous state
Circuit has memory!
Sequential circuits are generally Synchronous
Use a clock to synchronize the logic paths
www.tutorialspoint.com
Why use Sequential Logic?
6
Accumulator Example
An accumulator is a register that sums a list of numbers
Therefore, it feeds back the output back to the input
Without a register, there would be the possibility that races would
occur, causing erroneous outputs
We need to delay the output until the original calculation is finished
IN 0 1 0 IN
OUT
OUT 0 1 2
FB
+
FB 0 1 2
Why use Sequential Logic?
7
Accumulator Example
Essential to use sequential logic when
Paths have different delays but need to converge together
We always have to slow our fast paths down so they arrive along with
our slowest path
If we could make all paths have equal delays, we wouldn’t need
sequential logic, but this is really hard (almost impossible) to do
Why use Sequential Logic?
8
Pipelining Example
Small laundry has one washer, one dryer and
one operator, it takes 90 minutes to finish one
load:
Washer takes 30 minutes
Dryer takes 40 minutes
“Operator folding” takes 20 minutes
Why use Sequential Logic?
9
Pipelining Example
Doing laundry sequentially
Sequential laundry takes 6 hours for 4 loads
Why use Sequential Logic?
10
Pipelining Example
Doing laundry in a pipelined manner
Pipelined laundry takes 3.5 hours for 4 loads
Every 40 minutes a new load starts and a new load ends
Why use Sequential Logic?
11
Pipelining Data
If it takes 10 time units to process an instruction, we could perform one
instruction every 10 time units:
Instruction Output
Delay
But if we divide the process into 5 tasks that take 2 time units each:
Instruction
Output
Delay
We can start a new instruction every 2 time units.
And after filling the pipe, we finish an instruction every 2 units.
Why use Sequential Logic?
12
Pipelining Data
But some stages may be faster than others
So we need to hold the input to each stage constant until the previous
stage is done
We achieve this by adding a register in between the stages
So by using a pipeline, we can make our slowest path shorter and
therefore reduce the delay between actions
All data paths are built using a pipeline of some sort, either to eliminate
races or to increase throughput
Basics of Sequential Logic
Bi-Stability Principle
14
Back-to-back inverter = Cross-coupled inverter pair
= Basic memory element
When the gain of the inverter in the transient region is larger than 1,
A and B are the only two stable operation points
C is a metastable operation point V Vi2
o1
Vi1 Vo2
Vo1
A
Vi2 = Vo1
Vo2 C
Writing new value B
Cut the feedback loop – use mux! Vi1 = Vo2
Ratioed vs Non-Ratioed Latch
15
A static latch can be made by using a
CLK
feedback inverter.
The TG (with the driver before it)
must overcome the feedback
D D
inverter to write into the latch
CLK
But it is usually more robust to create
a mux-based non-ratioed latch
CLK
At the expense of size
CLK
CLK
Latch vs Flip Flop
16
We relate to registers as either latches or flip flops
Latch
Level sensitive
transparent transparent
opaque opaque
Flip Flop
Edge-triggered
locked locked
sample
sample
Latch vs Flip Flop
17
During high clock phases, a latch is transparent
Latching the input on the falling edge
A Flip Flop only samples the input on the rising edge
Clock
Input (D)
Latch D
D Q
Flip Flop
Latch Based Design vs Flip Flop Based Design
18
Latch-based
Flip-Flop-based
Static vs Dynamic Latch
19
A static latch – Stores its output in a static state
A dynamic latch – Uses temporary capacitance to store its state
As with logic, this provides a trade-off between area, speed and
reliability
CLK
S0
D Q
MUX
Q
2:1
D S1
CLK
Static Latch Dynamic Latch
Static vs Dynamic Latch
20
Some basic implementations of static and dynamic latches
Static Dynamic
CLK
CLK
Q
CLK D Q
D
CLK
CLK
Multiplexer Based NMOS Latches
21
mid
Input D Q D Q out
Making a Flip Flop
clk clk
22
Conceptually, we can create an edge triggered flip-flop by combining
two opposite polarity latches:
Input
D Q
mid
D Q
out
Master-Slave (Edge-Triggered) Register
23
Two opposite latches trigger on edge
Also called master-slave latch pair
Master-Slave Register
24
Multiplexer-based latch pair
I2 T2 I3 I5 T4 I6 Q
QM
D
I1 T1 I4 T3
CLK
How many transistors make up this flip-flop?
What is its clock load?
Reduced Clock Load Master-Slave Flip Flop
25
Writing value by overpowering feedback loop
Feedback inverters (I2, I4) should be weaker than driving inverter (I1,
I3)
Resettable Flip Flops
26
Asynchronous Set /
Reset Flip Flop
Syncchronous Reset Flip
Flop
Timing Parameters in Sequential Circuits
Timing Definitions
28
CLK Register
t D Q
tsetup thold
D DATA
CLK
STABLE t
tcq
Q DATA
STABLE t
– propagation delay
tcq
– setup time
tsetup – hold time
Clk-Q Delay – tcq
29
tcq is the time from the clock edge until the data appears at the
output
The tcq for rising and falling outputs is different
clk
Q
tcqLH tcqHL tcqLH
Mux Based Flip Flop – tcq Calculation
30
During low clock edge, data traverses slave and “waits” for the clock
at pass gate input
When clock rises, data has to go through pass gate and inverter
I2 T2 I3 I5 T4 I6 Q
QM
D
I1 T1 I4 T3
CLK
tcq =T3 I 6
Setup Time – tsetup
31
Setup time is the time the data has to arrive before the clock to
ensure correct sampling
tsetup tsetup tsetup
Good! Good! BAD
!
Mux Based Flip Flop – tsetup Calculation
32
Before clock edge, data should have propagated to the latching pass
gate
Else data will be restored to the previous state
I2 T2 I3 I5 T4 I6 Q
QM
D
I1 T1 I4 T3
CLK
tsetup I1 T1 I 3 I 2 T 3I
Timing Analysis – Setup Time
33
To obtain the setup time of the register while using SPICE, we
progressively skew the input with respect to the clock edge until the
circuit fails
Hold Time – thold
34
Hold time is the time the data has to be stable after the clock to
ensure correct sampling
clk
thold thold thold
Good! Good! BAD
!
Flop n adder (
Mux Based Flip Flop – thold Calculation https://www.youtube.com/wat
ch?v=60SBVO5susA
35 )
When the clock rises, T1 closes, latching the data at the output of I1
Therefore, any changes made after the clock will take tpd(I1) to
traverse
The hold time is 0 or – tpd(I1)
I2 T2 I3 I5 T4 I6 Q
QM
D
I1 T1 I4 T3
CLK
thold 0
Characterizing Timing
36
tD -
Q
D Q D Q
Clk Clk
tC - Q tC -
Q
Registe Latch
r
Setup Time Violation
37
Set up time
Required time for input to be stable BEFORE clock edge
Set up time violation
Input switches right before clock switches
Happens when path delay is too long
Fixing set up time violation
Make clock period longer (stretch clock cycle)
Make path delay shorter (accelerate combinational logic)
Hold Time Violation
38
Hold time
Required time for input to be stable AFTER clock edge
Hold time violation
Input switches right after clock switches
Happens when path delay is too short
Fixing hold time violation
Make path delay longer
Cannot be fixed once the chip is fabricated!!!
Tcq = 1ns Thold = 2ns Tcq = 1ns Thold = 2ns
Td_buf = 2ns
Other Latch and Flip Flop Implementations
SR Latch Circuit
40
SR latch circuit based on NOR2 gates
Clocked SR Latch
41
Level sensitive circuit
CMOS D-Latch
42
CK : 1 Q assumes the value of the input D
CK : 0 Q preserve its state
CMOS D-Latch
43
Constructed by two inverter loop +
two CMOS TG
CK:1 TG at input is activated
CK:0 TG at inverter loop is
activated
Static vs Dynamic Latch
44
Dynamic Latch
Static Latch
(Charge-
based)
Stored value remains as long If constantly clocked,
as the supply voltage is memory can be refreshed
applied.
Dynamic Edge-Triggered Registers (F/F)
45
Problem – Overlapping clocks can cause race conditions!
Solution: Enforce some delay and hold time!
Clock Overlap Problem
46
CLK
CLKb
I2 T2 I3 I5 T4 I6 Q
QM
D
I1 T1 I4 T3
CLK
C2MOS – Clocked CMOS
47
CLK = 0
Master stage = evaluation mode,
Slave stage = hold mode (high impedance)
CLK = 1
Master stage = hold mode
Slave stage = evaluation mode
C2MOS – Clocked CMOS
48
Insensitive to clock overlap Low Phase
Overlap
CLK
CLKb
April
T
27,
C2MOS – Clocked CMOS
49
Insensitive to clock overlap
High Phase
Overlap
CLK
CLKb
April
T
27,
C2MOS – Clocked CMOS
50
Tolerant to clock overlapping
Too long clock rise/fall times could be a problem
There exists a time slot where both the NMOS and PMOS transistors
are conducting. This creates a path between input and output that can
destroy the state of the circuit.
Clock rise/fall time should be much smaller than propagation delay
Buffers can be inserted to reduce rise/fall time
April
27,
TSPC – True Single-Phase Clocked Register
51
Positive latch Negative latch
(transparent when (transparent when
CLK= 1) CLK= 0)
TSPC – True Single-Phase Clocked Register
52
TSPC enables including logic inside the latch!
Example
: AND
latch
TSPC Flip Flop
53
VDD VDD VDD
CLK Q
M3 M6 M9
Y
Q
D X CLK
CLK M2 M5 M8
CLK
M1 M4 M7
TSPC latch malfunctions when the slope of the clock is not sufficiently steep
Pulse-Triggered Latches
54
Master-Slave Latches
Instead of a full set of master-slave latches
We can emulate an edge with a short clock pulse L1 L2
Data
D Q D Q
Clk Clk
Clk
Pulse-Triggered Latch
L
Data
D Q Design a clock pulse with a “clock
chopper”
Clk
Clk
Schmitt Trigger Circuits
55
Shows hysteresis in DC characteristics
The switching threshold depends on the direction of the transition
This property is handy in noisy environment
In Out
<Schematic symbol>
<Voltage transfer characteristic>
CMOS Schmitt Trigger
56
<Hysteresis is useful in noisy environment>
Basic Timing Constraints
Synchronous Timing
58
Timing Constraints
59
There are two main problems that can arise in synchronous logic:
Max Delay: The data doesn’t have enough time to pass from one
register to the next before the next clock edge.
Min Delay: The data path is so short that it passes through several
registers during the same clock cycle.
Max delay violations are a result of a slow data path, including the
registers’ tsetup, therefore it is often called the “Setup” path
Min delay violations are a result of a short data path, causing the
data to change before the thold has passed, therefore it is often
called the “Hold” path
Setup (Max) Constraint
60
Let’s see what makes up our clock cycle:
After the clock rises, it takes tcq for the data to propagate to point A
Then the data goes through the delay of the logic to get to point B
The data has to arrive at point B, tsetup before the next clock
In general, our timing path is a race:
Between the Data Arrival, starting with the launching clock edge
And the Data Capture, one clock period later
clk D Q Logic D Q
A B
D tcq
A clk
B tsetup
Setup (Max) Constraint
61
T tcq t
logic
Hold (Min) Constraint
62
Hold problems occur due to the logic changing before thold has
passed
This is not a function of cycle time – it is relative to a single clock edge!
Let’s see how this can happen:
The clock rises and the data at A changes after tcq
The data at B changes tpd (logic) later
Since the data at B had to stay stable for thold after the clock (for the
second register), the change at B has to be at least thold after the clock
edge
cl
k D Q Logic D Q
tcq
A B
D
clk
Hold (Min) Constraint
63
tcq t hold
Timing Constraints – Summary
64
For Setup constraints, the clock period has to be longer than the
data path delay:
T t t
This sets our maximumcq frequency
logic t setup
If we have setup failures, we can always just slow down the clock
For Hold constrains, the data path delay has to be longer than the
hold time:
t
This is independent
t
cq of clock t
logic hold
period.
If there is a hold failure, you can throw your chip away!
Pipelining – Optimizing Sequential Circuits
65
Example circuit to compute: log( a
b)
t DRV_CLK
Clock Nonidealities Ref_Clock
67
tskew tskew
t jit t jit
Clock skew Received Clock Clock
tRCV_CLK uncertaint
Spatial variation in temporally T y:
jitter+ske
equivalent clock edges; w
deterministic + random, tskew tDRVCLK
Clock jitter
Temporal variations in
consecutive edges of the clock
signal; modulation + random Ref_Clock
noise
Cycle-to-cycle (short-term) tJit,S
Long term tJit,L
tskew tskew
Variation of the pulse width t jit
Important for level sensitive t jit
clocking
Received
Clock
t
Positive and Negative Skew
68
R1 R2 R3
In D Q
Combinational Combinational
D Q
Logic D •••
Logic
Q
CLK
tCLK1 tCLK2 tCLK3
delay delay
Positive
skew
R1 R2 R3
In D Q
Combinational Combinational
D Q
Logic D •••
Logic
Q
tCLK1 tCLK2 tCLK3
delay delay CLK
Negative skew
Setup (Max) Constraint
69
The Launch path (still) consists of:
tcq+tlogic+tsetup
But if jitter makes the launch clock later, we need to add it to the data
path delay
t t t tsetup t jitter
The Capture pathlaunch cq
consists of:
logic
The clock period (T)
Positive skew means the capture clock path is longer
If jitter makes the capture clock earlier, we need to subtract it
tcapture T skew t jitter
Our max constraint is:
So we get: tcapture tlaunch
T t t t 2t
Setup (Max) Constraint
70
Data has to arrive before next clock edge
tcq
T skew t logic tsetup 2t jitter
Hold (Min) Constraint
71
The Launch path (still) consists of:
tcq+tlogic
But if jitter makes the launch clock later,
We need to subtract it from the data path delay
tlaunch tcq tlogic -
The Capture path consists of:
t jitter
Skew that makes the clock edge arrive at the capture register later than
at the launch register
Actually, since it is a single clock edge, jitter should effect the capture
clock the same as the launch clock
But as a worst case, we will add it as spatial jitter
t +t
capture skew + t
hold jitter
Our min constraint is:
So we get: tlaunch tcapture
tcq tlogic skew thold
Ada
Hold (Min) Constraint
72
Data has to arrive after the same clock edge has arrived at capture
reg
tcq t hold skew
Adding in Variation
73
Variations in both fabrication and operating conditions occur and are
taken into account through “corner” simulation
For global variation we have defined three primary simulation corners:
Typical Corner: our gates operate under nominal conditions and variation
Slow Corner: our gates slower (i.e., high VT, high temperature, low
voltage)
Fast Corner: our gates faster (i.e., low VT, low temperature, high voltage)
To assume worst-case conditions:
Calculate max-delay with the slowest possible transitions Slow Corner
Calculate min -delay with the fastest possible transitions Fast Corner
Static Timing Analysis Example
Static Timing Analysis – Example
75
Given synchronous network with:
In addition:
Static Timing Analysis – Example
76
Let’s find setup constraints for each path:
Path 1:
Static Timing Analysis – Example
77
Let’s find setup constraints for each path:
Path 2:
Static Timing Analysis – Example
78
Let’s find setup constraints for each path:
Path 3:
Static Timing Analysis – Example
79
Let’s find setup constraints for each path:
Critical path is Path 1, so the maximum frequency is 666MHz
Static Timing Analysis – Example
80
Now let’s find hold time constraints for :
Path 1:
Static Timing Analysis – Example
81
Now let’s find hold time constraints for :
Path 2:
Static Timing Analysis – Example
82
Now let’s find hold time constraints for :
Path 3:
Static Timing Analysis – Example
83
If we can set , we can use them to maximize frequency!
Equally dividing the delay of each path:
To get max. frequency, set all delays to 1.1ns:
Clock Distribution
84
Clock is distributed in a tree-like fashion
Zero-Skew Network by CAD
85
An example of the zero-skew clock routing network, generated by a
computer-aided design(CAD) tool
Self-Timed and Asynchronous Design
86
Functions of clock in synchronous design
Act as completion signal
Ensure the correct ordering of events
Need global clock distribution without skew
Asynchronous Design
Completion is ensured by careful timing analysis
Ordering of events is implicit in logic
No global clock = no skew = Potentially high speed and low power
High design complexity due to careful timing requirement
Self-Timed Design
Completion ensured by completion signal
Ordering imposed by handshaking protocol
Synchronous vs Self-Timed Pipelined Datapath
87
Relevant Reading
88
Jan. M. Rabaey, Digital Integrated Circuits, 2nd Ed.
Chapter 7