Timing Closure
Lecture 5
Jignesh Shah
UCSC Extension, Silicon Valley
Spring 2024
Agenda
• Logistics
• Quick review of lecture 4
• Timing Exceptions
• PVT – process, voltage, and temperature
• STA Modes
• Lab #5
Logistics
• Next week Midterm
- A Take Home Quiz which will due on Sunday May 18, 2024
Review of lecture 4
• Design constraints
• Intent of the design and external condition is specified through constraint.
• Input/output conditions:
• Set_input_delay, set_output_delay, set_load, etc…
• Clock description/modeling
• Create_clock, set_clock_uncertainity, etc…
Timing Exceptions
• As per RTL design intent, Setup or Hold check on some timing paths
might not be required or can be relaxed for more than one cycle.
• For STA such design intent can be express through
- False Paths: Timing paths on which S&H check is not required
even though logical connection exists.
- Multi Cycle Paths: Timing paths which allow more than one
clock cycle for the signal.
- Case Analysis: To specify constant (1 or 0) level for any signal.
- Disable Timing: Segment of paths which can be disabled.
- Clock Group: Exclusivity among multiple clock domain for clock
domain crossing paths.
MCP for longer than one cycle path due to FSM
• Create_clock –name CLK –period 10 [get_ports CLK]
• Set_multicycle_path 3 –setup –from [get_pins G1/Q] –to [get_pins G2/D]
• Set_multicycle_path 2 –hold –from [get_pins G1/Q] –to [get_pins G2/D]
Fig 2. Example of circuit of MCP
Fig1. Timing edge for MCP
MCP between clock domains as per Design
Constraint for CDC through Synchronizer
• Max delay or MCP between clock1 & clock 2 ffor data signals
• False path or Max delay for Control signal
MCP between clock domains with Synchronizer
False path for unsensitized logic
• Set_false_path –through [get_pin M1/b] –through [get_pin M2/e]
• Set_false_path –from FF1 –to [get_clock CLKB]
• Set_false_path –from FF2 –to [get_clock CLKA]
False path due to exclusive clocks
• Set_false_path –from [get_clock C1] –to [get_clock C2]
• Set_false_path –from [get_clock C2] –to [get_clock C1]
• Set_false_path –from [get_clock GC1] –to [get_clock GC2]
• Set_false_path –from [get_clock GC2] –to [get_clock GC1]
set_clock_group command can also be used to ignore timing paths
between clock domains
(i.e. set_clock_group –logical_exclusive -group {C1 GC1} –group {C2 GC2}
False path for bus protocol
• Set_false_path –from [get_clock Clk_Slave1] –to [get_clock Clk_Slave2]
• Set_false_path –from [get_clock Clk_Slave2] –to [get_clock Clk_Slave1]
set_clock_group command can also be used to ignore timing paths
between clock domains
(i.e. set_clock_group –asynchronous -group Clk_Slave1 –group
Clk_Slave2 )
False path for Control Signals
• Async Control Signals
Set_false_path –from rst_n –fall (assuming reset is active low)
• Quasi Static Signals
Set_false_path –through < clock_select_signal>
Static Signal
• Constraint to specify constant value on a pin or input port
* set_case_analysis 1 [get_pin UMUX2/CLK_SEL[2]]
(for func mode)
* set_case_analysis 0 [get_pin UMUX2/CLK_SEL[2]]
(for test mode)
• STA tool can propagate case value in downstream logic.
Breaking timing arc of a cell
• Set_disable_timing –from S -to Z [get_cells UMUX0]
• Alternative way: set_false_path –through UMUX0/S –through UMUX0/Z
• PHYCLK could be an “quasi static” signal which is toggling only when Reset is
asserted or when Clock is not toggling.
• Set_disable_timing exception can be used to break a combinational loop.
Design Rule Checks
• Rules for all ports and pins in the design to meet transition time and
capacitance limits:
• Set_max_transition 0.3
• Set_max_capacitance 0.7 [current_design]
Point to Point delay
• Min/max delays constraints to specify delay between two pins or
ports or clock domains
• Set_max_delay 4 –to G1
• Set_max_delay 1 –from G1 –to G2
• Set_max_delay 3 –from G1 –through G2 –to G3
• Set_min_delay 3 –from [get_clocks CLK1] –to [get_clocks CLK2]
• Max/Min delay constraint overrides any setup & hold check.
• False path overrides Max/Min delay constraint.
Many IPs in a typical design
• Multiple Hierarchies
Example:
- Full chip
- Sub System 1
- Partition 1
- Sub block 1
Block 1
Block 2
- Sub Block2
Block 1
- Partition 2
- Sub block 2_1
Block 2_1
- Sub System 2
- Glue logic
Constraint management for blocks & Full chip
• Top Down Methodology
• Bottom Up Methodology
• Hybrid of Top Down & Bottom Up
• More number of generated clock than logic designer intended for
• Who writes constraints and own constraint?
- RTL team?
Or
- Design Synthesis team?
Or
- STA team?
Book recommendation
• Constraining Designs for Synthesis and Timing Analysis
A Practical Guide to Synopsys Design Constraints (SDC)
Authors: Sridhar Gangadharan , Sanjay Churiwala
• https://link.springer.com/book/10.1007/978-1-4614-3269-2
Data to Data check (i.e. set_data_check)
• Setup & Hold timing check between two data signals.
• One is constraint pin and another is relative pin to write constraint
Set_data_check –from <relative_pin> -to <data_pin> -setup <number>
Set_data_check –from <relative_pin> -to <data_pin> -hold <number>
No Change data signal
• Use set_data_check to restrict switching window of a signal with respect
other high or low pulse of other signals
• Set_data_check can be used for skew constraint among bus signal using
negative setup and hold number
Process Voltage Temperature
• What is PVT and why it is important
• Delay & Current through transistors depends on
manufacturing process (how the transistors are made),
voltage at transistor
junction temperature of the transistor is operating)
• This behavior directly impacts the performance/speed of the logic gates
• Timing closure depends on the specific PVT conditions
• Closing timing in one specific PVT corner does not guarantee working silicon for all the
parts & all operating conditions of products.
• Multiple corners with specific design goals must be reached for working silicon
Threshold Voltage
• Vt = threshold voltage, voltage when the transistor starts to turn on
• Ids, transistor’s drain to source current, is directly proportional to Vt
• Thus, transistor speed is directly proportional to Vt
• Lower Vt, faster transistor è faster logic gates
• Higher Vt, slower transistor è slower logic gates
• However,
• Lower Vt leads to higher leakage (more power)
• Higher Vt, leads to lower leakage (less power)
• Solution:
• Mixed Vt design
• Low Vt cells for critical paths
• Standard Vt cells for regular logic
• High Vt cells for low speed paths
Channel Length
• Gate Length (Lg) – length of the transistor gates (typically also
referred as the feature size)
• Ids is directly proportional to 1/Lg, thus variations in Lg leads to big
changes to transistor performance
• Lg is a tightly controlled process param
• Smaller Lg è faster transistor, higher leakage
• Larger Lg è slower transistor, lower leakage
• Poly biasing…
• Increase Lg to reduce leakage, sometimes these are called wimpy cells
• Starting in 5nm and below, this is no longer a work able solution
• Finer grain control of Vt (multiple Vt cells)
Process corner (SS, TT, FF)
• Systematic process variation is characterized in spice models for use
in circuit simulators such as hspice
• Typically Slow N/P (SS) is considered the worst case for setup and Fast
N/P (FF) is considered worst case for hold
• There are cases where slow N/Fast P or Fast N/Slow P which can be worst
case for setup/hold
• Random process variations are characterized in statistical models for
use in monte carlo simulations
Voltage
• From the transistor Ids equation, note that Ids is directly proportional
to Vds, drain to source voltage of a transistor
• In digital logic gates, this voltage is VDD
• The speed (performance) of the transistor is directly proportional to VDD
• Therefore, the speed of logic gates is directly proportional to VDD
• Higher VDD è faster gates
• Lower VDD è slow gates
• Voltage is a variable that can be externally controlled
• Can be easily manipulated in lab experiments
Temperature
• The mobility of the carriers, µn and µp, depends on the temperature
• For NMOS, the carriers are electrons and mobility of electrons (µn) is defined as how
fast the electrons can travel through the silicon
• For PMOS, the carriers are holes and the mobility of holes (µp) is defined as how fast
the holes can travel through the silicon
• Holes are basically the absence of electrons
• As temperature goes up, the lattice structure in silicon vibrate faster, hence
impeding the mobility of the electrons/holes
• Higher temperature è slow gates
• As temperature goes down, the lattice structure in silicon vibrate less,
hence allowing the electrons/holes to pass by with less impediments
• Lower temperature è faster gates
Temperature Inversion
• The Ids current varies linearly with µ and (Vg – Vt)2
• Both µ and Vt reduce with increasing temperature and vice versa
• The term with more impact on the final current determines if the
delay increases or decreases with increasing temperature
• With deep submicron technologies (90nm and below), VDD can scale
as low as 0.9V but Vt has not scaled as aggressively (0.2 – 0.4V)
• Even thought µ improves at lower temperature, (Vg – Vt)2 has a
greater impact on the current resulting in less drive current at lower
temperatures than at higher temperatures
Temperature Inversion (cont.)
• Hence, the lower temperature becomes the SLOW corner, not HIGH
temperature
• This is especially true with high Vt devices
• This phenomenon is known as temperature inversion
STA modes
• A design can be operated in multiple modes like functional or
mission, test, sleep, high speed, low power, etc
• Timing check has to be done for each possible modes of a design.
• Each mode can have unique set of constraints like clock frequency,
case analysis for static signal, false path, etc
Scan chains for Test (aka Design For Test)
• DFT or test have multiple mode like scan Shift, Scan Capture, Atspeed,
JTAG, Boundary scan, etc.
• Each of the DFT mode requires a separate constraint.
Fig 3. Another topology
of Scan chain
Corner Explosion
Operating modes: functional, scan shift, scan capture, bist, async
FE corners: SS, TT, FF
SSG SSGNP TT FFGNP FF
ΔW ΔT ΔH
Typical typical typical Typical
BE corners: C-worst, Cbest, Typical
C-best min min max
RC-worst, RC-best C-worst max max min
RC-best max max max
RC-worst min min min
Temp corners: cold, hot
Voltage: Vlow, Vnomial, Vhigh
33
Worst case Corner
• Design for worst case
• Usually for setup, the worst delay corner is SS, low Voltage, high and low
temp and Cworst & Rcworst wire corners. (depends on design and process)
• Usually forhold, the minimum delay corner is FF high voltage, high and low
temp and SS low voltage, high or low temp for clock skew dominated path.
• Hold analysis need to be for all wire corners. (i.e. more pessimism required)
• Robust as it covers the process yield distribution
• Increases cost (larger die) and schedule (more difficult to fix setup/hold
violations across SS to FF
FF,Vhigh,hot
FF,Vhighcold
Design & Process
Window
SS,Vlow,hot
SS,Vlow,cold
Lab #5 –
* Copy files and directories from
/home/jdshah/spring_2024_labs/lab5
Thank You
Backup Slides
Clock Group command in PT
• Set_clock_groups:
A command to specify the relationship between groups of clocks to compute delay and crosstalk interaction
between the clocks.
• Set_clock_group –asynchronous
For clock with no timing relationship at all. (i.e. Coming from different PLL, Phase are not aligned, etc)
Crosstalk with infinite window. (i.e. most pessimistic)
• Set_clock_group –logical_exclusive
For clock with no logical relationship or not valid timing path among between them. (i.e. Muxed clock)
Crosstalk with overlap window
• Set_clock_group –physical_exclusive
For clock not exists simultaneously in operation. (i.e. Functional, Test, Low_power clock & high_speed clock)
No Crosstalk among them.
• Set_clock_group –asynchronous –allow_path
Restore time the path to allow asynchronous check (i.e. set_max_delay )
Crosstalk with infinite window
The .lib file
• The general format for a .lib file is:
[general and global attributes]
[cell1
- general attributes for cell1
- input pin characteristics (capacitance etc.)
- output pin characteristics (capacitance, timing
etc.) ]
[cell2
- general attributes for cell2
- input pin characteristics (capacitance etc.)
- output pin characteristics (capacitance, timing
etc.) ]
[cell3 ..
Timing .Lib Details
• pin(pin_name)
• direction : input, output, inout, internal
• clock_pin
• Function(expression)
• Used for output or bidirectional pins. The expression defines the value of the
output pin as a function of input pins
• max_capacitance
• The maximum output capacitive load that an output pin can drive
• capacitance
• The capacitive load of an input, inout, output or internal pin. Usually defined
as 0 for output pins
• internal_power()
• Output pins in combinational cells, define the rise_power and fall_power to a
related input pin. Input and clock pins also define this in sequential cells
• timing()
• Output pins in combinational cells, define the rise_delay, fall_delay,
rise_transition and fall_transition to a related input pin
• pulse width definitions, recovery, removal
• Required for clocks, asynchronous set and reset pins
NMOS/PMOS Basics
gate
source drain
Good in Passing 0 (Low)
substrate
gate
source drain
Good in Passing 1 (High)
substrate
Cell delay of cascaded gates
Sequential Check & Delay
Pipelining explanation through Diagram
Engine
Chassis Paint
Time Time
A A
B B
C C
Order of manufacturing Order of
(Car A, B, and then C) manufacturing
99 © 2019 Arm Limited
Pipelining
C T
Clock period = T + C
Combinationa
Now, if we create two pipeline
l stages:
clk logic
C C C
Clock period = T/2 + C
T/2 T/2
If C is small, our clock
frequency has almost doubled.
clk Hence, our throughput will
Pipelining double, too.
register
T – is the critical path through our combination logic
102 © 2019 Arm Limited C – is the delay though our register, clk->Q delay