Timing Closure
Lecture 4
Jignesh Shah
UCSC Extension, Silicon Valley
Spring 2024
Agenda
• Logistics
• Quick review of lecture 3
• Design Intent through Synopsys Design Constraint (SDC)
• Clock definitions
• Input drive/output loads
• Multi cycle/false paths, etc…
• Lab #4
Logistics
• Midterm on Week 6 : Take Home test
• Lab 3 question on AOCV:
Review of lecture 3
• Process or Manufacturing Variation for STA
- Global variations can be accounted through characterization at
foundry provide extreme corners like SS, TT, FF
- Local variation can be accounted through
• Design for worst case (fixed margin) OR
• Single Derate for On Chip Variation (aka OCV) OR
• Derate based on logical depth and physical distance (i.e. Advanced OCV) OR
• Statistical margin using characterized model of library cells (i.e. Parameterized OCV) OR
• Liberty Variation Format, extended version of POCV in which all arcs including transition
time and setup/hold constraint arcs are characterized for variation.
Setup & Hold checks of digital design
• Data should be stable before setup time of next launch edge.
• Data should be launched after hold time of previous capture edge.
Fig2 Pipeline flops
Fig1 Example of CPU Pipeline
Fig3 Timing check
Half cycle (aka Phase) paths
• Designs with opposite edge flops can have half cycle paths
• Tighter setup check compared to one cycle path but relaxed hold check
• Other setup & hold check type : Clock Gating, Timing Borrowing, Multi Cycle ( will cover later)
SDC (aka Design constraints) do the following
• Specify the design intent for physical implementation and timing closure.
• Create the Root or Source clocks
• Create generated (i.e. derived) clocks
• Define design exceptions (i.e. False, multicycle, max_delay, etc)
• Specify static signals (i.e. case_analysis, reset, etc)
• Set timing requirements for input and output paths
• Define driving cell & Output load of external circuit
• Guide the tool on delay modeling (i.e. disable_timing, gating_check)
Clock definition of source clock
• Create_clock –name CLK –period 10 [get_port CLKB]
• Create_clock –period 10 –waveform {3 5 7 9} [get_port CLKB]
• Refer ”man create_clock” in STA tool for all options of create_clock
Clock definition
• Divided Clock
of derived clock
- Create_clock –name CLKP [get_pins UPLL0/CLKOUT]
- Create_generated_clock –name CLKPDIV2 –source [get_pin UPLL0/CLKOUT]
-master CLKP –divide_by 2 [get_pins UFF0/Q]
• Multiplex Clock
- Create_clock –name TCLK [get_port TCLK]
- Create_clock –name TCLKDIV5 [get_port TCLKDIV5]
- Create_generated_clock –name FUNC_CLK –source TCLK [get_pin mux/TCLK_MUX_OUT]
- Create_generated_clock –name SCAN_CLK –source TCLKDIV5 [get_pin mux/TCLK_MUX_OUT] -add
Fig2 : Multiplex clock
Fig1 : Divided clock
Cascaded Clock definition
• Create_clock –name SRC_CLK –period 1.2 [get_port CLK]
• Create_generated_clock –name DIV2_LSB –master [get_clock CLK]
- source [get_port CLK] –divide_by 2 [get_pin FF1/LSB]
• Create_generated_clock –name DIV4_MSB –master [get_clock CLK]
- source [get_port CLK] –divide_by 4 [get_pin FF2/MSB]
OR
Create_generated_clock –name DIV4_MSB –master [get_clock DIV2_LSB]
- source [get_pin FF2/LSB] –divide_by 2 [get_pin FF2/MSB]
Multiplier Clock definition
• Create_clock –name SRC_CLK –period 1.2 [get_port CLK]
• Create_generated_clock –name CLKOUT –master [get_clock CLK]
- source [get_port CLK] –multiply_by 2 [get_pin XOR1/Z]
- duty_cycle 50
Pulse Clock definition
• Create_clock –name SRC_CLK –period 1.2 [get_port CLK]
• Create_generated_clock –name PULSE_CLK –master [get_clock CLK]
- source [get_port CLK] –edges { 1 1 3} –edge_shift { 0 <FP> 0}
[get_pin AN1/Z]
FP= Floating point number
Multiple Clocks on same pin
• Create_clock –name C1 –period 10 [get_port C1]
• Create_clock –name C2 –period 15 [get_port C2]
• Create_generated_clock –name C1_CLK –master [get_clock C1]
- source [get_port C1] –divide_by 1 [get_pin MX/CLK]
• Create_generated_clock –name C2_CLK –master [get_clock C2]
- source [get_port C2] –divide_by 1 [get_pin MX/CLK] -add
• Create_generated_clock –name GC1 –master [get_clock C1_CLK]
- source [get_pin MX/CLK] –divide_by 3 [get_pin FF2/Q]
• Create_generated_clock –name GC2 –master [get_clock C2_CLK]
- source [get_pin MX/CLK] –divide_by 3 [get_pin FF2/Q] -add
Multiple clock domains
• Clock definition as per design intent:
- Create_clock –name clka –period 10
- Create_clock –name clkb period 20
- Create_generated_clock –name div1_clk –master clkb -divide_by 2 –source [get_pin FF2/CP] [get_pin FF2/OUT]
- Create_generated_clock –name div2_clk _a –master clka –source [get_pin FF3/CP] -edges {2 4 6 } [get_pin FF3/out]
- Create_generated_clock –name div2_clk _b –master clkb –source [get_pin FF3/CP] -divide_by 4 –add [get_pin
FF3/out]
• Clock Domains Relationship: (By default all clocks are synchronous with each others)
- Set_clock_group –asynchronous -group {clka div2_clk_a} –group {clkb div2_clk_b div1_clk}
- Set_clock_group –logically_exclusive –group {clka} -group {div1_clk}
- Set_clock_group –physically_exclusive –group {div2_clk_a} –group {div2_clk_b}
Generated Clock Gotcha
Clock Propagation
• Clock defined through constraint can be propagated through
combination cells like
-> Inverter, Buffer,
-> And, Nand, Or, Nor (i.e. Discrete clock gater)
-> Integrated Clock Gater. (i.e. ICG, Clock gating with latch)
• Clock defined through constraint can not be propagated through
output pin of sequential cells like flops, Lathes & Memory IP.
Clock Gating check
• A technique to save power by turning off clock of flops or lathces
whose data are not being used for time being.
• A setup & hold timing check between enable & clock pin of a cell
called clock gating check.
• Condition of Clock gating check on any gate
- One of the input has to be clock signal
- One of the input has to be data signal
- Output of the cells are fan out to clock pins of sequential cells
- Enable signal can be changed during inactive phase of clock
Fanout to
Multiple flops
Modeling clock latency
• Two types of clock latency
• Source latency: the delay of the clock from its source to the design’s input
port (useful when clock generation is not part of the design)
• Network latency: propagation delay of clock signal inside the design itself
Set_clock_latency 3 –source [get_clocks CLK]
• Latency of generated clock
Set_clock_latency [expr Tml + Tgl +Tnl] –source [get_clocks CLK]
Virtual Clock
• A clock with no internal source on a pin or port.
• A clock to constraint IO ports and to model off chip or off block clock.
• Do not exists Physically.
• Defined using create_clock command with period, wavefom and
name only
Create_clock –period 10 -name v_clk_pcie –waveform { 0 5}
(i.e. Duty cycle 50%)
Create_clock –period 10 -name v_clk_usb –waveform { 0 4}
(i.e. Duty cycle 40%)
Other Clock related constraints
• Set_clock_uncertainty
• Set_clock_transition
• Group_path (i.e. not same as set_clock_group )
• Set_clock_gating_check
• Set_clock_sense / set_sense
• Set_ideal_latency
IO delay
• Set_input_delay –clock CLKA –max [expr TDFF + Tc] [get_ports IN]
• Set_output_delay –clock CLK –max [expr tc + tsetup] [get_ports OUT]
• Virtual clock can be use for IO delay of feed through path.
Design externals
• Set_driving_cell –lib_cell INV –library test [get_ports IN]
• Set_input_transition 0.9 [get_ports IN]
• Set_load 20 [all_outputs]
Timing Exceptions
• As per RTL design intent, Setup or Hold check on some timing paths
might not be required or can be relaxed for more than one cycle.
• For STA such design intent can be express through
- False Paths: Timing paths on which S&H check is not required
even though logical connection exists.
- Multi Cycle Paths: Timing paths which allow more than one
clock cycle for the signal.
- Case Analysis: To specify constant (1 or 0) level for any signal.
- Disable Timing: Segment of paths which can be disabled.
- Clock Group: Exclusivity among multiple clock domain for clock
domain crossing paths.
MCP for longer than one cycle path due to FSM
• Create_clock –name CLK –period 10 [get_ports CLK]
• Set_multicycle_path 3 –setup –from [get_pins G1/Q] –to [get_pins G2/D]
• Set_multicycle_path 2 –hold –from [get_pins G1/Q] –to [get_pins G2/D]
Fig 2. Example of circuit of MCP
Fig1. Timing edge for MCP
MCP between clock domains as per Design
Constraint for CDC through Synchronizer
• Max delay or MCP between clock1 & clock 2 ffor data signals
• False path or Max delay for Control signal
MCP between clock domains with Synchronizer
False path for unsensitized logic
• Set_false_path –through [get_pin M1/b] –through [get_pin M2/e]
• Set_false_path –from FF1 –to [get_clock CLKB]
• Set_false_path –from FF2 –to [get_clock CLKA]
False path due to exclusive clocks
• Set_false_path –from [get_clock C1] –to [get_clock C2]
• Set_false_path –from [get_clock C2] –to [get_clock C1]
• Set_false_path –from [get_clock GC1] –to [get_clock GC2]
• Set_false_path –from [get_clock GC2] –to [get_clock GC1]
set_clock_group command can also be used to ignore timing paths
between clock domains
(i.e. set_clock_group –logical_exclusive -group {C1 GC1} –group {C2 GC2}
False path for bus protocol
• Set_false_path –from [get_clock Clk_Slave1] –to [get_clock Clk_Slave2]
• Set_false_path –from [get_clock Clk_Slave2] –to [get_clock Clk_Slave1]
set_clock_group command can also be used to ignore timing paths
between clock domains
(i.e. set_clock_group –asynchronous -group Clk_Slave1 –group
Clk_Slave2 )
False path for Control Signals
• Async Control Signals
Set_false_path –from rst_n –fall (assuming reset is active low)
• Quasi Static Signals
Set_false_path –through < clock_select_signal>
Static Signal
• Constraint to specify constant value on a pin or input port
* set_case_analysis 1 [get_pin UMUX2/CLK_SEL[2]]
(for func mode)
* set_case_analysis 0 [get_pin UMUX2/CLK_SEL[2]]
(for test mode)
• STA tool can propagate case value in downstream logic.
Breaking timing arc of a cell
• Set_disable_timing –from S -to Z [get_cells UMUX0]
• Alternative way: set_false_path –through UMUX0/S –through UMUX0/Z
• PHYCLK could be an “quasi static” signal which is toggling only when Reset is
asserted or when Clock is not toggling.
• Set_disable_timing exception can be used to break a combinational loop.
Design Rule Checks
• Rules for all ports and pins in the design to meet transition time and
capacitance limits:
• Set_max_transition 0.3
• Set_max_capacitance 0.7 [current_design]
Point to Point delay
• Min/max delays constraints to specify delay between two pins or
ports or clock domains
• Set_max_delay 4 –to G1
• Set_max_delay 1 –from G1 –to G2
• Set_max_delay 3 –from G1 –through G2 –to G3
• Set_min_delay 3 –from [get_clocks CLK1] –to [get_clocks CLK2]
• Max/Min delay constraint overrides any setup & hold check.
• False path overrides Max/Min delay constraint.
Many IPs in a typical design
• Multiple Hierarchies
Example:
- Full chip
- Sub System 1
- Partition 1
- Sub block 1
Block 1
Block 2
- Sub Block2
Block 1
- Partition 2
- Sub block 2_1
Block 2_1
- Sub System 2
- Glue logic
Constraint management for blocks & Full chip
• Top Down Methodology
• Bottom Up Methodology
• Hybrid of Top Down & Bottom Up
• More number of generated clock than logic designer intended for
• Who writes constraints and own constraint?
- RTL team?
Or
- Design Synthesis team?
Or
- STA team?
Book recommendation
• Constraining Designs for Synthesis and Timing Analysis
A Practical Guide to Synopsys Design Constraints (SDC)
Authors: Sridhar Gangadharan , Sanjay Churiwala
• https://link.springer.com/book/10.1007/978-1-4614-3269-2
Lab #4 –
* Copy files from /home/jdshah/spring_2024_labs/lab4/*
Thank You
Backup Slides
Clock Group command in PT
• Set_clock_groups:
A command to specify the relationship between groups of clocks to compute delay and crosstalk interaction
between the clocks.
• Set_clock_group –asynchronous
For clock with no timing relationship at all. (i.e. Coming from different PLL, Phase are not aligned, etc)
Crosstalk with infinite window. (i.e. most pessimistic)
• Set_clock_group –logical_exclusive
For clock with no logical relationship or not valid timing path among between them. (i.e. Muxed clock)
Crosstalk with overlap window
• Set_clock_group –physical_exclusive
For clock not exists simultaneously in operation. (i.e. Functional, Test, Low_power clock & high_speed clock)
No Crosstalk among them.
• Set_clock_group –asynchronous –allow_path
Restore time the path to allow asynchronous check (i.e. set_max_delay )
Crosstalk with infinite window
The .lib file
• The general format for a .lib file is:
[general and global attributes]
[cell1
- general attributes for cell1
- input pin characteristics (capacitance etc.)
- output pin characteristics (capacitance, timing
etc.) ]
[cell2
- general attributes for cell2
- input pin characteristics (capacitance etc.)
- output pin characteristics (capacitance, timing
etc.) ]
[cell3 ..
Timing .Lib Details
• pin(pin_name)
• direction : input, output, inout, internal
• clock_pin
• Function(expression)
• Used for output or bidirectional pins. The expression defines the value of the
output pin as a function of input pins
• max_capacitance
• The maximum output capacitive load that an output pin can drive
• capacitance
• The capacitive load of an input, inout, output or internal pin. Usually defined
as 0 for output pins
• internal_power()
• Output pins in combinational cells, define the rise_power and fall_power to a
related input pin. Input and clock pins also define this in sequential cells
• timing()
• Output pins in combinational cells, define the rise_delay, fall_delay,
rise_transition and fall_transition to a related input pin
• pulse width definitions, recovery, removal
• Required for clocks, asynchronous set and reset pins
NMOS/PMOS Basics
gate
source drain
Good in Passing 0 (Low)
substrate
gate
source drain
Good in Passing 1 (High)
substrate
Cell delay of cascaded gates
Sequential Check & Delay
Pipelining explanation through Diagram
Engine
Chassis Paint
Time Time
A A
B B
C C
Order of manufacturing Order of
(Car A, B, and then C) manufacturing
99 © 2019 Arm Limited
Pipelining
C T
Clock period = T + C
Combinationa
Now, if we create two pipeline
l stages:
clk logic
C C C
Clock period = T/2 + C
T/2 T/2
If C is small, our clock
frequency has almost doubled.
clk Hence, our throughput will
Pipelining double, too.
register
T – is the critical path through our combination logic
102 © 2019 Arm Limited C – is the delay though our register, clk->Q delay