KEMBAR78
Module 5 Complete | PDF | Cmos | Integrated Circuit
0% found this document useful (0 votes)
8 views28 pages

Module 5 Complete

Module 5 discusses the importance of testing, debugging, and verification in integrated circuit (IC) design, emphasizing that testing often requires more effort than design itself. It outlines three main categories of tests: functionality tests, manufacturing tests, and tests conducted on the first batch of chips, along with various fault models and metrics such as controllability and observability. Additionally, it covers design for testability techniques, including ad hoc testing, scan design, built-in self-test (BIST), and IDDQ testing, all aimed at enhancing the reliability and manufacturability of ICs.

Uploaded by

Srihari Ks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views28 pages

Module 5 Complete

Module 5 discusses the importance of testing, debugging, and verification in integrated circuit (IC) design, emphasizing that testing often requires more effort than design itself. It outlines three main categories of tests: functionality tests, manufacturing tests, and tests conducted on the first batch of chips, along with various fault models and metrics such as controllability and observability. Additionally, it covers design for testability techniques, including ad hoc testing, scan design, built-in self-test (BIST), and IDDQ testing, all aimed at enhancing the reliability and manufacturability of ICs.

Uploaded by

Srihari Ks
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Module 5

Testing debugging and


Verification
Introduction to Testing
• While in real estate the refrain is “Location! Location! Location!” the comparable advice
in IC design should be “Testing! Testing! Testing!”
• For many chips, testing accounts for more effort than does design.
• Tests fall into three main categories.
• The first set of tests verifies that the chip per forms its intended function. These tests,
called functionality tests or logic verification, are run before tapeout to verify the
functionality of the circuit.
• The second set of tests are run on the first batch of chips that return from fabrication.
These tests confirm that the chip operates as it was intended and help debug any
discrepancies.
• The third set of tests verify that every transistor, gate, and storage element in the chip
functions correctly. These tests are conducted on each manufactured chip before
shipping to the customer to verify that the silicon is completely intact. These are called
manufacturing tests.
Testing a die (chip) can occur at the following levels:
• Wafer level
• Packaged chip level
• Board level
• System level
• Field level
The yield of a particular IC was the number of good die divided by the
total number of die per wafer. Because of the complexity of the Manu
facturing process, not all die on a wafer function correctly.
Dust particles and small imperfections in starting material or
photomasking can result in bridged connections or missing features.
These imperfections result in what is termed a fault.
Logic verification
• Verification tests are usually the first ones a designer might construct as part of the design process.
• RTL(Register Transfer logic) is equivalent to the design specification at a higher behavioral or specification level of
abstraction. The behavioral specification might be a verbal description, a plain language textual specification, a
description in some high level computer language such as C, a program in a system-modeling language such as
SystemC, or a hardware description language such as VHDL or Verilog, or simply a table of inputs and required
outputs.
• Functional equivalence involves running a simulator on the two descriptions of the chip (e.g., one at the gate level
and one at a functional level) and ensuring that the outputs are equivalent at some convenient check points in time
for all inputs applied. This is most conveniently done in an HDL by employing a test bench; i.e., a wrap per that
surrounds a module and provides for stimulus and automated checking.
• To check functional equivalence through simulation at various levels of the design hierarchy. If the description is at
the RTL level, the behavior at a system level may be able to be fully verified.
• At each level, you can write small tests to verify the equivalence between the new higher-level functional model
and the lower-level gate or functional level. At the top level, you can surround the filter functional model with a
software environment that models the real-world use of the filter.
FIGURE 1. Functional equivalence at various levels of abstraction
Manufacturing Test Principle
The purpose of manufacturing test is to screen out most of the defective parts before they are shipped to
customers. Typical commercial products target a defect rate of 350–1000 defects per million (DPM) chips
shipped.

1. Fault Models
(i) Stuck-At Faults
• In the Stuck-At model, a faulty gate input is modeled as a stuck
at zero (Stuck-At-0, S-A 0) or stuck at one (Stuck-At-l, S-A-l).

• This model dates from board-level designs, where it was


determined to be adequate for modeling faults.

• Figure 2 illustrates how an S-A-0 or S-A-1 fault might occur.


These faults most frequently occur due to gate oxide shorts (the
nMOS gate to GND or the pMOS gate to VDD) or metal-to-metal
shorts.

FIGURE 2: CMOS stuck-at faults


(ii) Short-Circuit and Open-Circuit Faults

• Other models include stuck-open or


shorted models. Two bridging or shorted
faults are shown in Figure 3.

• The short S1 results in an S-A-0 fault at


input A, while short S2 modifies the
function of the gate.

• It is evident that to ensure the most


accurate modeling, faults should be
modeled at the transistor level because it
is only at this level that the complete
circuit structure is known.
FIGURE 3 CMOS bridging faults
FIGURE 4: A CMOS open fault that causes sequential faults and a defect that causes static I DD current
2. Observability
▪ The observability of a particular circuit node is the degree to which you can observe
that node at the outputs of an integrated circuit (i.e., the pins).
▪ This metric is relevant when you want to measure the output of a gate within a
larger circuit to check that it operates correctly.
▪ Given the limited number of nodes that can be directly observed, it is the aim of
good chip designers to have easily observed gate outputs.

3. Controllability
▪ The controllability of an internal circuit node within a chip is a measure of the ease
of set ting the node to a 1 or 0 state.
▪ This metric is of importance when assessing the degree of difficulty of testing a
particular signal within a circuit.
▪ An easily controllable node would be directly settable via an input pad.
▪ A node with little controllability, such as the most significant bit of a counter, might
require many hundreds or thousands of cycles to get it to the right state.
4. Repeatability
• The repeatability of system is the ability to produce the same outputs given the same inputs.
Combinational logic and synchronous sequential logic is always repeatable when it is
functioning correctly.
• However, certain asynchronous sequential circuits are nondeter ministic. For example, an
arbiter may select either input when both arrive at nearly the same time. Testing is much
easier when the system is repeatable.
• Some systems with asyn chronous interfaces have a lock-step mode to facilitate repeatable
testing.

5. Survivability
▪ The survivability of a system is the ability to continue function after a fault.
▪ For example, error-correcting codes provide survivability in the event of soft errors.
Redundant rows and columns in memories and spare cores provide survivability in the event
of manufacturing defects.
▪ Adaptive techniques provide survivability in the event of process variation. Some
survivability features are invoked automatically by the hardware, while others are activated
by blowing fuses after manufacturing test.
6. Fault Coverage
▪ A measure of goodness of a set of test vectors is the amount of fault coverage it
achieves.
▪ That is, for the vectors applied, what percentage of the chip’s internal nodes were
checked? Conceptually, the way in which the fault coverage is calculated is as
follows.
▪ Each circuit node is taken in sequence and held to 0 (S-A-0), and the circuit is
simulated with the test vectors comparing the chip outputs with a known good
machine––a circuit with no nodes artificially set to 0 (or 1).
▪ When a discrepancy is detected between the faulty machine and the good machine,
the fault is marked as detected and the simulation is stopped. This is repeated for
setting the node to 1 (S-A-1).
▪ In turn, every node is stuck (artificially) at 1 and 0 sequentially. The fault coverage
of a set of test vectors is the percentage of the total nodes that can be detected as
faulty when the vectors are applied.
▪ To achieve world-class quality levels, circuits are required to have in excess of
98.5% fault coverage.
7. Automatic Test Pattern Generation (ATPG)

▪ Historically, in the IC industry, logic and circuit designers implemented the functions at
the RTL or schematic level, mask designers completed the layout, and test engineers
wrote the tests.
▪ In many ways, the test engineers were the Sherlock Holmes of the industry, reverse
engineering circuits and devising tests that would test the circuits in an adequate manner.
▪ For the longest time, test engineers implored circuit designers to include extra circuitry to
ease the burden of test generation.
▪ Happily, as processes have increased in density and chips have increased in complexity,
the inclusion of test circuitry has become less of an overhead for both the designer and
the manager worried about the cost of the die.
▪ In addition, as tools have improved, more of the burden for generating tests has fallen on
the designer. To deal with this burden, Automatic Test Pattern Generation (ATPG) methods
have been invented.
▪ The use of some form of ATPG is standard for most digital designs.
▪ Commercial ATPG tools can achieve excellent fault coverage. However, they are
computation-intensive and often must be run on servers or compute farms with many
parallel processors.
8. Delay Fault Testing

▪ The fault models dealt with until this point


have neglected timing. Failures that occur
in CMOS could leave the functionality of
the circuit untouched but affect the timing.

▪ For instance, consider the layout shown in


Figure 5 for an inverter gate composed of
par alleled nMOS and pMOS transistors.

▪ If an open circuit occurs in one of the


nMOS tran sistor source connections to
GND, then the gate would still function but
with increased tpdf.

▪ In addition, the fault now becomes


sequential as the detection of the fault
depends on the previous state of the gate. FIGURE 5 An example of a delay fault
Design of testability
• The keys to designing circuits that are testable are controllability and observability.
• • Restated, controllability is the ability to set (to 1) and reset (to 0) every node internal to the circuit.
• • Observability is the ability to observe, either directly or indirectly, the state of • any node in the circuit.
• • Good observability and controllability reduce the cost of manufacturing testing because they allow
high fault coverage with relatively few test vectors.
• • Moreover, they can be essential to silicon debug because physically probing internal signals has
become so difficult.
• • The following three main approaches are commonly called Design for Testability (DFT). These may be
categorized as follows:
• ➢ Ad hoc testing
• ➢ Scan-based approaches
• ➢ Built-in self-test (BIST)
Ad Hoc Testing
• Ad hoc test techniques, as their name suggests, are collections of ideas aimed
at reducing the combinational explosion of testing. •
• They are only useful for small designs where scan, ATPG (automatic test
pattern generation) and BIST are not available.
• • A complete scan-based testing methodology is recommended for all digital
circuits. •
• The following are common techniques for ad hoc testing: ➢ Partitioning large
sequential circuits
• ➢ Adding test points
• ➢ Adding multiplexers
• ➢ Providing for easy state reset •
• In general, ad hoc testing techniques represent a bag of tricks developed over
the years by designers to avoid the overhead of a systematic approach to
testing.
Scan Design
• The scan-design strategy for testing has evolved to provide observability and
controllability at each register.

• In designs with scan, the registers operate in one of two modes. In normal
mode, they behave as expected. In scan mode, they are connected to form a
giant shift register called a scan chain spanning the whole chip.

• By applying N clock pulses in scan mode, all N bits of state in the system can
be shifted out and new N bits of state can be shifted in.

• Therefore, scan mode gives easy observability and controllability of every


register in the system.
Scan Design

Figure 1: Scan-based testing


Scan Design
• Modern scan is based on the use of scan registers, as shown in Figure 1.

• The scan register is a D flip-flop preceded by a multiplexer. When the SCAN signal is deasserted,
the register behaves as a conventional register, storing data on the D input.

• When SCAN is asserted, the data is loaded from the SI pin, which is connected in shift register
fashion to the previous register Q output in the scan chain.

• For the circuit to load the scan chain,


➢ SCAN is asserted and CLK is pulsed eight times to load the first two ranks of 4-bit
registers with data. SCAN is deasserted and CLK is asserted for one cycle to operate the
circuit normally with predefined inputs.
➢ SCAN is then reasserted and CLK asserted eight times to read the stored data out. At the
same time, the new register contents can be shifted in for the next test.
Scan Design
• Testing proceeds in this manner of serially clocking the data through the scan register to the right
point in the circuit, running a single system clock cycle and serially clocking the data out for
observation.

• In this scheme, every input to the combinational block can be controlled and every output can be
observed. In addition, running a random pattern of 1s and 0s through the scan chain can test the
chain itself.

• Test generation for this type of test architecture can be highly automated. ATPG (automatic test
pattern generation) techniques can be used for the combinational blocks and, as mentioned, the scan
chain is easily tested.

• The prime disadvantage is the area and delay impact of the extra multiplexer in the scan register.
Designers (and managers alike) are in widespread agreement that this cost is more than offset by the
savings in debug time and production test cost.
Built In self Test
• The combination of signature analysis and the scan technique creates a
structure known as BIST for Built-In Self-Test or BILBO for Built-In Logic Block
Observation.
• The 3-bit BIST register shown in Figure 2 is a scannable, resettable register that
also can serve as a pattern generator and signature analyzer.
• C[1:0] specifies the mode of operation.
• In the reset mode (10), all the flip-flops are synchronously initialized to 0. In
normal mode (11), the flip-flops behave normally with their D input and Q output.
• In scan mode (00), the flip-flops are configured as a 3-bit shift register between
SI and SO. Note that there is an inversion between each stage. • In test mode (01),
the register behaves as a pseudo-random sequence generator or signature
analyzer.
➢ If all the D inputs are held low, the Q outputs loop through a pseudo-random bit
sequence, which can serve as the input to the combinational logic.
➢ If the D inputs are taken from the combinational logic output, they are swizzled
with the existing state to produce the syndrome.
• In summary, BIST is performed by first resetting the syndrome in the
output register. Then both registers are placed in the test mode to
produce the pseudo-random inputs and calculate the syndrome.
Finally, the syndrome is shifted out through the scan chain.
IDDQ Testing
• A method of testing for bridging faults is called IDDQ test (VDD supply current Quiescent) or supply
current monitoring.
• This relies on the fact that when a CMOS logic gate is not switching, it draws no DC current (except for
leakage).
• When a bridging fault occurs, then for some combination of input conditions, a measurable DC IDD
will flow.
• Testing consists of applying the normal vectors, allowing the signals to settle, and then measuring IDD.
As potentially only one gate is affected, the IDDQ test has to be very sensitive.
• In addition, to be effective, any circuits that draw DC power such as pseudo-nMOS gates or analog
circuits have to be disabled.
• Dynamic gates can also cause problems. As current measuring is slow, the tests must be run slower
(of the order of 1 ms per vector) than normal, which increases the test time.
• IDDQ testing can be completed externally to the chip by measuring the current drawn on the VDD line
or internally using specially constructed test circuits.
• This technique gives a form of indirect massive observability at little circuit overhead. However, as
subthreshold leakage current increases, IDDQ testing ceases to be effective because variations in
subthreshold leakage exceed currents caused by the faults.
Design for Manufacturability
Circuits can be optimized for manufacturability to increase their yield. This can be done in
several ways.
1. Physical
At the physical level (i.e., mask level), the yield and hence manufacturability can be
improved by reducing the effect of process defects. The design rules for processes will
frequently have guidelines for improving yield. The following list is representative: Increase
the spacing between wires where possible
• this reduces the chance of a defect causing a short circuit. Increase the overlap of layers
around contacts and vias
• this reduces the chance that a misalignment will cause an aberration in the contact
structure. Increase the number of vias at wire intersections beyond one if possible
• this reduces the chance of a defect causing an open circuit. Increasingly, design tools are
dealing with these kinds of optimizations automatically.
2. Redundancy
• Redundant structures can be used to compensate for defective com ponents
on a chip. For example, memory arrays are commonly built with extra rows.

• During manufacturing test, if one of the words is found to be defective, the


memory can be reconfigured to access the spare row instead.

• Laser-cut wires or electrically programmable fuses can be used for


configuration. Similarly, if the memory has many banks and one or more are
found to be defective, they can be disabled, possibly even under software
control.
3. Power
• Elevated power can cause failure due to excess current in wires,
which in turn can cause metal migration failures.

• In addition, high-power devices raise the die temperature, degrading


device performance and, over time, causing device parameter shifts.

4. Process Spread
• We have seen that process simulations can be carried out at different
process corners.

• Monte Carlo analysis can provide better modeling for process spread
and can help with centering a design within the process variations.
5. Yield Analysis

• When a chip has poor yield or will be manufactured in high volume, dice
that fail manufacturing test can be taken to a laboratory for yield analysis
to locate the root cause of the failure.

• If structures are determined to have caused many of the failures, the layout
of the structures can be redesigned.

• For example, during volume production ramp-up for the Pentium
microprocessor, the silicide over long thin polysilicon lines was found to
crack and raise the wire resistance. This in turn led to slower-than-
expected operation for the cracked chips. The layout was modified to
widen polysilicon wires or strap them with metal wherever possible,
boosting the yield at higher frequencies.

You might also like