Vlsi Digital Notes
Vlsi Digital Notes
VLSI DESIGN
Prepared by
I.Swathi Priya,M.Tech.,
K.Aishwarya.,M.E.,
L T/P/D C
III YEAR B.TECH ECE II SEM 4 -/ -/- 4
UNIT II :
VLSI CIRCUIT DESIGN PROCESSES: VLSI Design Flow, MOS layers, Stick Diagrams, Design Rules
and Layout, 2μm CMOS design rules for wires, Contacts and Transistors Layout Diagrams for NMOS and
CMOS Inverters and Gates, Scaling of MOS circuits.
UNIT III:
GATE LEVEL DESIGN : Logic Gates and Other complex gates, Switch logic, Alternate gate circuits, time
delays, Driving large Capacitive Loads, Wiring Capacitances, Fan-in and fan-out, Choice of layers.
UNIT IV :
DATAPATH SUBSYSTEM : Subsystem Design, Shifters, Adders, ALUs, Multipliers, Parity generators,
Comparators, Zero/One Detectors, Counters.
ARRAY SUBSYSTEMS: SRAM, DRAM, ROM, Serial access memories.
UNIT V :
PROGRAMMABLE LOGIC DEVICES :
PLAs, FPGAs, CPLDs, Standard Cells, Programmable Array Logic, Design Approach, parameters influencing
low power design.
CMOS TESTING :
CMOS Testing, Need for testing, Test Principles, Design Strategies for test, Chip level Test Techniques.
TEXT BOOKS:
1. Essentials of VLSI circuits and systems – Kamran Eshraghian, EshraghianDougles and A. Pucknell, PHI,
2005 Edition.
2. VLSI DESIGN - K. Lal Kishore, V.S.V Prabhakar, I.K International, 2009.
3. CMOS VLSI Design- Neil H.E Weste, David Harris, AyanBanerjee, Pearson Education, 1999.
REFERENCES:
COURSE OBJECTIVES
Explain electrical properties of MOS and BiCMOS devices to analyse the behaviour of
inverts designed with various loads.
Give exposure to the design rules to be followed to draw the layout of any logic circuit.
Provide concept to design different types of logic gates using CMttS inverter and analyze
their transfer characteristics.
Provide design concepts to design building blocks of data path of any system using gates.
COURSE OUTCOMES
Upon successfully completing the course, the student should be able to:
Acquire qualitative knowledge about the fabrication process of integrated circuit using MOS
transistors.
Choose an appropriate inverter depending on specifications required for a circuit
Draw the layout of any logic circuit which helps to understand and estimate parasitics of any
logic circuit
Design different types of logic gates using CMOS inverter and analyse their transfer
characteristics
Provide design concepts required to design building blocks of date, path using gates.
Design simple memories using MOS transistors and can understand Design of large memories.
Design simple logic circuit using PLA, PAL, FPGA and CPLD.
Understand different types of faults that can occur in a system and learn the concept of testing
and adding extra hardware to improve testability of system
TABLE OF CONTENTS
1.1 INTRODUCTION
The invention of the transistor by William B. Shockley, Walter H. Brattain and John
Bardeen of Bell Telephone Laboratories drastically changed the electronics industry and paved
the way for the development of the Integrated Circuit (IC) technology. The first IC was designed
by Jack Kilby at Texas Instruments at the beginning of 1960 and since that time there have
already been four generations of ICs .Viz SSI (small scale integration), MSI (medium scale
integration), LSI (large scale integration), and VLSI (very large scale integration). Now we are
ready to see the emergence of the fifth generation, ULSI (ultra large scale integration) which is
characterized by complexities in excess of 3 million devices on a single IC chip. Further
miniaturization is still to come and more revolutionary advances in the application of this
technology must inevitably occur.
Over the past several years, Silicon CMOS technology has become the dominant
fabrication process for relatively high performance and cost effective VLSI circuits. The
revolutionary nature of this development is understood by the rapid growth in which the number
of transistors integrated in circuits on a single chip.
The MOS technology is considered as one of the very important and promising
technologies in the VLSI design process. The circuit designs are realized based on pMOS,
nMOS, CMOS and BiCMOS devices.The pMOS devices are based on the p-channel MOS
transistors. Specifically, the pMOS channel is part of a n-type substrate lying between two
heavily doped p+ wells beneath the source and drain electrodes. Generally speaking, a pMOS
transistor is only constructed in consort with an NMOS transistor.The nMOS technology and
design processes provide an excellent background for other technologies. In particular, some
familiarity with nMOS allows a relatively easy transition to CMOS technology and design.
The techniques employed in nMOS technology for logic design are similar to GaAs
technology. Therefore, understanding the basics of nMOS design will help in the layout of GaAs
circuit. With the rapid advances in technology the the size of the ICs is shrinking and the
integration density is increasing.
The minimum line width of commercial products over the years is shown in the below
Figure 1.1 Moore’s law
.
The graph shows a significant decrease in the size of the chip in recent years which implicitly
indicates the advancements in the VLSI technology.
Silicon dioxide is a very good insulator, so a very thin layer, typically only a few hundred
molecules thick, is used.In fact, the transistors which are used do not use metal for their gate
regions, but instead use polycrystalline silicon (poly). Polysilicon gate FET's have replaced
virtually all of the older devices using metal gates in large scale integrated circuits. (Both metal
and polysilicon FET's are sometimes referred to as IGFET's (insulated gate field effect
transistors), since the silicon dioxide under the gate is an insulator.
MOS Transistors are classified as n-MOS ,p-MOS and c-MOS Transistors based on the
fabrication . nMOS devices are formed in a p-type substrate of moderate doping level. The
source and drain regions are formed by diffusing n- type impurities through suitable masks into
these areas to give the desired n-impurity concentration and give rise to depletion regions which
extend mainly in the more lightly doped p-region . Thus, source and drain are isolated from one
another by two diodes. Connections to the source and drain are made by a deposited metal layer.
In order to make a useful device, there must be the capability for establishing and controlling a
current between source and drain, and .this is commonly achieved in one of two ways, giving
rise to the enhancement mode and depletion mode transistors.
Let us now consider the conditions when current flows in the channel by applying a voltage Vds
between drain and source. The IR drop = Vds along the channel. This develops a voltage
between gate and channel varying with distance along the channel with the voltage being a
maximum of Vgs at the source end. Since the effective gate voltage is Vg= Vgs - Vt, (no current
flows when Vgs < Vt) there will be voltage available to invert the channel at the drain end so
long as Vgs - Vt ~ Vds· The limiting condition comes when Vds= Vgs - Vt. For all voltages Vds
< Vgs - Vt, the device is in the non-saturated region of operation which is the condition shown in
Fig. 1.4 (b) below.
VLSI DESIGN
Figure 1.4 Enhancement mode transistor for particular values of Vds with (Vgs > Vt)
Let us now consider the situation when Vds is increased to a level greater than Vgs - Vt. In this
case, an IR drop equal to Vgs – Vt occurs over less than the whole length of the channel such that,
near the drain, there is insufficient electric field available to give rise to an inversion layer to
create the channel. The .channel is, therefore, 'pinched off as shown in Fig.1.4 (c). Diffusion
current completes the path from source to drain in this case, causing the channel to exhibit a high
resistance and behave as a constant current source. This region, known as saturation, is
characterized by almost constant current for increase of Vds above Vds = Vgs - Vt. In all cases,
the channel will cease to exist and no current will flow when Vgs < Vt. Typically, for
enhancement mode devices, Vt = 1 volt for VDD = 5 V or, in general terms, Vt = 0.2 VDD.
to cause the channel to cease a negative voltage Vtd must be applied between gate and source.
Vtd is typically < - 0.8 VDD, depending on the implant and substrate bias, but, threshold voltage
differences apart, the action is similar to that of the enhancement mode transistor .
There are a large number and variety of basic fabrication steps used in the production of modern
MOS ICs. The same process can be used for the designed of NMOS or PMOS or
CMOS devices.The gate material could be either metal or poly-silicon . The most commonly used
substrate is bulk silicon or silicon-on-sapphire (SOS). Inorder to avoid the presence of parasitic
transistors, variations are brought in the techniques that are used to isolate the devices in the wafer.
The fabrication steps and respective diagrams (In Fig 1.6 below) are as follows:
Step1: Processing is carried on single crystal silicon of high purity on which required P impurities
are introduced as crystal is grown. Such wafers are about 75 to 150 mm in diameter and 0.4 mm thick
and they are doped with say boron to impurity concentration of 10 to power 15/cm3 to 10 to the
power 16 /cm3.
Step 2 : A layer of silicon di oxide (SiO2) typically 1 micrometer thick is grown all over the surface
of the wafer to protect the surface, acts as a barrier to the dopant during processing, and provide a
generally insulating substrate on to which other layers may be deposited and patterned.
Step 3: The surface is now covered with the photo resist which is deposited onto the wafer and spun
to an even distribution of the required thickness.
Step 4: The photo resist layer is then exposed to ultraviolet light through masking which defines
those regions into which diffusion is to take place together with transistor channels. Assume, for
example , that those areas exposed to uv radiations are polymerized (hardened), but that the areas
required for diffusion are shielded by the mask and remain unaffected.
Step 5: These areas are subsequently readily etched away together with the underlying silicon di
oxide so that the wafer surface is exposed in the window defined by the mask.
Step 6: The remaining photo resist is removed and a thin layer of SiO2 (0.1 micro m typical) is
grown over the entire chip surface and then poly silicon is deposited on the top of this to form the
gate structure. The polysilicon layer consists of heavily doped polysilicon deposited by chemical
vapour deposition (CVD). In the fabrication of fine pattern devices, precise control of thickness,
impurity concentration, and resistivity is necessary
Step 7: Further photo resist coating and masking allows the poly silicon to be patterned and then the
thin oxide is removed to expose areas into which n-type impurities are to be diffused to form the
source and drain. Diffusion is achieved by heating the wafer to a high temperature and passing a gas
containing the desired n-type impurity.
Note: The poly silicon with underlying thin oxide and the thick oxide acts as mask during diffusion
the process is self aligning.
Step 8: Thick oxide (SiO2) is grown over all again and is then masked with photo resist and etched
to expose selected areas of the poly silicon gate and the drain and source areas where connections are
to be made. (contacts cut)
Step 9: The whole chip then has metal (aluminium) deposited over its surface to a thickness typically
of 1 micro m. This metal layer is then masked and etched to form the required interconnection
pattern.
This diffusion should be carried out with special care since the p-well doping concentration and
depth will affect the threshold voltages as well as the breakdown voltages of the n-transistors. To
achieve low threshold voltages (0.6 to 1.0 V) either deep-well diffusion or high-well resistivityis
required. However, deep wells require larger spacing between the n- and p-type transistors and
wires due to lateral diffusion and therefore a larger chip area. The p-wells.act as substrates for the
n- devices within the parent n-substrate, and, the two areas are electrically isolated. Except this in
all other respects- like masking, patterning, and diffusion-the process is similar to nMOS
fabrication.
The Figure 1.7-5 below shows the CMOS p-well inverter showing VDD and Vss
substrate connections
Figure 1.7-5 CMOS p-well inverter
(ii)The n-well Process : Though the p-well process is widely used in C-MOS fabrication the n-
well fabrication is also very popular because of the lower substrate bias effects on transistor
threshold voltage and also lower parasitic capacitances associated with source and drain regions.
The typical n-well fabrication steps are shown in the Figure 1.8 below.
The first mask defines the n-well regions. This is followed by a low dose phosphorus implant
driven in by a high temperature diffusion step to form the n-wells. The well depth is optimized to
ensure against-substrate top+ diffusion breakdown without compromising then-wellton+ mask
separation. The next steps are to define the devices and diffusion paths, grow field oxide, deposit
and pattern the polysilicon, carry out the diffusions, make contact cuts, and finally metalize as before. lt will
be seen that an n+ mask and its complement may be used to define the n- and p-diffusion regions
respectively. These same masks also include the V DD and Vss contacts (respectively). It should
be noted that, alternatively, we could have used a p+ mask and its complement since the n + and
Due to the differences in charge carrier mobilities, the n-well process creates non-optimum p-
channel characteristics. However, in many CMOS designs (such as domino-logic and dynamic
logic structures), this is relatively unimportant since they contain a preponderance of n-channel
devices. Thus then-channel transistors are mainly those used to form1ogic elements, providing
speed and high density of elements.
However, a factor of the n-well process is that the performance of the already poorly performing
p-transistor is even further degraded. Modern process lines have come to grips with these
problems, and good device performance may be achieved for both p-well and n-well fabrication.
A BiCMOS circuit consist of both bipolar junction transistors and MOS transistors on a
single substrate. The driving capability of MOS transistors is less because of limited current
sourcing and sinking capabilities of the transistors. To drive large capacitive loads Bi-CMOS
technology is used. As this technology combines Bipolar and CMOS transistors in a single
integrated circuit, it has the advantages of both bipolar and CMOS transistors. BiCMOS is able to
achieve VLSI circuits with speed-power-density performance previously not possible with either
technology individually.The diagram given below shows the cross section of the BiCMOS
process which uses an npn transistor
The lay-out view of Bic-MOS transistor is shown in the figure below. The fabrication of BiCMOS
is similar to CMOS but with certain additional process steps and additional masks are
considered. They are (i) the p+ base region; (ii) n+ collector area; and (iii) the buried sub collector
(BCCD).
before bending in a parabolic response. Thus the name ohmic or linear for the non- saturated
region.
The drain current in saturation is virtually independent of V DS and the transistor acts as a current
source. This is because there is no carrier inversion at the drain region of the channel. Carriers are
pulled into the high electric field of the drain/substrate pn junction and ejected out of the drain
terminal.
where
Some time it is also convenient to use gate –capacitance per unit area ,Cg
So,the drain current is
This is the relation between drain current and drain-source voltage in non-saturated region.
or
The expressions derived above for Ids hold for both enhancement and depletion mode devices.
Here the threshold voltage for the nMOS depletion mode device (denoted as Vtd) is negative.
where QD = the charge per unit area in the depletion layer below the
oxide Qss = charge density at Si: SiO2 interface
Co =Capacitance per unit area.
Φns = work function difference between gate and Si
ΦfN = Fermi level potential between inverted surface and bulk Si
For polynomial gate and silicon substrate, the value of Φns is negative but negligible and the
magnitude and sign of Vt are thus determined by balancing the other terms in the equation.
To evaluate the Vt the other terms are determined as below.
effect. The potential difference between the source and the body (Vsb) affects the threshold
voltage of the transistor. In many situations, this Body Effect is relatively insignificant, so we
can (unless otherwise stated) ignore the Body Effect. But it is not always insignificant, in some
cases it can have a tremendous impact on MOSFET circuit performance.
Increasing Vsb causes the channel to be depleted of charge carriers and thus the threshold
voltage is raised. Change in Vt is given by ΔVt = γ.(Vsb) 1/2 where γ is a constant which depends
on substrate doping so that the more lightly doped the substrate, the smaller will be the body
effect
The threshold voltage can be written as
A simple inverter circuit can be constructed using a transistor with source connected to ground
and a load resistor of connected from the drain to the positive supply rail V DD· The output is
taken from the drain and the input applied between gate and ground .
But, during the fabrication resistors are not conveniently produced on the silicon substrate and
even small values of resistors occupy excessively large areas .Hence some other form of load
resistance is used. A more convenient way to solve this problem is to use a depletion mode
transistor as the load, as shown in Fig. below.
For the depletion mode transistor, the gate is connected to the source so it is always on .
In this configuration the depletion mode device is called the pull-up (P.U) and the
enhancement mode device the pull-down (P.D) transistor.
With no current drawn from the output, the currents Ids for both transistors must be
equal.
1.2.5.1 nMOS Inverter transfer characteristic.
The transfer characteristic is drawn by taking Vds on x-axis and Ids on Y-axis for both
enhancement and depletion mode transistors. So,to obtain the inverter transfer characteristic for
Vgs = 0 depletion mode characteristic curve is superimposed on the family of curves for the
enhancement mode device and from the graph it can be seen that , maximum voltage across the
enhancement mode device corresponds to minimum voltage across the depletion mode transistor.
From the graph it is clear that as Vin(=Vgs p.d. transistor) exceeds the Pulldown threshold
voltage current begins to flow. The output voltage Vout thus decreases and the subsequent
increases in Vin will cause the Pull down transistor to come out of saturation and become
resistive.
1.2.6 CMOS Inverter
The inverter is the very important part of all digital designs. Once its operation and
properties are clearly understood, Complex structures like NAND gates, adders, multipliers, and
microprocessors can also be easily done. The electrical behavior of these complex circuits can be
almost completely derived by extrapolating the results obtained for inverters. As shown in the
Figure 1.17 below the CMOS transistor is designed using p-MOS and n-MOS transistors.
In the inverter circuit ,if the input is high .the lower n-MOS device closes to discharge the
capacitive load .Similarly ,if the input is low,the top p-MOS device is turned on to charge the
capacitive load .At no time both the devices are on ,which prevents the DC current flowing from
positive power supply to ground. Qualitatively this circuit acts like the switching circuit, since
the p-channel transistor has exactly the opposite characteristics of the n-channel transistor. In the
transition region both transistors are saturated and the circuit operates with a large voltage gain.
The C-MOS transfer characteristic is shown in the below Figure 1.18. Considering the static
conditions first, it may be Seen that in region 1 for which Vi,. = logic 0, we have the p-transistor
fully turned on while the n-transistor is fully turned off. Thus no current flows through the
inverter and the output is directly connected to VDD through the p-transistor.
Hence the output voltage is logic 1 . In region 5 , V in = logic 1 and the n-transistor is fully on
while the p-transistor is fully off. So, no current flows and a logic 0 appears at the output.
In region 2 the input voltage has increased to a level which just exceeds the threshold voltage of
the n-transistor. The n-transistor conducts and has a large voltage between source and drain; so it
is in saturation. The p-transistor is also conducting but with only a small voltage across it, it
operates in the unsaturated resistive region. A small current now flows through the inverter from
VDD to VSS. If we wish to analyze the behavior in this region, we equate the p-device resistive
region current with the n-device saturation current and thus obtain the voltage and current
relationships.
Region 4 is similar to region 2 but with the roles of the p- and n-transistors reversed.However,
the current magnitudes in regions 2 and 4 are small and most of the energy consumed in
switching from one state to the other is due to the larger current which flows in region 3.
Region 3 is the region in which the inverter exhibits gain and in which both transistors are in
saturation.
The currents in each device must be the same ,since the transistors are in series. So,we can write
that
Since both transistors are in saturation, they act as current sources so that the equivalent circuit in
this region is two current sources in series between V DD and Vss with the output voltage coming
from their common point. The region is inherently unstable in consequence and the changeover
from one logic level to the other is rapid.
where Wp.d , Lp.d , Wp.u. and Lp.u are the widths and lengths of the pull-down and pull-up
transistors respectively.
So,we can write that
here
So,we get
This is the ratio for pull-up to pull down ratio for an inverter directly driven by another inverter.
1.2.8 Pull -Up to Pull-Down ratio for an nMOS Inverter driven through one
or more Pass Transistors
Let us consider an arrangement in which the input to inverter 2 comes from the output of
inverter 1 but passes through one or more nMOS transistors as shown in Figure 1.20 below
(These transistors are called pass transistors).
The connection of pass transistors in series will degrade the logic 1 level / into inverter 2 so that
the output will not be a proper logic 0 level. The critical condition is , when point A is at 0 volts
and B is thus at VDD. but the voltage into inverter 2at point C is now reduced from V DD by the
threshold voltage of the series pass transistor. With all pass transistor gates connected to V DD
there is a loss of Vtp, however many are connected in series, since no static current flows
through them and there can be no voltage drop in the channels. Therefore, the input voltage to
inverter 2 is
Vin2 = VDD- Vtp
where Vtp = threshold voltage for a pass transistor.
Let us consider the inverter 1 shown in Fig.(a) with input = V DD· If the input is at VDD , then the
pull-down transistor T2 is conducting but with a low voltage across it; therefore, it is in its
resistive region represented by R1 in Fig.(a) below. Meanwhile, the pull up transistor T1 is in
saturation and is represented as a current source.
For the pull down transistor
So,
Let us now consider the inverter 2 Fig.b .when input = VDD- Vtp.
Whence,
If inverter 2 is to have the same output voltage under these conditions then Vout1 = Vout2. That is
I1R1=I2R2 , therefore
Therefore
Figure 1.21
2. nMOS depletion mode transistor pull-up : This arrangement consists of a depletion mode
transistor as pull-up. The arrangement and the transfer characteristic are shown below in Figure
1.22. In this type of arrangement we observe
(a) Dissipation is high , since rail to rail current flows when Vin = logical 1.
(b) Switching of output from 1 to 0 begins when Vin exceeds Vt, of pull-down device.
(a) Dissipation is high since current flows when Vin =logical 1 (VGG is returned to VDD) .
(b) Vout can never reach VDD (logical I) if VGG = VDD as is normally the case.
(c) VGG may be derived from a switching source, for example, one phase of a clock, so that
(d) If VGG is higher than VDD then an extra supply rail is required.
4. Complementary transistor pull-up (CMOS) : This arrangement consists of a C-MOS
arrangement as pull-up. The arrangement and the transfer characteristic are shown below in
Figure 1.24
Figure 1.24
bipolar junction transistors,( Q2 and Q1), and two impedances which act as loads( Z2 and Z1) as
shown in the circuit below in Figure 1.25.
When input, Vin, is high (VDD), the NMOS transistor ( M1), turns on, causing Q1 to
conduct,while M2 and Q2 are off, as shown in Figure 1.26 . Hence , a low (GND) voltage is
translated to the output Vout. On the other hand, when the input is low, the M2 and Q2 turns on,
while M1and Q1 turns off, resulting to a high output level at the output as shown in Figure 1.26.
In steady-state operation, Q1 and Q2 never turns on or off simultaneously, resulting to a lower
power consumption. This leads to a push-pull bipolar output stage. Transistors M1and M2, on
the other hand, works as a phase-splitter, which results to a higher input impedance.
Figure 1.26
The impedances Z2 and Z1 are used to bias the base-emitter junction of the bipolar transistor and
to ensure that base charge is removed when the transistors turn off. For example when the input
voltage makes a high-to-low transition, M1 turns off first. To turn off Q1, the base charge must
be removed, which can be achieved by Z1.With this effect, transition time reduces.
However,there exists a short time when both Q1 and Q2 are on, making a direct path from the
supply (VDD) to the ground. This results to a current spike that is large and has a detrimental
effect on both the noise and power consumption, which makes the turning off of the bipolar
transistor fast .
The BiCMOS gates perform in the same manner as the CMOS inverter in terms of power
consumption, because both gates display almost no static power consumption.
When comparing BiCMOS and CMOS in driving small capacitive loads, their performance are
comparable, however, making BiCMOS consume more power than CMOS. On the other hand,
driving larger capacitive loads makes BiCMOS in the advantage of consuming less power than
CMOS, because the construction of CMOS inverter chains are needed to drive large capacitance
loads, which is not needed in BiCMOS.
The BiCMOS inverter exhibits a substantial speed advantage over CMOS inverters, especially
when driving large capacitive loads. This is due to the bipolar transistor’s capability of
effectively multiplying its current.
For very low capacitive loads, the CMOS gate is faster than its BiCMOS counterpart due to
small values of Cint. This makes BiCMOS ineffective when it comes to the implementation of
internal gates for logic structures such as ALUs, where associated load capacitances are small.
BiCMOS devices have speed degradation in the low supply voltage region and also BiCMOS is
having greater manufacturing complexity than CMOS.
2.1 INTRODUCTION
Design processes are always associated with certain concepts like stick diagrams and
symbolic diagrams. But the key element is a set of design rules which forms the communication link
between the designer (specifying requirements) and the fabricator (who materializes them). Design
rules are used to produce workable mask layouts from which the various layers in silicon will be
formed or patterned. Among the design rules Lambda –based rules are important. They are
straightforward and relatively simple to apply. However, they are 'real' and chips can be fabricated
from mask layouts using the lambda-based rule set. Correct and faster designs will be realized if a
fabricator's line is used to its full advantage and such rule sets are needed not only to the fabricator
but also to a specific technology.
2.2 VLSI DESIGN FLOW
The VLSI design cycle starts with a formal specification of a VLSI chip, follows a series of
steps, and eventually produces a packaged chip. A typical design cycle may be represented by the
flow chart shown in Figure 2.1 VLSI design flow.
Our emphasis is on the physical design step of the VLSI design cycle. However, to gain a global
perspective, we briefly outline all the steps of the VLSI design cycle.
1. System Specification
The first step of any design process is to lay down the specifications of the system. System
specification is a high level representation of the system. The factors to be considered in this
process include: performance, functionality, and physical dimensions (size of the die (chip)).
The fabrication technology and design techniques are also considered.
The specification of a system is a compromise between market requirements, technology and
economical viability. The end results are specifications for the size, speed, power, and
functionality of the VLSI system.
2. Architectural Design
The basic architecture of the system is designed in this step. This includes, such decisions as
RISC (Reduced Instruction Set Computer) versus CISC (Complex Instruction Set Computer),
number of ALUs, Floating Point units, number and structure of pipelines, and size of caches
among others.
The outcome of architectural design is a Micro-Architectural Specification (MAS). While
MAS is a textual (English like) description, architects can accurately predict the performance,
power and die size of the design based on such a description.
The outcome of functional design is usually a timing diagram or other relationships between
units. This information leads to improvement of the overall design process and reduction of
the complexity of subsequent phases. Functional or behavioral design provides quick
emulation of the system and allows fast debugging of the full system. Behavioral design is
largely a manual step with little or no automation help available.
4. Logic Design
In this step the control flow, word widths, register allocation, arithmetic operations, and logic
operations of the design that represent the functional design are derived and tested.
This description is called Register Transfer Level (RTL) description. RTL is expressed in a
Hardware Description Language (HDL), such as VHDL or Verilog. This description can be
used in simulation and verification. This description consists of Boolean expressions and
timing information. The Boolean expressions are minimized to achieve the smallest logic
design which conforms to the functional design. This logic design of the system is simulated
and tested to verify its correctness. In some special cases, logic design can be automated
using high level synthesis tools. These tools produce a RTL description from a behavioral
description of the design.
5. Circuit Design
The purpose of circuit design is to develop a circuit representation based on the logic design.
The Boolean expressions are converted into a circuit representation by taking into
consideration the speed and power requirements of the original design. Circuit Simulation is
used to verify the correctness and timing of each component.
The circuit design is usually expressed in a detailed circuit diagram. This diagram shows the
circuit elements (cells, macros, gates, transistors) and interconnection between these
elements. This representation is also called a netlist. Tools used to manually enter such
description are called schematic capture tools. In many cases, a netlist can be created
automatically from logic (RTL) description by using logic synthesis tools.
6. Physical Design
In this step the circuit representation (or netlist) is converted into a geometric representation.
As stated earlier, this geometric representation of a circuit is called a layout. Layout is created
by converting each logic component (cells, macros, gates, transistors) into a geometric
representation (specific shapes in multiple layers), which perform the intended logic function
of the corresponding component. Connections between different components are also
expressed as geometric patterns typically lines in multiple layers.
The exact details of the layout also depend on design rules, which are guidelines based on the
limitations of the fabrication process and the electrical properties of the fabrication materials.
Physical design is a very complex process and therefore it is usually broken down into
various sub-steps. Various verification and validation checks are performed on the layout
during physical design.
In many cases, physical design can be completely or partially automated and layout can be
generated directly from netlist by Layout Synthesis tools. Layout synthesis tools, while fast,
do have an area and performance penalty, which limit their use to some designs. Manual
layout, while slow and manually intensive, does have better area and performance as
compared to synthesized layout. However this advantage may dissipate as larger and larger
designs may undermine human capability to comprehend and obtain globally optimized
solutions.
7. Fabrication
After layout and verification, the design is ready for fabrication. Since layout data is typically
sent to fabrication on a tape, the event of release of data is called Tape Out. Layout data is
converted (or fractured) into photo-lithographic masks, one for each layer. Masks identify
spaces on the wafer, where certain materials need to be deposited, diffused or even removed.
Silicon crystals are grown and sliced to produce wafers. Extremely small dimensions of VLSI
devices require that the wafers be polished to near perfection. The fabrication process consists
of several steps involving deposition, and diffusion of various materials on the wafer. During
each step one mask is used. Several dozen masks may be used to complete the fabrication
process.
A large wafer is 20 cm (8 inch) in diameter and can be used to produce hundreds of chips,
depending of the size of the chip. Before the chip is mass produced, a prototype is made and
tested. Industry is rapidly moving towards a 30 cm (12 inch) wafer allowing even more chips
per wafer leading to lower cost per chip.
MOS design is aimed at turning a specification into masks for processing silicon to meet the
specification. We have seen that MOS circuits are formed on four basic layers-n-diffusion-
diffusion, polysilicon, and metal, which are isolated from one another by thick or thin(thinox)
silicon dioxide insulating layers. The thin oxide (thinox) mask region includes n-diffusion, p-
diffusion, and transistor channels. Polysilicon and thinox regions interact so that a transistor is
formed where they cross one another. In some processes, there may be a second metal layer and
also, in some processes, a second polysilicon layer. Layers may deliberately joined together
where contacts are formed. It is also clear that the basic MOS transistor properties can be
modified by the use of an implant within the thinox region and this is used in nMOS circuits to
produce depletion mode transistors. The BiCMOS technology is developed by including the
bipolar transistors in this design process by the addition of extra layers to a CMOS process.
2.4 Stick Diagrams
The layout of stick diagrams faithfully reflects the topology of the actual layout in silicon. The
color encoding is compatible with color terminals, printers, and plotters having quite simple
color palettes. Using color workstations, the mask areas are usually color filled while pen plotters
produce color outlines only. Figure 2.4 shows the Encodings for a simple metal nMOS
process(color)
Stick diagram for n-MOS transistor is shown in the Figure 2.5. The two parallel rails indicate V DD
and gnd
To understand the design rules for nMOS design style , let us consider a single metal, single
polysilicon nMOS technology.
The layout of nMOS is based on the following important features.
crosses thinox (green) wherever transistors are required. One should consider the implants
(yellow) for depletion mode transistors and also consider the length to width (L : W) ratio
for each transistor. These ratios are important particularly in nMOS and nMOS- like
circuits. This is illustrated in the Figure 2.6 (c).
Diffusion paths must not cross the demarcation line and n-diffusion and p-diffusion wires must
not join. The 'n' and 'p' features are normally joined by metal where a connection is needed. Their
geometry will appear when the stick diagram is translated to a mask layout. However, one must
not forget to place crosses on VDD and Vss rails to represent the substrate and p-well connection
respectively.
The design style is explained by taking the example the design of a single bit shift register. The
design begins with the drawing of the V DD and Vss rails in parallel and in metal and the creation
of an (imaginary) demarcation line in-between, as shown in Figure 2.8 (a). The n-transistors are
then placed below this line and thus close to Vss, while p-transistors are placed above the line
and below VDD In both cases, the transistors are conveniently placed with their diffusion paths
parallel to the rails (horizontal in the diagram) as shown in Figure 2.8 (b). A similar approach
The n- along with the p-transistors are interconnected to the rails using the metal and connect as
shown in Figure 2.8 (c). It must be remembered that only metal and poly-silicon can cross the
demarcation line but with that restriction, wires can run-in diffusion also. Finally, the remaining
interconnections are made as appropriate and the control signals and data inputs are added as
shown in the Figure 2.8 (d).
Figure 2.9 (a) Design rules for wires (n-MOS and p-MOS)
Figure 2.9 (b) Design rules for transistors (n-MOS , p-MOS, CMOS)
In CMOS designs, poly. to diff. contacts are always made via metal. A simple process is
followed for making connections between metal and either of the other two layers as shown in Figure
2.10 (a).The 2λ. x 2λ. contact cut indicates an area in which the oxide is to be removed down to the
underlying polysilicon or diffusion surface. When deposition of the metal layer takes place the metal
is deposited through the contact cut areas onto the underlying area so that contact is made between
the layers.
The process is more complex for connecting diffusion to poly-silicon using the butting contact
approach as shown in Figure 2.10 (b). In effect, a 2λ. x 2λ contact cut is made down to each of the
layers to be joined. The layers are butted together in such a way that these two contact cuts become
contiguous. Since the poly-silicon and diffusion outlines overlap and thin oxide under poly- silicon
acts as a mask in the diffusion process, the poly-silicon and diffusion layers are also butted together.
The contact between the two butting layers is then made by a metal overlay as shown in the
Figure 2.10.
Usually, second level metal layers are coarser than the first (conventional) layer and the isolation
layer between the layers may also be of relatively greater thickness. To distinguish contacts between
first and second metal layers, they are known as vias rather than contact cuts. The second metal layer
representation is color coded dark blue (or purple).
The important process steps for a two-metal layer process are given below.
The oxide below the first metal layer is deposited by atmospheric chemical vapor deposition (CVD)
and the oxide layer between the metal layers is applied in a similar manner. Depending on the
process, removal of selected areas of the oxide is accomplished by plasma etching, which is designed
to have a high level of vertical ion bombardment to allow for high and uniform etch rates. Similarly,
the bulk of the process steps for a double polysilicon layer process are similar in nature to those
already described, except that a second thin oxide layer is grown after depositing and patterning the
first polysilicon layer (Poly.1) to isolate it from the now to be deposited second poly. layer (Poly.2).
The presence of a second poly. layer gives greater flexibility in interconnections and also allows
Poly.2 transistors to be formed by intersecting Poly. 2 and diffusion.
The important features of double metal process are summarized as follows
Use the second level metal for the global distribution of power buses, that is, V DD and GND
( Vss), and for clock lines.
Use the first level metal for local distribution of power and for signal lines.
Lay out the two metal layers so that the conductors are mutually orthogonal wherever possible.
Each of the above arrangements can be merged into single split contacts
The CMOS rules are designed based on the extensions of the Mead and Conway concepts and also
by excluding the butting and buried contacts the new rules for CMOS design are formed.
2.5.5 2 μm CMOS Design Rules for Wires, Contacts and Transistor
The microscopic dimensions of Silicon circuits always cause some problems in the design
process.The major problem is presented by possible deviation in line widths and in interlayer
registration. If the line widths are too small, it is possible for lines to be discontinuous in places. If
separate paths in a layer are placed too close together, it is possible that they will merge in places or
interfere with each other.
For the lambda-based rules , the design rules are formulated in terms of a length unit λ which
is related to the resolution of the process λ may be viewed as a limit on the width deviation of a
feature from its ideal 'as drawn' size and also as a bound on the maximum misalignment of any one
mask. In the worst case, these effects may combine to cause the relative position of feature edges on
different mask levels to deviate by as much as 2λ in their interrelationship. Inevitably, a
consequence of using the lambda-based concept is that every dimension must be rounded up to
whole λ values and this leads to layouts which do not fully exploit the capabilities of the process.
Similar concepts underlie the establishment of 'micron-based' rule sets, but actual dimensions are
given so that full advantage can be taken of the fabrication line capabilities and tighter layouts
result.Layout rules, therefore, provide strict guidelines for preparing the geometric layouts which
will be used to configure the actual masks used during fabrication and can be regarded as the main
communication link between circuit/systems designers and the process engineers engaged in
manufacture. The goal of any set of design rules should give optimize yield while keeping the
geometry as small as possible without compromising the reliability of the finished circuit. On the
questions of yield and reliability, even the conservative nature of the lambda based rules can stand
reevaluation when these two factors are of paramount importance. In particular, the rules associated
with contacts can be improved upon in the light of experience. Figure 2.12 (a) sets out aspects that
may be observed for high yield and in high reliability situations. In our proposed scheme of events
in creating stick layouts for CMOS, it is assumed that poly. and metal can both freely cross well
boundaries and this is indeed the case, but we should be careful to try to exclude poly. from areas
which lie within p+ mask areas where possible. The reason for this is that the resistance of the poly.
layer is reduced in current processes by n- type doping. Clearly the p+ doping which takes place
inside the p+ mask will also dope the poly. which is already in place when the p+ doping step takes
place. This results in an increase in the n- doping poly. resistance which may be significant in
certain parts of a system.
The 3λ. metal width rule is a conservative one but is implemented to allow for the fact that the metal
layer is deposited after the others and on top of them and several layers of silicon dioxide, so that
the surface on which it sits is quite 'mountainous' . The metal layer is also light-reflective and these
factors combine to result in poor edge definition. In double metal the second layer of metal has an
even more uneven terrain on which to be deposited and patterned. Hence metal 2 is often wider than
metal 1.
Metal to metal separation is also large and is brought about mainly by difficulties in defining metal
edges accurately during masking operations on the highly reflective metal. All diffusion processes
are such that lateral diffusion occurs as well as impurity penetration from the surface. Hence the
separation rules for diffusion allow for this and relatively large separations are specified. This is
particularly the case for the p-well diffusions which are deep diffusions and thus have considerable
lateral spread. Transitions from thin gate oxide to thick field oxide in the oxidation process also use
up space and this is another reason why the lambda-based rules require a minimum separation
between thinox regions of 3λ. In effect, this implies that the minimum feature size for thick oxide is
3λ.The simplicity of the lambda-based rules makes this approach to design an appropriate one for
the novice chip designer and also, perhaps, for those applications in which we are not trying to
achieve the absolute minimum area and the absolute maximum performance. Because lambda-based
rules try 'to be all things to all people', they do suffer from least common denominator effects and
from the upward rounding of all process line dimension parameters into integer values of lambda.
The performance of any fabrication line in this respect clearly comes down to a matter of tolerances
and definitions in terms of microns (or some other suitable unit of length).Thus, expanded sets of
rules often referred to as micron-based rules are available to the more experienced designer to allow
for the use of the full capability of any process. Also, many processes offer additional layers, which
again adds to the possibilities presented to the designer. In order to properly represent these
important aspects, the next section introduces Orbit Semiconductor's 2µm feature size double metal,
double poly. n-well CMOS rules which also offer a BiCMOS capability.
2.6 Transistor layout diagrams for NMOS and CMOS inverters
Figure 2.15 shows the stick diagram and layout diagram of nMOS inverter.
3.1 INTRODUCTION
The module (integrated circuit) is implemented in terms of logic gates and interconnections
between these gates. Designer should know the gate-level diagram of the design. In general, gate-level
modeling is used for implementing lowest level modules in a design like, full-adder, multiplexers, etc.
Boolean algebra is used to represent logical(combinational logic) functions of digital circuits. A
combinational logic expression is a mathematical formula which is to be interpreted using the laws of
Boolean algebra. Now the goal of logic design or optimization is to find a network of logic gates that
together compute the combinational logic function we want.
For example, given the expression a+b , we can compute its truth value for any given values of a and
b , and also we can evaluate relationships such as a+b = c. but logic design is difficult for many
reasons:
We may not have a logic gate for every possible function, or even for every function of n
inputs.
Not all gate networks that compute a given function are alike-networks may differ greatly in
their area and speed.
Thus combinational logic expressions are the specification,
Logic gate networks are the implementation, Area, delay, and power are the costs.
A logic gate is an idealized or physical device implementing a Boolean function, that is, it
performs a logical operation on one or more logic inputs and produces a single logic output.
Logic gates are primarily implemented using diodes or transistors acting as electronic
switches, but can also be constructed using electromagnetic relays (relay logic), fluidic logic,
pneumatic logic, optics, molecules, or even mechanical elements.
With amplification, logic gates can be cascaded in the same way that Boolean functions can
be composed, allowing the construction of a physical model of all of Boolean logic.
simplest form of electronic logic is diode logic. This allows AND and OR gates to be built,
but not inverters, and so is an incomplete form of logic. Further, without some kind of
amplification it is not possible to have such basic logic operations cascaded as required for
more complex logic functions.
To build a functionally complete logic system, relays, valves (vacuum tubes), or transistors
can be used.
The simplest family of logic gates using bipolar transistors is called resistor-transistor logic
(RTL). Unlike diode logic gates, RTL gates can be cascaded indefinitely to produce more
complex logic functions. These gates were used in early integrated circuits. For higher speed,
Department of E.C.E.,MRECW Page 74
VLSI DESIGN
the resistors used in RTL were replaced by diodes, leading to diode- transistor logic (DTL).
Transistor-transistor logic (TTL) then supplanted DTL with the observation that one
transistor could do the job of two diodes even more quickly, using only half the space.
In virtually every type of contemporary chip implementation of digital systems, the bipolar
transistors have been replaced by complementary field-effect transistors (MOSFETs) to
reduce size and power consumption still further, thereby resulting in complementary metal–
oxide–semiconductor (CMOS) logic that can be described with Boolean logic.
output node to VDD and PDN connects the output node to the ground.
The transistor network is related to the Boolean function with a straight forward design procedure:
Design the pull down network (PDN) by realizing, AND(product) terms using series-
connected nMOSFETs. OR (sum) terms using parallel-connected nMOSFETS.
Design the pull-up network by realizing,
AND(product) terms using parallel-connected nMOSFETs. OR (sum) terms using series-connected
nMOSFETS.
Add an inverter to the output to complement the function. Some functions are inherently
negated, such as NAND,NOR gates do not need an inverter at the output terminal.
1) When the input Vin is logic HIGH, then the nMOS transistor is ON and the pMOS
transistor is OFF. Thus the output Y is pulled down to ground (logic 0) since it is
connected to ground but not to source VDD as shown in Figure 3.4 (a)
2) When the input Vin is logic LOW, then nMOS transistor is OFF and the pMOS transistor
is ON, Thus the output Y is pulled up to VDD(logic 1) since it is connected to source via
pMOS but not to ground as shown in Figure 3.4 (b).
Y= A.B = A.B
In this case, there is only one AND term, so there will be two nMOSFETs in series as shown in Figure
3.5 (a).
Step 3 Design the PUN. In PUN there will be two pMOSFETs in parallel , as shown in Figure 3.5 (b).
Finally join the PUN and PDN as shown in Figure 3.5 (c) which realizes two –input NAND gate. Note
that we have realized y, rather tat Y because the inversion is automatically provided by the nature of
the CMOS circuit operation
Working operation
1) Whenever at least one of the inputs is LOW, the corresponding pMOS transistor will conduct
while the corresponding nMOS transistor will turn OFF. Subsequently, the output voltage
will be HIGH.
2) Conversely, if both inputs are simultaneously HIGH, then both pMOS transistors will turn
OFF, and the output voltage will be pulled LOW by the two conducting nMOS transistors.
Y= A+B = A+B
In this case, there is only one OR term, so there will be two nMOSFETs connected in parallel, as
shown in Figure 3.6 (a)
Step 3 Design the PUN
In PUN there will be two pMOSFETs in series , as shown in Figure 3.6 (b)
Finally join the PUN and PDN as shown in Figure 3.6 (c) which realizes two –input NOR gate. Note
that we have realized y, rather tat Y because the inversion is automatically provided by the nature of
2) Conversely, if both inputs are simultaneously HIGH, then both pMOS transistors will turn
OFF, and the output voltage will be pulled LOW by the two conducting nMOS transistors.
Step 1: Draw A.B (AND) function first by connecting 2 nMOS transistors in series as shown in Figure
3.8 (a).
Step 3: Y = A.B+C.D , In this function A.B and C.D are added, for addition , we have to draw parallel
connection. So, A.B series connected in parallel with C.D as shown in Figure 3.8 (c).
Step 5: Take output at the point in between nMOS and pMOS networks.
3) When using nMOS switch logic no pass transistor gate input may be driven through one
or more pass transistors as shown in Figure 3.12.
Operation
When the gate input to the nMOS transistor is ‘0’ and the complementary ‘1’ is gate input to
the pMOS , thus both are turned off.
When gate input to the nMOS is ‘1’ and its complementary ‘0’ is the gate input to the pMOS ,
both are turned on and passes any signal ‘1’ and ‘0’ equally without any degradation.
The use of transmission gates eliminates the undesirable threshold voltage effects which give
rise to loss of logic levels in pass-transistors as shown in Figure 3.13.
Figure 3.13 Transmission gate
Advantages
1) Transmission gates eliminates the signal degradation in the output logic levels.
2) Transmission gate consists of two transistors in parallel and except near the positive and
negative rails.
Disadvantages
Figure 3.14 shows a 2-input multiplexer circuit using CMOS transmission gate.
Figure 3.14 2-input multiplexer circuit using CMOS transmission gate
If the control input S is low, the TG0 conducts and the output F is equal to A. On the other hand, if the
control input S is high the TG1 conducts and the output F is equal to B.
CMOS suffers from increased area and correspondingly increased capacitance and delay, as the logic
gates become more complex. For this reason, designers developed circuits (Alternate gate circuits) that
can be used to supplement the complementary type circuits . These forms are not intended to replace
CMOS but rather to be used in special applications for special purposes.
3.5.1 Pseudo nMOS Logic
Pseudo nMOS logic is one type of alternate gate circuit that is used as a supplement for the
complementary MOS logic circuits. In the pseudo-nMOS logic, the pull up network (PUN) is realized by
a single pMOS transistor. The gate terminal of the pMOS transistor is connected to the ground. It remains
permanently in the ON state. Depending on the input combinations, output goes low through the PDN.
Figure 3.15 shows the general building block of logic circuits that follows pseudo nMOS logic.
Here, only the nMOS logic (Qn) is driven by the input voltage, while the gate of p-transistor(Qp) is
connected to ground or substrate and Qp acts as an active load for Qn. Except for the load device, the
pseudo-nMOS gate circuit is identical to the pull-down network(PDN) of the complementary CMOS gate.
Figure 3.16 shows the realization of logic circuits using pseudo-nMOS logic.
Advantages
Disadvantages
1) The main drawback of using a pseudo nMOS gate instead of a CMOS gate is that the always on
PMOS load conducts a steady current when the output voltage is lower than VDD.
2) Layout problems are critical.
The gate (clock ø) defines two phases, evaluation and precharge phase during each clock cycle.
Working
When clock ø = 0 the circuit is in precharge phase with the pMOS device Mp ON and the
evaluation nMOS Mn OFF. This establishes a conducting path between V DD and the output
allowing Cout to charge to a voltage Vout = VDD. Mp is often called the precharge FET.
When clock ø = 1 the circuit is in evaluation phase with the pMOS device Mp OFF and the
evaluation nMOS Mn ON. If the logic block acts like a closed switch the C out can discharge
through logic array and Mn, this gives a final result of V out = VDD, logically this is an output of F
= 1. Charge leakage eventually drops the output to V out = 0 Vwhich could be an incorrect logic
value.
The logic formation is formed by three series connected FETs (3-input NAND gate) is shown in Figure
3.18.
The dynamic CMOS logic circuit has a serious problem when they are cascaded. In the precharged phase
(ø = 0) , output of all the stages are pre-charged to logic high. In the evaluation phase (ø = 1), the output
of all stages are evaluated simultaneously. Suppose in the first stage, the inputs are such that the output is
logic low after the evaluation. In the second stage, the output of the first stage is one input and there are
other inputs. If the other inputs of the second stage are such that output of it discharges to logic low, then
the evaluated output of the first stage can never make the output of the second stage logic high. Ths is
because, by the time the first stage is being evaluated, output of the second. Stage is discharged, since
evaluation happens simultaneously. Remember that the output cannot be charged to logic high in the
evaluation phase (ø = 1, pMOSFET in PUN is OFF), it can only be retained in the logic high depending
on the inputs.
Advantages
1) Low power dissipation.
2) Large noise margin.
3) Small area due to less number of transistors.
3.5.3 CMOS domino logic
Standard CMOS logic gates need a PMOS and an NMOS transistor for each logic input. The pMOS
transistors require a greater area tan the nMOS transistors carrying the same current. So, a large chip area
is necessary to perform complex logic operations. The package density in CMOS is improved if a
dynamic logic circuit, called the domino CMOS logic circuit, is used.
Domino CMOS logic is slightly modified version of the dynamic CMOS logic circuit. In this case, a static
inverter is connected at the output of each dynamic CMOS logic block. The addition of the inverter solves
the problem of cascading of dynamic CMOS logic circuits.
The circuit diagram of domino CMOS logic structures as shown in Figure 3.19 as follows
A domino CMOS AND-OR gate that realizes the function y = AB + CD is depicted in figure . The left
hand part of the circuit containing Mn,Mp, T1,T2,,T3,and T4 forms and AND-OR- INVERTER (AOI)
gate. It derives the static CMOS inverter formed by N2 and P2 in the right-hand part of the circuit. The
domino gate is activated by the single phase clock ø applied to the NMOS (Mn) and the PMOS (Mp)
transistors. The load on the AOI part of the circuits is the parasitic load capacitance.
Working
When ø = 0, is ON and Mn is OFF, so that no current flows in the AND-OR paths of the AOI.
The capacitor CL is charged to VDD through Mp since the latter is ON. The input to the inverter
Department of E.C.E.,MRECW Page 92
VLSI DESIGN
Note : Logic input can change only when ø = 0. No changes of the inputs are permitted when ø = 1 since
a discharge path may occur.
Advantages
1) Smaller areas compared to conventional CMOS logic.
2) Parasitic capacitances are smaller so that higher operating speeds are possible.
3) Operation is free of glitches since each gate can make one transition.
Disadvantages
1) Non inverting structures are possible because of the presence of inverting buffer.
2) Charge distribution may be a problem.
3.5.4 Clocked CMOS logic
The clocked CMOS logic is also referred as C 2MOS logic. Figure 3.20 shows the general
arrangement of a clocked CMOS (C2MOS) logic. A pull-up p-block and a complementary n-block pull-
down structure represent p and n-transistors respectively and are used as implement clocked CMOS logic
shown in figure. However, the logic in this case is connected to the output only during the ON period of
the clock. Figure shows a clocked inverter circuit which is also belongs to clocked CMOS logic family.
The slower rise times and fall times can be expected due to owing of extra transistors in series with the
output.
When ø = 1 the circuit acts an inverter , because transistors Q3 and Q4 are ‘ON’ . It is said to be
in the “evaluation mode”. Therefore the output Z changes its previous value.
When ø = 0 the circuit is in hold mode, because transistors Q3 and Q4 becomes ‘OFF’ . It is
said to be in the “precharge mode”. Therefore the output Z remains its previous value.
Working
During the pre charge phase ø = 0 , the output of the n-tree gate, OUT 1 OUT3 , are charged to
VDD, while the output of the p-tree gate OUT2 is pre discharged to 0V. Since the n-tree gate
connects pMOS pull-up devices, the PUN of the p-tree is turned off at that time.
During the evaluation phase ø = 1, the outputs (OUT1,OUT3) of the n-tree gate can only make a
1-0 transition, conditionally turning on some transistors in the p-tree. This ensures that no
accidental discharge of OUT 2 can occur.
Similarly n-tree blocks can follow p-tree gates without any problems, because the inputs to the
n-gate are pre charged to 0.
Disadvantages
Here, the p-tree blocks are slower than the n-tree modules, due to the lower current drive of the pMOS
transistors in the logic network.
the delay over the pair will be constant irrespective of the sense of the logic level transition of the input to
the first .
In general, the delay through a pair of similar nMOS inverters is
Td = (1 + Zp.u/Zp.d ) τ
Assume that τ = 0.3 n sec.
Then , Td = (1 + 4) 0.3
=5τ
Thus, the inverter pair delay for inverters having 4:1 ration is 5τ.
Figure 3.23 shows CMOS inverter pair delay. The theoretical delay associated with a pair of both n and p
transistors lambda based inverters. Here the gate capacitance is double comparable to nMOS inverter
since the input to a CMOS inverter is connected to both transistor gate.
NOTE: Here the asymmetry (uneven) of resistance
It can be eliminated by increasing the width of the p-device channel by a factor of two or three at the
same time the gate capacitance of p-transistor also increased by the same factor.
This current charges CL and since its magnitude is approximately constant, we have
Substitute the value of Idsp in above equation and then the rise time is
So that the rise time is slower by a factor of 2.5 when using minimum size devices for both n & p.
In order to achieve symmetrical operation using minimum channel length we need to make Wp
= 2.5 Wn.
Where, CL denotes offchip load. The capacitances which of this order must be driven through low
resistances, otherwise excessively long delays will occur. Large capacitance is presented at the input,
which in turn slows down the rate of change of voltage at input.
MOS circuits low resistance values imply low L:W ) . Since length L cannot be reduced
below the minimum feature size, the channels must be made very wide to reduce resistance value.
Consider N cascaded inverters as on increasing the width factor of ‘f’ than the previous stage as shown in
Figure .
As the width factor increases, the capacitive load presented at the inverter input increases and the area
occupied increases also. It is observed that as the width increases, the number N of stages are decreased to
drive a particular value of CL. Thus with large f(width), N decreases but delay per stage increases for 4:1
nMOS inverters.
Determine the value of f which will minimize the overall delay for a given value of y.
ln(y) = ln(f N)
ln (y) = N ln (f)
N = ln(y)/ln(f)
For N even
= 3.5Nfτ (CMOS)
Delay α Nfτ
= ln(y)/ln(f) . fτ
It can be shown that total delay is minimized if f assumes the value of e for both CMOS and nMOS
inverters.
Assume f = e N = ln(y)/ln(e)
N = ln(y)
Consider a positive going (0 to 1) transition at input V in turns ON the inverter formed by T 1 and
T2.
With a small delay, the gate of T3 is pulled down to 0 volts. Thus, device T3 is cut off. Since
gate of T4 is connected to Vin, it is turned ON and the output is pulled down very fast.
For the opposite transition of Vin (1 to 0), Vin drops to 0 volts. The gate of transistor T 3 is
allowed to rise to VDD quickly.
Simultaneously the low Vin turns off T4 very fast. This makes T3 to conduct with its gate voltage
approximately equal to VDD.
This gate voltage is twice the average voltage that would appear if the gate was connected to the
source as in the conventional nMOS inverter.
Now as Idsα Vgs , doubling the effective Vgs increases the current and there by reduces the delay in
charging at the load capacitor of the output. The result is more symmetrical transition.
Figure 3.28 shows the non-inverting nMOS super buffer where the structures fabricated in 5µm
technology are capable of driving capacitance of 2pF with a rise time of 5nsec.
1. In BiCMOS technology we use bipolar transistor drivers as the output stage of inverter and
logic gate circuits.
2. In bipolar transistors, there is an exponential dependence of the collector (output) current on the
base to emitter (input) voltage Vbe .
3. Hence, the bipolar transistors can be operated with much smaller input voltage swings than
MOS transistors and still switch large current.
4. Another consideration in bipolar devices is that the temperature effect on input voltage V be.
5. In bipolar transistor, Vbe is logarithmically dependent on collector current I C and also other
parameters such as base width, doping level, electron mobility.
6. Now, the temperature differences across an IC are not very high. Thus the V be values of the
bipolar devices spread over the chip remain same and do not differ by more than a few milli
volts.
The switching performance of a bipolar transistor driving a capacitive load can be analyzed to begin with
the help of equivalent circuit as shown in Figure 3.29.
The time ∆t required to change the output voltage Vout by an amount equal to the input voltage is
∆t = CL/gm
Where,
CL is the load capacitance
The value of ∆t is small because the trans conductance of the bipolar transistors is relatively high. There
are two main components which reveals the delay due to the bipolar transistors are T in and TL .
Tin is the time required to first charge the base emitter junction of the bipolar (npn) transistor.
This time is typically 2ns for the BiCMOS transistor base driver.
For the CMOS driver the time required to charge the input gate capacitance is 1ns.
TL is the time required to charge the output load capacitance .
The combined effect of Tin and TL is represented as shown in Figure 3.30.
Delay for BiCMOS inverter s reduced by a factor of hfe as compared with a CMOS inverter.
The devices thus have high β, high gm, high hfe and low RC. The presence of such efficient and
advantageous devices on chip offers a great deal of scope and freedom to the VLSI designer.
Propagation delays
Propagation delay is the delay in the propagation of the signal created by the change of logical status at
the input to create same change at the output.
(i)Cascaded pass transistors
Figure 3.32 shows a chain of four pass transistors driving a capacitive load C L. All the gates are supplied
by
VDD so that a signal can propagate to the output. The lamped RC equivalent circuit is shown in figure,
where each transistor is modeled by a series resistance and capacitance representing the gate-to-channel
capacitance and stray capacitances. Then minimum value of R is the turned ON resistance of each
enhancement mode pass transistor.
The current through the capacitance at the node with voltage V2 is
C (dV2 / dt ) ≈ C.∆V2/ ∆t
The current entering at this node is I1 = (V1 – V2)/R and
the current leaving from this node is I2 = (V2 – V3)/R.
By applying KCL at this node
IC = I1 – I2
As the number of sections in the network increases, the circuit parameters become distributed. Assume
that R and C as the resistance per unit length and the capacitance per unit length respectively.
By simplifying the analysis if all sheet resistance, gate-to-channel capacitance R S and □cg are lumped
together
R total = nr Rs C total = nc□cg
Where r gives relative resistance per section in terms of RS and c gives relative capacitance per section
In terms of □cg . Then the overall delay for n sections is given by
τp = n2rc(τ)
It can be shown that the signal delay in a section containing N identical pass transistors driving a matched
load (CL = Cg) is
τp = 0.7 * N(N+1)/2 *RCL
For large value of N, the quantity (N + 1) can be replaced by N. Since the delay increases with N, the
number of pass transistors is restricted to 4. A cascade of more pass transistors will produce a very slow
circuit and the signal needs to be restored by an inverter after every three (or) four pass transsitor.
Capacitance due to fringing field effects can be a major component of the overall capacitance of
interconnect wires. For fine line metallization, the value of fringing field capacitance (C ff) can be of the
same order as that of the area capacitance. Thus , C ff should be taken into account if accurate prediction
of performance is needed.
t = thickness of wire
Cw = Carea + Cff
From the definition of capacitance itself, it can be said that there exists a capacitance between the layers
due to parallel plate effects. This capacitance will depend upon the layout i.e., where the layers cross or
whether one layer underlies another etc., by the knowledge of these capacitances, the accuracy of circuit
modeling and delay calculations will be improved. It can be readily calculated for regular structures.
1. The source and drain p-diffusion regions forms junctions with the n-substrate (or n-well) at well
defined and uniform depths.
2. Similarly, the source and drain n-diffusion regions forms junctions with p-substrate (or p-well)
at well defined and uniform depths.
3. Hence, for diffusion regions, each diode thus formed has associated a peripheral (side wall)
capacitance with it.
4. As a whole the peripheral capacitance,C p will be the order of pF/unit length. So its value will be
greater than Carea of the diffusion region to substrate.
However, as the n and p-active regions are formed by impure implants at the surface of the silicon incase
of orbit processes, they have negligible depth. Hence Cp is quite negligible in them.
An additional input to a CMOS logic gate requires an additional nMOS and pMOS i.e., two
additional transistors, while incase of other MOS logic gates, it requires one additional
transistor.
In CMOS logic gates, due to these additional transistors, not only the chip area but also the total
effective capacitance per gate also increased and hence propagation delay increases.
Some of the increase in propagation delay time can be compensated by the size-scaling method.
By increasing the size of the device, its current driving capability can be preserved.
Due to increase in both of inputs and devices size, the capacitance increases, Hence propagation
delay will still increase with fan-in.
An increase in the number of outputs of a logic gate directly adds to its load capacitances.
Hence, the propagation delay increases with fan-out
3.10 CHOICE OF LAYERS
The following are the constraints which must be considered for the proper choice of layers.
1. Since the polysilicon layer has relatively high specific resistance (R S), it should not be used for
routing VDD and VSS (GND) except for small distances.
2. VDD and GND (VSS) must be distributed only on metal layers, due to the consideration of Rs
value.
3. The capacitive effects will also impose certain restrictions in the choice of layers as follows
(i) where fast signal lines are required, and in relation to signals on wiring which has relatively
higher values of RS.
(ii) The diffusion areas have higher values of capacitance to substrate and are harder to drive.
4. Over small equipotential regions, the signal on a wire can be treated as being identical at all
points.
5. Within each region the propagation delay of the signal will comparably smaller than the gate
delays and signal delays caused in a system connected by wires.
Thus the wires in a MOS system can be modeled as simple capacitors. This concept leads to the
establishment of electrical rules (guidelines) for communication paths(wires) as given in Table 3.2.
Silicide 2,000λ NA NA
CMOS system design consists of partitioning the system into subsystems of the
types listed above. Many options exist that make trade-offs between speed, den- sity,
programmability, ease of design, and other variables. This chapter addresses design
options for common datapath operators. The next chapter addresses arrays, especially
those used for memory. Control structures are most commonly coded in a hardware
description language and synthesized. Datapath operators benefit from the structured
design principles of hierarchy, regularity, modularity, and locality. They may use N
identical circuits to process N-bit data. Related data operators are placed physically
adjacent to each other to reduce wire length and delay. Generally, data is arranged to
flow in one direction, while control signals are introduced in a direction orthogonal to the
dataflow.Common datapath operators considered in this chapter include adders, one/zero
detectors, comparators, counters, shifters, ALUs, and multipliers.
4.1.2 Shifters
Consider a direct MOS switch implementation of a 4X4 crossbar switch as shown in
Figure 4.1.The arrangement is quit general and may be readily expanded to
4.1.3 Adders
Addition is one of the basic operation perform in various processing like counting,
multiplication and filtering. Adders can be implemented in various forms to suit different
speed and density requirements. The truth table of a binary full adder is shown in Table
4.1, along with some functions that will be of use during the discussion of adders. Adder
inputs: A, B
The direct implementation of the above equations is shown in Figure 4.3.Using the gate
schematic and the transistors is shown in Figure 4.4.
The full adder employs 32 transistors (6 for the inverters, 10 for the carry circuit, and 16
for the 3-input XOR). A more compact design is based on the observation that S can be
factored to reuse the CARRY term as follows:
Such a design is shown at the transistor levels in Figure 4.5 and uses only 28 transistors.
Note that the pMOS network is complement to the nMOS network.
Here Cin=C
The above two equations can be written in terms of two new signals Pi and Gi, which are
shown in Figure 4.6.
ci+1 = Gi + Pi.ci
si = Pi ⊕ ci
Where,
Gi = ai.bi
Pi = (ai ⊕ bi)
Pi and Gi are called carry propagate and carry generate terms, respectively. Notice that
the generate and propagate terms only depend on the input bits and thus will be valid
after one and two gate delay, respectively. If one uses the above expression to calculate
the carry signals, one does not need to wait for the carry to ripple through all the
previous stages to find its proper value. Let’s apply this to a 4-bit adder to make it clear.
Putting i = 0, 1, 2, 3 in ci+1 = Gi + Pi.ci we get
c1 = G0 + P0.c0
c2 = G1 + P1.G0 + P1.P0.c0
c3 = G2 + P2.G1 + P2.P1.G0 + P2.P1.P0.c0
c4 = G3 + P3.G2 + P3.P2.G1 + P3..P2.P1.G0 + P3.P2.P1.P0.c0
Notice that the carry-out bit, ci+1, of the last stage will be available after four delays: two
gate delays to calculate the propagate signals and two delays as a result of the gates
required to implement Equation c4.
Figure 4.7 shows that a 4-bit CLA is built using gates to generate the Pi and Gi and
signals and a logic block to generate the carry out signals according to Equations c1 to c4.
Logic gate and transistor level implementation of carry bits are shown in Figure 4.8
The disadvantage of CLA is that the carry logic block gets very complicated for more
than 4-bits. For that reason, CLAs are usually implemented as 4-bit modules and are used
in a hierarchical structure to realize adders that have multiples of 4-bits.
transmission gate. If the carry path is precharged to VDD, the transmission gate is then
reduced to a simple NMOS transistor. In the same way the PMOS transistors of the carry
generation is removed. One gets a Manchester cell as shown in the Figure 4.10.
Figure 4.10 Manchester cell
The Manchester cell is very fast, but a large set of such cascaded cells would be slow.
This is due to the distributed RC effect and the body effect makin/g the propagation time
grow with the square of the number of cells. Practically, an inverter is added every four
4.1.4 Multipliers
In many digital signal processing operations - such as correlations, convolution,
filtering, and frequency analysis - one needs to perform multiplication. The most basic
form of multiplication consists of forming the product of two positive binary numbers.
This may be accomplished through the traditional technique of successive additions and
shifts, in which each addition is conditional on one of the multiplier bits. Here is an
example of 4-bit multiplication given in the Figure 4.12.
The multiplication process may be viewed to consist of the following two steps:
It should be noted that binary multiplication is equivalent to a logical AND op- eration.
Thus evaluation of partial products consists of the logical ANDing of the multiplicand
and the relevant multiplier bit. Each column of partial products must then be added and,
if necessary, any carry values passed to the next column.
There are a number of techniques that may be used to perform multiplication. In general,
the choice is based on factors such as speed, throughput, numerical accuracy, and area.
As a rule, multipliers may be classified by the format in which data words are accessed,
namely:-
1. Serial form
2. serial/parallel form
3. Parallel form
(n − 1)2full adders,
n − 1 half adders, and
n2 AND gates.
The worst-case delay associated with such a multiplier is (2n + l)tg, where tg is the worst-case
adder delay.
Cell shown in Figure 4.13 is a cell that may be used to construct a parallel multiplier.
The Xi term is propagated diagonally from top right to bottom left, while the yj term is
propagated horizontally. Incoming partial products enter at the top. Incoming CARRY IN
values enter at the top right of the cell. The bit-wise AND is performed in the cell, and
the SUM is passed to the next cell below. The CARRY 0UT is passed to the bottom left
of the cell.
Figure 4.14 depicts the multiplier array with the partial products enumerated. The
Multiplier can be drawn as a square array, as shown here, Figure is the most convenient
for implementation. In this version the degeneration of the first two rows of the multiplier
are shown. The first row of the multiplier adders has been replaced with AND gates
while the second row employs half-adders rather than full adders. This optimization
might not be done if a completely regular multiplier were required (i.e. one array cell). In
this case the appropriate inputs to the first and second row would be connected to ground,
as shown in the previous slide. An adder with equal carry and sum propagation times is
advantageous, because the worst-case multiply time depends on both paths.
Example for implementation of 4x4 multiplier(4-bit) using Wallace Tree Multi- plication
methods is given in Figure 4.16
Figure 4.16 Product terms
Considering the product P3, it may be seen that it requires the summation of four partial
products and a possible column carry from the summation of P2 as shown in Figure 4.17
Figure 4.17 Wallace Tree Multiplication for 4-bits
Consider the 6 x 6 multiplication table. Considering the product P5, it may be seen that it
requires the summation of six partial products and a possible column carry from the
summation of P4. Here we can see the adders required in a multiplier based on this style
of addition.
The adders have been arranged vertically into ranks that indicate the time at which the adder
output becomes available. While this small example shows the general Wallace addition
technique, it does not show the real speed advantage of a Wallace tree. There is an
identifiable “array part”, and a CPA part, which is at the top right. While this has been
shown as a ripple-carry adder, any fast CPA
DEPARTMENT OF E.C.E.,MRECW Page 129
VLSI DESIGN
can be used here. The delay through the array addition (not including the CPA) is
proportional to log1.5(n), where n is the width of the Wallace tree. Wallace Tree
Multiplication for 6-bits is shown in Figure 4.20.
presence of error in bit information. External noise and loss of signal strength cause loss
of data bit information while transporting data from one device to other device,
located inside the computer or externally. To indicate any occurrence of error, an extra bit
is included with the message according to the total number of 1s in a set of data, which is
called parity. If the extra bit is considered 0 if the total number of 1s is even and 1 for odd
quantities of 1s in a set of data, then it is called even parity.
On the other hand, if the extra bit is 1 for even quantities of 1s and 0 for an odd
number of 1s, then it is called odd parity. A parity generator is a combination logic
system to generate the parity bit at the transmitting side. Table 4.4 shows the Truth table
for generating even and odd parity bit
Table 4.4 Truth table for generating even and odd parity bit
If the message bit combination is designated as, D3D2D1D0 and Pe, Po are the even and
odd parity respectively, then it is obvious from the table that the Boolean expressions of
even parity and odd parity are
Pe=D3D2D1D0
Po =(D3D2D1D0)
The above illustration is given for a message with four bits of information. However, the
logic diagrams can be expanded with more XOR gates for any number of bits. Figure 4.21
shows the Even parity generator using logic gates. Figure 4.22 shows the Odd parity
generator logic gates
Detecting all ones or zeros on wide N-bit words requires large fan-in AND or NOR
gates. Recall that by DeMorgan’s law, AND, OR, NAND, and NOR are funda- mentally the
same operation except for possible inversions of the inputs and/or outputs. You can build a
tree of AND gates, as shown in Figure 4.23. Here, alternate NAND and NOR gates have
been used. The path has log N stages.
Figure 4.23 One/zero detectors (a) All one detector (b) All zero detector (c) All zero detector
4.1.7 Comparators
Another common and very useful combinational logic circuit is that of the Digital
Comparator circuit. Digital or Binary Comparators are made up from standard AND,
NOR and NOT gates that compare the digital signals present at their input terminals and
produce an output depending upon the condition of those inputs.
For example, along with being able to add and subtract binary numbers we need to be
able to compare them and determine whether the value of input A is greater than, smaller
than or equal to the value at input B etc. The digital comparator accomplishes this using
several logic gates that operate on the principles of Boolean Algebra. There are two main
types of Digital Comparator available and these are.
Identity Comparator: An Identity Comparator is a digital comparator that has only one output
terminal for when A = B either “HIGH” A = B = 1or “LOW” A = B = 0
Magnitude Comparator: A Magnitude Comparator is a type of digital com- parator that
has three output terminals, one each for equality, A = B greater than,A > B and less than
A<B
The purpose of a Digital Comparator is to compare a set of variables or unknown numbers,
for example A (A1, A2, A3, An, etc) against that of a constant or unknown value such as B
(B1, B2, B3 Bn, etc) and produce an output condition or flag depending upon the result of
the comparison. For example, a magnitude comparator of two 1-bits, (A and B) inputs would
produce the following three output conditions when compared to each other.
A > B, A + B, A < B
Inputs Outputs
B A A>B A=B A<B
0 0 0 1 0
0 1 1 0 0
1 0 0 0 1
1 1 0 0 0
The logic diagram of 1-bit comparator using basic gates is shown below in Figure 4.24.
*** Draw separate diagrams for greater, equality and less than expressions.
4.1.8 Counters
Counters can be implemented using the adder/subtractor circuits and registers (or
equivalently, D flip- flops).The simplest counter circuits can be built using T flip-flops
because the tog- gle feature is naturally suited for the implementation of the counting
operation.
other two flip-flops have their clock inputs driven by the Q output of the preceding flip-
flop. Therefore, they toggle their states whenever the preceding flip-flop changes its state
from Q = 1 to Q = 0, which results in a positive edge of the Q signal.
Note here the value of the count is the indicated by the 3-bit binary number
Q2Q1Q0. Since the second flip-flop is clocked by Q0 , the value of Q1 changes shortly
after the change of the Q0 signal. Similarly, the value of Q2 changes shortly after the
change of the Q1 signal. This circuit is a modulo-8 counter. Because it counts in the
upward direction, we call it an up-counter. This behavior is similar to the rippling of
carries in a ripple-carry adder. The circuit is therefore called an asynchronous counter, or
a ripple counter.
Observing the pattern of bits in each row of the Table 4.5, it is apparent that bit Q0
changes on each clock cycle. Bit QQ1 changes only when Q0 = 1. Bit Q2 changes only
when both Q1 and Q0 are equal to 1. Bit Q3 changes only when Q2 = Q1 = Q0 = 1. In
general, for an n-bit up-counter, a give flip-flop changes its state only when all the
preceding flip-flops are in the state Q = 1. Therefore, if we use T flip-flops to realize
the 4-bit counter, then the T inputs should be defined as
T0 = 1
T1 = Q0
T2 = Q0Q1
T3 = Q0Q1Q2
In Figure 4.28, instead of using AND gates of increased size for each stage, we use a
factored arrangement. This arrangement does not slow down the response of the
counter, because all flip-flops change their states after a propagation delay from the
positive edge of the clock. Note that a change in the value of Q0 may have to
propagate through several AND gates to reach the flip-flops in the higher stages of the
counter, which requires a certain amount of time. This time must not exceed the clock
period. Actually, it must be 3less than the clock period minus the setup time of the
flip-flops. It shows that the circuit behaves as a modulo-16 up-counter. Because all
changes take place with the same delay after the active edge of the Clock signal, the
circuit is called a synchronous counter.
The technological advancement has improved performance as well as packing density of these
devices over the years Gordon Moore made his famous observation in 1965, just four years
after the first planar integrated circuit was discovered. He observed an exponential growth in
the number of transistors per integrated circuit in which the number of transistors nearly
doubled every couple of years as shown in Figure 4.29. This observation, popularly known as
Moore's Law, has been maintained and still holds true today. Keeping up with this law, the
semiconductor memory capacity also increases by a factor of two every year.
4.2.1 Memory Classification
Size: Depending upon the level of abstraction, different means are used to express the size of
the memory unit. A circuit designer usually expresses memory in terms of bits, which are
equivalent to the number of individual cells need to store the data. Going up one level in the
hierarchy to the chip design level, it is common to express memory in terms of bytes, which is
a group of 8 bits. And on a system level, it can be expressed in terms of words or pages, which
are in turn collection of bytes.
Function: Semiconductor memories are most often classified on the basis of access patterns,
memory functionality and the nature of the storage mechanism. Based on the access patterns,
they can be classified into random access and serial access memories. A random access
memory can be accessed for read/write in a random fashion. On the other hand, in serial access
memories, the data can be accessed only in a serial fashion. FIFO (First In First Out) and LIFO
(Last In Last Out) are examples of serial memories. Most of the memories fall under the
random access types. Based on their functionalities, memory can be broadly classified into
Read/Write memories and Read-only memories.
As the name suggests, Read/Write memory offers both read and write operations and
hence is more flexible. SRAM (Static RAM) and DRAM (Dynamic RAM) come under this
category. A Read-only memory on the other hand encodes the information into the circuit
topology. Since the topology is hardwired, the data cannot be modified; it can only be read.
However, ROM structures belong to the class of the nonvolatile memories.
Removal of the supply voltage does not result in a loss of the stored data. Examples of such
structures include PROMs, ROMs and PLDs. The most recent entry in the filed are memory
modules that can be classified as nonvolatile, yet offer both read and write functionality.
Typically, their write operation takes substantially longer time than the read operation. An
EPROM, EEPROM and Flash memory fall under this category. Figure 4.30 Classification of
memories
Timing Parameters: The timing properties of a memory are illustrated in Figure 4.31. The time
it takes to retrieve data from the memory is called the read- access time. This is equal to the
delay between the read request and the moment the data is available at the output. Similarly,
write-access time is the time elapsed between a write request and the final writing of the input
data into the memory. Finally, there is another important parameter, which is the cycle time
(read or write), which is the minimum time required between two successive read or write
cycles. This time is normally greater than the access time.
Figure 4.30 Timing properties of a memory
To overcome this problem, the address provided to the memory module is generally encoded
as shown in Figure 4.32. A decoder is used internally to decode this address and make the
appropriate select line high. With 'k' address pins, 2K number of select pins can be driven and
hence the number of interface pins will get reduced by a factor of log2N.
Though this approach resolves the select problem, it does not address the issues of the memory
aspect ratio. For an N-word memory, with a word length of M, the aspect ratio will be nearly
N:M, which is very difficult to implement for large values of N. Also such sort of a design
slows down the circuit very much.
This is because, the vertical wires connecting the storage cells to the inputs/outputs become
excessively long. To address this problem, memory arrays are organized so that the vertical
and horizontal dimensions are of the same order of magnitude, making the aspect ratio close to
unity. To route the correct word to the input/output terminals, an extra circuit called column
decoder is needed as shown in the Figure 4.33.
The address word is partitioned into column address (A0 to AK-1) and row address (AK-1 to
AL-1). The row address enables one row of the memory for read/write, while the column
address picks one particular word from the selected row.
we have to take into account, the relatively large parasitic column capacitance and
and column pull-up transistors as shown in Figure 4.37 .
When none of the word lines is selected, the pass transistors M3 and M4 are turned off and the
data is retained in all memory cells. The column capacitances are charged by the pull-up
transistors P1 and P2. The voltages across the column capacitors reach VDD - VT.
READ Operation
Consider a data read operation, shown in Figure 28.41, assuming that logic '0' is stored
in the cell. The transistors M2 and M5 are turned off, while the transistors M1 and M6 operate
in linear mode. Thus internal node voltages are V 1 = 0 and V2 = VDD before the cell access
transistors are turned on. The active transistors at the beginning of data read operation are
shown in Figure 4.38.
After the pass transistors M3 and M4 are turned on by the row selection circuitry, the voltage
CBb of will not change any significant variation since no current flows through M4. On the
other hand M1 and M3 will conduct a nonzero current and the voltage level of C B will begin to
drop slightly. The node voltage V1 will increase from its initial value of '0'V. The node voltage
V1 may exceed the threshold voltage of M2 during this process, forcing an unintended change
of the stored state. Therefore voltage must not exceed the threshold voltage of M2, so the
DEPARTMENT OF E.C.E.,MRECW Page 149
VLSI DESIGN
(Eq 1)
The transistor M3 is in saturation whereas M1 is linear, equating the current equations we get
(Eq 2)
Substituting Eq 1 in Eq.2 we get
(Eq .3)
WRITE Operation
Consider the write '0' operation assuming that logic '1' is stored in the SRAM cell
initially. Figure 4.39 shows the voltage levels in the CMOS SRAM cell at the beginning of the
data write operation. The transistors M1 and M6 are turned off, while M2 and M5 are
operating in the linear mode. Thus the internal node voltage V 1 = VDD and V2 = 0 before the
access transistors are turned on. The column voltage V b is forced to '0' by the write circuitry.
Once M3 and M4 are turned on, we expect the nodal voltage V 2 to remain below the threshold
voltage of M1, since M2 and M4 are designed according to Eq. 1.
The voltage at node 2 would not be sufficient to turn on M1. To change the stored information,
i.e., to force V1 = 0 and V2 = VDD, the node voltage V1 must be reduced below the threshold
voltage of M2, so that M2 turns off. When the transistor M3 operates in linear
region while M5 operates in saturation region. Equating their current equations we get
(Eq 4)
Rearranging the condition of in the result we get
(Eq 5)
WRITE Circuit
The principle of write circuit is to assert voltage of one of the columns to a low level. This
column. The transistor M1 is on only in the presence of the write enable signal and
when the data bit to be written is '0'. The transistor M2 is on only in the presence of the write
signal and when the data bit to be written is '1'. The circuit for write operation is
shown in Figure 4.40
The C S capacitor stores the charge for the cell. Transistor M1 gives the R/W access to
the cell. CB is the capacitance of the bit line per unit length.
Memory cells are etched onto a silicon wafer in an array of columns (bit lines) and
rows (word lines). The intersection of a bit line and word line constitutes the address of the
memory cell.
DRAM works by sending a charge through the appropriate column (CAS) to activate the
transistor at each bit in the column. When writing, the row lines contain the state the capacitor
should take on.
When reading, the sense amplifier determines the level of charge in the capacitor. If it is
more than 50%, it reads it as "1”; otherwise it reads it as "0". The counter tracks the refresh
sequence based on which rows have been accessed in what order. The length of time necessary
to do all this is so short that it is expressed in nanoseconds (billionths of a second). e.g. a
memory chip rating of 70ns means that it takes 70 nanoseconds to completely read and
recharge each cell.
The capacitor in a dynamic RAM memory cell is like a leaky bucket. Dynamic RAM
has to be dynamically refreshed all of the time or it forgets what it is holding. This refreshing
takes time and slows down the memory.
DEPARTMENT OF E.C.E.,MRECW Page 153
VLSI DESIGN
network, whichever switch is closed, those diodes will conduct and the output will be high
(logic 1), sections where there is no diode connected there will be no current flowing and the
output will be low (logic 0). For instance, when switch S5 is closed, the diodes D6 and D7 are
on and therefore both output 1 and output 3 are at logic 1 and both output 2 and output 4 are at
logic 0.
Hence the corresponding binary number is 0101 and its decimal value is 5.The disadvantage of
a diode cell is that it does not isolate the bit line from the word line. For better isolation the
diode can be replaced by gate-source connection of a NMOS transistor.
Moreover, in order to achieve the programmability i.e. for multiple read write
capability a modified transistor known as Floating Gate (FG) Transistor is employed. The
structure is similar to a traditional MOS device, except that an extra gate is inserted between
gate and channel. The threshold voltage of the FG is programmable and corresponding to its
different values the level 0 and level 1 can be identified.
PLAs were introduced in the early 1970s, by Philips, but their main drawbacks were that they were
expensive to manufacture and offered somewhat poor speed-performance. Both disadvantages were
due to the two levels of configurable logic, because programmable logic planes were difficult to
manufacture and introduced significant propagation delays. To overcome these problems ,
Programmable Array Logic (PAL) devices were developed.
Memory : Memory is used to store, provide access to, and allow modification of data and program
code for use within a processor-based electronic circuit or system. The two basic types of memory
ROM.
For example a 32 × 8 ROM contains 32 words of 8 bits each. This means there are eight output
lines and there are 32 numbers of distinct words stored in that unit, each of which is applied to the
output lines. The particular word selected from the presently available output lines is determined by
five input variables, as there are five input lines for a 32 × 8 ROM, because 2 5 = 32. Five input
variables can specify 32 addresses or min-terms and for each address input there is a unique
selected word. Thus, if the input address is 0000, word number 0 is selected. For address 0001,
word number 1 is selected and so on.
A ROM is sometimes specified by the total number of bits it contains, which is 2 n × m. For
example, a 4,096-bit ROM may be organized as 512 words of 8 bits each. That means the device
has 9 input lines (29 × m = 512) and 8 output lines.
In Figure below, the block consisting of an AND array with buffers or inverters is equivalent to a
decoder. The decoder basically is a combinational circuit that generates 2n numbers of minterms
from n number of input lines. 2n or p numbers of minterms are realized from n number of input
variables with the help of n numbers of buffers, n numbers of inverters, and 2n numbers of AND
gates.
Each of the minterms is applied to the inputs of m number of OR gates through fusible links. Thus,
m numbers of output functions can be produced after blowing of some selected fuses. The
equivalent logic diagram of a 2n×m ROM is shown below
ROM has many important applications in the design of digital computer systems. Realization of
complex combinational circuits, code conversions, generating bit patterns, performing arithmetic
functions like multipliers, forming look-up tables for arithmetic functions, and bit patterns for
characters are some of its applications. They are particularly useful for the realization of multiple
output combinational circuits with the same set of inputs. As such, they are used to store fixed bit
patterns that represent the sequence of control variables needed to enable the various operations in
the system. They are also used in association with microprocessors and microcontrollers.
5.1.2 Programmable logic device-(PLD)
The logic devices other than TTL ,CMOS families whose logical operation is specified by the
user through a process called programming are called Programmable Logic Devices. So, the
programmable logic device is the IC that contain digital logic cells and programmable interconnect
. The idea of PLD was first conceived by Ron Cline from Signetics in 1975 with programmable
AND and OR planes. The basic idea with these devices is to enable the designer to configure the
logic cells and interconnect to form a digital electronic circuit within a single IC package. Here, the
hardware resources will be configured to implement a required functionality. By changing the
hardware configuration, the PLD will operate a different function. The functioning and basic
working principle of PLD is explained below through the diagrams.
There are three types of PLD available. The simple programmable logic device (SPLD), the
Complex programmable logic device(CPLD), and the Field programmable gate array (FPGA).
PLA, Programmable Logic Array is a type of LSI device and conceptually similar to a ROM.
However, a PLA does not contain all AND gates to form the decoder or does not generate all the
minterms like ROM. In the PLA, the decoder is replaced by a group of AND gates with
DEPARTMENT OF E.C.E.,MRECW Page 162
VLSI DESIGN
buffers/inverters, each of which can be programmed to generate some product terms of input
variable combinations that are essential to realize the output functions. The AND and OR gates
inside the PLA are initially fabricated with the fusible links among them. The required Boolean
functions are implemented in sum of the products form by opening the appropriate links and
retaining the desired connections.
So, the PLA consists of two programmable planes AND and OR planes . The AND plane consists
of programmable interconnect along with AND gates. The OR plane consists of programmable
interconnect along with OR gates. In this view, there are four inputs to the PLA and four outputs
from the PLA. Each of the inputs can be connected to an AND gate with any of the other inputs by
connecting the crossover point of the vertical and horizontal interconnect lines in the AND gate
programmable interconnect. Initially, the crossover points are not electrically connected, but
configuring the PLA will connect particular cross over points together. In this view, the AND gate
is seen with a single line to the input. This view is by convention, but this also means that any of
the inputs (vertical lines) can be connected. Hence, for four PLA inputs, the AND gate also has four
inputs. The single output from each of the AND gates is applied to an OR gate programmable inter
connect.
Again, the crossover points are initially not electrically connected, but configuring the PLA will
connect particular crossover points together. In this view, the OR gate is seen with a single line to
the input. This view is by convention, but this also means that any of AND gate outputs can be
connected to the OR gate inputs. Hence, for four AND gates, the OR gate also has four inputs
Therefore, the function is implemented in either AND-OR form when the output link across
INVERTER is in place, or in AND-OR-INVERT form when the link is blown off. The general
structure of a PLA with internal connections is shown in figure below.
The size of a PLA is specified by the number of inputs, the number of product terms,and the
number of outputs. The number of sum terms is equal to the number of outputs. The PLA described
in figure above is specified as n × p × m PLA. The number of programmable links is 2n × p + p ×
m + m, whereas that of ROM is 2n × m. A typical PLA of 16 × 48 × 8 has 16 input variables, 48
product terms, and 8 output lines.
To implement the same combinational circuit, a 216 × 8 ROM is needed, which consists of 2 16 =
65536 minterms or product terms. So there is a drastic reduction in number of AND gates within
the PAL chip, thus reducing the fabrication time and cost.
5.1.4 Programmable Array Logic (PAL)
The first programmable device was the programmable array logic (PAL) developed by Monolithic
Memories Inc(MMI). The Programmable Array Logic or PAL is similar to PLA, but in a PAL
device only AND gates are programmable. The OR array is fixed by the manufacturer. This
makes PAL devices easier to program and less expensive than PLA. On the other hand, since the
OR array is fixed, it is less flexible than a PLA device.
The PAL device. has n input lines which are fed to buffers/inverters. Buffers/inverters are
connected to inputs of AND gates through programmable links. Outputs of AND gates are then fed
to the OR array with fixed connections. It should be noted that, all the outputs of an AND array are
not connected to an OR array. In contrast to that, only some of the AND outputs are connected to
an OR array which is at the manufacturer's discretion. This can be clarified by above, which
illustrates the internal connection of a four-input, eight AND-gates and three-output PAL device
before programming. Note that while every buffer/inverter is connected to AND gates through
links, F1-related OR gates are connected to only three AND outputs, F2 with three AND gates, and
F3 with two AND gates. So this particular device can generate only eight product terms, out of
which two of the three OR gates may have three product terms each and the
rest of the OR gates will have only two product terms. Therefore, while designing with PAL,
particular attention is to be given to the fixed OR array.
GAL-Generic Array Logic
PAL and PLA devices are one-time programmable (OTP) based on PROM, so the PAL or PLA
configuration cannot be changed after it has been configured. This limitation means that the
configured device would have to be discarded and a new device configured. The GAL, although
similar to the PAL architecture, uses EEPROM and can be reconfigured.
The Generic Array Logic (GAL) device was invented by Lattice Semiconductor. The GAL was an
improvement on the PAL because one device was able to take the place of many PAL devices or
could even have functionality not covered by the original range. Its primary benefit, however, was
that it was erasable and re-programmable making prototyping and design changes easier for
engineers. The GAL is very useful in the prototyping stage of a design, when any bugs in the logic
can be corrected by reprogramming.
The CPLD consists of a number of logic blocks or functional blocks, each of which contains a
macrocell and either a PLA or PAL circuit arrangement. In this view, eight logic blocks are shown.
The building block of the CPLD is the macro-cell, which contains logic implementing disjunctive
normal form expressions and more specialized logic operations. The macro cell provides additional
circuitry to accommodate registered or nonregistered outputs, along with signal polarity control.
Polarity control provides an output that is a true signal or a complement of the true signal. The
actual number of logic blocks within a CPLD varies; the more logic blocks available, the larger the
design that can be configured.
In the center of the design is a global programmable interconnect. This interconnect allows
connections to the logic block macrocells and the I/O cell arrays (the digital I/O cells of the CPLD
connecting to the pins of the CPLD package).The programmable interconnect is usually based on
either array-based interconnect or multiplexer-based interconnect:• Array-based interconnect allows
any signal within the programmable interconnect to connect to any logic block within the CPLD.
This is achieved by allowing horizontal and vertical routing within the programmable interconnect
and allowing the crossover points to be connected or unconnected (the same idea as with the PLA
and PAL), depending on the CPLD configuration.
of oxide an top off the N+ surface, Low-pressure Chemical Vapor Deposition (LPCVD) nitride and
the reoxidized top oxide. The programming current has an important effect because the higher the
current during programming, the lower the link resistance, resulting in smaller thickness for the
antifuse material. Programming circuits for antifuses need to supply high currents (15 ma for Actel)
to insure high reliability and performance.
Amorphous silicon antifuse technology is the alternative to dielectric antifuse. It consists of
amorphous silicon between two layers of metal that changes phases when current is applied. When
the antifuse is not programmed the amorphous silicon has a resistance of 1 Giga ohm. When a high
current (about 20 mA) is applied to the anitfuse the amorphous silicon changes into a conductive
polysilicon link. Quick Logic pASIC FPGA is a perfect example of an amorphous silicon antifuse
technology.
(b). SRAM-based Technology:
SRAM FPGA architecture consists of static RAM cells to control pass gates or multiplexers. The
FPGA speed is determined by the delay introduced by the logic cells and the routing channels.
Multiplexers, look-up tables and output drivers affect the speed of signals through the logic cells.
An FPGA with more PIPs is easier to route but introducing more routing delay. The size of the
look-up table plays an important role depending on the design. Smaller LUTs provide higher
density but larger ones are preferred for high-speed applications.
Distinguish between SRAM and Antifuse Technologies: The following points explains the
differences between the two technologies.
1. Antifuse programming technology is faster than SRAM programming technology due to the RC
delays introduced by the interconnect structure.
2. Antifuse technology has more silicon area per gate and is easier to route than SRAM technology.
3. A disadvantage of antifuse FPGA is that they require more process layers and mask steps and
also contain high voltage programming transistors.
4. SRAM-based technology contains higher capacity than antifuse technologies.
5. SRAM based technology is very flexible with in-system programmability and the ability to
reconfigure the design during the debugging stage while antifuse technology is one-time
programmable (OTP). This ability reduces design and development, which reduces overall cost
of the design. Another advantage to this is that SRAM technology can be programmed at the
factory through complete verification test where the antifuse are tested as “blanks” and require
programming by the user to verify design requirements and operation.
6. A disadvantage of SRAM technology is that it is volatile meaning it has to be reprogrammed
every time power is turned off and on again. The SRAM usually require an extra memory
element to program the chip which occupies board space .
A standard cell is a group of transistor and interconnect structures that provides a boolean
logic function or a storage function (flipflop or latch). The simplest cells are direct representations
of the elemental NAND, NOR, and XOR boolean function, although cells of much greater
complexity are commonly used (such as a 2-bit full-adder, or muxed D-input flipflop.) The cell's
boolean logic function is called its logical view: functional behavior is captured in the form of
a truth table or Boolean algebraequation (for combinational logic), or a state transition
table (for sequential logic).
Usually, the initial design of a standard cell is developed at the transistor level, in the form of
a transistor netlist or schematic view. The netlist is a nodal description of transistors, of their
connections to each other, and of their terminals (ports) to the external environment. A schematic
view may be generated with a number of different Computer Aided Design(CAD) or Electronic
Design Automation(EDA) programs that provide a Graphical User Interface (GUI) for this netlist
generation process. Designers use additional CAD programs such as SPICE or Spectre to simulate
the electronic behavior of the netlist, by declaring input stimulus (voltage or current waveforms)
and then calculating the circuit's time domain (analogue) response. The simulations verify whether
the netlist implements the desired function and predict other pertinent parameters, such as power
consumption or signal propagation delay.
Since the logical and netlist views are only useful for abstract (algebraic) simulation, and not device
fabrication, the physical representation of the standard cell must be designed too. Also called
the layout view, this is the lowest level of design abstraction in common design practice. From a
manufacturing perspective, the standard cell's VLSI layout is the most important view, as it is
closest to an actual "manufacturing blueprint" of the standard cell. The layout is organized into base
layers, which correspond to the different structures of the transistor devices, and interconnect
wiring layers and via layers, which join together the terminals of the transistor formations.
The interconnect wiring layers are usually numbered and have specific via layers representing
specific connections between each sequential layer. Non-manufacturing layers may be also be
present in a layout for purposes of Design Automation, but many layers used explicitly for Place
and route (PNR) CAD programs are often included in a separate but similar abstract view. The
abstract view often contains much less information than the layout and may be recognizable as
a Layout Extraction Format (LEF) file or an equivalent.
After a layout is created, additional CAD tools are often used to perform a number of common
validations. A Design Rule Check (DRC) is done to verify that the design meets foundry and other
layout requirements. A Parasitic Extraction (PEX) then is performed to generate a PEX-net list with
parasitic properties from the layout. The nodal connections of that net list are then compared to
those of the schematic net list with a Layout Vs Schematic (LVS) procedure to verify that the
connectivity models are equivalent.
The PAL device is a special case of PLA which has a programmable AND arrayand a fixed OR
array. The basic structure of Rom is same as PLA. It is cheap comparedto PLA as only the AND
array is programmable. It is also easy to program a PALcompared to PLA as only AND must be
programmed.
The figure 1 below shows a segment of an unprogrammed PAL. The input bufferwith non inverted
and inverted outputs is used, since each PAL must drive many ANDGates inputs. When the PAL is
programmed, the fusible links (F1, F2, F3…F8) areselectively blown to leave the desired
connections to the AND Gate inputs. Connectionsto the AND Gate inputs in a PAL are represented
byXs, as shown here:
As an example, we will use the PAL segment of figure 1 to realize the function I1I2‘+I1I2. The Xs
indicate that the I1 and I2‘ lines are connected to the first AND Gate, and the I1‘ and I2 lines are
connected to the other Gate. Typical combinational PAL have 10 to 20 inputs and from 2 to 10
outputs with 2to 8 AND gates driving each OR gate. PALs are also available which contain D flip-
flop switch inputs driven from the programming array logic. Such PAL provides a convenient way
of realizing sequential networks. Figure 2 below shows a segment of a sequential PAL. The D flip-
flop is driven from the OR gate, which is fed by two AND gates. The flip-flop output is fed back to
the programmable AND array through a buffer. Thus the AND gate inputs can be connected to A,
A‘, B, B‘, Q, or Q‘. The Xs on the diagram show the realization of the next-state equation.
Q+ = D = A‘BQ‘ + AB‘Q
The flip-flop output is connected to an inverting tristate buffer, which is enabled when
EN = 1
Figure 3 below shows a logic diagram for a typical sequential PAL, the 16R4.This PAL has an
AND gate array with 16 input variables, and it has 4 D flip-flops. Each flip-flop output goes
through a tri state-inverting buffer (output pins 14-17). One input(pin 11) is used to enable these
buffers. The rising edge of a common clock (pin 1) causes the flip-flops to change the state. Each D
flip-flop input is driven from an OR gate, and each OR gate is fed from 8 AND gates. The AND
gate inputs can come from the external PAL inputs (pins2-9) or from the flip-flop outputs, which
are fed back internally. In addition there are four input/output (i/o) terminals (pins 12,13,18 and
19), which can be used as either network outputs or as inputs to the AND gates. Thus each AND
gate can have a maximum of 16 inputs (8 external inputs, 4 inputs fed back from the flip-flop
outputs, and 4 inputs from the i/o terminals). When used as an output, each I/O terminals driven
from an inverting tri state buffer. Each of these buffers is fed from an OR gate and each OR gate is
fed from 7 AND gates. An eighth AND gate is used to enable the output.
n CMOS integrated circuit design there is a trade-off between static power consumption and
technology scaling. Recently, the power density has increased due to combination of higher clock
speeds, greater functional integration, and smaller process geometries. As a result static power
consumption is becoming more dominant. This is a challenge for the circuit designers. However,
the designers do have a few methods which they can use to reduce this static power consumption.
But all of these methods have some drawbacks. In order to achieve lower static power
consumption, one has to sacrifice design area and circuit performance. In this paper, we propose a
new method to reduce static power in the CMOS VLSI circuit using Variable Body Biasing
technique without being penalized in area requirement and circuit performance
5.2 CMOS TESTING
Testing is one of the most expensive parts of chips
1) Logic verification accounts for > 50% of design effort for many chips
2) Debug time after fabrication has enormous opportunity cost
3) Shipping defective parts can sink a company
Observability: ease of observing a node by watching external output pins of the chip
Controllability: ease of forcing a node to 0 or 1 by driving input pins of the chip
Combinational logic is usually easy to observe and control. Finite state machines can be very
difficult, requiring many cycles to enter desired state especially if state transition diagram is not
known to the test engineer.
TUTORIAL SHEETS
21
Design a stick diagram for the CMOS logic for AB CD
22
Design a layout diagram for the pMOS logic Y A(B C)
23 Design a layout diagram for two input nMOS NAND gate.
24 Design a stick diagram and layout for two input CMOS NAND gate indicating all the
regions and layers.
25 Draw the stick diagram and mask layout for a CMOS two input NOR gate.
26 Calculate the gate capacitance value of 5mm technology minimum size transistor with
gate to channel capacitance value is 0.0004 pF/mm2.
27 What is the problem of driving large capacitive loads? Explain a method to drive such
load.
28 Calculate the rise time and fall time of the CMOS inverter (W/L)n=6 and
' 2 '
(W/L)p=8. k 150A / V , V 0.7V, k 62A / V ,
n tn p 2
47 A sequential circuit with n inputs and m storage devices. To test this circuit how many
test vectors are required?
48 How IDDQ testing is used to test the bridge faults?
49 What is ATPG? Explain a method of generation of test vector.
50 Draw the stick and layout diagram of nMOS inverter
MODEL QUESTIONS
DESCRIPTIVE QUESTIONS
DEPARTMENT OF E.C.E.,MRECW Page 186
VLSI DESIGN
UNIT- I(A)
PART-A
Part-B
1. Describe the NMOS fabrication process with neat sketch.
2. Write about the CMOS fabrication using N-well process with neat diagrams.
3. Explicate the NMOS Enhancement & depletion mode fabrication process with neat sketch.
4. Describe the CMOS fabrication using P-well process with neat sketch.
5. Draw the flow diagram of CMOS fabrication using Berkley N-well process.
6. Write notes on CMOS fabrication using Silicon-On-Insulator(SOI) process with neat
diagrams.
7. Write the comparisions between CMOS ,Bipolar and GaAs technology.
8. Draw and write about the CMOS fabrication using Twin-Tub process.
9. Write in detail about dry and wet oxidation with neat diagrams.
10. What is BiCMOS Technology & give its advantages over CMOS technology.
UNIT – I(B)
PART-A
1. What is saturation region and write its expression for IDS.
2. What is non-saturation region and write its expression for IDS.
3. Draw the IDS and VDS graph for depletion mode.
4. Draw the IDS and VDS graph for Enhancement mode.
5. Define Transconductance (gm) & Output conductance(gds) with expressions.
6. What is pass transistor?
7. List out the advantages of Bi-CMOS Inverter.
8. Draw the transfer characteristics of Bi-CMOS Inverter.
9. What is NMOS and CMOS Inverter?
10. Define Figure of merit of MOS transistor(ω0).
PART-B
1. Derive the relationship between drain to source current versus drain to source
UNIT -2
PART-A
1. What is full custom design?
2. What is semi custom design?
3. What is top-down & bottom-up approach?
4. What is layout?
5. List out the MOS layers.
6. List out the models of scaling.
7. What is stick diagram.
8. Mention any few λ based design rules for transistors?
9. List out any six scaling parameters.
10. List out the limitations of scaling.
PART-B
1. Elucidate the VLSI Design flow & write clearly about design processes.
2. Explicate the guidelines & design rules for stick diagram and layout diagram .
3. Design the stick diagram for two input NMOS NAND and NOR gates.
4. Draw the stick diagram & layout for two input NMOS Ex-OR gates.
5. Design the stick diagram for two input PMOS NAND and NOR Gates.
6. Draw the layout diagram for NMOS inverter.
7. Write the merits & demerits of scaling.
8. Write about design rules for wires in detail (orbit 2µm CMOS).
9. Write the scaling factors for different types of device parameters.
10. Design the circuit diagram,stick diagram & lay out for given Boolean function using
CMOS.F=(A+B+C)1
UNIT-3
PART-A
1. What is switch logic?
2. What is sheet resistance (RS)?
3. Define pass transistor?
4. What are transmission gates?
5. List out the types of alternative gate circuits.
6. What is propagation delay?
7. What is wiring capacitance?
8. Define Fan-in and Fan-out.
9. What is Rise time(tr) & Fall time (tf).
10. What is gate capacitance (Cg).
PART-B
UNIT-4(A)
PART-A
UNIT-4(B)
PART-A
PART-B
1. Elucidate basic memory chip architecture?
2. Explicate the principle of SRAM with neat sketch?
3. What is the principle of DRAM using 1- transistor with neat diagram?
4. Illustrate the following in detail.
(i) Flash memory
(ii) LIFO, FIFO
5. Write about shift registers & their types.
6. ExplicateSynchronous DRAM (SDRAM) with neat diagram.
7. Write in detail about CAM.
8. Write the different types of sequential memory access.
9. Explicate ROM cell organization & implement with an example.
10. Write about memory element parameters.
UNIT-5(A)
PART-A
PART-B
1. Explain the following in detail with neat diagram
i. Channeled Gate array
ii. Channel less Gate array
iii. Structured Gate array
2. a) Draw the schematic structure of PLA & explain its principle.
b) Using PLA implement JK flip flop circuit.
3. Sketch a diagram for two input XOR gate using PLA & explain its operation with the help
of truth tables.
4. Draw the structure of PAL & explain its principle.
5. Implement JK flip flop using PROM.
6. a) Describe about the principle & operation of FPGA
b) list out the applications of FPGA.
7. a)Write about Complex Programmable Logic Device. & draw its basic structure.
b) Give the applications and advantages of CPLD.
8. Compare PLA’s, PAL’s, CPLD’s, FPGA’s, & standard cells.
9. Clearly write each step of high level design flow of an ASIC.
UNIT-5(B)
PART-A
PART-B
1. Write in detail about need for testing and the two groups of testing.
2. Explain what is meant by short and open circuit faults in detail.
3. Illustrate the following
i. Observability
ii. Fault Coverage
iii. Fault Simulation
iv. Controllability
4. Illustrate the following
i. Stuck- at faults
ii. Stuck- open & stuck- short faults
iii. Stuck- open fault.
5. a)What are the design strategies for test(DST).
b)Explicate the scan-based test techniques.
6. With neat diagram explain about the internal structure of BIST
7. Draw the architecture and the state diagram of TAP controller.
8. What are the chip level & system level test techniques.
9. How layout design can be done for improved testability? Explain.
OBJECTIVE QUESTIONS
UNIT – 1A
1. Which of the material used as gate [ ]
a. Photo resistive b. polysilicon c. metal d. glass
2. The ---------- of the VLSI chip ranges from pre-assembly wafer preparation to fabrication techniques
for the packages that provide electrical connections and mechanical and environmental protection [ ]
a. Placement b. Floor planning c. Packaging d. None
3. ------------------ is the electrical consideration for VLSI packages [ ]
a. low ground resistance b. short signal leads c. minimum power supply spiking d.All
4. A MOS transistor which has conducting channel region at zero gate bias is called [ ]
a. Depletion mode b. Enhancement mode c. Saturated mode d. Non- saturated mode
5. If packing density area and performance are the constraints, power dissipation is not a constraint, the
technology you prefer [ ]
a. .BJT b. CMOS c. NMOS d. PMOS
6. The speed of the CMOS logic is less, when compared to other technologies due to [ ]
a. High noise immunity b. High input capacitance c. High driven current d. All
7. A. sink current is obtained under which condition. [ ]
a. Logic high input b. Logic low input c. Logic high output d. Logic low output
8. Which logic family has highest speed of operation [ ]
a. TTL b. DTL c. ECL d. All the above
9. The partial current flowing through p and n channels is called [ ]
a. Voltage spikes b. current spikes c. both a and b d. none of the above
10. -------------- is the non-saturated digital logic family [ ]
a. TTL b. ECL c. MOS d. RTL
11. Which logic family has very good noise immunity? [ ]
a. RTC b. DTL c. TTL d. ECL
12. The different integrator resistance is
13. The different integrator capacitance is
14. The fabrication of CMOS is done by
15. Diffusion process is carried out in a
16. Ion implantation is performed at temperature
17. The ions in ion implementation process are accelerated _
18. Ion implantation is other technique is
19. Abbreviation of CMOS is
20. Aluminum is used for metallization of most IC because
21. The body effect will be if the substrate is DC
22. The equation of Gm is
23. The output conductance Gds is
24. The polarities of voltage and current in PMOS circuits
25. The polarities of voltage and current in NMOS circuits
26. Charge moves from to when Vds is applied
UNIT – 1B
UNIT - 2
UNIT – 3
UNIT – 4A
1. For the 4X4 bit barrel shifter, the regularity factor is given by
a. 8 b . 4 c.2 d 16
2. The level of any particular design can be measured by
a. SNR b .Ratio of amplitudes c. regularity d. quality
3. In tackling the design of system the more significant property is
a. logical operations b . test ability c . topological properties d .nature of architecture
4. Any bit shifted out at one end of data word will be shifted in at the other end of the word is called
a. end-around b. end-off c. end-less d. end-on
5. In the VLSI design the data and control signals of a shift register flow in
a. horizontally and vertically b. vertically and horizontally c. both horizontally d .both vertically
6. The subsystem design is classified as
a. first level b. top level c. bottom level d. leaf-cell level
7. The larger system design must be partition into a sub systems design such that
a. minimum interdependence and inter conection
b. complexity of interconnection
c. maximum interdependence
d. arbitarily chosen
8. To simplify the subsystem design, we generally used the
a. interdependence b. complex interconnections c. regular structures d. standard cells
9. System design is generally in the manner of
a. down-top b. top-down c. bottom level only d. top level only
10. Structured design begins with the concept of
a. hierarchy b. down-top design c. bottom level design d. complex function design
11. Any general purpose n-bit shifter should be able to shift incoming data by up to number of places
are
a. n b. 2n c. n-1 d. 2n-1
12. For a four bit word, a one-bit shift right is equivalent to a
a. two bit shift left b. three-bit shift left c. one bit shift left d. four-bit shift left
UNIT – 4B
6. DRAM has a
a. smaller layout and uses large power
b. smaller layout and uses less power
c. more power and slower
d. more power and faster
7. SRAM has a
a. faster, more power and larger
b. slower, more power and larger
c. faster, less power and smaller
d. faster less power and larger
8. On chip memory is comes under the category of
a. high density memory
b. medium density memory
c. low density memory
d. large density memory
9. On chip memory usually in the order of
a. 10k bytes
b. 50k bytes
c. 1k bytes
d. 100 k bytes
10. The simplest and safest way to use memory in a system is to treat it as a
a. sequential component
b. combinational component
c. decoders
d. NOR gates
11. Serial access memory at the chip level is classed as memory that has
a. shift registers
b. counters
c. accesstime is independent of location of data
d. internally stored data is used
UNIT – 5A
1. The PLA provides a systematic and regular way of implementing multiple output functions of n
variables in
a. POS form
b. SOP form
c. complex form
d. simple form
2. V(input variables) X P(product terms) PLA is to maintain generality within the constraints of its
dimensions then for
a. AND gate have n inputs and output OR gate must have P inputs
b. AND gate have P inputs and output OR gate must have n inputs
c. Both AND gate and OR gate have n inputs
d. both AND and or gates have P inputs
3. A MOS PLA is realized by using the gate of
a. AND
b. OR
c. AND-OR
d. NOR
4. A CMOS PLA is realized by
a. pseudo nmos NOR gate
b. CMOS NOR gate
c. pseudo nmos NAND gate
d. CMOS NAND gate
5. The mapping of irregular combinational logic functions into regular structures is provided by the
a. FPGA
b. CPCD
c. standard cells
d. PLA
6. The general arrangement of PLA is
a. AND/OR structure
b. OR/AND structure
c. NAND/NOR structure
d. EX-OR/OR structure
7. V XP X Z PLA represents as
a. V-no.of input variables P-no.of output functions Z-no.of gates
b. V-no.of gatesP-no.of OR gates Z- no.of AND gates
c. V-no.of input variables P-no.of product terms Z-no.of output functions
d. V-no.of gates P-no.of AND gates Z-no.of output functions
8. To realize any finite state machine requirements, the PLA along with
a. NOR gate is used
b. feed back links is used
c. NAND gate is used
d. NOT gate is used
9. To reduce the PLA dimensions, the simplification must be done on a
a. individual output basis
b. multi-output basis
c. individual product term
d. individual input basis
10. The advantage of pre-charge evaluate logic is .
11. Standard cells can be placed on silicon chip.
12. DRAM is widely used because
13. Pre designed logic cells are known as _
14. Standard cell areas in CBIC area
15. Power busses are also known as
16. Inter connections are in FPGA.
17. Device sizes in gate array are .
18. The small squares on the edge of the cell are raised for
19. Connecting data path element to form a data path results in and
layout than using standard cells
20. Cross talk results from
21. Silicon circuitry is connected to outside world by
22. LUT is used in
23. In full custom ASIC design all the layers are
24. FPGA is a
25. PAL and PLA are known as _
26. The output of a physical design is
27. IN a PLA are programmable
28. The size of an IC is generally measured by
UNIT – 5B
REFERENCES: