Computer Vision – Introduction
1. What is Computer Vision?
Computer Vision is a field of Artificial Intelligence (AI) that enables computers to
understand and interpret images and videos in a way similar to how humans do.
2. What are the goals of Computer Vision?
Recognition – Identifying objects, people, or scenes.
Measurement – Extracting properties like distance, size, and shape.
Search – Locating visual information in databases.
Interaction – Enabling computers to interact with the environment through visual
input.
3. Give definitions of Computer Vision from different researchers.
Ballard & Brown (1982): Building explicit, meaningful descriptions of physical
objects from images.
Trucco & Verri (1998): Computing 3D properties from one or more images.
Sockman & Shapiro (2001): Making useful decisions about objects or scenes from
images.
4. What is the scope of Computer Vision?
Perception & Interpretation – Identifying objects, scenes, and activities.
Measurement – Extracting 3D properties from images.
Search & Organization – Managing and retrieving visual content efficiently.
5. How does human vision differ from computer vision?
Computers: Excel in simple, repetitive tasks when trained with large datasets.
Humans: Better at complex, context-based tasks.
Trend: AI is rapidly improving, reducing the gap.
6. Why is computer vision hard?
Incomplete information – Lighting, camera settings, object shape are often
unknown.
Inverse problem – Inferring a 3D world from 2D images.
High-dimensional data – Requires heavy computation.
No complete model of human vision – We still don’t fully understand how humans
see.
Context dependency – Importance of elements in an image varies by situation.
7. Why is computer vision important?
Biological relevance: ~50% of macaque brain and ~1/3 of human brain is dedicated
to vision.
Data availability: Billions of images and videos uploaded daily (Flickr, Instagram,
YouTube).
Practical impact: Applications across healthcare, industry, security, entertainment,
and transportation.
8. List major applications of computer vision.
Consumer Electronics: QR codes, panorama, night mode, FaceID.
Healthcare: Medical imaging, diagnosis, surgery planning.
Security: Biometrics, CCTV monitoring.
Industry: Robotics, inspection, OCR, number plate recognition.
Transportation: Self-driving cars, driver monitoring.
Space: Mars rovers, telescopes.
Entertainment: AR/VR, VFX, sports replay.
Retail: Cashier-less checkout, theft detection.
Creative: GAN-based image/video generation, photo editing.
9. What are key research areas in computer vision?
Reconstruction: 3D/4D scene recovery.
Recognition: Object detection, face recognition, classification.
Generation: Image translation, GANs.
Visual Search: Content-based retrieval.
10. Why study computer vision?
Massive data availability from internet and sensors.
Broad applications across domains like healthcare, robotics, security, entertainment,
etc.
Critical role in automation, AI, and human-computer interaction.
A. Short Answer / Definitions
1. Define spatial resolution and intensity resolution.
Spatial Resolution: The number of pixels used to represent an image — determines
how much detail can be captured.
Example: A 1920×1080 image has higher spatial resolution than a 640×480 image.
Intensity Resolution: The number of possible intensity levels (shades of gray or
color) a pixel can have.
Example: 8-bit intensity → 256 levels (0–255).
2. What is brightness adaptation?
The ability of the human eye to adjust its sensitivity based on the brightness level of the
environment, allowing vision in both dark and bright conditions.
3. Give examples of reflectance values.
Reflectance is the ratio of light reflected by an object to the incident light.
Fresh snow: ~0.9 (high reflectance)
Green grass: ~0.25
Asphalt: ~0.05 (low reflectance)
4. Define Bayer filter and demosaicing.
Bayer Filter: A color filter array on an image sensor with a pattern of red, green, and
blue filters (50% green, 25% red, 25% blue) to mimic human vision sensitivity.
Demosaicing: The process of reconstructing a full RGB image from the incomplete
color samples captured through the Bayer filter.
B. Formula-based
5. Use f(x,y)=i(x,y)×r(x,y)f(x, y) = i(x, y) \times r(x, y)f(x,y)=i(x,y)×r(x,y) with given
values.
If i(x,y)=0.7i(x, y) = 0.7i(x,y)=0.7 and r(x,y)=0.4r(x, y) = 0.4r(x,y)=0.4, then:
f(x,y)=0.7×0.4=0.28f(x, y) = 0.7 \times 0.4 = 0.28f(x,y)=0.7×0.4=0.28
(Interpretation: Pixel intensity is 28% of the maximum possible brightness.)
6. Calculate image storage size:
Formula:
b=M×N×kb = M \times N \times kb=M×N×k
Where:
MMM = image width in pixels
NNN = image height in pixels
kkk = bits per pixel
Example: For a 1920×1080, 24-bit RGB image:
b=1920×1080×24=49,766,400 bits≈5.94 MBb = 1920 \times 1080 \times 24 = 49,766,400 \
text{ bits} \approx 5.94 \text{ MB}b=1920×1080×24=49,766,400 bits≈5.94 MB
C. Conceptual
7. Why is RGB used instead of VIBGYOR?
RGB matches the three types of cone cells in the human eye (sensitive to red, green, and
blue wavelengths), making image capture and display more efficient. VIBGYOR is a
continuous spectrum and would require more sensors and higher processing cost.
8. Explain BRDF with an example.
BRDF (Bidirectional Reflectance Distribution Function): Describes how light is
reflected from a surface depending on the incident and viewing directions.
Example: Matte paper scatters light in many directions (diffuse reflection), while a
mirror reflects light mostly in one direction (specular reflection).
9. Difference between CCD and CMOS.
CCD (Charge-Coupled CMOS (Complementary Metal-Oxide
Feature
Device) Semiconductor)
Power usage Higher Lower
Image
Generally better in low light Slightly noisier
quality
Cost More expensive Cheaper
Speed Slower Faster
D. Application-oriented
10. Suitable spectrum for detecting bone fractures.
X-ray spectrum (high-energy electromagnetic radiation) is ideal because it penetrates soft
tissue but is absorbed by bones, making fractures visible.
11. Why photographers prefer DSLRs despite good smartphones.
Larger sensors → better low-light performance
Interchangeable lenses for varied shooting styles
Faster autofocus & burst shooting
Higher dynamic range and control over settings
E. Diagram-based
12. Label human eye diagram for image formation:
Should include: Cornea, Pupil, Lens, Retina, Optic Nerve, Fovea.
13. Draw image sensing pipeline:
Typical blocks:
Scene → Lens → Image Sensor → ADC (Analog to Digital Conversion) → Image
Processing → Storage/Display.
14. Show Bayer grid arrangement:
G R G R G R
B G B G B G
G R G R G R
B G B G B G
(G = Green, R = Red, B = Blue)
If you want, I can now make this into a clean, ready-to-use Word doc with diagrams for
Q12–Q14 so it’s exam-ready.
Do you want me to prepare that?
A. Short Answer / Definitions
1. Define spatial resolution and intensity resolution.
Spatial Resolution: Refers to the number of pixels per unit area in an image. Higher
spatial resolution means finer detail.
Intensity Resolution: The number of different intensity levels (gray levels) a pixel
can have, determined by bit depth.
2. What is brightness adaptation?
Brightness adaptation is the human eye’s ability to adjust to different levels of
illumination, allowing vision in both bright and dark conditions.
3. Give examples of reflectance values.
Fresh snow: ~90%
Asphalt road: ~10%
Green grass: ~25%
White paper: ~80%
4. Define Bayer filter and demosaicing.
Bayer Filter: A color filter array with a pattern of red, green, and blue filters used in
most digital cameras.
Demosaicing: The process of interpolating missing color information from the Bayer
pattern to produce a full-color image.
B. Formula-based
5. Use f(x, y) = i(x, y) × r(x, y) with given values.
If i(x, y) = 200 (illumination) and r(x, y) = 0.5 (reflectance), then:
f(x, y) = 200 × 0.5 = 100 (image intensity).
6. Calculate image storage size:
Formula: b = M × N × k
Example: For an image of 1024 × 768 pixels and k = 8 bits (1 byte):
b = 1024 × 768 × 1 = 786,432 bytes (~0.75 MB).
C. Conceptual
7. Why is RGB used instead of VIBGYOR?
RGB corresponds to the three primary colors detected by human cone cells. Mixing
RGB can produce all visible colors, making it efficient for digital imaging, unlike
VIBGYOR which has seven distinct wavelengths.
8. Explain BRDF with an example.
BRDF (Bidirectional Reflectance Distribution Function) describes how light
reflects off a surface depending on the incident and viewing directions.
Example: A matte wall scatters light in many directions (diffuse), while a mirror
reflects light in one direction (specular).
9. Difference between CCD and CMOS sensors:
Feature CCD CMOS
Power Consumption Higher Lower
Image Quality Better low-light performance Slightly more noise
Cost More expensive Cheaper
Speed Slower Faster
D. Application-oriented
10. Suitable spectrum for detecting bone fractures:
X-ray spectrum is used as it penetrates soft tissue but is absorbed by bones,
highlighting fractures.
11. Why photographers prefer DSLRs despite good smartphones:
Larger sensors, better low-light performance, interchangeable lenses, faster focusing,
and higher dynamic range.
E. Diagram-based
12. Human Eye Diagram for Image Formation
📌 (Insert labeled diagram: cornea, pupil, iris, lens, retina, optic nerve)
13. Image Sensing Pipeline
📌 (Insert flowchart: Scene → Lens → Sensor → ADC → Image Processing → Storage)
14. Bayer Grid Arrangement
📌 (Insert diagram showing 2×2 pattern: G R / B G repeating)
Short Answer / Definitions
1. Define pixel and megapixel.
2. What is the difference between grayscale and binary images?
3. Define histogram equalization.
4. What is aliasing in digital images?
5. Give an example of an active vision system.
B. Formula-based
6. Calculate the total number of bits required to store a 1920 × 1080 RGB image with 8
bits per channel.
7. Using the convolution formula, compute the output for a given 3×3 image patch and
kernel.
8. Find the aspect ratio of an image given M = 400 pixels, N = 300 pixels.
9. Calculate spatial resolution given image dimensions and sensor size.
10. Apply the sampling theorem to determine the minimum sampling rate for a given
spatial frequency.
C. Conceptual
11. Why is the Fourier Transform important in image processing?
12. Explain the difference between point processing and neighborhood processing.
13. What are the advantages of using the HSV color model over RGB for object
detection?
14. Describe the working of a pinhole camera model.
15. How does gamma correction improve image quality?
D. Application-oriented
16. Which part of the electromagnetic spectrum is used in night vision cameras and why?
17. Explain how LiDAR assists in self-driving cars.
18. Give an example of computer vision in agriculture.
19. Why is thermal imaging preferred for fire detection?
20. Suggest an application of stereo vision.
E. Diagram-based
21. Draw and label the electromagnetic spectrum with visible light highlighted.
22. Sketch the pinhole camera model.
23. Draw a block diagram of an OCR (Optical Character Recognition) system.
24. Show the working steps of Canny Edge Detection as a flowchart.
25. Draw the RGB color cube.
If you want, I can also prepare detailed answers for all of these in the same style as before
and put them in a Word file with diagrams included.
Do you want me to go ahead and prepare that answer set?
You said:
yes
ChatGPT said:
It seems like I can’t do more advanced data analysis right now. Please try again later.
You said:
please give answers for my exam
ChatGPT said:
Alright — I’ll give you clear, exam-ready answers for the new question set I shared earlier,
with formulas and diagrams described where needed so you can rewrite them in your
notebook or a Word doc.
A. Short Answer / Definitions
1. Pixel and Megapixel
o Pixel: The smallest unit of an image that stores color/intensity information.
o Megapixel: One million pixels (used to describe camera resolution).
2. Grayscale vs Binary Images
o Grayscale: Each pixel has an intensity value (0–255 for 8-bit images).
o Binary: Each pixel is either black (0) or white (1).
3. Histogram Equalization
A method to improve image contrast by redistributing intensity values so they are
more evenly spread across the histogram.
4. Aliasing in Digital Images
Distortion or artifacts in an image due to insufficient sampling (e.g., jagged edges in
diagonal lines).
5. Example of Active Vision System
LiDAR scanners in self-driving cars (emit signals and measure reflections).
B. Formula-based
6. Image Storage Calculation
Formula:
Size in bits=M×N×k\text{Size in bits} = M \times N \times kSize in bits=M×N×k
Where M=1920M = 1920M=1920, N=1080N = 1080N=1080, k=24k = 24k=24 bits (8
bits × 3 channels).
Size=1920×1080×24=49,766,400 bits=6,220,800 bytes≈6.22 MB\text{Size} = 1920 \times
1080 \times 24 = 49,766,400 \ \text{bits} = 6,220,800 \ \text{bytes} \approx 6.22 \ \
text{MB}Size=1920×1080×24=49,766,400 bits=6,220,800 bytes≈6.22 MB
7. Convolution Example
Apply kernel element-by-element, multiply and sum values — standard convolution
process (show working in exam).
8. Aspect Ratio
Formula:
Aspect Ratio=MN=400300=43\text{Aspect Ratio} = \frac{M}{N} = \frac{400}{300} = \frac{4}
{3}Aspect Ratio=NM=300400=34
9. Spatial Resolution
Formula:
Resolution=Number of pixelsSensor dimension (mm)\text{Resolution} = \frac{\text{Number
of pixels}}{\text{Sensor dimension
(mm)}}Resolution=Sensor dimension (mm)Number of pixels
Example: 4000 pixels over 36 mm sensor → 111.1 pixels/mm.
10. Nyquist Sampling Theorem
Minimum sampling rate = 2 × max spatial frequency.
C. Conceptual
11. Importance of Fourier Transform
Converts an image from spatial domain to frequency domain — useful for filtering
and compression.
12. Point vs Neighborhood Processing
Point: Pixel processed independently (e.g., brightness adjustment).
Neighborhood: Pixel value depends on surrounding pixels (e.g., blurring, edge
detection).
13. HSV vs RGB for Object Detection
HSV separates color (Hue) from brightness (Value), making it less sensitive to
lighting changes than RGB.
14. Pinhole Camera Model
Light passes through a tiny hole and forms an inverted image on the opposite side of a
closed box — basic camera principle.
15. Gamma Correction
Adjusts brightness using a power-law curve to match human perception.
D. Application-oriented
16. Night Vision Spectrum
Infrared spectrum — detects heat even in low light.
17. LiDAR in Self-driving Cars
Measures distances to objects by emitting laser pulses — creates a 3D map of
surroundings.
18. Computer Vision in Agriculture
Detecting crop diseases from leaf images using deep learning.
19. Thermal Imaging for Fire Detection
Captures heat signatures — works through smoke and darkness.
20. Stereo Vision Application
Depth measurement for robotic navigation.
E. Diagram-based (Describe for you to draw)
21. Electromagnetic Spectrum — Draw a horizontal band:
Left: Gamma, X-ray, UV
Middle: Visible light (VIBGYOR)
Right: Infrared, Microwave, Radio.
22. Pinhole Camera — Box with a small hole and inverted image projection.
23. OCR Block Diagram — Image capture → Preprocessing → Character segmentation
→ Feature extraction → Classification → Output text.
24. Canny Edge Detection Flowchart — Input → Gaussian smoothing → Gradient
calculation → Non-maximum suppression → Double threshold → Edge tracking.
25. RGB Color Cube — 3D cube with RGB axes (0–255).
A. Short Answer / Definitions
1) Spatial resolution vs. intensity resolution
Spatial resolution: Number of pixels used to sample the scene (e.g., 1920×1080).
Higher → finer detail.
Intensity resolution (radiometric resolution/bit depth): Number of discrete
intensity levels per pixel (e.g., 8-bit → 256 levels, 0–255).
2) Brightness adaptation
The human visual system’s ability to adjust sensitivity across wide illumination levels so we
can see in bright daylight and at night.
3) Examples of reflectance values
Approximate diffuse reflectance: fresh snow ≈ 0.9, white paper ≈ 0.8, green grass ≈ 0.25,
asphalt/charcoal ≈ 0.05.
4) Bayer filter & demosaicing
Bayer filter: Mosaic CFA with 50% G, 25% R, 25% B arranged in 2×2 tiles (GR/
BG).
Demosaicing: Interpolating missing color samples to reconstruct a full-RGB image.
B. Formula-Based
5) Image formation model
Example: If i=0.7i=0.7i=0.7, r=0.4r=0.4r=0.4 ⇒ f=0.28f=0.28f=0.28 (28% of max).
f(x,y)=i(x,y)×r(x,y)f(x,y) = i(x,y)\times r(x,y)f(x,y)=i(x,y)×r(x,y)
6) Image storage size
b=M×N×kb = M \times N \times kb=M×N×k (bits). For bytes, divide by 8.
Example (1920×1080, 24-bit RGB):
b=1920×1080×24=49,766,400b = 1920\times1080\times24 =
49{,}766{,}400b=1920×1080×24=49,766,400 bits
= 6,220,800 bytes ≈ 6.22 MB (decimal) ≈ 5.93 MiB (binary).
C. Conceptual
7) Why RGB instead of VIBGYOR?
Human color vision is trichromatic (three cone types ≈ R/G/B). RGB efficiently reproduces
colors with three channels; VIBGYOR (7 bands) would be redundant and costly.
8) BRDF (Bidirectional Reflectance Distribution Function)
Gives reflected radiance as a function of incoming/outgoing directions.
Example:
Lambertian (matte paper): uniform reflection → looks same from many angles.
Specular (mirror): reflection concentrated around mirror direction.
9) CCD vs. CMOS
Feature CCD CMOS
Power Higher Lower
Readout Charge shifted across chip Pixel-wise, parallel
Image noise (old/typical) Lower Slightly higher (modern CMOS very good)
Speed/On-chip features Slower, fewer Faster, many features
Cost Higher Lower
D. Application-Oriented
10) Spectrum for bone fractures
X-ray (penetrates soft tissue, absorbed by bone).
11) Why many photographers still prefer DSLRs/ILCs
Bigger sensors, better low-light and dynamic range, interchangeable lenses, reliable AF,
faster bursts, robust ergonomics and controls.
E. Diagram-Based (ready to redraw/insert)
12) Human eye – label: Cornea, Iris, Pupil, Lens, Retina, Fovea (macula), Optic Nerve.
13) Image sensing pipeline: Scene → Lens → Image Sensor → ADC → Demosaicing/ISP
→ Compression/Storage/Display.
14) Bayer grid (2×2 tile repeats):
G R G R G R
B G B G B G
G R G R G R
B G B G B G
New Set – Answers
A. Short Answer / Definitions
1) Pixel & megapixel
Pixel = smallest image element; 1 MP = 10^6 pixels.
2) Grayscale vs. binary
Grayscale: 0–255 (8-bit) intensities; Binary: two levels (0/1).
3) Histogram equalization
Contrast enhancement by redistributing intensities to approximate a uniform histogram.
4) Aliasing
Artifacts (jagged edges, moiré) from undersampling high spatial frequencies.
5) Active vision example
LiDAR, structured light (e.g., Kinect), time-of-flight cameras.
B. Formula-Based
6) Size of a 1920×1080 RGB, 8-bit/channel
Bits = 1920×1080×3×8=49,766,4001920\times1080\times3\
times8=49{,}766{,}4001920×1080×3×8=49,766,400 bits
= 6,220,800 bytes ≈ 6.22 MB (dec).
7) 2D convolution (worked example)
Image patch III and kernel KKK:
I=[121010212]I=\begin{bmatrix}1&2&1\\0&1&0\\2&1&2\end{bmatrix}I=102211102,
K=[0101−41010]K=\begin{bmatrix}0&1&0\\1&-4&1\\0&1&0\end{bmatrix}K=0101−41
010.
Output center = sum of element-wise products =
(2)+(1)+(−4)+(1)+(1)=1(2)+(1)+(-4)+(1)+(1) = 1(2)+(1)+(−4)+(1)+(1)=1. (Show full grid if
space allows.)
8) Aspect ratio (M=400, N=300)
400:300=4:3400:300 = 4:3400:300=4:3.
9) Spatial resolution (pixels/mm)
Example: 6000 px over 36 mm ⇒ 166.7 px/mm.
\text{px/mm}=\frac{\text{#pixels along width}}{\text{sensor width (mm)}}.
10) Sampling theorem
Sample ≥ 2× highest spatial frequency (Nyquist) to avoid aliasing.
C. Conceptual
11) Why Fourier Transform?
Reveals frequency content; simplifies filtering (low/high-pass), deconvolution, compression.
12) Point vs neighborhood processing
Point: operate per pixel independently (e.g., gamma).
Neighborhood: use neighbors (e.g., blur, Sobel).
13) HSV advantages for detection
Hue is less sensitive to illumination; easier color thresholding than RGB.
14) Pinhole camera model
Perspective projection: x=fX/Z, y=fY/Zx = fX/Z,\; y = fY/Zx=fX/Z,y=fY/Z on image plane;
forms inverted image.
15) Gamma correction
Intensity mapping Iout=Iin1/γI_{out}=I_{in}^{1/\gamma}Iout=Iin1/γ to match human
perception and display characteristics.
D. Application-Oriented
16) Night vision
Infrared (near-IR for active IR cameras; long-wave IR/thermal for heat).
17) LiDAR role
Emits laser pulses, measures time-of-flight → precise 3D point cloud for detection, tracking,
localization.
18) Agriculture CV example
Leaf disease detection, weed/crop segmentation, yield estimation via drone imagery.
19) Thermal imaging for fire
Captures heat signatures through smoke/darkness; early detection and localization.
20) Stereo vision application
Depth estimation for obstacle avoidance, 3D mapping, pose estimation.
E. Diagram-Based (what to draw)
21) Electromagnetic spectrum
Left→Right: Gamma | X-ray | UV | Visible (VIBGYOR) | IR | Microwave | Radio. Mark
wavelengths.
22) Pinhole camera sketch
Box with tiny hole; rays from top of object go through hole to bottom of image plane →
inverted image.
23) OCR block diagram
Capture → Preprocess (denoise, binarize, deskew) → Segmentation → Feature
extraction/Deep features → Classifier (CNN/RNN) → Post-processing (spell/LM) → Text.
24) Canny edge flow
Input → Gaussian blur → Gradient (Sobel) → Non-max suppression → Double thresholds
→ Hysteresis linking → Edges.
25) RGB color cube
3D axes R,G,B from 0–255; corners: (0,0,0)=black, (255,255,255)=white, axis endpoints are
primaries.
Bonus quick formulas & facts (last-minute glance)
Bits to MB: MB (decimal) = bits / (8×10^6); MiB (binary) = bits / (8×2^20).
SNR improves by √N for averaging N identical noisy frames.
8-bit gray levels: 256; 10-bit: 1024.
PSNR: 10log10(MAX2MSE)10\log_{10}\left(\frac{MAX^2}{\text{MSE}}\
right)10log10(MSEMAX2) (MAX=255 for 8-bit).
Common kernels:
o Box blur 1913×3\frac{1}{9}\mathbf{1}_{3\times3}9113×3
o Gaussian blur (normalized)
o Sobel X: [−101−202−101]\begin{bmatrix}-1&0&1\\-2&0&2\\-1&0&1\
end{bmatrix}−1−2−1000121
If you want, I can also add this into a .docx layout (with headings/spacing ready to print) and
include simple vector diagrams for the eye, pipeline, Bayer grid, pinhole, Canny flow, and
OCR blocks.
Bits to MB: MB (decimal) = bits / (8×10^6); MiB (binary) = bits / (8×2^20).
SNR improves by √N for averaging N identical noisy frames.
8-bit gray levels: 256; 10-bit: 1024.
PSNR: 10log10(MAX2MSE)10\log_{10}\left(\frac{MAX^2}{\text{MSE}}\
right)10log10(MSEMAX2) (MAX=255 for 8-bit).
Common kernels:
Box blur 1913×3\frac{1}{9}\mathbf{1}_{3\times3}9113×3
Gaussian blur (normalized)
Sobel X: [−101−202−101]\begin{bmatrix}-1&0&1\\-2&0&2\\-1&0&1\end{bmatrix}
−1−2−1000121
1. Define Image Histogram.
→ An image histogram is a graphical representation of the distribution of pixel
intensity values in an image.
2. What is Gamma Correction?
→ Gamma correction adjusts the brightness of an image by applying a nonlinear
transformation to pixel values.
3. Define Point Spread Function (PSF).
→ The PSF describes how a single point of light is recorded by an imaging system,
determining the level of blur.
4. What is an Edge in Image Processing?
→ An edge is a boundary between two regions with distinct intensity values.
B. Formula-Based
5. Calculate the number of pixels in a 1920 × 1080 image.
→ Pixels = 1920 × 1080 = 2,073,600 pixels.
6. Find storage size for a grayscale image of 512 × 512 pixels, 8 bits per pixel.
→ Size = 512 × 512 × 1 byte = 262,144 bytes (~256 KB).
7. If an RGB image is 1024 × 768 pixels, calculate its storage size (24 bits/pixel).
→ Size = 1024 × 768 × 3 bytes = 2,359,296 bytes (~2.25 MB).
C. Conceptual
8. Why is image normalization used?
→ To scale pixel values to a specific range (e.g., 0–1) for better contrast and
algorithm performance.
9. Difference between global and local thresholding.
o Global: One threshold value for the entire image.
o Local: Threshold varies across regions based on local intensity.
10. Explain convolution in image processing.
→ Convolution is a mathematical operation used to apply filters to images for
blurring, sharpening, or edge detection.
D. Application-Oriented
11. Which imaging technique is suitable for detecting tumors?
→ MRI (Magnetic Resonance Imaging) — provides detailed soft tissue visualization.
12. Why is infrared imaging used in night vision?
→ Infrared can detect heat signatures, allowing visibility in low-light or dark
environments.
E. Diagram-Based
13. Draw a basic convolution operation diagram.
14. Show an image histogram for a low-contrast image.
15. Sketch the electromagnetic spectrum highlighting visible light and IR.
Short Answer / Definitions
Define illumination and reflectance in the context of image formation.
What is the typical reflectance of black velvet and snow?
Define lumen and lumen/m².
What is colour constancy and why is it important in vision systems?
What does spatial resolution measure?
What is intensity resolution?
Define specular reflection and diffuse reflection with examples.
What is demosaicing?
2. Formula-Based & Numerical
Using f(x,y)=i(x,y)⋅r(x,y)f(x,y) = i(x,y) \cdot r(x,y)f(x,y)=i(x,y)⋅r(x,y), calculate
f(2,3)f(2,3)f(2,3) given i(2,3)=100i(2,3) = 100i(2,3)=100 lm/m² and r(2,3)=0.8r(2,3) =
0.8r(2,3)=0.8.
An image is 1024 × 768 pixels, 8-bit depth. Calculate the storage in bits and bytes.
If spatial resolution is 300 dpi, how many pixels are in a 4 × 6 inch image?
Given a Bayer filter arrangement, estimate how many green pixels are in a 12 MP
image.
3. Conceptual
Why are green pixels more abundant than red or blue in the Bayer filter?
Explain why shorter wavelengths have higher energy.
Differentiate between CCD and CMOS in terms of quality, cost, and power usage.
Why might CCDs produce less noise than CMOS sensors?
Describe how human cone cells determine the RGB model.
How does surface geometry affect image brightness?
4. Application-Oriented
Which part of the EM spectrum would you use for:
a) Detecting hidden cracks in metal objects
b) Night vision cameras
c) MRI scanning
Why is colour processing sometimes skipped in image analysis?
In what scenarios is monochrome imaging better than colour imaging?
Explain why camera sensors in smartphones can still produce high-quality images
despite small size.
5. Diagram / Labeling
Label the main parts of the human eye and indicate where the image forms.
Draw and label a Bayer grid.
Sketch a basic image sensing pipeline.
Draw the image sampling and quantization process.
If you want, I can add all these extra questions into your Word exam notes so you have
one big consolidated preparation file instead of multiple separate ones. That way, you’ll have
theory + formulas + diagrams + questions in one place.
Do you want me to merge these into your existing revision Word doc?
You said:
answers
ChatGPT said:
Alright — here’s the answers key for the extra questions I just gave you so you can prepare
without hunting for them in the notes.
Answers to Extra Possible Exam Questions
1. Short Answer / Definitions
Illumination: Amount of light incident on a surface, i(x,y)i(x,y)i(x,y), measured in
lm/m².
Reflectance: Proportion of incident light reflected by a surface, r(x,y)r(x,y)r(x,y),
dimensionless (0–1).
Reflectance values: Black velvet = 0.01, Snow = 0.93.
Lumen: Unit of luminous flux; total amount of visible light emitted by a source.
Lumen/m² (Lux): Illuminance — amount of light per unit area.
Colour constancy: Ability to perceive colours consistently under varying
illumination.
Spatial resolution: Measure of the smallest discernible detail in an image; usually in
dpi (dots per inch).
Intensity resolution: Smallest detectable change in brightness; measured in bits per
pixel.
Specular reflection: Mirror-like reflection where light is reflected in a single
direction (e.g., polished metal).
Diffuse reflection: Scattered reflection in many directions (e.g., matte paper).
Demosaicing: Process of reconstructing a full RGB image from incomplete colour
samples from a Bayer filter.
2. Formula-Based & Numerical
1. f(x,y)=i(x,y)⋅r(x,y)f(x,y) = i(x,y) \cdot r(x,y)f(x,y)=i(x,y)⋅r(x,y)
Given i(2,3)=100i(2,3) = 100i(2,3)=100, r(2,3)=0.8r(2,3) = 0.8r(2,3)=0.8:
f(2,3)=100×0.8=80f(2,3) = 100 \times 0.8 = 80f(2,3)=100×0.8=80 (brightness units).
2. Image storage:
M×N×kM \times N \times kM×N×k bits = 1024×768×8=6,291,4561024 \times 768 \
times 8 = 6,291,4561024×768×8=6,291,456 bits = 786,432 bytes (~0.75 MB).
3. Pixels in 4 × 6 inch image at 300 dpi:
(4×300)×(6×300)=1200×1800=2,160,000(4 \times 300) \times (6 \times 300) = 1200 \
times 1800 = 2,160,000(4×300)×(6×300)=1200×1800=2,160,000 pixels.
4. Green pixels in 12 MP image:
Bayer filter = 50% green.
12 MP×0.5=6 MP green pixels12 \text{ MP} \times 0.5 = 6 \text{ MP green
pixels}12 MP×0.5=6 MP green pixels.
3. Conceptual
More green pixels: Human eyes are most sensitive to green wavelengths (M-cones),
improving perceived sharpness.
Shorter wavelength = higher energy: From E=hνE = h\nuE=hν and ν=c/λ\nu = c/\
lambdaν=c/λ, decreasing λ increases ν, hence energy.
CCD vs CMOS:
o CCD: Better quality, less noise, high power usage, expensive.
o CMOS: Cheaper, low power, more noise.
Why CCD less noisy: Charge transfer in CCDs happens in a controlled way before
amplification, reducing random noise.
Human cone cells & RGB: Long (red), Medium (green), Short (blue) cones → RGB
model.
Surface geometry & brightness: Flat perpendicular surfaces receive more light;
oblique surfaces get less illumination.
4. Application-Oriented
EM spectrum uses:
a) Hidden cracks in metal: X-rays or ultrasound.
b) Night vision: Infrared.
c) MRI scanning: Radio waves (in magnetic fields).
Skipping colour processing: For tasks where texture, shape, or intensity is more
important (e.g., edge detection).
Monochrome better: In low light, scientific imaging, or when needing higher spatial
resolution.
Smartphone image quality: Computational photography + AI enhancement + good
lens coatings compensate for smaller sensors.
5. Diagram / Labeling
(Diagrams not shown here but would be drawn in exam)
Human eye: Cornea, lens, retina, fovea, optic nerve.
Bayer grid: 2×2 block → G R / B G pattern.
Image sensing pipeline: Light → Lens → Sensor (CCD/CMOS) → ADC → Image
processor → Storage.
Sampling & quantization: Continuous image → sampled grid → discrete pixel
values.