Image processing with
OpenCV
and
Python
Kripasindhu Sarkar
kripasindhu.sarkar@dfki.de
Kaiserslautern University,
DFKI – Deutsches Forschungszentrum für Künstliche Intelligenz
http://av.dfki.de
Some of the contents are taken from
Slides from Didier Stricker, SS16
Slides from Rahul Sukthankar, CMU
Images from OpenCV website
Example from Stanford CS231n
Outline
● Motivation - What is an Image?
● Tools required for Image processing
● Introduction to OpenCV
○ Intro
○ Installation + usage
○ Modules
○ …
● Introduction to Python
○ Numpy
○ Broadcasting
○ Simple maths
● Example applications
○ Edge detector
○ Thresholding, histogram normalization, etc
○ Filters
● Filters/Convolution
○ Background
○ maths
What is an image?
Image from Rahul Sukthankar, CMU
What is an image?
• 2D array of pixels
• Binary image (bitmap)
• Pixels are bits
• Grayscale image
• Pixels are scalars
• Typically 8 bits (0..255)
• Color images
• Pixels are vectors
• Order can vary: RGB, BGR
• Sometimes includes Alpha
What is an image?
• 2D array of pixels
• Binary image (bitmap)
• Pixels are bits
• Grayscale image
• Pixels are scalars
• Typically 8 bits (0..255)
• Color images
• Pixels are vectors
• Order can vary: RGB, BGR
• Sometimes includes Alpha
Slide from Rahul Sukthankar, CMU
What is an image?
• 2D array of pixels
• Binary image (bitmap)
• Pixels are bits
• Grayscale image
• Pixels are scalars
• Typically 8 bits (0..255)
• Color images
• Pixels are vectors
• Order can vary: RGB, BGR
• Sometimes includes Alpha
Slide from Rahul Sukthankar, CMU
Tools required for Image processing
● Good data structure for representing Grids/Matrix
● Efficient Operations on Matrix
○ Matrix multiplications
○ Broadcasting
○ Inverse… etc
○ Eigen analysis
○ SVD
● IO
○ Reading/writing of images and videos
● GUI
● Machine learning support
○ Optimization algorithms
○ Gradient descent etc
● Image processing algorithms
○ Filters/Convolutions
○ Transforms
○ Histogram Equalization
○ Specific CV algorithms - Canny Edge detectors, Flood filling, Scale space, Contours,
Features, etc.
Introduction to OpenCV
● OpenCV stands for the Open Source Computer Vision Library.
● Founded at Intel in 1999
● OpenCV is free for commercial and research use.
● It has a BSD license. The library runs across many platforms and actively
supports Linux, Windows and Mac OS.
● OpenCV was founded to advance the field of computer vision.
● It gives everyone a reliable, real time infrastructure to build on. It collects the
most useful algorithms.
OpenCV Algorithm Modules Overview
HighGUI:
I/O, Interface
Image Processing Transforms Fitting Optical Flow Segmentation
Tracking
Calibration Features Depth, Pose Object recognition Computational
VSLAM Normals, Planes, Machine learning Photography
3D Features
CORE:
Data structures, Matrix math, Exceptions etc
Slide from G. Bradsky
Robot support
OpenCV Overview: > 500
opencv.willowgarage.com
functions
General Image Processing Functions Image Pyramids
Geometric
Descriptors
Segmentation Camera
Calibration,
Stereo, 3D
Features
Transforms Utilities and
Data Structures
Tracking
Machine
Learning: Fitting
•Detection,
•Recognition
Matrix Math
Slide Courtesy OpenCV Tutorial Gary Bradski
OpenCV Conceptual Structure
Modules User
Contrib
Other
Languages
VO
Calib3d Stitching
Objdetect Features2d SLAM
Java (TBD)
(TBD)
SSE
(TBD) TBB
GPU
Python MPU
Machine
learning imgproc
C++
HighGUI CORE
C
Operating system
Slide from D. Stricker
OpenCV – Getting Started
▶ Download OpenCV
http://opencv.org
▶ Online Reference:
http://docs.opencv.org
▶ Books?
OpenCV – Installation
● Installation instruction
● https://docs.opencv.org/3.0.0/d7/d9f/tutorial_linux_inst
all.html
● Build from Source
○ Install the dependencies
○ Download the code (OpenCV)
○ Configure and Compile the code
○ Install
CMake Introduction
● “CMake is an open-source, cross-platform family of tools designed to
build, test and package software”
○ Cross-platform Project generator
Makefile
(Unix)
CPP/C CMake
code
Visual Studio
solution(Windows)
● Resources -
● Contents in the official website:
● https://cmake.org/cmake/help/v3.11/manual/cmake-buildsystem.7.html
OpenCV – Installation
● Installation instruction
● https://docs.opencv.org/3.0.0/d7/d9f/tutorial_linux_inst
all.html
● Build from Source
○ Install the dependencies
○ Download the code (OpenCV)
○ Configure and Compile the code
○ Install
OpenCV – Installation
● Installation instruction
● https://docs.opencv.org/3.0.0/d7/d9f/tutorial_linux_inst
all.html
● Build from Source
○ Install the dependencies
○ Download the code (OpenCV)
○ Configure and Compile the code
○ Install
(make sure installation files
libs and exes are in your default paths)
○ (make sure cv2.so is in default PYTHONPATH)
OpenCV – Usage
DisplayImage.cpp:
#include <stdio.h>
#include <opencv2/opencv.hpp>
using namespace cv;
int main(int argc, char** argv )
{
if ( argc != 2 )
{
CMakeLists.txt
printf("usage: DisplayImage.out <Image_Path>\n"); cmake_minimum_required(VERSION 2.8)
return -1; project( DisplayImage )
} find_package( OpenCV REQUIRED )
include_directories( ${OpenCV_INCLUDE_DIRS} )
Mat image; add_executable( DisplayImage DisplayImage.cpp )
image = imread( argv[1], 1 );
target_link_libraries( DisplayImage ${OpenCV_LIBS} )
if ( !image.data )
{
printf("No image data \n");
return -1;
}
namedWindow("Display Image", WINDOW_AUTOSIZE );
imshow("Display Image", image);
waitKey(0);
return 0;
17
}
OpenCV – Core
● All the OpenCV classes and functions are placed into the
cv namespace
● core - compact module defining basic data structures
and basic functions used by all other modules
● Basic image class cv::Mat
○ (https://docs.opencv.org/3.1.0/d3/d63/classcv_1_1Mat.html)
Mat A, C; // creates just the header parts
A = imread(argv[1], IMREAD_COLOR); // here we'll know the method used (allocate matrix)
Mat B(A); // Use the copy constructor
C = A;
//Initializers….
Mat E = Mat::eye(4, 4, CV_64F);
Mat O = Mat::ones(2, 2, CV_32F);
Mat Z = Mat::zeros(3,3, CV_8UC1);
The Mat class
● Important things to know:
● Shallow copy: Mat A = B; does not copy data.
● Deep copy: clone() and/or B.copyTo(A); (for ROIs, etc).
● Most OpenCV functions can resize matrices if needed
● Lots of convenient functionality (Matrix expressions):
● s is a cv::Scalar, α scalar (double)
● Addition, scaling, ...: A±B, A±s, s±A, αA
● Per-element multiplication, division...: A.mul(B), A/B, α/A
● Matrix multiplication, dot, cross product: A*B, A.dot(B),
● A.cross(B)
● Transposition, inversion: A.t(), A.inv([method])
● And a few more.
HighGUI
HighGUI
Image I/O, rendering
Processing keyboard and other events, timeouts
Trackbars
Mouse callbacks
Video I/O
HighGUI
Example functions
• void cv::namedWindow(const string& winname, int flags=WINDOW_AUTOSIZE);
– Creates window accessed by its name. Window handles repaint, resize events.
Its position is remembered in registry.
• void cv::destroyWindow(const string& winname);
• void cv::imshow(const string& winname, cv::Mat& mat);
– Copies the image to window buffer, then repaints it when necessary.
{8u|16s|32s|32f}{C1|3|4} are supported.
HighGUI
• Mat imread(const string& filename, int flags=1);
– loads image from file, converts to color or grayscale, if need,
and returns it (or returns empty cv::Mat()).
– image format is determined by the file contents.
• bool imwrite(const string& filename, Mat& image);
– saves image to file, image format is determined from
extension.
• Example: convert JPEG to PNG
– cv::Mat img = cv::imread(“picture.jpeg”);
– if(!img.empty()) cv::imwrite( “picture.png”, img );
HighGUI: Creating Interfaces I
● Start off by creating a program that will constantly input images from a camera
#include <opencv2/opencv.hpp>
int main( int argc, char* argv[] ) {
cv::VideoCapture capture("filename.avi");
if (!capture.isOpened()) return 1;
cv::Mat frame;
while (true) {
capture >> frame; if(!frame.data) break;
//process the frame here
}
capture.release();
return 0;
}
Python and Numpy
• Python is a high-level, dynamically typed multiparadigm programming language.
• Python code is often said to be almost like pseudocode, since it allows you to express very powerful
ideas in very few lines of code while being very readable.
Example:
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
print(quicksort([3,6,8,10,1,2,1]))
# Prints "[1, 1, 2, 3, 6, 8, 10]"
Python examples in this section are taken from Stanford CS231n
Python basic types and containers
• Basic types - integers, floats, booleans, and strings...
x=3
print(type(x)) # Prints "<class 'int'>"
print(x) # Prints "3"
print(x + 1) # Addition; prints "4"
• Containers - lists, dictionaries, sets, and tuples.
xs = [3, 1, 2] # Create a list
print(xs, xs[2]) # Prints "[3, 1, 2] 2"
print(xs[-1]) # Negative indices count from the end of the list; prints "2"
List comprehension
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares) # Prints [0, 1, 4, 9, 16]
Python basic types and containers
• Dictionaries
d = {'cat': 'cute', 'dog': 'furry'} # Create a new dictionary with some data
print(d['cat']) # Get an entry from a dictionary; prints "cute"
d['fish'] = 'wet' # Set an entry in a dictionary
print(d['fish']) # Prints "wet"
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal in d:
legs = d[animal]
print('A %s has %d legs' % (animal, legs))
• Tuples
– ordered list of values
Python - Function
• Functions
def hello(name, loud=False):
if loud:
print('HELLO, %s!' % name.upper())
else:
print('Hello, %s' % name)
hello('Bob') # Prints "Hello, Bob"
hello('Fred', loud=True) # Prints "HELLO, FRED!"
Python - Numpy
• Arrays
– A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative
integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of
integers giving the size of the array along each dimension.
import numpy as np
a = np.array([1, 2, 3]) # Create a rank 1 array
print(type(a)) # Prints "<class 'numpy.ndarray'>"
print(a.shape) # Prints "(3,)"
print(a[0], a[1], a[2]) # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print(a) # Prints "[5, 2, 3]"
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
print(b.shape) # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0]) # Prints "1 2 4"
Python - Numpy
• Arrays - Slicing
import numpy as np
# Create the following rank 2 array with shape (3, 4)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
# [6 7]]
b = a[:2, 1:3]
# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1]) # Prints "2"
b[0, 0] = 77 # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) # Prints "77"
Python - Numpy
• Boolean array indexing
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
bool_idx = (a > 2) # Find the elements of a that are bigger than 2;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 2.
print(bool_idx) # Prints "[[False False]
# [ True True]
# [ True True]]"
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx]) # Prints "[3 4 5 6]"
# We can do all of the above in a single concise statement:
print(a[a > 2]) # Prints "[3 4 5 6]"
Python - Numpy
• Array operations
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
# Elementwise product; both produce the array
# [[ 5.0 12.0]
# [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))
# Elementwise square root; produces the array
# [[ 1. 1.41421356]
# [ 1.73205081 2. ]]
print(np.sqrt(x))
• Matrix multiplication - dot
x = np.array([[1,2],[3,4]])
v = np.array([9,10])
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
Python - Numpy
• Broadcasting
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v # Add v to each row of x using broadcasting
print(y) # Prints "[[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
• Rules
• If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both
shapes have the same length.
• The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or
if one of the arrays has size 1 in that dimension.
• The arrays can be broadcast together if they are compatible in all dimensions.
• After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of
shapes of the two input arrays.
• In any dimension where one array had size 1 and the other array had size greater than 1, the first array
behaves as if it were copied along that dimension
Python - Image operations
• Scipy library
from scipy.misc import imread, imsave, imresize
# Read an JPEG image into a numpy array
img = imread('assets/cat.jpg')
print(img.dtype, img.shape) # Prints "uint8 (400, 248, 3)"
# We can tint the image by scaling each of the color channels
# by a different scalar constant. The image has shape (400, 248, 3);
# we multiply it by the array [1, 0.95, 0.9] of shape (3,);
# numpy broadcasting means that this leaves the red channel unchanged,
# and multiplies the green and blue channels by 0.95 and 0.9
# respectively.
img_tinted = img * [1, 0.95, 0.9]
# Resize the tinted image to be 300 by 300 pixels.
img_tinted = imresize(img_tinted, (300, 300))
# Write the tinted image back to disk
imsave('assets/cat_tinted.jpg', img_tinted)
Python - LA
• PCA/Eigen analysis
cov = np.cov((X - X.mean(axis=0)).transpose())
eigenvalues,eigenvectors = np.linalg.eig(cov)
Image processing Examples
• Image filtering/convolution operations
• Edge detection algorithms
• Object detection
• Segmentation
• ...
Filtering - Theory
•
Slide from D. Stricker
Normalized Box Filter
Slide from D. Stricker
Gaussian Filter I
•Probably the most useful filter (although not the fastest).
Gaussian filtering is done by convolving each point in the
input array with a Gaussian kernel.
•1D Gaussian kernel
Slide from D. Stricker
Gaussian Filter II
•
Slide from D. Stricker
Median filter
•The median filter run through each element of the signal
(in this case the image) and replace each pixel with
the median of its neighboring pixels (located in a square
neighborhood around the evaluated pixel).
•The median of a finite list of numbers can be found by
arranging all the observations from lowest value to
highest value and picking the middle one.
Slide from D. Stricker
Usage examples
• Box filter
blur(src, dst, Size( filt_size_x, filt_size_y), Point(-1,-1));
–src: Source image
–dst: Destination image
–Size( w,h ): Defines the size of the kernel to be used ( of width w pixels and height h pixels)
–Point(-1, -1): Indicates where the anchor point (the pixel evaluated) is located with respect to the
neighborhood. If there is a negative value, then the center of the kernel is considered the anchor
point.
• Gaussian blur
GaussianBlur( src, dst, Size(filt_size_x, filt_size_y ), 0, 0
);
–Size(w, h): The size of the kernel to be used (the neighbors to be considered). and have to be odd
and positive numbers otherwise the size will be calculated using the and arguments.
–sigma_x: The standard deviation in x. Writing 0 implies that is calculated using kernel size.
–sigma_y: The standard deviation in y. Writing 0 implies that is calculated using kernel size.
TD
•Détection de visages
•Masquage
Image processing Examples
• Image filtering/convolution operations
• Edge detection algorithms
• Object detection
• Segmentation
• ...
Canny Edge Detector
44
Hough Transform
Gary Bradski, Adrian Kahler
2008 45
Scale Space
Chart by Gary Bradski,
void cvPyrDown( 2005 void cvPyrUp(
IplImage* src, IplImage* src,
IplImage* dst, IplImage* dst,
IplFilter filter = IplFilter filter =
IPL_GAUSSIAN_5x5); IPL_GAUSSIAN_5x5);
46
Thresholds
Screen shots by Gary Bradski,
2005
47
Histogram Equalization
Screen shots by Gary Bradski,
2005 48
Contours
49
• Inpainting: Image textures
• Removes damage to images, in this case, it removes the text.
Segmentation
• Pyramid, mean-shift, graph-cut
• Here: Watershed
Screen shots by Gary Bradski,
2005
51
51
Projections
Screen shots by Gary Bradski,
2005
Stereo Rectification
•Algorithm steps are shown at right:
•Goal:
–Each row of the image contains the same world points
–“Epipolar constraint”
Result: Epipolar alignment of features:
53
All: Gary Bradski and Adrian Kaehler: Learning OpenCV
Features2d contents
Detection Description Matching
Matchers available
• BruteForce
• FlannBased
• BOW
Detectors available Descriptors available
• SIFT • SIFT Matches filters
• SURF • SURF (under construction)
• FAST • Calonder • Cross check
• STAR • Ferns • Ratio check
• MSER • One way
• HARRIS
• GFTT (Good Features
To Track)
Slide from D. Stricker
OpenCV summary
• Following slides group together important
methods/classes in OpenCV and summarises them.
– Use them as a rough outline - the main source of information
still should be from the official documentation page.
Matrix
Manipulation
Slide from D. Stricker
Simple Matrix Operations
Slide from D. Stricker
Simple Image Processing
Slide from D. Stricker
Image Conversions
Slide from D. Stricker
Histogram
Slide from D. Stricker
Input/Output
Slide from D. Stricker
Serialization I/O
Slide from D. Stricker
Serialization I/O
Slide from D. Stricker
GUI (“HighGUI”)
Slide from D. Stricker
Camera Calibration, Pose, Stereo
Slide from D. Stricker
Object Recognition
Slide from D. Stricker
Thank you!