Practical Guide to OpenCV


Computer Vision


Computer Vision allows or makes computer to see and process visual data just like Humans. Computer Vision involves analyzing images to extracting useful information from given image.

Consider a UseCase of Detecting the Helmet of Two-Wheeler Driver, If a Driver does not wear Helmet then He/She will be fined with the help of Number Plate present on the vehicle.

Here this number plate image will be processed and REGN NO. will be extracted like APXXBJXXXX Accordingly He/She will be fined.

Computer Vision helps us to collect thousands of information simultaneously.

OpenCV

OpenCV is an open source image processing library which is available on Windows, Mac, Linux and works in C++, C, JAVA and Python. OpenCV is created by Intel and later supported by willow garage and now maintained by Itseez.

In order to solve this Usecase, we also require Object detection model for detecting helmet and number plate, OCR for extracting Vehicle REGN number from number plate, etc.


Image-Processing is the analysis and manipulation of a digitized image, especially in order to extract useful information.


Digital Image

Digital images are stored in the form of Matrix.When Computer sees an image it sees in the form of Pixel Matrix

There are Two types of Digital Images

Gray scaled Images

  • Grayscale image pixel is formed by 8 bits (1 byte).
  • The value of this 8 bits determines Pixel Intensity (Either Bright or Dark)
  • In other words, grayscale images have only One Channel.

Colored Images

  • Colored image pixel is formed by 24 bits (3 byte).
  • The value of each 8 bits determines the intensity of each color.
  • Colored images have Three Channel (R-Red, G-Green, B-Blue)

Note: - The Computer reads any image as a range of values between 0 and 255.

   **OpenCV stores 3 bytes of colored images in (BGR) format.


In order to start OpenCV , One should have the enough knowledge of Numpy.

Numpy

It is python library designed for numeric computation.

Array structure is important because digital images are 2D arrays of PIXEL.

All the OpenCV array structures are converted to and from Numpy arrays.

Numpy arrays are very fast with respect to multi-dimensional arrays because it has binding with C programming language.

One can use more convenient indexing system rather than using for loops.

Working with Basics of OpenCV

In [10]:
import cv2
import matplotlib.pyplot as plt
In [21]:
#Load an image using imread specifying the path
'''To load in gray scale mode,colour mode you can pass 
second parameter as integer value 0,1 repectively'''
img = cv2.imread('I:\Tomb_raider.jpg',1)
In [22]:
#To display our image variable, we use 'imshow'
#The first parameter will be title shown on image window
#The second parameter is the image variable
cv2.imshow('tomb',img)
#But to display i am using pyplot as plt
In [23]:
plt.imshow(img[:,:,::-1])
Out[23]:
<matplotlib.image.AxesImage at 0x7bfe090>
In [ ]:
#'waitKey' allows us to input information when a image window is open
# By leaving it blank it just waits for any key to be pressed 
# BY placing numbers, we can specify a delay for how long you keep the window open(time in ms) 
cv2.waitKey()
In [ ]:
#This Closes all open windows
#Failed to execute this will cause your program to hang
cv2.destroyAllWindows()
In [5]:
img.shape
# Here Shape of your image is given
# 183,275,3 represents Height,Width,Channels Resp.
# we can slice(crop) our img. within these Height,Width
Out[5]:
(183, 275, 3)

In above image, (183, 275) these are your limits starting from 0 to 182 and 0 to 274 Height, Width Resp. Your Pixels are not present after it.

In [24]:
#By Slicing Image variable we get particular part of an image  (Region of interest)
ROI = img[10:80,50:100]
In [ ]:
cv2.imshow('ROI',ROI)
cv2.waitKey()
cv2.destroyAllWindows()
In [25]:
plt.imshow(ROI[:,:,::-1])
Out[25]:
<matplotlib.image.AxesImage at 0x9809410>
In [26]:
#Geometric Shapes on Image
#Drawing Line on image
#1st,2nd,3rd,4th,5th parameters are image variable,
#starting coordinate,ending coordinate,color in BGR format,
#thickness of line Resp.
cv2.line(img,(0,90),(140,90),(0,0,255),4)
#Drawing Rectangle on image
cv2.rectangle(img,(50,20),(100,80),(255,0,0),4)
#Drawing arrow on image
cv2.arrowedLine(img,(0,182),(130,100),(0,255,0),4)
# Adding Text on image
cv2.putText(img,'Lara_Croft', (50,15) , cv2.FONT_HERSHEY_PLAIN , 0.5 , (255,255,255) , 1)
#Note----------Types of Font present in cv2
# fonts = [i for i in dir(cv2) if 'FONT' in i ]
Out[26]:
array([[[125, 164, 149],
        [123, 162, 147],
        [120, 159, 144],
        ...,
        [119, 133, 129],
        [ 76,  89,  87],
        [ 63,  76,  74]],

       [[113, 149, 135],
        [113, 152, 137],
        [117, 153, 139],
        ...,
        [109, 123, 119],
        [ 72,  85,  83],
        [ 63,  76,  74]],

       [[ 98, 133, 119],
        [102, 138, 124],
        [109, 144, 130],
        ...,
        [ 94, 108, 104],
        [ 66,  80,  76],
        [ 65,  79,  75]],

       ...,

       [[  0, 255,   0],
        [  0, 255,   0],
        [  0, 255,   0],
        ...,
        [ 41,  27,  21],
        [ 48,  34,  28],
        [ 55,  41,  35]],

       [[  0, 255,   0],
        [  0, 255,   0],
        [  0, 255,   0],
        ...,
        [ 40,  26,  20],
        [ 47,  33,  27],
        [ 54,  40,  34]],

       [[  0, 255,   0],
        [  0, 255,   0],
        [  0, 255,   0],
        ...,
        [ 38,  24,  18],
        [ 44,  30,  24],
        [ 50,  36,  30]]], dtype=uint8)
In [143]:
cv2.imshow('line',img)
cv2.waitKey()
#Note-----------waitKey returns the ASCII value of key Pressed
cv2.destroyAllWindows()
In [27]:
plt.imshow(img[:,:,::-1])
Out[27]:
<matplotlib.image.AxesImage at 0x9873210>

Image Segmentation

Image Segmentation means dividing the image into various parts such that it can be used efficiently for applications like object recognition, etc. Processing full image makes it inefficient. To overcome this problem, these image parts are used for processing to reduce computational work.

There are various Image Segmentation Methods such as Threshold, Edge Based, Region Based, Clustering Based, Watershed Based, PDE Based and ANN Based Methods

Thresholding Method

Thresholding is a simple, popular, effective way of segmenting an image into a foreground and background. In these each Pixel value is compared with the Threshold value such that if the Pixel value is smaller than the threshold, we set it to 0, otherwise, we set it to a maximum value 255.In this way, Threshold method isolates objects by converting Grayscale images into Binary images. This technique is most effective in images with high levels of contrast.

In [8]:
PATH = "formula.png"
Image(filename = PATH , width=400, height=1000)
Out[8]:

Above Algorithm, is to convert image into simple Binary image.

I have discussed 4 more Algorithms below.

In [27]:
img = cv2.imread('flower.jpg')  
  
# cv2.cvtColor method is used to convert an 
# image from one color space to another
# loaded image in coloured mode. so, converting into 
# grayscale mode
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
In [ ]:
cv2.imshow('tomb',img)
In [49]:
#Using plt to display image
plt.imshow(img,cmap='gray')
Out[49]:
<matplotlib.image.AxesImage at 0x89bc270>
In [38]:
#Assigning Maximum value & Threshold value
max_val = 255
Thres_value = 110
In [39]:
# Applying different thresholding  
# techniques on image variable 
# All pixels value greater than 110 will  
# be set to 255 
ret, o1 = cv2.threshold(img, Thres_value, max_val, cv2.THRESH_BINARY) 
ret, o2 = cv2.threshold(img, Thres_value, max_val, cv2.THRESH_BINARY_INV)  
ret, o3 = cv2.threshold(img, Thres_value, max_val, cv2.THRESH_TOZERO) 
ret, o4 = cv2.threshold(img, Thres_value, max_val, cv2.THRESH_TOZERO_INV) 
ret, o5 = cv2.threshold(img, Thres_value, max_val, cv2.THRESH_TRUNC)

Where,

     o1  is Binary Threshold.

     o2  is Inverted image of Binary Threshold.

     o3  is Set to Zero. ( if Pixel value is Less than Threshold value then Pixel value will set to zero and                       
                                Pixel value Greater than Threshold value then Pixel value will be retained. )

     o4  is Inverted image of Set to Zero. 

     o5  is Truncated Threshold. ( if Pixel value is Greater than Threshold value, then Pixel value will set to  
                                                   Threshold value and other Pixel values will be retained. )
In [17]:
# Displaying output images with the corresponding thresholding  
# techniques applied to the input image 
cv2.imshow('Binary Threshold', o1) 
cv2.imshow('Binary Threshold Inverted', o2)  
cv2.imshow('Set to 0', o3) 
cv2.imshow('Set to 0 Inverted', o4)
cv2.imshow('Truncated Threshold', o5)
cv2.waitKey()
cv2.destroyAllWindows()
In [65]:
plt.xlabel('Binary Threshold')
plt.imshow(o1,cmap='gray')
Out[65]:
<matplotlib.image.AxesImage at 0xa61bbf0>
In [66]:
plt.xlabel('Binary Threshold Inverted')
plt.imshow(o2,cmap='gray')
Out[66]:
<matplotlib.image.AxesImage at 0xa63b530>
In [67]:
plt.xlabel('Set to 0')
plt.imshow(o3,cmap='gray')
Out[67]:
<matplotlib.image.AxesImage at 0xa678af0>
In [68]:
plt.xlabel('Set to 0 Inverted')
plt.imshow(o4,cmap='gray')
Out[68]:
<matplotlib.image.AxesImage at 0xa6b7bb0>
In [69]:
plt.xlabel('Truncated Threshold')
plt.imshow(o5,cmap='gist_gray')
Out[69]:
<matplotlib.image.AxesImage at 0xa6f4ef0>

In this way a desired part of image is obtained.

I hope you got strong basics of OpenCV

THANK YOU

Comments

Post a Comment