This OpenCV tutorial series provides basic and intermediate concepts of OpenCV. Our OpenCV tutorial is designed for beginners and for newbies, provided you’ve got good hands-on knowledge of Python. If no, you can learn Python from Angela Yu’s Python Course for free.
OpenCV is an open-source library for computer vision. It enables the machine to recognize the faces, objects, etc. In this tutorial of ours, we will learn about the history of OpenCV, its basic concepts, it’s working like how an image is interpreted by a computer, and its benefits.
Knowing our beloved OpenCV
Computer Vision is basically a field of study that helps computers to understand the content of the digital images such as photographs, videos, etc. Its main is to understand and help machines to extract the content of the images. It extracts the description from the pictures, which may be an object, a text description, and three-dimension model, etc. For example, if implemented in a car, it will be able to identify different objects around the road, such as traffic lights, pedestrians, traffic signs, etc. and can act accordingly.
OpenCV Library is the abbreviation for Open Source Computer Vision Library, which is widely used for image recognition or identification. It was developed and launched in 1999 by Intel. It was written in C/C++ then, but now it is commonly used in Python for training/feeding the robots with the image/video data to further process it and act accordingly.
It was first introduced as an alpha version for common use at the IEEE Conference on Computer Vision and Pattern Recognition in the year 2000 and five beta versions were released in the short span of 5 years i.e. 2001 to 2005 and finally, in 2006, version 1.0 was released as the official first version of OpenCV. Then in the year, 2009 a second version was released which aimed at making it easier, more type-safe, pattern, and better implementations. As of now, an independent Russian team is responsible for its development and releases its newer version every 6 months.
- OpenCV is available free of cost.
- Since the OpenCV library is written in C/C++, so it is very fast. Now it can be used with Python.
- It requires less RAM to usage, like 60-70 MB.
- It is portable, can run on any device that can run on C.
How OpenCV interprets images?
The first step in the process of image processing begins with the recognition of data by computers. Firstly, a matrix is created for data in image format that simply means that an image is represented as a matrix. Each pixel value in the image is processed into this matrix based on the pixel’s color intensity.
For example, a picture of size 500×500 is converted to a matrix of size 500×500. If this image is colored (RGB), this dimension becomes 500x500x3 (we’ll get to that later). In fact, every manipulation in image processing is a matrix operation. Suppose that you want to blur the desired part of the image, a particular filter moves over the entire matrix and makes changes to the values of the pixels of the part you want to blur. As a result, the required part of the image is blurred.
So, basically there are two ways to identify an image:
An image consisting of two colours black and white, are referred to as Grayscale images. The contrast measurement of intensity is black treated as the weakest intensity, and white as the strongest intensity. When we use the grayscale image, the computer assigns each pixel value based on its level of darkness.
It is represented as a 2D matrix, where each element represents the intensity of brightness in that particular pixel. Remember, 0 means black, and 255 means white, which means we use 8-bit values to represent a pixel (28 which is 256). You can get a better idea by looking at the example of a motorcycle image below i.e. Figure 1
The motorcycle image above has a size of 32 x 16. This means that its dimensions are 32 pixels wide and 16 pixels tall. The X-axis ranges from 0 to 31 and Y-axis from 0 to 16. Overall, the image has 32×16 = 512 pixels. In this grayscale image, each pixel contains a value that represents the intensity of brightness of that particular pixel and the values vary from 0 to 255. Because the pixel value represent the intensity of light, where the value 0 represents dark pixels (black), 255 is bright (white) and the values in between represent the intensity on the grayscale.
2. RGB (Red – Green – Blue)
An RGB image is a combination of the red, green, blue color which together makes a new color. Color images or RGB images are represented by 3D matrix or three matrices: one represents the intensity of red color in the particular pixel, one for green and one for blue. The computer retrieves that value from each pixel and puts the results in an array for further manipulation.
Let’s take another image of a motorcycle, suppose it’s colored (which it’s not). As you can see in the above image, that it is an RGB image so the computer will interpret it as a 3D matrix (three matrices stacked on top of each other). Let’s say, the first matrix (channel 1) represents the red channel, then each element of that matrix represents the intensity of red color in that particular pixel, same is the case with the green channel (channel 2) and blue channel (channel 3). Each pixel in the color image has three numbers (0 to 255) associated with it which show the intensity in that particular pixel, one each for Red, Blue, and Green color.
If we take the pixel (0,0) suppose, we’ll see that it represents the top-left pixel of the image of green grass. When we view this pixel in the color images, it looks like this:
That’s all for now. Keep a look out for part 2 of our OpenCV tutorial.