Image Retrieval: Global and Local Color Histogram
In the previous post we talked about Image Retrieval and Image Descriptors. Now we will introduce one of the most common and important descriptors that doesn’t include information about color spatial distribution which is Color Histogram.
Color Histogram is a representation of the distribution of colors in an image.(From Wikipedia)
Color histogram represents the image but from another perspective. Color Histogram counts similar pixels and store it in bins in order to describe the number of pixels in each range of colors (or bin) independently.
Note: Color Histogram is a color descriptor and as we knew from the previous post that each descriptor contains a feature extraction algorithm and a matching function.
Color Histogram is divided into:
- Global Color Histogram (GCH).
- Local Color Histogram (LCH).
Global Color Histogram
GCH is the most known color histogram used to detect similar images.
Feature extraction algorithm:
- Discretize your color-space (images’ colors) into n color (You may use just 8*8*8 =512 color instead of 256*256*256=16777216 color).
- Create a bin for each color.
- Count number of pixels for each color and store it in histogram’s bins.
The most common matching function for this method is Euclidean distance.
To compare 2 images A, B.
A(R,G,B) : represents number of pixels in color = (R,G,B). (for example A(6,2,4) represents the number of discretized pixels of color R=6,G=2 and B=4).
D: sum Euclidean distances.
Remember : the larger the distance value, the less similar the images are.
Look at this example
Here C has the same color histogram as B but A is different from them.
Using Euclidian distance for these color histograms we found that D(A,C) = D(A,B) and D(B,C) = 0 but There’s a problem here that B, C are not similar at all so D(B,C) shouldn’t be zero and D(A,C) should be smaller than D(A,B) because A,C have the same pixels except for only two pixels.
That’s why we call GCH doesn’t include information about color spatial distribution.
There’s an attempt to solve this problem which is the next part of this post.
Local Color Histogram
LCH includes information about color’s distribution in different regions. It’s the same as GCH but at first we divide the image into different block. Where each pair of the blocks (one of them in the first image and the other in the second) will be computed separately using GCH. After that the total distance between the two images will be the sum of all GCH distances between them.
Feature extraction algorithm:
- Split image into m blocks
- Compute the GCH for each pair of blocks as shown in the figure
To compare 2 images a, b.
All we need to do is to sum up all distances computed by GCH.
D: sum of Euclidean distances.
Using LCH the distances are now more reasonable. D(A,B) = 1.768, D(A,C) = .707, D(B,C)=1.768.
So sometimes LCH is more efficient than GCH. But when the image is rotated we may get a very different output.
Look at this example:
In this example the distance between the 2 images using LCH = 0
Here the distance between the 2 images using LCH = 4 although they are the same but the second one is rotated and this problem is the main disadvantage of Local Color Histogram.