Dataset
As an example, we will use some bicycle and human images from Graz dataset. After downloading the images from that website, 200 images from bicycle set and 200 images from human set are kept in a folder called 'training'. The rest are kept in an another folder 'test' to be used as test images. Those images are in bmp format, and we need to convert them to png format. To do batch processing for conversion, we have written a MatLab program as follows.%File name: bmp2png.m %------------------------------------------------------------------------- sDir=[pwd,'/test/']; %source folder sFile=[sDir, '*.bmp']; %source files dFile=[pwd,'/test_png/test_']; %destination %------------------------------------------------------------------------- sList=dir(sFile); %Get file list nFiles=size(sList,1) %number of files %------------------------------------------------------------------------- for i=1:nFiles Img=imread([sDir,sList(i).name]); imwrite(Img,[dFile,sprintf('%03d',i),'.png']); i %output progress end
As an alternative to MatLab software, we have used a free software, Octave on an old Ubuntu 14.04 LTS Linux machine. If your Octave doesn't have image processing package installed, you can enter the following commands in terminal. The second command is for statistics package.
sudo apt-get install octave-image sudo apt-get install octave-statistics
Thereafter, load the packages using the following commands.
pkg load image pkg load statistics
Converted png images are to be put in training_png folder and test_png folder respectively. Use 'chdir' command to change working folder in octave command window. Then, enter the name of the MatLab program 'bmp2png;' to run it. The images will be converted to png and stored in the destination folder.
Feature Extraction
To find SIFT features, we use Harris & Hessian feature extractor from a website. You can use other feature extractor also. After downloading a file extract_features.tar.gz, extract and put extract_features.ln and harhessift.basis files in the working folder. To do batch extraction, we have written a script file, xall.sh as follows.#!/bin/bash FILES=./*.png for f in $FILES do echo "Processing - $f ..." ./extract_features.ln -harhes -i "$f" -sift -pca harhessift.basis done
Change the mode of the script file as follows and run it by entering './xall.sh' in the terminal. The resulting feature files will be in the same folder.
chmod 755 xall.sh ./xall.sh
To display the extracted features on its image, you can use a MatLab program display_features.m that is at the same website. Then, as an example, you can see bike_002.png and its features by entering the following command in the Octave command terminal.
display_features('bike_002.png.harhes.sift','bike_002.png',0,0);
The extension for feature files is .sift. The first line in a feature file is size (dimension) of a feature vector and it is 128. The number in second line indicates the number of features which is typically a few thousands. Starting from third line, each row has 133 values and column 6 to column 133 represent a feature vector.
Classification
Firstly, the extracted features will be classified into about 200 groups. We will use k-means clustering algorithm for it. To measure a distance between two feature vectors, you can use any suitable distance measuring method. We use χ2 distance measuring method because it is also suitable to measure similarity of histograms in a later step. A MatLab program that measures χ2 distance between two vectors is shown below.function d=ChiDist(v1,v2) %Find Chi Square distance between two vectors %Input: 2 vectors %Output: a scalar distance %File name: ChiDist.m dv=(v1-v2).^2; sv=abs(v1)+abs(v2); sv(sv==0)=1e-9; %eliminate zero denominator d=sum(dv./sv)./2; end
MatLab has a built in function for k-mean clustering called kmeans but statistics package is required for it. The following is the k-means clustering function developed myself so that you can use any distance measuring method that you want.
function [Idx,C]=KMeansCustom(X,k) %KMeansCustom partitions the points in the n-by-d data matrix X into k clusters. %[Idx,C]= KMeansCustom(X,k) returns %n-by-1 vector IDX containing the cluster indices of each point and %k-by-d matrix C containing the k cluster centroid locations. %For n sample points with d dimensions in each point, X has n rows and d columns. %File name: KMeansCustom.m %Author: Yan Naing Aye %Website: http://cool-emerald.blogspot.com/ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Define maximum number of iterations MaxIter=100; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [n,d]=size(X); k=round(k); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %step1 :arbitrarily choose k samples as the initial cluster centers p=randperm(n); Mu=X(p(1:k),:); D=zeros(k,n); for t=1:MaxIter %step2:distribute the samples X to the clusters for j=1:k for i=1:n D(j,i)=ChiDist(X(i,:),Mu(j,:));%Use custom distance end end [ValMin,IndexMin]=min(D); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %step 3: update the cluster centers OldMu=Mu; for i=1:k Mu(i,:)=mean(X(IndexMin==i,:),1); end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %step4 :check convergence if sum(sum(abs(OldMu-Mu))) == 0 break end t %output progress end Idx=IndexMin'; C=Mu;
It will need a lot of calculations to use all the features from all 400 training images each with thousands of features. That is why, we just randomly pick 100 features from each image and group them into 200 clusters. The MatLab program to find the cluster centers is as follows. Even with only 100 features from each image, it took several hours to get the feature centers also known as visual token vocabulary which were finally saved to a file, FeatureCtrs.dat.
%File name: IF1_Find_VisualTokens.m %Author: Yan Naing Aye %Website: http://cool-emerald.blogspot.com/ %------------------------------------------------------------------------- %parameter settings NRF=100; %number of random features taken from each training image NVT=200; %number of visual tokens to find FV_Size=128; %feature vector size %------------------------------------------------------------------------- sDir=[pwd,'/training_png/']; %source directory sFile=[sDir,'*.sift']; %source files fList=dir(sFile); %get file list nFiles=size(fList,1); %number of files %------------------------------------------------------------------------- allFeatures=zeros(NRF*nFiles,FV_Size); %allocate and initialize for all features %(e.g. 100 random features each from 400 files and each feature has 128 dimensions) selFeatures=zeros(NRF,FV_Size); %allocate and initialize for random features si=1; ei=100; %starting index and ending index for each file for i=1:nFiles i %output progress sFile=[sDir,fList(i).name] %read all features from each file readFeatures=textread(sFile,'','headerlines',2); %each row has 133 elements and first 5 elements are u,v,a,b,c; feature vectors starts from 6th column readFeatures=readFeatures(:,6:end); %select random features nFeatures=size(readFeatures,1); p=randperm(nFeatures,NRF); selFeatures=readFeatures(p,:); %put the selected features to all features pool ei=i*NRF; si=ei-NRF+1; allFeatures(si:ei,:)=selFeatures; end %find feature centers by k-means clustering [idx,FeatureCtrs] = KMeansCustom(allFeatures,NVT); %save the feature centers to a file save -ascii -double -tabs FeatureCtrs.dat FeatureCtrs;
It can be run by entering 'IF1_Find_VisualTokens;' in the Octave terminal.
Histogram of Visual Tokens
Secondly, all the features from each training image are read and histograms of the visual tokens are built. These histograms are also called token frequency features.function h=GetHistOfVT(sFile) %Get histogram of visual tokens %Input: feature file path %Output: a row (1D array) of histogram %File name: GetHistOfVT.m %Author: Yan Naing Aye %Website: http://cool-emerald.blogspot.com/ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %read all features from each file readFeatures=textread(sFile,'','headerlines',2); %each row has 133 elements and first 5 elements are u,v,a,b,c; feature vectors starts from 6th column readFeatures=readFeatures(:,6:end); %load feature centers (visual tokens) load('FeatureCtrs.dat'); nf=size(readFeatures,1); nc=size(FeatureCtrs,1); %---------------------------------------------------------------- %initialize histogram h=zeros(1,nc); for i=1:nf minD=ChiDist(readFeatures(i,:),FeatureCtrs(1,:)); minC=1; for j=2:nc d=ChiDist(readFeatures(i,:),FeatureCtrs(j,:)); if (d<minD) minD=d; minC=j; end end h(minC)=h(minC)+1; end end
The resulting histograms are saved in a file, AllHistVT.dat. The program is run by entering 'IF2_Build_Histogram_of_VisualTokens; ' in the Octave command window. It also took a few hours in our case.
%File name: IF2_Build_Histogram_of_VisualTokens.m %Author: Yan Naing Aye %Website: http://cool-emerald.blogspot.com/ %------------------------------------------------------------------------- NVT=200; %number of visual tokens sDir=[pwd,'/training_png/']; %source directory sFile=[sDir, '*.sift']; %source files sList=dir(sFile); %source file list nFiles=size(sList,1); %number of files AllHistVT=zeros(nFiles,NVT); %histograms of visual tokens of all files %------------------------------------------------------------------------- for i=1:nFiles AllHistVT(i,:)=GetHistOfVT([sDir,sList(i).name]); i%output progress end save -ascii -double -tabs AllHistVT.dat AllHistVT;
Image Retrieval
Finally, a test image called query image is given and the most similar 9 training images are displayed. The program for this task is shown below and it can be run by entering 'IF3_Retrieve;' in Octave terminal.%File name: IF3_Retrieve.m %Author: Yan Naing Aye %Website: http://cool-emerald.blogspot.com/ %------------------------------------------------------------------------- %Get test file featureDir=[pwd,'/test_png/']; [uFileName,uPathName] = uigetfile('*.png','Select a test file',featureDir); [pathstr, fileNameWithoutExt, ext, versn] = fileparts(uFileName); featurefile=[featureDir, fileNameWithoutExt,'.png.harhes.sift']; %------------------------------------------------------------------------- %show test image figure; imshow([pwd,'/test_png/', fileNameWithoutExt,'.png']); title('Test image'); %------------------------------------------------------------------------- %Get histogram for test image figure; h=GetHistOfVT(featurefile); bar(h); title('Test image histogram'); %------------------------------------------------------------------------- %Load trained data load('AllHistVT.dat'); n=size(AllHistVT,1); cDist=zeros(n,1); for i=1:n; cDist(i)=ChiDist(h,AllHistVT(i,:)); end [B, IX] = sort(cDist); %------------------------------------------------------------------------- %Get folder path and wildcard imgDir=[pwd,'/training_png/']; imgFile=[imgDir, '*.png']; %Get file list and number of files fileList=dir(imgFile); %------------------------------------------------------------------------- %Show results nr=9;%number of results to show %settings cs=3; rs=ceil(nr./cs); figure; for i=1:nr imgFileName=fileList(IX(i)).name; subplot(rs,cs,i); imshow([imgDir,imgFileName]); title(['Rank ',num2str(i)]); end %-------------------------------------------------------------------------
When bike and human test images are given, it can take out the very similar images from the training images within a few seconds.
Here is the source zip file..
No comments:
Post a Comment
Comments are moderated and don't be surprised if your comment does not appear promptly.