Introduction

With the advent of neural networks and advancements in Deep Learning for Computer Vision, we have stopped thinking about the features that are being extracted through these black box models and how they affect the accuracy of the model. And with the recent advancements in Transfer Learning, we have also neglected the model building process. We are taking a pre-trained model and change the last layers by adding our layers based on the number of classifications required to be predicted for our data.

Eventually, this process of adding or subtracting the layers becomes a trial-and-error method to get good “accuracy” and probably we will get outstanding “accuracy” based on the number of changes we do to that model. But, only to get good accuracy(mind you which is important) we have forgotten the essence of extracting various types of features from the images — features related to the texture and shape of different regions in the image.

One may question that, why to go through this hassle of extracting the features manually and then training Machine Learning model? I have developed a python package which will help avoid the trouble of extracting features from each image and then storing those features sequentially into a data frame to later apply traditional Machine Learning models. The package only works for 2-Channel gray scale images.

Installing the Package

Installing OpenCV dependencies

sudo apt-get update # Opencv-Deps

sudo apt-get install build-essential checkinstall cmake pkg-config yasm

sudo apt-get install git gfortran

sudo apt-get install libjpeg8-dev libjasper-dev libpng12-dev

sudo apt-get install libtiff5-dev

sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libdc1394–22-dev

sudo apt-get install libxine2-dev libv4l-dev

sudo apt-get install libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev

sudo apt-get install qt5-default libgtk2.0-dev libtbb-dev

sudo apt-get install libatlas-base-dev

sudo apt-get install libfaac-dev libmp3lame-dev libtheora-dev

sudo apt-get install libvorbis-dev libxvidcore-dev

sudo apt-get install libopencore-amrnb-dev libopencore-amrwb-dev

sudo apt-get install x264 v4l-utils

sudo apt-get install libprotobuf-dev protobuf-compiler

sudo apt-get install libgoogle-glog-dev libgflags-dev

sudo apt-get install libgphoto2-dev libeigen3-dev libhdf5-dev doxygen

sudo apt-get install python3-dev python3-pip

A. Install using PIP



pip3 install git+ # package installationpip3 install git+ https://github.com/vatsalsaglani/xrayimage_extractfeatures.git

B. Clone and Install



cd xrayimage_extractfeatures

python3 setup.py install git clone https://github.com/vatsalsaglani/xrayimage_extractfeatures.git cd xrayimage_extractfeaturespython3 setup.py install

Introduction to the Package

The package includes the following contents,

1. GLCM Features

Correlation

Homogeneity

Energy

Contrast

But, what is GLCM?

Statistically, GLCM is a method of examining texture that considers the spatial relationship of pixels in the gray-level co-occurrence matrix or GLCM. The texture is characterized by the GLCM based on how often pairs of pixel with specific values and in a specified spatial relationship occur in an image.

Extracting GLCM Features:

from xtract_features.glcms import *

feats = glcm(img) # energy

energy = feats.energy() # correlation

corr = feats.correlation() # contrast

cont = feats.contrast() # homogeneity

homogeneity = feats.homogeneity() # all the features at once

_all = feats.glcm_all()

2. Moments

24 variant image moment values

Hu Moments

How moments as a concept are used for images?

Image moments are used to describe objects after segmentation and play an essential role in object recognition and shape analysis. Images moments may be employed for pattern recognition in images. Simple image properties derived via raw moments is area or sum of grey levels.

Extracting Moments from an Image

from xtract_features.moments import * _moments = moment(img).get_moments() _hu_moments = moment(img).get_HuMoments()

_moments is a list of 24 variant moments and _hu_moments is list of the 7 hu moments which are invariant.

3. Region Properties

Extracting Region Properties

from xtract_features.region_props import * _rp = region_props(img) # maximum area region

max_area = _rp.max_area() # plot regions

_rp.plot_image() # plot black and white

_rp.plot_show_bw() # plot with lables

_rp.plot_image_with_label() # mean of areas of all the regions

_rp.mean_area() # eccentricity of the highest area region

_rp.eccentricity()

Apart from the above given function the package contains 20 more functions to extract various other features from an image.

euler_number()

solidity()

perimeter()

# standard deviation of all the areas of the regions of the given image

std_area()

# otsu’s Threshold

thresh_image()

bb()

bb_area()

centroid_r()

convex_area_r()

coordinates_r()

eq_diameter()

extent_r()

filled_area_r()

inertia_tensor_area()

label_r()

inertia_tensor_eigvals_r()

local_centroid_r()

maj_ax_len()

min_ax_len()

orient()

4. Helpers

This module provides some basic functionalities like extracting the numpy-image-arrays given the path of the folder containing DICOM files, saving the list containing every numpy-image-array obtained from the path to a pickle file and can also load any previously saved pickle file.

a. Extract list of numpy-image-arrays: extract_img_array()

from xtract_features.helpers import extract_img_array # get list of numpy image arrays and a list of filename/ids numpy_list, ids = extract_img_array(‘path-to-image-folder’, getID = True) # only get a list of numpy image arrays numpy_list, ids = extract_img_array(‘path-to-image-folder’)

# here the ids list will be an empty list

b. Save Pickle: save_pickle()

Given the extracted list of numpy-image-arrays or list-of-image-ids/names we can save it using the save_pickle() function.

from xtract_features.helpers import save_pickle save_pickle(numpy_list, “numpy-list”)

save_pickle(ids, “ids-list”)

c. Load Pickle: load_pickle()

Given any saved .pkl file we can load it using the load_pickle() function.

from xtract_features.helpers import load_pickle np_list = load_pickle(“numpy-list”)

ids = load_pickle(“ids-list”)

d. Show Image: show()

Display the image stored in the form of numpy-image-array using the show() function.

from xtract_features.helpers import show # show with title

show(np_list[1], title = ids[1]) # show without title

show(np_list[1])

e. Plots: plots()

Using the plots() function display a list of images stored in the form of numpy-image-array.

from xtract_features.helpers import plots #plots with titles

plots(np_list[:8], titles = ids[:8] # plots without titles

plots(np_list[:8])

5. Feature Extraction

By now you may be longing for the fulfillment of the commitment made at the start, extracting a bunch of features from every image inside the folder and saving those into a data frame. Well, everyone, the time has finally come to unveil that functionality that adds the most value to this package.

But before jumping into that let’s see some other functionalities of this module.

a. Entropy

from xtract_features.extract import s_entropy, entropy_simple # shannon's entropy

s_entr = s_entropy(img) # simple entropy

entr_simp = entropy_simple(img)

b. Feature Dictionary from Image Path: feature_dict_from_imgpath()

getId = True: The keys of the gererated feature dictionary are the image-name/image-id extracted while generating the numpy-image-array list.

The of the gererated feature dictionary are the extracted while generating the list. getId = False (default): The keys of the generated feature dictionary is an integer which corresponds to list of features.

from xtract_features.extract import feature_dict_from_imgpath # getId = True

data_d = feature_dict_from_imgpath(‘path-of-image-folder’, ids, getId = True) # getId = False

data_d = feature_dict_from_imgarray(numpy_list, [])

c. Feature Dictionary from Image Array List: feature_dict_from_imgarray()

getId = True: The keys of the gererated feature dictionary are the image-name/image-id extracted while generating the numpy-image-array list.

The of the gererated feature dictionary are the extracted while generating the list. getId = False (default): The keys of the generated feature dictionary is an integer which corresponds to list of features.

from xtract_features.extract import feature_dict_from_imgarray #getId = True

data_d = feature_dict_from_imgarray(numpy_list, ids, getId = True) # getId = False

data_d = feature_dict_from_imgarray(numpy_list, [])

d. Get Data frame from Image Path: get_df_from_path()

getId = True : if you need the features .csv file to have corresponding image-names/ids for their feature values

: if you need the features file to have corresponding for their feature values getId = False (default): the output file will be numbered and will not contain any image-name/id corresponding to the feature value

from xtract_features.extract import get_df_from_path # getId = True

df = get_df_from_path(‘path-to-image-folder’, ids, getId = True) # getId = False (default)

df = get_df_from_path('path-to-image-folder', [])

e. Get Data frame from Image Array list: get_df_from_img_array()

getId = True : if you need the features .csv file to have corresponding image-names/ids for their feature values

: if you need the features file to have corresponding for their feature values getId = False (default): the output file will be numbered and will not contain any image-name/id corresponding to the feature value

from xtract_features.extract import get_df_from_img_array # getId = True

df = get_df_from_img_array(‘path-to-image-folder’, ids, getId = True) # getId = False (default)

df = get_df_from_img_array('path-to-image-folder', [])

6. Extras

2D Convolutions

Segmentation

2D Convolutions: conv2d()

For edge detection, sharpening and blurring the image we use 2D convolutions.

from xtract_features.twodconv import conv2d conv2d(image, "kernel-name")

There are 14 convolution kernels/matrices available inside the package as follows,

identity

edge-all

edge-H

edge-V

sharp

gauss-3

gauss-5

boxblur

unsharp

gradient-H

gradient-V

sobel-H

sobel-V

emboss

2. Segmentation: water_seg()

For now the package includes only one segmentation technique, Watershed Segmentation.