![](https://crypto4nerd.com/wp-content/uploads/2023/06/1kzMvd_PeFtEjD7fJY4LSsQ.png)
The Essence of Machine Learning in Image Processing:
Machine learning has revolutionized the field of image processing by enabling computers to learn patterns and make predictions from visual data. In the context of leaf classification, machine learning algorithms can be trained on a dataset of labeled leaf images to recognize and categorize different species based on their unique features. By leveraging the power of machine learning, we can automate the process of leaf identification and contribute to botanical research and conservation efforts.
To begin, let us import all necessary libraries.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from math import isclose
from fractions import Fraction
from skimage import data, io, filters, util, color
from skimage.morphology import (disk, square, rectangle, skeletonize,
erosion, dilation, opening, closing,
binary_erosion, binary_dilation,
binary_opening, binary_closing)
from skimage.measure import label, regionprops
from skimage.io import imread, imshow
from skimage.color import rgb2gray
from tqdm.notebook import tqdm, trangefrom imblearn.over_sampling import SMOTE
from sklearn.preprocessing import (StandardScaler, MinMaxScaler,
OneHotEncoder, OrdinalEncoder)
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split, GridSearchCV, KFold
from sklearn.pipeline import make_pipeline, Pipeline
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import LinearSVC
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
import cv2
import torch
from tqdm.notebook import tqdm, trange
from PIL import Image
from transformers import YolosFeatureExtractor, YolosForObjectDetection
from yoloface import face_analysis
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as T
from torch.utils.data import DataLoader
from torchvision import models, transforms, datasets
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw, ImageFont
import time
import math
import shutil
import copy
from pathlib import Path
from torchsummary import summary
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=UserWarning)
# Empty the GPU memory cache
torch.cuda.empty_cache()
Now, let us see a sample image!
image_raw = io.imread('leaves/plantA_1.jpg')
fig, ax = plt.subplots()
ax.imshow(image_raw,cmap='gray');
gray_leaves = rgb2gray(image_raw[:,:,:3])
binary_leaves = util.invert(gray_leaves > 0.5)
plt.figure()
plt.imshow(binary_leaves, cmap='gray')
plt.show()
Segmentation
label_leaves = label(binary_leaves)
plt.figure()
plt.imshow(label_leaves);
raw_props = regionprops(label_leaves)[1:] # remove the background class
clean_props = [prop for prop in raw_props if prop.area > 10]
I will be using these regionprops
properties to differentiate the leaves for the ML model.
- area
- perimeter
- eccentricity
- solidity
- extent
Object Feature Extraction
def get_class(fpath):
'''
Extracts the class of the leaves from the filepath.
'''
return fpath.split('/')[1].split('.')[0].split('_')[0]leaves_data = []
folder_path = 'leaves'
for filename in tqdm(os.listdir(folder_path)):
file_path = os.path.join(folder_path, filename)
if os.path.isfile(file_path):
image_raw = io.imread(file_path)
gray_leaves = rgb2gray(image_raw[:,:,:3])
binary_leaves = util.invert(gray_leaves > 0.5)
label_leaves = label(binary_leaves)
raw_props = regionprops(label_leaves)[1:]
clean_props = [prop for prop in raw_props if prop.area > 10]
for prop in clean_props:
leaves_data.append({'area': prop.area,
'perim': prop.perimeter,
'ecc': prop.eccentricity,
'solid': prop.solidity,
'extent': prop.extent,
'label': get_class(file_path)
})
df_leaves = pd.DataFrame(data=leaves_data)
display(df_leaves)
KNN
pipeline = Pipeline(steps=[('scl', StandardScaler()),
('model', KNeighborsClassifier())])param_grid = {'model__n_neighbors': list(range(5, 31, 5)),
}
scoring = 'accuracy'
cv = 3
grid_search = GridSearchCV(estimator=pipeline,
param_grid=param_grid,
scoring=scoring,
cv=cv,
n_jobs=-1,
verbose=1,
return_train_score=True)
grid_search.fit(X_trainval, y_trainval)
# grid_search.fit(X_trainval_res, y_trainval_res)
val_acc = grid_search.best_score_
train_acc = grid_search.cv_results_[
'mean_train_score'][grid_search.best_index_]
hold_acc = grid_search.score(X_hold, y_hold)
print(f'nKNN ClassifiernnTrain score: {train_acc:.3f}nVal score: {val_acc:.3f}nnTest score: {hold_acc:.3f}')
Fitting 3 folds for each of 6 candidates, totalling 18 fitsKNN Classifier
Train score: 0.825
Val score: 0.803
Test score: 0.714
Logistic Regression
pipeline = Pipeline(steps=[('scl', StandardScaler()),
('model', LogisticRegression())])param_grid = {'model__C': [0.1, 1, 5, 10, 100, 1000],
'model__penalty': ['l2'],
'model__solver': ['liblinear'],
'model__random_state': [69]
}
scoring = 'accuracy'
cv = 3
grid_search = GridSearchCV(estimator=pipeline,
param_grid=param_grid,
scoring=scoring,
cv=cv,
n_jobs=-1,
verbose=1,
return_train_score=True)
grid_search.fit(X_trainval, y_trainval)
# grid_search.fit(X_trainval_res, y_trainval_res)
val_acc = grid_search.best_score_
train_acc = grid_search.cv_results_[
'mean_train_score'][grid_search.best_index_]
hold_acc = grid_search.score(X_hold, y_hold)
print(f'nLogistic RegressionnnTrain score: '
f'{train_acc:.3f}nVal score: {val_acc:.3f}'
f'nnTest score: {hold_acc:.3f}')
Fitting 3 folds for each of 6 candidates, totalling 18 fitsLogistic Regression
Train score: 0.865
Val score: 0.831
Test score: 0.714
GBM
pipeline = Pipeline(steps=[('scl', StandardScaler()),
('model', GradientBoostingClassifier())])param_grid = {'model__learning_rate': [0.001],
'model__max_features': [3, 4, 5],
'model__max_depth': [10, 20],
'model__random_state': [69]
}
scoring = 'accuracy'
cv = 3
grid_search = GridSearchCV(estimator=pipeline,
param_grid=param_grid,
scoring=scoring,
cv=cv,
n_jobs=-1,
verbose=1,
return_train_score=True)
grid_search.fit(X_trainval, y_trainval)
# grid_search.fit(X_trainval_res, y_trainval_res)
val_acc = grid_search.best_score_
train_acc = grid_search.cv_results_[
'mean_train_score'][grid_search.best_index_]
hold_acc = grid_search.score(X_hold, y_hold)
print(f'GBMnnTrain score: '
f'{train_acc:.3f}nVal score: {val_acc:.3f}'
f'nnTest score: {hold_acc:.3f}')
Fitting 3 folds for each of 6 candidates, totalling 18 fits
GBMTrain score: 0.998
Val score: 0.798
Test score: 0.821
Deep Learning
I will also try to implement deep learning to classify the leaves
output_folder = 'cropped_leaves'# Delete the 'cropped_leaves' folder if it exists
if os.path.exists(output_folder):
shutil.rmtree(output_folder)
if not os.path.exists(output_folder):
os.makedirs(output_folder)
class_count = {}
leaf_count = 0
classes = []
for filename in tqdm(os.listdir(folder_path)):
file_path = os.path.join(folder_path, filename)
if os.path.isfile(file_path):
leaf_class = get_class(file_path)
if leaf_class not in classes:
classes.append(leaf_class)
if leaf_class not in class_count:
class_count[leaf_class] = 0
image_raw = io.imread(file_path)
gray_leaves = rgb2gray(image_raw[:,:,:3])
binary_leaves = util.invert(gray_leaves > 0.5)
label_leaves = label(binary_leaves)
raw_props = regionprops(label_leaves)[1:] # remove the background class
clean_props = [prop for prop in raw_props if prop.area > 10] # just the leaves, remove specks
image = Image.open(file_path)
for prop in clean_props:
class_count[leaf_class] += 1
leaf_count += 1
cropped_image = image.crop((prop.bbox[1], # left
prop.bbox[0], # top
prop.bbox[3], # right
prop.bbox[2]) # bottom
)
class_folder = os.path.join(output_folder, leaf_class)
if not os.path.exists(class_folder):
os.makedirs(class_folder)
new_filename = f"leaf_{leaf_count}.jpg"
new_file_path = os.path.join(class_folder, new_filename)
cropped_image.save(new_file_path)
# Rename files to ensure monotonically labeled filenames
for leaf_class, count in class_count.items():
class_folder = os.path.join(output_folder, leaf_class)
for i, filename in enumerate(os.listdir(class_folder), start=1):
file_path = os.path.join(class_folder, filename)
new_filename = f"leaf_{i}.jpg"
new_file_path = os.path.join(class_folder, new_filename)
os.rename(file_path, new_file_path)
def create_dataset(src, dst, range_, class_):
"""Copy images of class class_ within range_ from src to dst.Parameters
----------
src : str
source directory
dst : str
destination directory
range_ : tuple
tuple of min and max image index to copy
class_ : str
image class ('plantA', 'plantB', ...)
"""
if os.path.exists(dst):
shutil.rmtree(dst)
os.makedirs(dst)
fnames = [f'leaf_{i}.jpg' for i in range(*range_)]
for fname in fnames:
src_file = os.path.join(src, fname)
dst_file = os.path.join(dst, fname)
shutil.copyfile(src_file, dst_file)
# looping through create_dataset for each class
class_counts = [len(os.listdir(os.path.join(output_folder, class_))) for class_ in classes] # Number of pictures for each class
for class_, count in zip(classes, class_counts):
src = output_folder
# Define the custom ranges for each split based on the available count
train_range = (1, int(0.7 * count)) # 70% for training
validation_range = (int(0.7 * count) + 1, int(0.9 * count)) # 20% for validation
test_range = (int(0.9 * count) + 1, count) # 10% for testing
dst = f'cropped_leaves/train/{class_}'
create_dataset(src+'/'+class_, dst, range_=train_range, class_=class_)
dst = f'cropped_leaves/validation/{class_}'
create_dataset(src+'/'+class_, dst, range_=validation_range, class_=class_)
dst = f'cropped_leaves/test/{class_}'
create_dataset(src+'/'+class_, dst, range_=test_range, class_=class_)
data_path = Path(output_folder)
data_path_list = list(data_path.glob("*/*.jpg"))
train_dir = data_path / "train"
val_dir = data_path / "validation"
test_dir = data_path / "test"
train_data = datasets.ImageFolder(root=train_dir,
transform=transforms.Compose(
[transforms.Resize((224, 224)),
transforms.ToTensor()]))
all_images = torch.stack([img_t for img_t, _ in train_data], dim=3)
means = all_images.view(3, -1).mean(dim=1).numpy()
stds = all_images.view(3, -1).std(dim=1).numpy()
data_transforms = {
'train': transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=means, std=stds)
]),
'validation': transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=means, std=stds)
]),
'test': transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=means, std=stds)
])
}
data_dir = data_path
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
data_transforms[x])
for x in ['train', 'validation', 'test']}
dataloaders = {x: DataLoader(image_datasets[x], batch_size=1,
shuffle=True)
for x in ['train', 'validation', 'test']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'validation', 'test']}
class_names = image_datasets['train'].classes
class_names
['plantA', 'plantB', 'plantC', 'plantD', 'plantE']
Fine-tuned VGG-19
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")# Load train, validation, and test datasets
train_data = image_datasets['train']
val_data = image_datasets['validation']
test_data = image_datasets['test']
# Define data loaders
train_loader = dataloaders['train']
valid_loader = dataloaders['validation']
test_loader = dataloaders['test']
# Load the pretrained VGG19 model
model = models.vgg19(pretrained=True)
# Freeze the parameters of the pretrained layers
for param in model.parameters():
param.requires_grad = False
# Modify the last fully connected layer to match the number of classes
num_classes = len(class_names)
model.classifier[6] = nn.Linear(4096, num_classes)
# Move the model to the appropriate device
model = model.to(device)
# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Train the model
num_epochs = 30
best_valid_loss = float('inf')
for epoch in range(num_epochs):
train_loss = 0.0
valid_loss = 0.0
# Training
model.train()
for images, labels in train_loader:
images, labels = images.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
train_loss += loss.item() * images.size(0)
# Validation
model.eval()
with torch.no_grad():
for images, labels in valid_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images)
loss = criterion(outputs, labels)
valid_loss += loss.item() * images.size(0)
train_loss = train_loss / len(train_loader.dataset)
valid_loss = valid_loss / len(valid_loader.dataset)
print(f"Epoch: {epoch+1}/{num_epochs}, Train Loss: {train_loss:.4f}, "
f"Valid Loss: {valid_loss:.4f}")
# Save the best model based on validation loss
if valid_loss < best_valid_loss:
best_valid_loss = valid_loss
torch.save(model.state_dict(), 'best_leaves_model.pt')
# Test the model
model.load_state_dict(torch.load('best_leaves_model.pt'))
model.eval()
correct = 0
total = 0
with torch.no_grad():
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print(f"Test Accuracy: {accuracy:.2f}%")
Epoch: 1/30, Train Loss: 0.4749, Valid Loss: 0.1140
Epoch: 2/30, Train Loss: 0.1253, Valid Loss: 0.1190
Epoch: 3/30, Train Loss: 0.1363, Valid Loss: 0.1185
Epoch: 4/30, Train Loss: 0.0705, Valid Loss: 0.0598
Epoch: 5/30, Train Loss: 0.0796, Valid Loss: 0.0407
Epoch: 6/30, Train Loss: 0.0894, Valid Loss: 0.0853
Epoch: 7/30, Train Loss: 0.0718, Valid Loss: 0.0495
Epoch: 8/30, Train Loss: 0.0468, Valid Loss: 0.0574
Epoch: 9/30, Train Loss: 0.0561, Valid Loss: 0.0269
Epoch: 10/30, Train Loss: 0.0543, Valid Loss: 0.0370
Epoch: 11/30, Train Loss: 0.0639, Valid Loss: 0.1421
Epoch: 12/30, Train Loss: 0.0657, Valid Loss: 0.0481
Epoch: 13/30, Train Loss: 0.0876, Valid Loss: 0.0563
Epoch: 14/30, Train Loss: 0.0704, Valid Loss: 0.0673
Epoch: 15/30, Train Loss: 0.0948, Valid Loss: 0.0335
Epoch: 16/30, Train Loss: 0.0389, Valid Loss: 0.0402
Epoch: 17/30, Train Loss: 0.0598, Valid Loss: 0.0882
Epoch: 18/30, Train Loss: 0.0962, Valid Loss: 0.0865
Epoch: 19/30, Train Loss: 0.0872, Valid Loss: 0.0295
Epoch: 20/30, Train Loss: 0.0600, Valid Loss: 0.0713
Epoch: 21/30, Train Loss: 0.0445, Valid Loss: 0.1138
Epoch: 22/30, Train Loss: 0.0997, Valid Loss: 0.2957
Epoch: 23/30, Train Loss: 0.0483, Valid Loss: 0.0287
Epoch: 24/30, Train Loss: 0.0539, Valid Loss: 0.0495
Epoch: 25/30, Train Loss: 0.0744, Valid Loss: 0.0827
Epoch: 26/30, Train Loss: 0.1011, Valid Loss: 0.1187
Epoch: 27/30, Train Loss: 0.0960, Valid Loss: 0.2000
Epoch: 28/30, Train Loss: 0.0857, Valid Loss: 0.1095
Epoch: 29/30, Train Loss: 0.0709, Valid Loss: 0.0857
Epoch: 30/30, Train Loss: 0.0584, Valid Loss: 0.1196
Test Accuracy: 96.00%
model.load_state_dict(torch.load('best_leaves_model.pt'))
model.eval()label_map = {k: v for k, v in enumerate(image_datasets['train'].classes)}
fig, ax = plt.subplots(5, 5, figsize=(25, 25))
ax = ax.flatten()
plt.suptitle('Test set predictions vs ground truth', fontsize=24)
plt.subplots_adjust(wspace=0.1, hspace=0.3)
for idx, (images, labels) in enumerate(test_loader):
images = images.to(device)
labels = labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
# Map the predicted class index to the corresponding label
predicted_labels = [label_map[p.item()] for p in predicted]
# Convert the labels to a list of strings
actual_labels = [label_map[l.item()] for l in labels]
# Iterate over the images, actual labels, and predicted labels
for image, actual_label, predicted_label in zip(images,
actual_labels, predicted_labels):
# Move the image to the CPU and convert it to a NumPy array
image = image.cpu().numpy()
image = np.transpose(image, (1, 2, 0))
# Clip the image data to the valid range [0, 1]
image = np.clip(image, 0, 1)
# Display the image
ax[idx].imshow(image)
ax[idx].axis('off')
# Add the actual and predicted labels as title
if actual_label == predicted_label:
ax[idx].set_title(f"Actual: {actual_label}nPredicted: "
f"{predicted_label}nCORRECT", fontsize=16)
else:
ax[idx].set_title(f"Actual: {actual_label}nPredicted: "
f"{predicted_label}nWRONG", fontsize=16)
fig.show()
Significance and Applications of Machine Learning in Leaf Classification
The application of machine learning in leaf classification has significant implications in various domains. From botanical research and species identification to ecological studies and plant disease detection, machine learning-based leaf classification techniques offer invaluable insights and automation capabilities. By leveraging large-scale datasets and powerful algorithms, we can enhance our understanding of plant biodiversity and contribute to sustainable environmental practices.
Closing Thoughts
Through the implementation of machine learning techniques in leaf image processing, we have witnessed the transformative capabilities of this approach. By training models to classify leaf images accurately, we can automate and streamline the process of leaf identification, opening up avenues for botanical research and conservation efforts. The power of machine learning lies in its ability to learn from vast amounts of visual data, making it an invaluable tool in the field of leaf classification and beyond. So, let’s embrace the potential of machine learning in image processing and embark on a journey to uncover the wonders of our botanical world through automated leaf classification.
References
Benjur Emmanuel L. Borja. 2023. Image Processing(MSDS2023). Asian Institute of Management.