Creating a Number Classifier Using Supervised Learning Methods | by Christian Burns

Machine learning (ML) has revolutionized the field of computer science and is being used in a myriad of applications from personalized recommendation systems to autonomous vehicles. For those interested in ML, creating a number classifier can be an excellent starting point. This article provides a high-level overview of how to approach such a project using supervised learning methods.

The objective of a number classifier is to correctly identify handwritten digits. The MNIST (Modified National Institute of Standards and Technology) database is often used for this purpose. This dataset contains 70,000 grayscale images of handwritten digits, each of which is 28×28 pixels.

The approach to this project will be based on supervised learning, a type of machine learning where the model is trained on a labeled dataset. In the case of the MNIST dataset, the labels are the actual digits that the images represent (0–9).

Before training a machine learning model, the data must be prepared. This often involves several steps including splitting the dataset into a training set and a testing set. The training set is used to train the model, while the testing set is used to evaluate its performance.

Additionally, it’s important to normalize the image data. Image data is typically composed of pixel intensities ranging from 0 (black) to 255 (white). By scaling these values to a range between 0 and 1, it’s easier for the model to learn from the data.

# Import necessary libraries
from sklearn.model_selection import train_test_split
from keras.datasets import mnist
import numpy as np# Load dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# Normalize pixel values to be between 0 and 1
X_train, X_test = X_train / 255.0, X_test / 255.0
# Reshape input data from (28, 28) to (28, 28, 1)
w, h = 28, 28
X_train = X_train.reshape(X_train.shape[0], w, h, 1)
X_test = X_test.reshape(X_test.shape[0], w, h, 1)

Choosing the right model for the task is crucial. For a number classifier, a common choice is the Support Vector Machine (SVM), a type of machine learning model that’s often used for classification tasks. However, other models such as Convolutional Neural Networks (CNNs) can also be effective, particularly when dealing with image data.

# Import necessary libraries
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten# Create a sequential model
model = Sequential()
# Add layers
model.add(Conv2D(64, kernel_size=3, activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(Flatten())
model.add(Dense(10, activation='softmax')) # Output layer

After selecting a model, the next step is to train it using the training data. This involves feeding the data into the model and allowing it to adjust its internal parameters to better fit the data. This is done through a process known as gradient descent, where the model incrementally adjusts its parameters to minimize the difference between its predictions and the actual labels.

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])# Train the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3)

Once the model is trained, it’s important to evaluate its performance. This is done using the testing set. The model’s predictions on this set are compared to the actual labels to compute various performance metrics. One of the most common metrics for classification tasks is accuracy, which measures the proportion of correct predictions. However, other metrics such as precision, recall, and the F1 score may also be relevant, particularly in cases where the classes are imbalanced.

# Evaluate the model
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test accuracy:', test_acc)

Even after training a model, there’s usually room for improvement. By tuning the model’s hyperparameters (parameters that are not learned from the data), it’s often possible to increase its performance. Grid search and random search are common methods for hyperparameter tuning.

# Import necessary libraries
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV# Function to create model for KerasClassifier
def create_model():
model = Sequential()
model.add(Conv2D(64, kernel_size=3, activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
# Create the KerasClassifier
model = KerasClassifier(build_fn=create_model, verbose=0)
# Define the grid search parameters
batch_size = [10, 20, 40, 60, 80, 100]
epochs = [10, 50, 100]
param_grid = dict(batch_size=batch_size, epochs=epochs)
# Conduct Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)
# Print Results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

Creating a number classifier provides a comprehensive introduction to the workflow of a machine learning project, covering key aspects from data preparation and model selection to model training and evaluation. By understanding and applying these concepts, one can lay a solid foundation for more complex machine learning projects.

Here’s a copy of the code for you to play with on your own; https://colab.research.google.com/drive/1CkHY3XBmlZ6_TDF60oJxEzskDHDBbbNa?usp=sharing

Source link