Deploying Machine Learning Models on Google Cloud Platform: From Development to Production with Flask, Docker, and Kubernetes | by Shahrullohon Lutfillohonov

Navigating the Journey of Model Deployment and User Interaction for Seamless Machine Learning Experiences

A beginner-friendly guide to deploying a machine learning pipeline on Google Kubernetes Engine — Image by Author

Introduction

In today’s data-driven world, machine learning models are valuable not only because they can predict things accurately, but also because they can have a real impact on the real world. Going from creating a good machine learning model to actually using it is a very important step. It changes complicated math and logic into tools that can help make business decisions, improve user experiences, and do difficult tasks automatically.

This process of using machine learning models is both a skill and a science. It needs a good mix of technical know-how and expertise in a particular field. In this article, we’re going to thoroughly explore this important step. We will focus on some of the most important technologies used in this process: Flask, Docker, and Kubernetes. These are all managed within the strong framework of the Google Cloud Platform.

From Development to Production: A Synergistic Approach

Deploying machine learning models involves much more than mere technical execution. It bridges the gap between development and production environments, ensuring that models function seamlessly and efficiently for their intended users. With this in mind, we’re set to delve into the intricacies of creating a scalable, interactive, and user-friendly ecosystem that revolves around our models.

Let’s take a quick look at the essential tools that form the foundation of our deployment process. These tools work together seamlessly to transform our machine learning model into a functional application hosted on Google Cloud Platform.

Scikit-learn

Scikit-learn is a powerful and user-friendly machine learning library that’s written in Python. It offers a wide range of tools for tasks such as classification, regression, clustering, and more. For our purpose, we’ll utilize scikit-learn to create a simple yet effective machine learning model.

Flask

Flask is a lightweight web framework also built with Python. It’s designed to make web development quick and straightforward. We’ll leverage Flask to build a web service that exposes our machine learning model’s predictions as an API. This will enable other applications to interact with our model easily.

Docker

Docker simplifies the process of packaging an application and its dependencies into a standardized container. Containers ensure that our application runs consistently across various environments, making deployment smoother. With Docker, we’ll encapsulate our Flask application, along with its dependencies, into a portable container.

Kubernetes

Kubernetes is a robust open-source platform for orchestrating and managing containerized applications. It handles tasks like scaling, load balancing, and automated deployment. We’ll deploy our Dockerized Flask application on a Kubernetes cluster hosted on the Google Cloud Platform. This allows us to efficiently manage our application’s lifecycle and ensure its availability.

Google Cloud Platform (GCP)

Google Cloud Platform offers a suite of cloud computing services that empower us to deploy, manage, and scale applications with ease. We’ll take advantage of GCP’s resources to set up a Kubernetes cluster using Google Kubernetes Engine (GKE) and to deploy our machine learning application. GCP provides the infrastructure needed to make our application accessible to the world.

Why Google Cloud Platform?

Embracing the Google Cloud Platform for our deployment venture is an intentional choice. Google Cloud Platform provides a powerful, scalable infrastructure that seamlessly integrates with Kubernetes, allowing us to focus more on our applications and less on managing the underlying infrastructure. This synergy between Kubernetes and Google Cloud Platform lays a solid foundation for deploying, managing, and scaling our machine learning applications with confidence and ease.

Now we covered all the tools we are going to use briefly. Let’s dive into the technical part.

Stroke is a leading cause of death and disability worldwide. According to the World Health Organization, stroke is the second leading cause of death and the third leading cause of disability. Early detection and treatment of stroke can improve outcomes, but many strokes are not detected early enough.

Objective

Develop and deploy a machine learning model for predicting strokes. This model will be trained on relevant medical data to accurately identify individuals who might be at risk of experiencing a stroke. The goal is to create a robust and accurate prediction tool that can assist medical professionals in assessing stroke risk factors and making informed decisions about patient care. The deployed model should be integrated into existing healthcare systems to provide timely insights and support preventive measures, ultimately contributing to improved patient outcomes and healthcare management.

🔧Preparing the Machine Learning Model with scikit-learn

We’ll prepare a minimalistic machine learning model using the scikit-learn library. We’ll be using the healthcare-dataset-stroke-data dataset from Kaggle. Our approach will involve a decision tree classifier for simplicity. Here is our minimalistic approach:

Read data and do necessary preprocessing
Choosing and training the model
Model Evaluation
Saving Model Weights

def main():
# Load the dataset
data = pd.read_csv("data/healthcare-dataset-stroke-data.csv")# Prepare data
X, y = prepare_data(data)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the pipeline
pipeline = create_pipeline()
pipeline.fit(X_train, y_train)
# Evaluate the model
evaluate_model(pipeline, X_test, y_test)
# Save the trained pipeline
save_pipeline(pipeline)

The full pipeline is given on the GitHub page: https://github.com/Shahrullo/stroke-prediction-kubernetes/blob/main/ml_pipeline.py

🔧Building a Flask Web Service with Swagger

With our machine learning model ready, we’ll create a Flask web service that provides API endpoints for making predictions. We’ll use Flasgger, an extension for Flask, to create interactive API documentation.

app = Flask(__name__)
Swagger(app)# Load the trained pipeline
pipeline_filename = os.path.join('trained_model', 'stroke_prediction_pipeline.pkl')
pipeline = joblib.load(pipeline_filename)
@app.route('/predict', methods=['POST'])
def predict():
"""
Predict stroke probability for new data
---
....
"""
data = request.get_json()
# Convert data to a DataFrame
new_data = pd.DataFrame(data, index=[0])
# Make predictions using the pipeline
prediction = pipeline.predict(new_data)
pred_mapping = {
0: 'No-Stroke',
1: 'Stroke'
}
predicted_class = pred_mapping[prediction[0]]
return f"Model prediction is {predicted_class}"

Again, full code for the web service can be found on the GitHub page: https://github.com/Shahrullo/stroke-prediction-kubernetes/blob/main/app.py

🚢Containerizing with Docker

To ensure consistent deployment across environments, we’ll containerize our Flask application using Docker. Here’s how we’ll do it with Dockerfile: specify the base image, and the steps to install Python dependencies, copy the Flask app, and expose a port.

# Base Image
FROM continuumio/miniconda3:23.3.1-0# Set the working directory inside the container
WORKDIR /app
# Copy the requirements file first for better caching
COPY requirements.txt /app/
# Install dependencies using pip and clean up after
RUN pip install --no-cache-dir -r requirements.txt
# # Copy the application code into the container
COPY . /app/
# Expose the port the app runs on
EXPOSE 5000
# Command to run the application with gunicorn
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

If you haven’t been keeping up with the previous steps, there’s no need to be concerned. You can effortlessly make a duplicate of this repository on GitHub. Additionally, the repository provides a concise guide on how to train and run the web service locally. This is how your project folder should appear by this stage:

https://github.com/Shahrullo/stroke-prediction-kubernetes/tree/main

With our web application now working perfectly, we can move forward with the task of packaging it into a container and then launching the application on Google Kubernetes Engine.

The prerequisite to using GCP is to create a Google account and log in to the Google Cloud Platform by visiting https://console.cloud.google.com/.

Note: Please keep in mind that using tools and services in Google Cloud Platform (GCP) comes with charges. So, it’s a good idea to check out the pricing details on Google Console’s page to understand the costs. However, for this particular deployment, the free credits from Google should cover the expenses.

Sign-in to your GCP console and go to Manage Resources in IAM&Admin page. Click on Create New Project

GCP Console -> Manage Resources -> Create New Project

Once the new project is created, it will be reflected on the home page
on the Project Info tab. To open the Cloud Shell, just click the Activate Cloud Shell button at the top of the console window.

After you’ve opened the terminal, the first thing to do is to make a copy of the Git repository where the source code and data are stored. In this case, we’re copying the stroke-prediction-kubernetes.git repository.

git clone https://github.com/Shahrullo/stroke-prediction-kubernetes.git

To make sure everything is set up correctly, go into the newly copied folder and run a quick ls command to check if the files and code are in the Google Cloud Shell.

Cloning the Git project to GCP Cloud Shell

Following that, we define the project ID variable, create the Docker image for the application, and assign a tag to it for version tracking.

export PROJECT_ID=stroke-kubernetes
docker build -t gcr.io/${PROJECT_ID}/stroke-prediction:v1 .

We can verify the successful build of the image by using the docker images command to list Docker images; the newly constructed image should be visible in the list.

Next, we need to push the Docker image created earlier to the Google Container registry. For this, we need to provide Google authentication first. We can do this by running the gcloud auth configure-docker command on the Cloud Shell.

Once we have authenticated with Google, we can push the Docker image to the registry using the docker push command. Here is the command

docker push gcr.io/${PROJECT_ID}/stroke-prediction:v1

Pushing a Docker image to the Google Container Registry can take some time, depending on the size of the image. Once the push is complete, you can open the images in the Container Registry and verify that the Docker image you uploaded is present in the stroke-prediction folder.

Create Cluster and Deploy the App

Once the Docker image has been pushed to the Container Registry, we can set up other configurations. We can set the project to the project ID, set the compute zone to asia-northeast3-b, and create a small two-node cluster called stroke-cluster.

gcloud config set project $PROJECT_ID
gcloud config set compute/zone asia-northeast3-b
gcloud container clusters create stroke-cluster --num-nodes=2

Once the Kubernetes cluster is up and running, we can deploy our stroke prediction app using the Docker image we built earlier. To do this, we will use the kubectl create deployment command to create a deployment. The deployment will specify the image location and the number of replicas we want to run. We will also use the kubectl expose deployment command to expose the deployed app on port 5000. Once the deployments are complete, we can use the kubectl get service command to verify that the services are running.

kubectl create deployment stroke-prediction  --image=gcr.io/${PROJECT_ID}/stroke-prediction:v1
kubectl expose deployment stroke-prediction --type=LoadBalancer --port 80 --target-port 5000

Once the app is deployed, Kubernetes will show the external IP address through which it can be accessed. We can simply add /apidocs to the end of the external IP address to access the Swagger API.

To test whether the app is functioning properly, we can pass some dummy values as input to the model and click the Try it out and Execute button. The Swagger API will return the output of the model.

Deployed stroke-prediction ML app prediction

Finally! We have successfully deployed our stroke prediction model using Kubernetes on Google Cloud Platform. This is a major accomplishment, and we should be proud of our work.

There are many other components that come into play when deploying an app, such as model management, load balancing, security, and others. We will cover them in our future blogs. However, the core idea today is to present a framework so you can understand the process and add more levers to this approach based on the complexity of the app.

Consider removing the project and associated files, including images and data, once the task is finished. This helps prevent incurring additional charges for the ongoing use of GCP resources.

We built a stroke prediction model using scikit-learn and the healthcare-dataset-stroke-data dataset from Kaggle. We then created a user-friendly API with Flask, which we documented with Flasgger. To ensure our application could run seamlessly across diverse environments, we containerized it with Docker. Finally, we deployed our Dockerized app on Kubernetes on the Google Cloud Platform (GCP).

This journey is a gateway to deploying machine learning models in the real world. By following these steps, you can build a model that is accurate, scalable, and easy to use. The skills you learn along the way can be applied to a myriad of applications, from healthcare to finance to e-commerce.

So, what are you waiting for? Start your own machine learning deployment journey today!

Source link

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

VC-Dimension V.S. Inductive Bias V.S. Biology V.S. Physical Laws : Comprehensive Multi-Disciplinary Table of Machine Learning Classifiers | by Medium_AI_CS_ML | Apr, 2024

Why Machine Learning Is Worth Talking About? | by jupytermishra | Apr, 2024

You may have missed

The Weekly Reorg: Bitcoin Fashion Week

Virtual curating frees artist – Hypergrid Business

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

Azteco Is Helping Millions Buy Bitcoin Without Sharing Their Identity

Navigating the Journey of Model Deployment and User Interaction for Seamless Machine Learning Experiences

Introduction

From Development to Production: A Synergistic Approach

Scikit-learn

Flask

Docker

Kubernetes

Google Cloud Platform (GCP)

Why Google Cloud Platform?

Objective

🔧Preparing the Machine Learning Model with scikit-learn

🔧Building a Flask Web Service with Swagger

🚢Containerizing with Docker

Create Cluster and Deploy the App

Leave a Reply Cancel reply

Related Stories

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

VC-Dimension V.S. Inductive Bias V.S. Biology V.S. Physical Laws : Comprehensive Multi-Disciplinary Table of Machine Learning Classifiers | by Medium_AI_CS_ML | Apr, 2024

Why Machine Learning Is Worth Talking About? | by jupytermishra | Apr, 2024

You may have missed

The Weekly Reorg: Bitcoin Fashion Week

Virtual curating frees artist – Hypergrid Business

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

Azteco Is Helping Millions Buy Bitcoin Without Sharing Their Identity