![](https://crypto4nerd.com/wp-content/uploads/2023/06/1l6W6IPfc5dZ2MN6s5Fcwew-1024x683.jpeg)
In this modern day and age, most of us is familiar with the term machine learning. Some said that this distinguished piece of technology is the closest glimpse to our future. But you may ask, what is machine learning?, how does it work? simply put, as the name suggest “Machine Learning”, we make machines learn. Although it doesn’t sound compelling, but it is literally what it is.
What does a machine learn anyways?
If you haven’t noticed, a computer or particularly a machine, can only recognize numbers, specifically a signal consisting of 0’s and or 1’s. We call them bits, bits formed together creates something called bytes, a byte is typically 8 bits stringed together, and strings of bytes will form numerous combination that eventually form something called data, as we all know and familiar of.
In machine learning context, the machine learns our data’s patterns. Whatever form our data is, as long as it can be converted into numbers and signals, there will always be a pattern in it. In my experience, I processed images of chili plants leafs, along the process it will learn to identify a plant, later on locating it’s leafs, and finally, it will classify whether that chili plant of mine is healthy or potentially ill.
Hows does it learn?
In surface level, there are three ways of how a machine learns, namely:
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
Supervised learning, is where we guide that machine to learn what is what, who is who, etc. we call those representations as labels. In my case, I tell the machine, that imageA is a healthy chili, and imageB is a potentially ill chili, so that later, after it’s learning process, it can classify a chili plant health condition when we give an image. This can be done with absolute ease in Apple’s CreateML software, we’ll come back to this part later.
Unsupervised learning, in concept is the opposite of supervised learning. We don’t really tell them which is which, rather it’s learning algorithm will figure out everything on it’s own. Based on that definition, the result of this learning will not be a particular label of object representation as in supervised learning, rather it will be a somewhat qualitative data such as item clusters to identify patterns in, let’s say a scatter plot of monthly sales.
Reinforcement learning, this concept is a whole another world, it lets the machine or we can say software agents, to learn and identify patterns on it’s own without initial training data. Typically it uses error and reward scores system. Simply put, it punishes the agent when they made mistakes using the error scores, and rewards them when they made the right decision, based on the context of the problem. This type of learning, is the backbone of our now everyday technologies such as self-driving cars.
What Problem Can Machine Learning Solve?
Based on it’s learning capabilities, there are a few domain of problems that machine learning can solve. Supervised learning can solve a typically called classification and regression type problems. Example being, my own app, chiliwise, is an app that classifies whether a chili is healthy or potentially ill. This is considered a classification problem. Other problem is regression type problems. Regression problems has one thing in common, and that is they use a quantity-based data to predict an outcome that is also based on quantity data. For example, predicting next-year birth rate based on previous years birth-rate data. Unsupervised learning, as I’ve mentioned before can solve clustering-based problems, clustering monthly sales, and or other scatter plot data, can be used to identify patterns and trends in that data. As for reinforcement learning, it can be used to solve environment-based problem, for example self-training a chess AI, and your other familiar games such as flappy bird, that in the end will automatically flap the bird through the obstacles.
Types of Machine Learning Applications
Machine learning has come to a state where almost all of our data can be processed. Some honorable mentions of it’s application would be:
- Image Classification
- Object Detection
- Text Classification
In Apple’s CreateML however, you will find a whole array of machine learning applications.
For my application, chiliwise, as I’ve mentioned before. I’ll be using image classification option, after an option has been selected, our adventure begins here!
Some of us might be unfamiliar with the interface above, but don’t worry because CreateML has made it so that it can be understood with relative ease. Here are some short explanations of the interface above:
- Training data, as the name suggests, is the data that we have provided to be trained and analyzed by the machine learning model we about to create. To add the prepared data, it’s as simple as the good ol’ drag and drop action, or alternatively, we can click the plus icon, then add it.
- Validation data, will also be used in training phase, the sole purpose of this dataset, is to validate whether the machine has learned properly or not, this can be inferred from the training loss, and accuracy.
- Testing data, as the name suggests, the sole purpose of this dataset, is to test the model’s performance in the end, after the training process has been done. This process of testing is used as our model’s success parameter. We can evaluate whether our model is trained well or not. That can be measured using it’s measurements such as F1-scores, precision, recall, and ultimately it’s accuracy.
- Iterations, how many times we want the model to iterate in it’s training process.
- Augmentations, this option is available exclusively for image classification option, with the options as shown above, we can manipulate our images, adding noise, blurring it, cropping it, etc, therefore populating our dataset which helps the model to learn variety of cases in our data.
Once the training process has been done, we will encounter an interface like shown below:
On the top right corner, we can see our model’s performance metric, ideally we want the accuracy to be as high as possible, it means that our model is high confidence in doing the task we asked them to do. One thing worth mentioning, try to keep the those three accuracy scores similar to one another, with this, we make sure that our model doesn’t overfit, a case where our model ends up memorizing our data, rather than learning it’s patterns and intricacies.
Using the trained model
After the whole training process has been done, and we got the result we wanted, now has come the time to use it into our project. We will be using Vision, a framework that is used to handle image classification request programatically. Below is a code snippet that I used, to load the model and process image classification request inside a function:
static func createImageClassifier() -> VNCoreMLModel{
let defaultConfig = MLModelConfiguration()let imageClassifierWrapper = try? LeoChiliV4(configuration: defaultConfig)
guard let imageClassifier = imageClassifierWrapper else{
fatalError("Failed to create an ML Model instance")
}
let imageClassifierModel = imageClassifier.model
guard let imageClassifierVisionModel = try? VNCoreMLModel(for: imageClassifierModel) else{
fatalError("Failed to create VNCoreMLModel Instance")
}
return imageClassifierVisionModel
}
func processImage(for image : CIImage){
let imageClassificationRequest = VNCoreMLRequest(model: shared)
let handler = VNImageRequestHandler(ciImage: image, orientation: .up)
let requests : [VNRequest] = [imageClassificationRequest]
try? handler.perform(requests)
guard let observations = imageClassificationRequest.results as? [VNClassificationObservation] else{
print("VNRequest produced the wrong result type : (type(of: imageClassificationRequest.results))")
return
}
if let confidenceLevels = observations.first?.confidence{
self.confidenceLevels = confidenceLevels
}
if let firstResult = observations.first{
self.result = firstResult.identifier
}
}
The first function, is used to load our trained model, since Vision, only accepts our model, in form of CoreML model, we need to convert our trained model into the format that Vision will understand and based on the code above, we have to convert it into a VNCoreMLModel instance.
The second function, which named processImage, is where we process inputted image, into outputting whether that image is considered a healthy chili, or potentially ill. By creating an image classification request to Vision, which in the code above, creating array of VNRequest instances, we will let the request handler (VNImageRequestHandler instance), to handle those image requests. Finally, after that has been done, we can access the requests return values, or in the code above, the observations variable which is an array of VNClassificationObservation instances. From there, in my case I wanted the result label, and also it’s confidence level, in another case, it won’t always be what we need.
This is the final looks, of the working chiliwise application: