![](https://crypto4nerd.com/wp-content/uploads/2023/07/1kucG-EbMDk7S4-AbkHY6sw-1024x574.png)
The final objective in the OP was to make sure the margin around the hyperplane was as large as possible. Is this guaranteed in the solution to the MF? Before we can answer this question we need to be more specific: what is the margin around the hyperplane?
There are multiple ways to define the margin but for the purpose of training a SVM it suffices to define it as the distance from the plane to the nearest datapoint (regardless of the datapoint’s class). By finding a plane that maximizes this distance we are attempting to improve the generalization error of our classifier.
Does a solution to the MF maximize the distance from the plane to the nearest datapoint? Let’s see. Suppose w and b are solutions to the MF, that x is a point on the plane, and that y is one of our training datapoints. Then, the distance from a point y to the plane defined by x • w + b = 0 is nothing but the absolute value of the scalar projection of y — x onto w:
Therefore, y’s distance to the plane is equal to:
|(y — x) • w| / ||w||.
By adding and subtracting b from the left side of the previous equation we see that the distance from y to the plane is simply:
|(y — x) • w + b — b| / ||w|| = |y • w + b — (x • w + b)| / ||w||.
Using the fact that x is a point on the plane so that x • w + b = 0 yields:
distance from y to plane = |y • w + b| / ||w||.
Thanks to the MF’s constraints, we know that |y • w + b| ≥ 1, which implies that the distance from y to the plane is at least 1 / ||w||, regardless of y (or its class label). By minimizing ||w|| we increase this lower bound. Equivalently, minimizing ||w|| maximizes the margin around the plane.
In conclusion, any solution to the MF will ensure that Objective #3 in the OP is met.