Why Minimizing ||w|| Maximizes Margin Width In Linear SVMs | by Karla Hernandez | Random Noise

Why Minimizing ||w|| Maximizes Margin Width In Linear SVMs | by Karla Hernandez | Random Noise | Jul, 2023

The Tech Guy July 18, 2023 2 min read

The final objective in the OP was to make sure the margin around the hyperplane was as large as possible. Is this guaranteed in the solution to the MF? Before we can answer this question we need to be more specific: what is the margin around the hyperplane?

There are multiple ways to define the margin but for the purpose of training a SVM it suffices to define it as the distance from the plane to the nearest datapoint (regardless of the datapoint’s class). By finding a plane that maximizes this distance we are attempting to improve the generalization error of our classifier.

Does a solution to the MF maximize the distance from the plane to the nearest datapoint? Let’s see. Suppose w and b are solutions to the MF, that x is a point on the plane, and that y is one of our training datapoints. Then, the distance from a point y to the plane defined by x • w + b = 0 is nothing but the absolute value of the scalar projection of y — x onto w:

When calculating the distance from a point to the plane we care about the magnitude (absolute value) of the scalar projection, not just its sign.

Therefore, y’s distance to the plane is equal to:

|(y — x) • w| / ||w||.

By adding and subtracting b from the left side of the previous equation we see that the distance from y to the plane is simply:

|(y — x) • w + b — b| / ||w|| = |y • w + b — (x • w + b)| / ||w||.

Using the fact that x is a point on the plane so that x • w + b = 0 yields:

distance from y to plane = |y • w + b| / ||w||.

Thanks to the MF’s constraints, we know that |y • w + b| ≥ 1, which implies that the distance from y to the plane is at least 1 / ||w||, regardless of y (or its class label). By minimizing ||w|| we increase this lower bound. Equivalently, minimizing ||w|| maximizes the margin around the plane.

In conclusion, any solution to the MF will ensure that Objective #3 in the OP is met.

Source link

Leave a Reply Cancel reply

Related Stories

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

VC-Dimension V.S. Inductive Bias V.S. Biology V.S. Physical Laws : Comprehensive Multi-Disciplinary Table of Machine Learning Classifiers | by Medium_AI_CS_ML | Apr, 2024

Why Machine Learning Is Worth Talking About? | by jupytermishra | Apr, 2024

You may have missed

The Weekly Reorg: Bitcoin Fashion Week

Virtual curating frees artist – Hypergrid Business

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

Azteco Is Helping Millions Buy Bitcoin Without Sharing Their Identity