Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Incubator Q&A

Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.

What are the support vectors in a soft-margin SVM Question

+2
−0

I know what Support Vector Machines (SVMs) are and how they work, but I regularly get confused by what exactly the support vectors are.

In case of linearly separable data, the support vectors are those data points that lie (exactly) on the borders of the margins. After all, these are the only points that are necessary to compute the margin (through the bias term b) and therefore support the solution.

For soft-margin $C$-SVMs, however, I find the concept of support vectors less obvious. Of course, the data points on the border of the margin are still support vectors, but I always get confused about whether the points that are in the margin are support vectors or not. After all, only the points on the borders are used to compute the bias term $b$ (in the same way as for linearly separable data). Therefore, it could be argued that the margin is only supported by these points.

However, there are multiple sources that mention 1000+ support vectors, which would be impossible if only those on the border count. My question is thus: What exactly are the support vectors for a soft-margin SVM?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

1 answer

+1
−0

The goal of a soft-margin or C-SVM is to solve the following minimisation problem:

$$\min_{\boldsymbol{w}, b} \frac{1}{2} \|\boldsymbol{w}\|^2 + C \sum_i \xi_i$$

subject to $$\forall i : \begin{aligned}y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 + \xi_i &\geq 0 \\ \xi_i &\geq 0\end{aligned},$$

where $y_i \in \{-1, 1\}$.

The solution can be found by means of Lagrange multipliers, $\alpha_i$ and $\lambda_i = (C - \alpha_i)$. The roots of the derivative of the Lagrangian function for this problem are given by

$$\boldsymbol{w} = \sum_i \alpha_i y_i \boldsymbol{x}_i,$$

Note that, $\boldsymbol{w}$ depends only on those samples for which $\alpha_i \neq 0$.

Additionally, the Karush-Kuhn-Tucker conditions require that the solution satisfies

$$\begin{align} \alpha_i \big(y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 + \xi_i\big) & = 0 \\ (C - \alpha_i) \, \xi_i & = 0, \end{align}$$

Just as with hard-margin SVMs (the linearly separable case), the biases can be computed by samples on the borders of the margin, i.e. $\xi_i = 0$ and $\alpha_i \geq 0$, such that $y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 = 0$. However, unlike hard-margin SVMs, there might be samples for which $\alpha_i \geq 0$, but $\xi_i \geq 0$, which implies $\alpha_i = C$. These are samples that lie inside the margin. Although these samples do not define $b$, they do affect $\boldsymbol{w}$ and therefore the solution is also supported by these samples.

When $\xi_i = 0$, we end up with a sample on the border In order to compute $b$, the first constraint for sample $i$ must be tight, i.e. $\alpha_i > 0$, such that $y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 = 0$ can be solved for $b$. which allows to compute $b$ if $\alpha_i > 0$ and $\xi_i = 0$. If both constraints are tight, i.e. $\alpha_i < C$, $\xi_i$ must be zero. Therefore, $b$ depends on those samples for which $0 < \alpha_i < C$.

Therefore, we can conclude that the solution depends on all samples for which $\alpha_i > 0$. After all, $\boldsymbol{w}$ still depends on those samples for which $\alpha_i = C$.

TL;DR: The support vectors are all points that lie inside or on the border of the margins, because these are the points for which the Lagrange multipliers are positive, i.e. $\alpha_i > 0$.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »