Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.
Are you here to participate in a specific proposal? Click on the proposal tag (with the dark outline) to see only posts about that proposal and not all of the others that are in progress. Tags are at the bottom of each post.
Post History
The goal of a soft-margin or C-SVM is to solve the following minimisation problem: $$\min_{\boldsymbol{w}, b} \frac{1}{2} \|\boldsymbol{w}\|^2 + C \sum_i \xi_i$$subject to $$\forall i : \begin{al...
Answer
#1: Initial revision
The goal of a [soft-margin](https://en.wikipedia.org/wiki/Support_vector_machine#Soft-margin) or C-SVM is to solve the following minimisation problem: $$\min_{\boldsymbol{w}, b} \frac{1}{2} \|\boldsymbol{w}\|^2 + C \sum_i \xi_i$$ subject to $$\forall i : \begin{aligned}y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 + \xi_i &\geq 0 \\ \xi_i &\geq 0\end{aligned},$$ where $y_i \in \{-1, 1\}$. The solution can be found by means of [Lagrange multipliers](https://en.wikipedia.org/wiki/Lagrange_multiplier), $\alpha_i$ and $\lambda_i = (C - \alpha_i)$. The roots of the derivative of the Lagrangian function for this problem are given by $$\boldsymbol{w} = \sum_i \alpha_i y_i \boldsymbol{x}_i,$$ Note that, $\boldsymbol{w}$ depends only on those samples for which $\alpha_i \neq 0$. Additionally, the [Karush-Kuhn-Tucker conditions](https://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions) require that the solution satisfies $$\begin{align} \alpha_i \big(y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 + \xi_i\big) & = 0 \\ (C - \alpha_i) \, \xi_i & = 0, \end{align}$$ Just as with hard-margin SVMs (the linearly separable case), the biases can be computed by samples on the borders of the margin, i.e. $\xi_i = 0$ and $\alpha_i \geq 0$, such that $y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 = 0$. However, unlike hard-margin SVMs, there might be samples for which $\alpha_i \geq 0$, but $\xi_i \geq 0$, which implies $\alpha_i = C$. These are samples that lie inside the margin. Although these samples do not define $b$, they do affect $\boldsymbol{w}$ and therefore the solution is also supported by these samples. When $\xi_i = 0$, we end up with a sample on the border In order to compute $b$, the first constraint for sample $i$ must be tight, i.e. $\alpha_i > 0$, such that $y_i (\boldsymbol{w} \cdot \boldsymbol{x}_i + b) - 1 = 0$ can be solved for $b$. which allows to compute $b$ if $\alpha_i > 0$ and $\xi_i = 0$. If both constraints are tight, i.e. $\alpha_i < C$, $\xi_i$ must be zero. Therefore, $b$ depends on those samples for which $0 < \alpha_i < C$. Therefore, we can conclude that the solution depends on all samples for which $\alpha_i > 0$. After all, $\boldsymbol{w}$ still depends on those samples for which $\alpha_i = C$. TL;DR: The support vectors are all points that lie inside or on the border of the margins, because these are the points for which the Lagrange multipliers are positive, i.e. $\alpha_i > 0$.