Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Objective function in SMO SVM #1

Open
kilianFatras opened this issue Feb 13, 2019 · 4 comments
Open

Objective function in SMO SVM #1

kilianFatras opened this issue Feb 13, 2019 · 4 comments

Comments

@kilianFatras
Copy link

Hello jonchar,

I was reading your notebook on solving SVM with SMO algorithm. Unfortunately there is a part that I might misunderstand and looks as an error to me. When you define your objective function, you return :
``return np.sum(alphas) - 0.5 * np.sum(target * target * kernel(X_train, X_train) * alphas * alphas)

However, I think the term (target * target) is target^2 (element wise multiplication) instead of a matrix composed of target_i * target_j. To agree with the theory, I would have coded target[:,None] * target[None,:] to get such matrix.

One more time, I might be wrong, and I would really appreciate your feedback on it.

Best,

Kilian

@jonchar
Copy link
Owner

jonchar commented Feb 14, 2019

Hi Kilian! Thanks for your comment, I see what you're saying, it's unclear whether the decision function should be computing element-wise or a matrix. However if I have the decision function return the following as you suggest, I don't see a difference in the resulting fits:

np.sum(alphas) - 0.5 * np.sum((target[:, None] * target[None, :]) * kernel(X_train, X_train) * (alphas[:, None] * alphas[None, :]))

Is this what you mean?

@kilianFatras
Copy link
Author

Hi jonchar! Thank you for your quick answer.

It is what I mean. As it is a quadratic problem, I think you should get a matrix before doing the sum. If you don't see anything different it might be because of the sparsity of the alphas (summing 0 won't change anything ;)).

I also noted 2 little mistake between the text and the code. The first one is in the computation of the gaussian kernel where you just take the L2 norm instead of square L2 norm (as mentioned in the text). It should be square L2 in the code (https://en.wikipedia.org/wiki/Radial_basis_function_kernel).

The second one is a typo in the text, when you define f(x), you have a '+b' while it is a '-b' in your code. I think the convention is '-b' so it is just a typo :).

What do you think ?

Best,

Kilian

@jonchar
Copy link
Owner

jonchar commented Feb 22, 2019

Hey Kilian, thanks for your points. I didn't see much change in the result by changing the decision function. However, I did see a change in the result when I modified the Gaussian kernel to explicitly take the L2 norm (e.g. np.linalg.norm(..., ord=2)). Good catch!

I also found a mistake in the function that plots the decision boundary, looks like I had it rotated to some degree or reflected through the origin. I'll have a new version up soon.

Thanks again!

@jonchar
Copy link
Owner

jonchar commented Mar 23, 2019

I've made some changes to the SVM notebook (3fc041d) that fix the Gaussian kernel and how the decision boundary is displayed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants