Lecture 2: Learning to Answer Yes/No
1. Perceptron Hypothesis Set -- hyperplanes/linear classifiers in
2. Percepptron Learning Algorithm (PLA) -- correct mistakes and improve iteratively
Start from some (say, 0
), and 'correct' its mistakes on
3. Guarantee of PLA -- no mistake eventually if linear separable
There are two cases: linear separable and not linear separable:
With the number of t
increases, gradually moves closer to .
next we need formular:
next we need formular:
Prove that the number of iterations is limited.
From (1) we can see:
From (2) we can see:
So that the following equation can be obtained:
When :
among them:
So evidence. (Because of the left of the above ≤ 1)
4. Non-Separable Data -- hold somewhat 'best' weights in pocket
: as long as linear separable and correct by mistake
inner product of and grows fast; length of grows slowly
PLA 'lines' are more and more aligned with => halts
: simple to implement, fast, works in any dimension
'assumes' linear separable to halt
not fully sure how long halting takes
modify PLA Algorithm by keeping best weights in pocket.
The efficiency is much slower than PLA.