Lecture 2: Learning to Answer Yes/No
1. Perceptron Hypothesis Set -- hyperplanes/linear classifiers in
2. Percepptron Learning Algorithm (PLA) -- correct mistakes and improve iteratively
Start from some (say, 0
), and 'correct' its mistakes on
3. Guarantee of PLA -- no mistake eventually if linear separable
There are two cases: linear separable and not linear separable:
With the number of t
increases, gradually moves closer to .
next we need formular:
(1)
next we need formular:
(2)
Prove that the number of iterations is limited.
From (1) we can see:
From (2) we can see:
So that the following equation can be obtained:
When :
among them:
So evidence. (Because of the left of the above ≤ 1)
4. Non-Separable Data -- hold somewhat 'best' weights in pocket
Guarantee
: as long as linear separable and correct by mistake
inner product of and grows fast; length of grows slowly
PLA 'lines' are more and more aligned with => halts
Pros
: simple to implement, fast, works in any dimension
Cons
:
'assumes' linear separable to halt
not fully sure how long halting takes
modify PLA Algorithm by keeping best weights in pocket.
The efficiency is much slower than PLA.