Lecture 2: Learning to Answer Yes/No

1. Perceptron Hypothesis Set -- hyperplanes/linear classifiers in

2. Percepptron Learning Algorithm (PLA) -- correct mistakes and improve iteratively

Start from some (say, 0), and 'correct' its mistakes on

3. Guarantee of PLA -- no mistake eventually if linear separable

There are two cases: linear separable and not linear separable:

if linear separable:

With the number of t increases, gradually moves closer to .

next we need formular:

(1)

next we need formular:

(2)

Prove that the number of iterations is limited.

From (1) we can see:

From (2) we can see:

So that the following equation can be obtained:

When :

among them:

So evidence. (Because of the left of the above ≤ 1)

4. Non-Separable Data -- hold somewhat 'best' weights in pocket

More about PLA

Guarantee: as long as linear separable and correct by mistake
- inner product of and grows fast; length of grows slowly
- PLA 'lines' are more and more aligned with => halts
Pros: simple to implement, fast, works in any dimension
Cons:
- 'assumes' linear separable to halt
- not fully sure how long halting takes

if not linear separable:

modify PLA Algorithm by keeping best weights in pocket.

The efficiency is much slower than PLA.