Book
- Understanding Machine Learning from Theory to Algorithm - Shai Shalev-Shwartz and Shai Ben David
Complete Handwritten Notes
Intro
- Machine Learning is the study of algorithms that can learn from data, gradually imporoving their performamce.
- Mathematical Statistics is the science of making decision in the face of uncertainty.
Realizable Case
- Binary classification: (X,Y)=(instance,label)
- Example: X−image,Y∈{+1,−1} (eg: "cat", "dog"), X∈S - set of all possible instances
Statistical Learning
- We will assume that (X,Y) is random, in other words, it has a probability distribution P, so we use language of probability theory.
Supervised Learning
- (X,Y)∈S×{+1,−1}
- P is the distribution of (X,Y) i.e. P(A)=Probability((X,Y)∈A)
- Π is the distribution of X
- Imposing the probabilitic model on (X,Y) takes as info realm of Statistical Learning Theory
- Goal: preduct label Y based on the observation X
- The prediction rule is a function g:S→{−1,+1}
- The quality of a prediction rule g is measured by the classification/generalization error L(g)=P(Y=g(X))
- The training data is a sequence (X1,Y1),(X2,Y2),⋯,(Xn,Yn) of i.i.d. pairs with distribution P
- An algorithm takes training data as input and outputs gn^=gn^((X1,Y1),⋯,(Xn,Yn))
- In general, we will consider 2 scenarios:
- "Realizable" learning: there exists g∈Gs.t.Y=g∗(x) with probability 1.
- "Agnostic" learning: there is no g∈Gs.t.Y=g∗(x) with probability 1.
Realizable Scenario:
- Assume that the set G of all possible classification rules is finite. By assumption, ∃g∗(x) with prob 1