Introduction to Machine Learning (NYU Paris, Fall 2018)

Machine Learning is getting more and more important these days with applications ranging from autonomous driving to computer assisted medicine, including weather or financial forecasting. In this class we will study the mathematical foundations of the current machine learning algorithms.

We will cover the main models from both supervised learning including linear and non linear regression and classification (kernel methods, support vector machine, neural networks) and unsupervised learning (including clustering, gaussian mixtures, self organizing maps, principal and independent component analysis and non linear dimensionality reduction)

We will review basic concepts in probability and statistics. We will discuss Bayesian vs frequentist statistics and model/parameter inference, as well as sampling methods.

Finally, we will also discuss the important question of model assessment and selection.

The class will follow the structure

1. Lectures (introduction of the new material that will be needed during the lab sessions and for the assignements)

2. Programming (lab) sessions, (you have the opportunity to apply what you have learned during the lecture, and you can ask all the questions you want to make sure you understand everything before the assignement)

3. Assignments (You are given a new problem and you are evaluated on your ability to use the course material to solve this new problem)

Assignments policy

Except if explicitely stated otherwise, assignments are due at the beginning of each class.

The lecture notes will be available here soon: ML2018 Lecture notes

Questions for each exam can be found by clicking on those exams below

Exam : 60% of the grade (30% midterm, 30% final)

The Midterm exam will take place on Thursday october 25th, 3pm during class

Assignments : 30 % of the grade (Tentative schedule below)

Final Project : 10 % of the grade (Tentative schedule below)

The Github page for the class will be hosted at https://acosse.github.io/IntroMLFall2018/ and will be used for the lab and the assignments. You can also click on each “Lab” in the schedule below and this will re-direct you to the github page.

Tentative schedule:

Legend: Lab sessions are in green, Homeworks due dates are in red, dates related to the project are in orange.

Week # | date | Topic | Assignements |

Week 1 | 09/04, 09/06 | General Intro + reminders on proba and inference. Part I, Part II | |

Part I : supervised Learning | |||

Week 2 | 09/11, 09/13 | Linear and logistic regression, regularization and Compressed sensing Linear Classification Part I, Part II, Note on the Bias-Variance trade-off |
HW1 |

Week 3 | 09/18, 09/20 | Lab 1: Intro to Python + linear class. and regression | |

Week 4 | 09/25, 09/27 | Non Linear classification, Kernel methods, SVM, Parts I & 2 | HW2 |

Week 5 | 10/2, 10/4 | Neural Networks, Optimization, Stochastic Optimization, Deep learning, Part I | HW1 due |

Week 6 | 10/9, 10/11 | Lab 2: Non Linear regression and classification, Neural Nets | |

Part II : Unsupervised Learning | |||

Week 7 | 10/16, 10/18 | Clustering, Linear Latent variable models (Part I) | HW2 due, HW3 |

Week 8 | 10/23, 10/25 | Linear Latent variable models (Part II), PCA, ICA, GMM, EM algorithm, Non linear LVM (Part I) |
Project choice |

Week 9 | 10/30, 11/1 | Non Linear LVM (Part II) and Manifold Learning | |

Week 10 | 11/6, 11/8 | Lab 3: Unsupervised Learning | HW3 due |

Week 11 | 11/13, 11/15 | Generalization, complexity and VC Theory | |

Week 12 | 11/20, 11/22 | Probabilistic models, HMM, Bayesian Nets + Advanced topics, Adversarial Learning |
HW 4 |

Week 13 | 11/27, 11/29 | Lab 4: Exam review, wrap up | Project due date |

Week 14 | 12/4, 12/6 | Final Exam |

- The elements of Statistical Learning, Hastie, Tibshirani, Friedman

- Pattern Recognition and Machine Learning, Bishop

- Machine Learning, a probabilistic perspective, Murphy

- Non linear dimensionality reduction, Lee, Verleysen.

Lab Sessions and programming policy

The lab sessions will require you to do some programming. It is strongly recommended to use python as it is more flexible and will be useful to you when moving to pytorch later on for more advanced machine learning methods requiring GPU processing.

Downloading and getting started with Python.

- Start by downloading anaconda: https://www.anaconda.com/download/#macos
- If you don’t have a text editor yet, you can download sublime text (see interesting keyboard shortcuts here)
- We will use multiple libraries during the class. Among the important ones, you can find the documentation from scikit-learn, numPy, Pandas

Data sets can be downloaded on the following websites: