Statistical and Computational Foundations of Machine Learning
Spring 2022
Sundays 5:00pm—7:30pm in ???

Contact

Course description

In this course you will learn the theoretical foundations of machine learning. We will explain how a machine learning algorithm can make accurate predictions on unseen data. This property is called generalization. We develop a statistical theory of generalization for prediction problems. Along the way we analyze some popular machine learning methods (support vector machines, boosting, perceptron, stochastic gradient descent, decision trees, neural networks, k-nearest neighbor). Besides generalization properties of machine learning algorithms, we will look also at their computational complexity and computational hardness of the underlying optimization problems.

The class will be completely theoretical. In lectures, I will give definitions and prove theorems. In homeworks and exams, I will ask you to solve mathematical problems and prove theorems. You will NOT do any coding, you will NOT do any data analysis, and you will NOT build any machine learning models.

Course syllabus

Course Prerequisites

You must have taken courses in probability theory, linear algebra, and algorithms. You will benefit from prior exposure to machine learning (ML) (e.g. introductory ML course, neural network course, statistical inference course, practical experience). However, knowledge of ML is not a prerequisite. I will give a brief introduction to ML in the first lecture.

Probability theory.
Random variables, discrete and continuous probability distributions, expectation, variance, correlation, covariance, independence, Bernoulli, binomial and Gaussian distributions, law of large numbers.
Linear algebra.
Vector spaces, subspaces, inner products, Cauchy-Schwarz inequality, norms, triangle inequality, linear functions, matrices, vector-matrix multplication, matrix multplication, orthogonal matrices, eigen values, eigen vectors.
Algorithms.
Arrays, dictionaries, sorting, searching, time and memory complexity, big O and big Omega notation, basics of NP-completeness.

I expect you to know one-dimensional calculus (limits, infimum, supremum, continuity, derivatives and integrals), mathematical notation (sets, functions, logical formulas), and elementary combinatorics (combinations, permutations, binomial coeficients). You must be comfortable doing mathematical proofs.

Course structure and grading

Homeworks (40% of grade)
There will be 4 homework sets. You should spend significant amount of time working on the homeworks. You can work in small groups and/or consult other materials or people. However, in your solution you have to list members of your group, all the materials used and all people you consulted. Type your solutions in LaTeX, print them out, and hand in them in class.
Midterm exam (30% of grade)
Midterm exam will take place in place one of the lectures, roughly in the middle of the semester. The questions will have the same format as the the homework problems. You can bring and use any amount of printed materials (notes, books, research papers). However, you cannot use any device connected to the internet (phones, tablets, computers).
Final exam (40% of grade)
Final exam will take place during the final period, at the same time as the class. The format of the final exam will be exactly the same as the midterm exam.

Note that the total percentage adds up to 110%. That means you can make up lost points. If you want to do well in the course, make sure to submit solutions to all homework problems.

Lectures

There is one 2.5 hour class meeting per week, divided by a 15 minute break. In accordance with NYU guidelines, you must wear a mask covering your face and nose whenever you are attending lecture.

Homeworks

Your solution must be turned both in via Gradescope on NYU Brightspace and as a printed out hard-copy in class by the due date. Use LaTeX and submit your solution as a PDF file. Use the following template.

Handout date Due date
Homework #0 January 30 February 6
Homework #1 (not posted yet) February 6 February 20
Homework #2 (not posted yet) February 20 March 6
Homework #3 (not posted yet) March 27 April 10
Homework #4 (not posted yet) April 10 April 24

Tentative course schedule

I am hoping to cover the topics listed below. Additional topics (decision trees, k-nearest neighbor, neural networks) might be added if time permits.

Week 1 January 30 Generalization error, Test error, Bayes classifier, Hoeffding Bound, Empirical Risk Minimization (ERM), Overfitting, PAC model with finite hypothesis class
Week 2 February 6 PAC model, No-free Lunch Theorem, Vapnik-Chervonenkis dimension, Sauer's lemma, epsilon-nets, epsilon-samples
Week 3 February 13 Upper and lower bounds on the sample complexity for PAC model
Week 4 February 20 Agnostic PAC model, Uniform convergence, McDiarmid's inequality, Rademacher complexity
Week 5 February 27 Upper and lower bounds on the sample complexity for Agnostic PAC model
Week 6 March 6 Online learning model, Halving algorithm, Mistake bound, Perceptron, Winnow, Online-to-batch conversion
Week 7 March 13 Midterm exam
Week 8 March 20 Spring break — No lectures
Week 9 March 27 Hardness of agnostically learning of halfspaces, Logistic regression, Online convex optimization, Online gradient descent, Stochastic connvex optimization, Stochastic gradient descent
Week 10 April 3 Support Vector Machines (SVM), Primal and dual formulation, Kernels, Hilbert spaces
Week 11 April 10 Margin theory for Support Vector Machines
Week 12 April 17 Weak-learning, Strong-learning, Boosting
Week 13 April 24 TBD
Week 14 May 1 TBD
Week 15 May 8 TBD
Week 16 May 15 Final exam

Course Policies

Homeworks

Your solution must be turned both in via Gradescope on NYU Brightspace and as a printed out hard-copy in class by the due date. Use LaTeX and submit your solution as a PDF file. Use the following template.

Collaboration is allowed on homework, but solutions must be written independently. If you use external sources (books, online documents), list them in your solutions.

Grade changes

Fair grading is important to me. If you feel that you were incorrectly docked points on an assignment or exam, please let me know - it is very possible I or the TAs misunderstood your work. Do not wait until the end of the semester to bring up grading issues! If you notice something off, please ask ASAP.

Questions about performance

If you are struggling in the class, contact me as soon as possible so that we can discuss what’s not working, and how we can work together to ensure you are mastering the material. It is difficult to address questions about performance at the end of the semester, or after final grades are submitted: by Tandon policy no extra-credit or makeup work can be used to improve a students grade once the semester closes. So if you are worried about your grade, seek help from me early.

Exams

Midterm and Final will be open-book written exams. They will be 2.5 hours long.

You can bring and use any amount of printed or hand-written material you want (hand-written or printed notes, cheat sheets, textbooks, lecture notes, research papers, etc.) All materials you use have to be in paper form. Use of electronic devices will not be allowed. This includes computers, laptops, tablets, e-readers, smartphones, smart-watches, headphones, calculators, etc.

Any communication with anybody inside or outside the classroom during the exam will be stricly prohibited. Passing of any materials or objects from one student to another will not be allowed. If you would like to have a cheat sheet or notes from your classmate, make sure to get a copy before the exam.

Bring pens, pencils, and extra paper for writing. You can bring ruler and compass if you want, but they will not be necessary or useful. I will bring extra writing paper as well. Calculators of any kind will not be allowed. You will have to do all your calculations on paper.

You can bring drinks and food to the exam.

Learning materials

Books

Other materials