Schedule


Date

Assignments

Lecture

Topics and Slides

Readings,

Notes, and Demos

Recitation

Topics & Materials

Wed 01/16

out: HW0

out: survey0

Course Overview [slides]

  • Why take IntroML?
  • Course Goals
  • Logistics

Demo: Intro to Jupyter Notebooks and Data Analysis with NumPy

Mon 01/21

due: survey0

NO CLASS (MLK Holiday)

 

Intro to Python

Wed 01/23

due: HW0

out: HW1

Regression Overview [slides]

  • k-NN Regression
  • Linear Regression
  • Decision Tree Regression
  • Evaluation methods

Read: ISL textbook

  • Ch 1
  • Sec 2.1 - 2.2

Demo [notebook]: Using sklearn’s fit and predict functions for KNeighborsRegressor and DecisionTreeRegressor

Mon 01/28

 

Linear Regr. Algorithms [slides]

  • Exact Solutions [notes]
  • Gradient Descent [demo]

Read: ISL textbook

  • Sec 3.1 - 3.3 for simpler non-matrix formulation

Read: ESL textbook

  • Sec 3.1 - 3.2 for matrix math formulation

Read: DL textbook

Review of Matrix Math & Derivatives behind Linear Regression

[Math primer from Harvard cs181]

[External demo of gradient descent]

Wed 01/30

due: HW1

out: HW2

Regression: Regularization

[slides]

  • Penalization Methods
  • L2 (“Ridge”)
  • L1 (“Lasso”)
  • Cross Validation

Read: ISL textbook

  • Sec 5.1 (CV)
  • Sec 6.2 (Ridge & Lasso)

Mon 02/04

Statistical Decision Theory [slides]

  • Intro to Probability
  • Decision Theory
  • Curse of Dimensionality
  • Bias-Variance Tradeoff

 Read: ESL textbook

  • Sec 2.4 - 2.6
  • Sec 2.9
  • Sec. 7.3

Pipelines and Cross Validation

[Demo Notebook]

Wed 02/06

due: HW2

out: HW3

Classification Overview [slides]

  • Binary vs Multi-class
  • Logistic Regression
  • Decision Tree Classifiers
  • Evaluation metrics
  • Confusion Matrices
  • ROC curves
  • PR curves

Read ISL textbook

  • Sec. 4.1 - 4.2

Read: EMLM book

Mon 02/11

Logistic Regression [slides]

  • Logistic sigmoid function
  • LR optimization problem

Read: ISL textbook

  • Sec 4.3 - 4.3.4

Logistic Regression numerical implementation issues [notebook] [useful blog post]

Wed 02/13

due: HW3


out:
Project1

Logistic Regression  [slides]

  • Step sizes
  • Optimization via first-order stochastic gradient descent
  • Optimization via second-order methods

 Read: ESL textbook

  • Sec 4.4.1 - 4.4.4

Read: DL textbook

Mon 02/18

 

NO CLASS (President's Day Holiday)

 

No Recitation (Holiday)

Wed 02/20

Feature Processing & Selection

[slides]

  • Feature transformations
  • Subset selection
  • Brief: Missing data

 Read: ISL textbook

  • Sec. 6.1

Thu 02/21

(Mon on Thurs at Tufts)

Neural Networks 1/2 [slides]

  • Feed-forward NNs
  • (aka Multi-Layer Perceptrons)

 Read: DL textbook Ch 6

  • Sec 6.0 (Intro)
  • Sec 6.1 (xor)
  • Sec 6.2 (Learning)
  • Sec 6.5 (Backprop)

Mon 02/25

 

Neural Networks 2/2 [slides]

  • Multi-class classification
  • Backpropagation [demo]
  • Automatic differentiation
  • Brief: NNs for Images
  • Brief: NNs for Sequences

Skim: M. Nielson textbook

Skim: DL textbook

Neural Nets demo with automatic differentiation

[notebook]

[HIPS/autograd python package tutorial]

Wed 02/27

out: HW4

Classifiers Using Bayes Theorem [slides]

  • Naive Bayes for discrete features
  • Naive Bayes for continuous features
  • Linear Discriminant Analysis / Quadratic Discriminant Analysis for continuous features

 Read: ISL textbook

  • Sec. 4.4.1
  • Sec. 4.4.2
  • Sec 4.4.3

Skim: Naive Bayes article by Jake VanderPlas

Mon 03/04

Decision Trees [slides]

  • Construction
  • Pruning

Proj 1 work time!

 Read: ISL textbook

  • Sec. 8.1

Office hours for Project 1

and/or

Review for midterm

Wed 03/06

due:Project1

Review Session for Midterm

 

Mon 03/11

 

Midterm Exam

 

No Recitation

Wed 03/13

due: HW4

Improving Classifier Performance [slides]

  • Data Augmentation
  • Early Stopping
  • Dropout

Hyperparameter Optimization

  • Grid Search
  • Random Search
  • Bayesian Optimization

 Read: DL textbook

  • Sec 11.4

Mon 03/18

NO CLASS (Spring Break)

 

No Recitation

Wed 03/20

NO CLASS (Spring Break)

 

Mon 03/25

out: Project2

Bagging & Boosting  [slides]

  • Random Forests
  • XGBoost

 Read: ISL textbook

  • Sec 8.2

Skim: ESL textbook

  • Ch. 10

No Recitation

Wed 03/27

Kernels [slides]

  • What is a kernel?
  • Kernelized Linear Regr.
  • Kernelized Logistic Regr.
  • [demo] (inspired by D. Sheldon at U.Mass)

 

Mon 04/01

Support Vector Machines [slides]

  • Margins of Linear Classifiers
  • Hinge Loss vs. Logistic Loss
  • Kernels and SVMs

 Read: ISL textbook

  • Sec. 9.1 - 9.5

Bag-of-words and Naive Bayes

Wed 04/03

out: HW5

Recommendation Systems [slides]

  • Content-based (supervised per-user classifier)
  • Collaborative filtering (unsupervised matrix factorization)
  • Evaluation

Koren et al. 2009. “Matrix Factorization Techniques for Recommender Systems.” [IEEE link]

Mon 04/08

 Dim. Reduction: Overview [slides]

  • Task Overview
  • PCA Intro
  • Examples

 Read: ISL textbook

  • Sec 6.4 (What goes wrong in high dims?)
  • Sec 10.1 (Intro)
  • Sec 10.2 (PCA)

Kernelized SVMs for multi-class classification

Wed 04/10

due: Project2

due: HW5

Principal Components Analysis [slides]

  • PCA practical: Choosing K
  • PCA Math Derivation [notes]
  • t-SNE
  • NN embeddings

Skim: DL textbook

Applications:

Mon 04/15

NO CLASS (Patriots' Day)

 

Practical SVD and PCA

Wed 04/17

out: Project3

Clustering: Overview [slides]

 Read: Jake VanderPlas’ Data Science Handbook:

Mon 04/22

out: HW6

Fairness/Ethics in ML [slides]

Examples of ML in the Wild:

K-means

Wed 04/24

ML Frontiers

  • Finish clustering (Gaussian mixtures)
  • Reinforcement Learning

Mon 04/29

 due: HW6

Final Exam Review

 

Final Exam Review

Fri

05/03

Final Exam

  • 3:30-5:30pm Fri May 3
  • Halligan 111A

 

Wed

05/08

due: Project3