CS 137 Deep Neural Networks

Instructor

Class times and location

TuTh 12:00-1:15pm, JCC160

Office hours:

TBD

Description & Objective:

Deep neural networks are tremendously successful in numerous applications especially when the data is complex and large in scale. In this course, we will talk about typical deep network architectures and techniques of training these models. We will focus on the following topics

Feedforward neural network
Convolutional neural network: convolutional, non-linear, pooling, and batch normalization layers; CV applications
Recurrent neural network: vanilla RNN, LSTM, GRU; NLP applications
Attention networks: attention mechanism, self-attention
Optimization: stochastic optimization, practical issues like gradient vanishing and explosion
Regularization: regularization with norms, dropout, data augmentation
Computation: back-propagation, packages like tensorflow and keras

After this course, a successful student should acquire the following abilities for a learning problem: 1) deciding whether deep learning is appropriate, 2) identifying the appropriate type of neural networks, 3) implementing neural networks with existing packages, and 4) training neural networks correctly.

Materials:

Book: deep learning book. Goodfellow et al. MIT Press. 2016. Similar courses:

Convolutional Neural Networks for Visual Recognition at Stanford.
Natural Language Processing with Deep Learning at Stanford.

Reading: The first part of this course will be based on ``Deep Learning’’ by Goodfellow et al.. The book is easy to read. We do not have time to cover all materials in this book, but these sections will be very useful in this course.

Chapter 4.1 through 4.3: a basis for numerical optimization.
Chapter 5: a good review of machine learning basics.
Chapter 6: feedforward neural networks and backpropagation
Chapter 7.1, 7.4, 7.8, and 7.12: ingredients of regularizing deep models
Chapter 8.1 through 8.5: optimization methods for training deep models
Chapter 9.1 - 9.3: convolutional neural networks
Chapter 10.1, 10.2.1, 10.2.2, 10.5, 10.7, and 10.10: recurrent neural networks

Course Work and Grading Policy

In-class quizzes (8%): AT LEAST FIVE in-class quizzes will be scheduled at random dates. These quizzes are used to check your reading. They are also used to encourage attendance and collect feedback. Your grade for this part will be calculated based on your best FIVE scores.
Participation (4%):
- participate class discussion (2%): the instructor will take notes at students' questions and monitor class discussions.
- participate piazza discussion (2%): the top 10 piazza contributors get 1%. Other students get credit in proportional to the 10th person.
Assignments (50%):
- Assignment 1 (6%): setting up the programming environment; implementing a very simple regression model
- Assignment 2 (12%): implementing and training a feedforward neural network
- Assignment 3 (16%): implementing a convolutional neural network
- Assignment 4 (16%): implementing a recurrent neural network
Final project (38%):
- Project proposal (8%): Students are encouraged to form teams to work on a problem as the final project. A team can have at most three students. A team needs to first write a proposal, which includes problem description, the dataset, the plan, and a review of current methods.
- Project implementation and report (20%): The team needs to excute the plan for the proposed problem and write a report. The report should take the format of research paper.
- Project presentation (10%): The team needs to present the project to the entire class.

Collaboration: Discussions are highly encouraged, but all work need to be completed by individuals or teams independently. You can communicate your ideas verbally or by handwritten notes, but you cannot share your code or report with each other. If you need to use codes from online resources, you need to download corresponding packages or files and import the functions or classes you want to use. You need to clearly acknowledge the usage of these resources.

Late submissions: Every student get 3 free tickets representing 3 extra days you can spend on your projects. If all tickets are used up, a late submission gets its points discounted by 50% if it is within 24 hours after the deadline and zero points if it is later. If a group project is late, then the rule falls to all group members, and everyone’s share is calculated separately.

Prerequisites:

Comp 135 Introduction to Machine Learning.

Academic Integrity Policy:

On assignments: you must work out the details of each solution and code/write it out on your own. You may verbally discuss the problems and general ideas about their solutions with other students, but you CANNOT show and copy written or typed solutions from others. You may consult other textbooks or existing content on the web, but you CANNOT ask for answers through any question answering websites like (but not limited to) Quora, StackOverflow, etc.. If you see some material having the same problem and providing a solution, you CANNOT check or copy the solution provided.

On the final project: each team needs to work out the project on its own. The team members should try their best balance the work between the two team members. If any code is from a third-party, the code needs to be wrapped in a function or package and labled as third-party.

This course will strictly follow the Academic Integrity Policy of Tufts University. For any issues not covered above, please refer to the Academic Integrity Policy at Tufts.

Accessibility:

Tufts and the instructor of CS 137 in 2022 Fall strive to create a learning environment that is welcoming students of all backgrounds. Please see the detailed accessibility policy at Tufts.