COMP150: Machine Learning for Graph Data Analytics

Instructor

Liping Liu

Class times and location

TR 10:30-11:45 @ Collaborative Learning and Innovation Complex 316

Office hours:

W 2-3pm, F 2-3pm at Halligan 234

Overview

Graph and network data are ubiquitous and often in large scale. Graph data are generally characterized by the graph structure and data attached to graph nodes or edges. Machine learning is an important approach to automated information extraction from graph data. However, graph data need special model designs, as most learning models (e.g. neural networks) only accept vectors as the input. Models for graph data generally fall into two categories: 1) models that learn vector representations of graph data, and 2) models that take graphs as the input.

In this course, we will start with an introduction of graph theory, linear algebra, and machine learning, then we will cover the following topics in depth:

  1. Node representation
  2. Graph representation and generative models
  3. Graph convolutional neural networks
  4. Learning to solve hard graph problems
  5. Graphs in chemistry
  6. Knowledge graph

The course work consists of 3 projects and a final project.

Objectives

After this course, a successful student should acquire the following abilities when solving a learning problem involving graph data:

  1. identifying the type of learning problem (e.g. whether node embeddings or graph embeddings are needed; what target to fit)
  2. choosing the appropriate type of learning models (e.g. preparing correct inputs and fitting targets to the model)
  3. training related learning models with existing packages to solve the problem

Schedule

Week Content Assignment
week 1 (Sep 2) Graph theory; Linear algebra  
week 2 (Sep 9) Graph Laplacian; Graph signal [tutorial] Proj 1.1 out
week 3 (Sep 16) Node2vec [paper]; Discussion: embedding propagation Proj 1.1 due
week 4 (Sep 23) Visualization [tSNE]; Other variants [papers] Proj 1.2 out
week 5 (Sep 30) GCN [paper]; Discussion: GAT [paper] Proj 1.2 due; Proj 2 out
week 6 (Oct 7) Other variants  
week 7 (Oct 14), R only Graph classification and graph kernel Proj 2 due
week 8 (Oct 21) Graph classification (cont.); Autoencoding intro [paper] Final proj proposal due; Proj 3 out
week 9 (Oct 28) Graph encoding [paper]  
week 10 (Nov 4) Graph generation [papers] Proj 3 due
week 11 (Nov 11) Chemical graph tutorial  
week 12 (Nov 18) Molecule graph; reaction graph  
week 13 (Nov 25, T only) Knowlege graph  
week 14 (Dec 2) Knowledge graph embedding; knowledge graph applications Final proj due at exam

Course Work and Grading Policy

Prerequisites:

Comp 135 Introduction to Machine Learning. Background knowledge in linear algebra

Schedule

Academic Integrity Policy:

On assignments: you must work out the details of each solution and code/write it out on your own. You may verbally discuss the problems and general ideas about their solutions with other students, but you CANNOT show and copy written or typed solutions from others. You may consult other textbooks or existing content on the web, but you CANNOT ask for answers through any question answering websites like (but not limited to) Quora, StackOverflow, etc.. If you see some material having the same problem and providing a solution, you CANNOT check or copy the solution provided.

On the final project: each team needs to work out the project on its own. The team members should try their best balance the work between the two team members. If any code is from a third-party, the code needs to be wrapped in a function or package and labled as third-party.

This course will strictly follow the Academic Integrity Policy of Tufts University. For any issues not covered above, please refer to the Academic Integrity Policy at Tufts.

Accessibility:

Tufts and the instructor of COMP 135 in 2018 Spring strive to create a learning environment that is welcoming students of all backgrounds. Please see the detailed accessibility policy at Tufts.