This subject qualifies as a Computer Systems concentration subject.
Graph analytics have applications in a variety of domains, such as social network and Web analysis, computational biology, machine learning, and computer networking. This course will cover research topics in graph analytics including algorithms, optimizations, frameworks, and applications. Students will learn about both the theory and practice of designing efficient graph algorithms (parallel, cache-efficient, external-memory, etc.). We will also study design choices in high-level graph processing frameworks. Students will read and present research papers, and also complete a research project. This course is suitable for graduate students or advanced undergraduates who have taken 6.046 and 6.172/6.871.
This is a graduate-level course where we will cover the latest
research on graph analytics. Advanced undergraduates may enroll if
they have taken 6.046 and 6.172. The course units are 3-0-9.
Lectures
For each lecture we will study several research papers; for each paper
one student will give a presentation, which will be followed by a
discussion.
Grading
The assignments consist of writing weekly paper
reviews, doing several paper
presentations, class participation, and completing an open-ended
research project. The grading breakdown is as follows.
Grading Breakdown |
Paper Questions and Reviews | 20% |
Paper Presentations | 25% |
Research Project | 50% |
Class Participation | 5% |
Paper Questions and Reviews
Prior to each lecture, you are
expected to read the papers under "Required Reading" in the schedule
for that lecture. Students are required to answer a posted question about each paper, due at 12:00pm on the day of the lecture.
In addition, students are required to submit one paper review
every week, due 11:59pm on Tuesdays. The review should be on a paper
chosen from any of the starred papers (*) under "Required Reading" for
the two lectures the week (i.e., the Wednesday and Friday immediately
after the due date).
The review should first describe the problem the paper is trying to
solve, why it is important, the main ideas proposed, and the results
obtained. The review should then describe how the ideas are novel
compared to existing work at the time, the strengths and weaknesses of
the paper, and any ideas you may have for improving the techniques
and/or evaluation criteria, and any open problems or directions for
further work that you can think of.
The answers to the paper questions as well as the paper reviews will be
submitted
on Learning Modules. The paper reviews will be made visible
after each submission deadline, and you are encouraged to read other
reviews to improve your understanding and to prepare for the class discussion.
Paper Presentations
Students will give several presentations on assigned papers throughout
the semester. These presentations should be 20 minutes long with
slides, followed by a discussion. The presentation should discuss the
motivation for the problem being solved, any definitions needed to
understand the paper, key technical ideas in the paper, theoretical
and/or experimental results, and existing work at the time (doing a
literature search and including a discussion of recent related work
would be a bonus). The presenter should also present his or her own
thoughts on the paper (perceived strengths and weaknesses, directions
for future work, etc.), and pose several questions for discussion. The
presenter will be expected to lead the discussion on the paper.
Research Project
A large part of the work in this course is
in proposing and completing an open-ended research project. This will
give you a taste of doing research in the area of graph analytics.
The project may involve (but is not limited to) any of the following
tasks: implementation of non-trivial and theoretically-efficient graph
algorithms; analyzing and optimizing the performance of existing graph
codes; designing new graph algorithms that are theoretically and/or
practically efficient; applying graph algorithms in the context of
larger applications, e.g., in social network and Web analysis, machine
learning, or computational biology; exploring new problems for
analyzing the structure of graphs; improving or designing new graph
processing frameworks/systems; and conducting a survey of a topic. The
project may explore parallelism, cache-efficiency, I/O-efficiency, and
memory-efficiency. The project can be related to research that you
are currently doing, subject to instructor approval.
The project will be done in groups of 2-3 people and consist of a proposal,
mid-term report, poster presentation, and final report.
The timeline for the
project is as follows.
Assignment | Due Date |
Pre-proposal meeting | 3/14 |
Proposal | 3/16 |
Mid-term Report | 4/13 |
Poster Session | 5/14 |
Final Report | 5/17 |
- Pre-proposal meeting: You will schedule a 15 minute meeting with the instructor to discuss what you would like to propose. Feedback will be given to be incorporated into the proposal.
- Proposal: The proposal should be about 2 pages long (excluding figures and references) and will describe the
project that you are proposing to work on, the main components of the
project, as well as a projected weekly schedule of what you plan to
accomplish throughout the semester. The deadline is 3/16, but you may turn it in on 3/19 if you have an appointment at the Communication Lab to improve your proposal.
- Mid-term report: The mid-term report will talk about what you have
accomplished so far, a breakdown of the contribution among group
members so far, any obstacles you encountered, any changes to the proposed tasks, and a schedule of the
remaining work to be done. This should be about 6 pages long (excluding figures and references).
The deadline is 4/13, you may turn it in on 4/16 if you have an appointment at the Communication Lab to improve your report.
- Poster session: We will have a poster session on Monday 5/14 (time TBD), where you will describe your research to the instructor and classmates, and learn about other projects.
- Final report: The final report will be in the style of a research paper describing your project. It should include an abstract summarizing the project, an introduction describing and motivating the problem, a brief discussion of related work, a brief overview of any background knowledge needed to understand the paper, followed by your contributions. It should also discuss any open problems or directions for further work, and include a breakdown of work among group members. The report should be about 10 pages long (excluding figures and references).
Piazza
We will be
using
Piazza for answering questions. You may publicly
post questions about the papers or any cool ideas that you have about
graph analytics. Discussions among students is encouraged. If possible, please use Piazza instead of emailing the course staff. You may use a private note if you want your post to be visible only to the course staff.
Collaboration Policy
You are welcome to read the papers with
other students, but all of your written work should be your own. You
should carefully acknowledge any collaborators and any outside sources
you use to complete the paper questions, paper reviews, and the
project reports. Please do not look at anyone else's answers before
submitting your answers to the paper questions and your paper reviews.
Computing Resources
We will be using Amazon Web Services.
Textbooks
Some of the readings will be from the following textbooks.
Introduction to Algorithms, 3rd Edition by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein (CLRS)
Networks, Crowds, and Markets by David Easley and Jon Kleinberg
Some Other Courses on Graph Analytics
Networks (Daron Acemoglu and Asu Ozdaglar, MIT)
Analysis of Networks (Jure Leskovec, Stanford)
Networks (David Easley and Jon Kleinberg, Cornell)
Topics in Social Data (Johan Ugander, Stanford)
Network Theory (Mark Newman, University of Michigan)
Graphs and Networks (Dan Spielman, Yale)
Statistical Network Analysis (Jennifer Neville, Purdue)
Network Analysis and Modeling (Aaron Clauset, Sante Fe Institute)
Parallel Graph Analysis (George Slota, RPI)
Large-Scale Graph Mining (A. Erdem Sariyuce, University of Buffalo)
Mining Large-scale Graph Data (Danai Koutra, University of Michigan)
Data Mining meets Graph Mining (Leman Akoglu, Stony Brook)
Graphs and Networks (Charalampos Tsourakakis, Aalto University)
Large-Scale Graph Processing (Keval Vora, Simon Fraser University)