CS 294-131: Special Topics in Deep Learning

Instructors

Trevor Darrell

Dawn Song

Teaching Assistants

Lisa Anne Hendricks

Office Hours

Lisa Anne: Monday 5–6:00 pm, Soda-Alcove-283H

Lectures

Time: Monday 1–2:30 pm

Location: 306 Soda

Room Limit: Soda 306 is designed for smaller courses. We increased course enrollment so more students could benefit from this course. However, if the room becomes too full (and thus poses a fire hazard), students who arrive after the room has reached capacity will be directed to watch the lecture remotely. The link for the live webcast (and recorded lectures) can be found on Piazza.

You may see the intro slides from the first day of class here.

Mailing list and Piazza

To get announcements about information about the class including guest speakers, and more generally, deep learning talks at Berkeley, please sign up for the talk announcement mailing list for future announcements.

If you are in the class, you may sign up on Piazza. Additionally, you should sign up for the class slack channel and the class google group (this is different than the talk announcement mailing list).

Arxiv Summaries

This semester we started summarizing interesting papers from Arxiv each week. Check out the papers we have chosen and summarized here!

Syllabus

Date	Speaker	Readings	Talk	Deadlines
08/28	Anima Anandkumar	Main Readings: Tensor Regression Networks by Jean Kossaifi, Zachary Lipton, Aran Khanna, Tommaso Furlanello and Anima Anandkumar Tensor Contractions with Extended BLAS Kernels on CPU and GPU by Yang Shi, U.N. Niranjan, Anima Anandkumar, Cris Cecka. Also see a blog post and poster. Background Reading: Tensor Decompositions for Learning Latent Variable Models by A. Anandkumar, R. Ge, D. Hsu, S.M. Kakade and M. Telgarsky. Also see a blog post. Fast and Guaranteed Tensor Decomposition via Sketching by A. Anandkumar, R. Ge, D. Hsu, S.M. Kakade and M. Telgarsky. Also see a poster. Jupyter notebooks (credits will be provided on AWS to run them): Tensors on tensorly package (with mxnet backend) Gluon tutorials for deep learning Visual question & answering using sketches	Role of Tensors in Machine Learning
09/05	Labor Day - No Class
09/11	Vladlen Koltun	Main Readings: Learning to Act by Predicting the Future by A. Dosovitskiy and V. Koltun Playing for Benchmarks by S. Richter, Z, Hayder, and V. Koltun Background Reading: A Critique of Pure Vision by P. Churchland, V.S. Ramachandran, and T. Sejnowski. Playing for Data: Ground Truth from Computer Games by S.Richter, V. Vineet, S. Roth, and V. Kolton.	Learning to Act with Natural Supervision
09/18	Jianfeng Gao	Main Readings: ReasoNet: Learning to Stop Reading in Machine Comprehension by Y. Shen, P. Huang, J. Gao, and W. Chen Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access by B. Dhingra, L.Li, X. Li, J. Gao, Y.Chen, F. Ahmed, and L. Deng Background Reading: SQuAD: 100,000+ Questions for Machine Comprehension of Text by P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang POMDP-Based Statistical Spoken Dialog Systems: A Review by S. Young, M. Gasic, B. Thomson, and J.Williams	Neural approaches to Machine Reading Comprehension and Dialogue	Project Proposal Due
09/25	Quoc Le and Barret Zoph	Main Reading: Neural Architecture Search with Reinforcement Learning by B. Zoph and Q. Le. Learning Transferable Architectures for Scalable Image Recognition by B. Zoph, V. Vasudevan, J. Schlens, and Q. Le.	Learning Transferable Architectures for ImageNet
10/02	Ross Girshik	Main Reading: Mask R-CNN K. He, G. Gkioxari, P. Dollar, R. Girshik Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. by P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang. Background Reading: Feature Pyramid Networks for Object Detection by T. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, S. Belongie Focal Loss for Dense Object Detection T. Lin, P. Goyal, R. Girshick, K. He, P.Dollar	The Past, Present, and Future of Object Detection
10/09	Igor Mordatch	Main Reading: Emergence of Grounded Compositional Language in Multi-Agent Populations by Mordatch and Abbeel. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments by Lowe, Wu, Tamar, Harb, Abbeel, and Mordatch. Background Reading: A Paradigm for Situated and Goal-Driven Language Learning by Gauthier and Mordatch. Predicting Pragmatic Reasoning in Language Games. by Frank and Goodman.	Emergence of Grounded Compositional Language in Multi-Agent Populations
10/16	David Patterson	Main Reading: In-Datacenter Performance Analysis of a Tensor Processing Unit by Jouppi et al. A Cloud-Scale Acceleration Architecture by Caulfield et al.	Evaluation of a Domain-Specific Architecture for Deep Neural Networks in the Datacenter: The Google TPU
10/23	Matthew Johnson	Main Reading: Composing graphical models with neural networks for structured representations and fast inference by Johnson et al. Linear dynamical neural population models through nonlinear embeddings by Gao et al. Background Reading: Structured Inference Networks for Nonlinear State Space Models by Krishnan et al. Conjugate-Computation Variational Inference : Converting Variational Inference in Non-Conjugate Models to Inferences in Conjugate Models by Khan and Lin.	Composing graphical models and neural networks for structured representations and fast inference
10/30	Percy Liang	Main reading: Understanding Black-box Predictions via Influence Functions by P.W. Koh and P. Liang Developing Bug-Free Machine Learning Systems With Formal Mathematics by D. Selsam, P. Liang, and D. Dill. Background Reading: A Roadmap for a Rigorous Science of Interpretability by F. Doshi-Velez and B. Kim. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods by N. Carlini and D. Wagner. Gradient Estimation Using Stochastic Computation Graphs by J. Schulman, N. Heess, T. Weber, and P. Abbeel	Fighting Black Boxes, Adversaries, and Bugs in Deep Learning	Project Milestone Due
11/06	Li Deng	Main Reading: Deep neural networks for acoustic modeling in speech recognition by G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhouke, P. Nguyen, T. Sainath, and B. Kingsbury. An Unsupervised Learning Method Exploiting Sequential Output Statistics by Y. Liu, J. Chen, and L.Deng. Background Reading: Unsupervised transcription of historical documents by T. Berk-Kirkpatrick, G. Durrett, and D. Klein. Deep Learning: Methods and Applications (Ch. 2, 7, 11) by L. Deng and D. Yu. Deep Learning (Ch. 8, 10, 20) by I. Goodfellow, Y. Bengio, and A. Coourville.	From Supervised to Unsupervised Deep Learning: Successes and Challenges
11/13	Rob Fergus	Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play by S. Sukhbaatar et al. Unsupervised Learning of Disentangled Representations from Video by E. Dention and V. Birodkar.	Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play and Unsupervised Learning of Disentangled Representations from Video
11/20	Rishabh Singh	Main Reading: RobustFill: Neural Program Learning under Noisy I/O by J. Devlin, J. Uesato, S. Bhupatiraju, R. Singh, A. Mohamed, and P. Kohli. Neuro-symbolic Program Synthesis by E. Parisotto, A. Mohamed, R. Singh, L. Li, D. Zhou, and P. Kohli Background Reading: AP: Artificial Programming by R. Singh and P. Kohli Neural Turing Machines by A. Graves, G. Wayne, I. Danihelka	Neural Program Synthesis
11/27	Danny Tarlow	Main Reading: TerpreT: A Probabilistic Programming Language for Program Induction by Alex Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, and Daniel Tarlow Differentiable Programs with Neural Libraries by Alex Gaunt, Marc Brockschmidt, Nate Kushman, and Daniel Tarlow. Background Reading: Neural Random-Access Machines by Karol Kurach, Marcin Andrychowicz, Ilya Sutskever Programming with a Differentiable Forth Interpreter by Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel	Differential Interpreters
11/27 DATE CHANGED!	Poster Session	3:00-5:00 SDH Atrium
12/09				Final Report Due

Course description

In recent years, deep learning has enabled huge progress in many domains including computer vision, speech, NLP, and robotics. It has become the leading solution for many tasks, from winning the ImageNet competition to winning at Go against a world champion. This class is designed to help students develop a deeper understanding of deep learning and explore new research directions and applications of deep learning. It assumes that students already have a basic understanding of deep learning. In particular, we will explore a selected list of new, cutting-edge topics in deep learning, including new techniques and architectures in deep learning, security and privacy issues in deep learning, recent advances in the theoretical and systems aspects of deep learning, and new application domains of deep learning such as autonomous driving.

Class format and project

This is a lecture, discussion, and project oriented class. Each lecture will focus on one of the topics, including a survey of the state-of-the-art in the area and an in-depth discussion of the topic. Each week, students are expected to complete reading assignments before class and participate actively in class discussion.

Students will also form project groups (two to three people per group) and complete a research-quality class project.

Enrollment information

For undergraduates: Please note that this is a graduate-level class. However, with instructors’ permission, we do allow qualified undergraduate students to be in the class. If you are an undergraduate student and would like to enroll in the class, please fill out this form and come to the first lecture of the class. Qualified undergraduates will be given instructor codes to be allowed to register for the class after the first lecture of the class, subject to space availability.

Students may enroll in this class for variable units.

1 unit: Participate in reading assignments (including serving as discussion lead once and Arxiv lead once).
2 units: Complete a project. Projects may fall into one of four categories:
- Traditional Literature Review of a deep learning topic (e.g., literature review of deep dialogue systems)
- Distill-like Literature Review of a deep learning topic (e.g., a Distill-like blog post illustrating different optimization techniques used in deep learning)
- Reimplement research code and open source it
- Conference level research project
3 units: Both reading assignments and a project.
You may not take this class for 4 units.

Deadlines

Reading assignment deadlines:
- For students,
  - Submit questions by Friday noon
  - Vote on the poll of discussion questions by Saturday 11:59 pm
- For discussion leads,
  - Send form to collect questions from students by Wednesday 11:59 pm
  - Summarize questions proposed by students to form the poll and send it by Friday 11:59 pm
  - Summarize the poll to generate a ranked & categorized discussion question list and send the list to teaching staff by Sunday 7pm
Arxiv leads (new this semester!),
- Throughout the week discuss papers which have appeared on Arxiv during the prior week on Slack. All Arxiv leads are expected to be involved in the Slack discussion. Other students may participate as well, but they are not required to.
  - Arxiv is an archive of scientific papers covering a broad set of fields. When researchers want to share their results, they frequently place the paper on Arxiv. There are a few Arxiv pages Arxiv leads should follow: Computation and Language, Artificial Intelligence, Computer Vision and Pattern Recognition, Learning, Robotics, and Neural and Evolutionary Computing. Not all papers are deep learning papers, but most (if not all) deep learning papers will fall into one of these categories.
- Choose (approximately) five exciting papers and write a short summary (0.5 pages) for each paper and send to the TA by Monday morning.
- Give a five minute presentation at the beginning of the next class (five minutes total, not per paper).

Grading

20% class participation
25% weekly reading assignment
- 7.5% discussion leads
- 7.5% Arxiv leads (new this semester!)
- 10% individual reading assignments
55% project

Additional Notes

For students who need computing resources for the class project, we recommend you to look into AWS educate program for students. You’ll get 100 dollar’s worth of sign up credit. Here’s the link .