Abstract: Deep learning has recently reshaped the landscape of computer vision research and application. The depth of neural networks is of central importance for recognition accuracy, but deeper neural networks are more difficult to train. In this talk, I will discuss the challenges as well as recent advances of learning deeper networks. I will introduce my recent work called Deep Residual Learning that enables ultra-deep networks with 150+ layers. This method is the foundation of our 1st-place winning entries in all five main tracks in ImageNet and COCO 2015 competitions which cover image classification, object detection, and semantic segmentation. This talk also covers the advance of object detection systems and the intuitions behind them, and highlights the importance of learning visual features for recognition.
Bio: Dr. Kaiming He is a Research Scientist at Facebook AI Research (FAIR) as of August 2016. Before that, he was a Lead Researcher at Microsoft Research Asia (MSRA) which he joined in 2011. His research interests are on computer vision and deep learning. He has received two CVPR Best Paper Awards as the first author, respectively in 2009 and 2016. His work on Deep Residual Networks (ResNets) won the 1st places in all five major tracks in ImageNet and MS COCO competitions 2015 that covered image classification, object detection, and semantic segmentation. He received the PhD degree in 2011 from the Chinese University of Hong Kong, and the BS degree in 2007 from Tsinghua University.
Abstract: Contemporary neural network models perform a fixed amount of computation for a given amount of data. For instance, a feed-forward network accepts an input of a predefined size, and performs a constant number of computation steps. However, many algorithmic problems require super-linear computation time. For example, multi-digit multiplication needs O(n log n) computation steps; therefore it cannot be fully learnt by a feed-forward network, a convolutional network, or a recurrent network. We present two families of Turing complete models - neural GPU, and one based on discrete decisions. Our models learn to rearrange long sequences and to perform arithmetics. They learn multi-digit decimal multiplication, addition, and combinations of such operations.
Bio: Wojciech Zaremba is a researcher and founder at OpenAI, where he leads the robotics team. He was a student of Prof. Rob Fergus and Prof. Yann LeCun at New York University where he graduated in less than 3 years. He holds a Master's degree summa cum laude from Ecole Polytechnique in Paris and another one from University of Warsaw. Wojciech has been working at Facebook AI Research and at Google Brain. His interests are in robotics, meta-learning, and in Turing complete neural-network based models.
Abstract: Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
Bio: Kunal Talwar is a Research Scientist at Google, working in the areas of Differential Privacy, Algorithms and Machine Learning. He graduated from UC Berkeley in 2004, where he worked with Christos Papadimitriou and Satish Rao. Before joining Google, he was a Senior Researcher at Microsoft Research in Silicon Valley.
Abstract: Many machine learning models are vulnerable to "adversarial examples"—inputs that are intentionally designed to cause the model to produce the wrong output. These inputs are often so subtle that a human observer cannot see that anything has been altered. Because adversarial examples that fool one machine learning model often fool another, an attacker can construct them without access to the target model. Explicitly training models to defend against adversarial attack is a partially effective defense strategy, and can also improve the performance of the model on naturally occurring data.
Abstract: In machine learning, the two dominating approaches to learning generative models of data has mostly been based on either directed graphical models or undirected graphical models. In this talk, I'll discuss a third approach, which has become more popular only recently: autoregressive generative models. Thanks to neural networks, this family of models has been shown to be very competitive, both in terms of the realism of the data they can generate and the data representation they can learn. I'll discuss a variety of such neural autoregressive models and dissect the advantages and disadvantages of this approach.