Date |
Topic |
Instructor |
Scriber |
09/03/2018, Mon |
Lecture 01: Overview I [ slides ]
[Reference]:
- Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao,
Why and When Can Deep-but Not Shallow-networks Avoid the Curse of Dimensionality: A Review,
- Hrushikesh Mhaskar, Qianli Liao, Tomaso Poggio, Learning Functions: When is Deep Better Than Shallow, 2016.
- Michael Kohler, Adam Krzyzak, Nonparametric Regression Based on Hierarchical Interaction Models. IEEE Transactions on Information Theory , 63(3):1620 - 1630, 2016.
|
Y.Y. |
|
09/05/2018, Wed |
Lecture 02: Overview II [ slides ]
|
Y.Y. |
|
09/10/2018, Mon |
Lecture 03: Overview III [ slides ]
[Reference]:
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals,
Understanding deep learning requires rethinking generalization.
ICLR 2017.
[Chiyuan Zhang's codes]
- Peter L. Bartlett, Dylan J. Foster, Matus Telgarsky. Spectrally-normalized margin bounds for neural networks.
[ arXiv:1706.08498 ]. NIPS 2017.
- Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro. The Implicit Bias of Gradient Descent on Separable Data. [ arXiv:1710.10345 ]. ICLR 2018.
- Matus Telgarsky. Margins, Shrinkage, and Boosting. [ arXiv:1303.4172 ]. ICML 2013.
|
Y.Y. |
|
09/12/2018, Wed |
Lecture 04: Overview IV [ slides ] and Project 1 [ project1.pdf ]
[Reference]:
- Freeman, Bruna. Topology and Geometry of Half-Rectified Network Optimization, ICLR 2017. [ arXiv:1611.01540 ]
- Luca Venturi, Afonso Bandeira, and Joan Bruna. Neural Networks with Finite Intrinsic Dimension Have no Spurious Valleys. [ arXiv:1802.06384 ]
- Stephane Mallat, Group Invariant Scattering, Communications on Pure and Applied Mathematics, Vol. LXV, 1331–1398 (2012)
- Joan Bruna and Stephane Mallat, Invariant Scattering Convolution Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012
- Haixia Liu, Raymond Chan, and Yuan Yao, Geometric Tight Frame based Stylometry for Art Authentication of van Gogh Paintings, Applied and Computational Harmonic Analysis, 41(2): 590-602, 2016.
- Roberto Leonarduzzi, Haixia Liu, and Yang Wang, Scattering transform and sparse linear classifiers for art authentication. Signal Processing 150: 11-19, 2018.
|
GU, Hanlin Y.Y. |
|
09/19/2018, Wed |
Lecture 05: Harmonic Analysis of Convolutional Networks: Wavelet Scattering Net [ slides ]
|
Y.Y. |
|
09/24/2018, Mon |
Lecture 06: Harmonic Analysis of Convolutional Networks: Extension of Scattering Nets [ slides ]
|
Y.Y. |
|
09/26/2018, Wed |
Lecture 07: Convolutional Neural Network with Structured Filters [ slides ]
[Abstract]:
- In this lecture I'll introduce a recent work by Prof. Xiuyuan CHENG et al. in Duke University.
- Filters in a Convolutional Neural Network (CNN) contain model parameters learned from enormous amounts of data.
The properties of convolutional filters in a trained network directly affect the quality of the data representation
being produced. In this talk, we introduce a framework for decomposing convolutional filters over a truncated expansion
under pre-fixed bases, where the expansion coefficients are learned from data. Such a structure not only reduces the number
of trainable parameters and computation load but also explicitly imposes filter regularity by bases truncation. Apart from
maintaining prediction accuracy across image classification datasets, the decomposed-filter CNN also produces a stable
representation with respect to input variations, which is proved under generic assumptions on the bases expansion.
Joint work with Qiang Qiu, Robert Calderbank, and Guillermo Sapiro.
|
Y.Y. |
|
10/03/2018, Wed |
Lecture 8: Student Seminars on Project 1
[Team]: DENG Yizhe, HUANG Yifei, SUN Jiaze, TAN Haiyi
- Title: Real or fake? A Comparison Between Scattering Network & Resnet-18 [ slides ].
[Team]: YIN, Kejing (Jake) and QIAN, Dong
- Title: Feature Extraction and Transfer Learning [ slides ].
|
|
|
10/08/2018, Mon |
Lecture 9: Student Seminars on Project 1
[Team]: Bhutta, Zheng, Lan (Group 6)
- Title: Raphael painting analysis: Random cropping leading to high variance [ slides ].
|
|
|
10/10/2018, Wed |
Lecture 10: Sparsity in Convolutional Neural Networks [ slides ]
[Reference]:
- Jeremias Sulam, Vardan Papyan, Yaniv Romano, and Michael Elad. Multi-Layer Convolutional Sparse Modeling:
Pursuit and Dictionary Learning, IEEE Transactions on Signal Processing, vol. 66, no. 15, pp. 4090-4104, 2018. arXiv:1708.08705.
- Vardan Papyan, Yaniv Romano, and Michael Elad. Working Locally Thinking Globally: Theoretical Guarantees for Convolutional Sparse Coding, IEEE Transactions on Signal Processing, vol. 65, no. 21, pp. 5687-5701, 2018. arXiv:1707.06066.
- Vardan Papyan, Yaniv Romano, and Michael Elad. Convolutional Neural Networks Analyzed via Convolutional Sparse Coding, Journal of Machine Learning Research, 18:1-52, 2017. arXiv:1607.08194.
|
Y.Y. |
|
10/15/2018, Mon |
Lecture 11: Seminar: Exponentially Weighted Imitation Learning for Batched Historical Data.
[ slides ]
[Speaker]: WANG, Qing, Tecent AI Lab.
[Abstract]:
- We consider deep policy learning with only batched historical trajectories.
The main challenge of this problem is that the learner no longer has a simulator or “environment
oracle” as in most reinforcement learning settings. To solve this problem, we propose a monotonic
advantage reweighted imitation learning strategy that is applicable to problems with complex
nonlinear function approximation and works well with hybrid (discrete and continuous) action space.
The method does not rely on the knowledge of the behavior policy, thus can be used to learn from
data generated by an unknown policy. Under mild conditions, our algorithm, though surprisingly
simple, has a policy improvement bound and outperforms most competing methods empirically. Thorough
numerical results are also provided to demonstrate the efficacy of the proposed methodology.
This is a joint work with Jiechao Xiong, Lei Han, Peng Sun, Han Liu, and Tong Zhang.
[Team]: Huangshi Tian, Beijing Fang, Yunfei Yang (Group 3)
- Title: An In-Depth Look at Feature Transformation Ability of CNN
[ slides ].
|
|
|
10/22/2018, Mon |
Lecture 12: Implicit Regularization in Gradient Descent [ slides ]
[Reference]:
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals,
Understanding deep learning requires rethinking generalization. ICLR 2017. [ arXiv:1611.03530 ]
[Chiyuan Zhang's codes]
- Peter L. Bartlett, Dylan J. Foster, Matus Telgarsky. Spectrally-normalized margin bounds for neural networks.
[ arXiv:1706.08498 ].
- Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro. The Implicit Bias of Gradient Descent on Separable Data.
[ arXiv:1710.10345 ]
- Poggio, T, Liao, Q, Miranda, B, Rosasco, L, Boix, X, Hidary, J, Mhaskar, H. Theory of Deep Learning III: explaining the non-overfitting puzzle.
[ MIT CBMM Memo-73, 1/30/2018 ].
- Liao, Q., Miranda, B., Hidary, J., and Poggio, T. Classical generalization bounds are surprisingly tight for Deep Networks. MIT CBMM Memo-91.
[arXiv:1807.09659]
- Zhu, Weizhi, Yifei Huang, and Yuan YAO. On Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics.
[arXiv:1810.03389]
- Yuan Yao, Lorenzo Rosasco and Andrea Caponnetto,
On Early Stopping in Gradient Descent Learning, Constructive Approximation, 2007, 26 (2): 289-315.
- Tong Zhang and Bin Yu. Boosting with Early Stopping: Convergence and Consistency. Annals of Statistics, 2005, 33(4): 1538-1579.
[ arXiv:0508276 ].
|
Y.Y. |
|
10/24/2018, Wed |
Lecture 13: Seminar
[Speaker]: Baoyuan WU, Tencent AI Lab
[Abstract]: In this talk, I will introduce three topics if time permitted.
- Topic 1: Tencent ML-Images: large-scale visual representation learning. [ slides (.pptx) ]
The success of deep learning strongly depends on large-scale high-quality training data. Tencent ML-Images is an important open-source project, and it publishes a large-scale multi-label image database (including 18M images and 11K categories), the checkpoints with excellent capability of visual representation (80.73% top-1 accuracy on the validation set of ImageNet), as well as the complete codes. In this talk, I will introduce the construction of ML-Images and its main characteristics, the training of deep neural networks using large-scale image database, the transfer learning to single-label image classification on ImageNet, the feature extraction and image classification using the trained checkpoint. This project tries to give you a clear picture of the complete process of visual presentation learning based on deep neural networks.
Project address: https://github.com/Tencent/tencent-ml-images
- Topic 2: Lp-Box ADMM: a versatile framework for integer programming. [ slides (.pptx) ]
In this talk, we revisit the integer programming (IP) problem, which plays a fundamental role in many computer vision and machine learning applications. We propose a novel and versatile framework called Lp-box ADMM, which is based on two main ideas. (1) The discrete constraint is equivalently replaced by the intersection of a box and the Lp-ball. (2) We infuse this equivalence into the ADMM (Alternating Direction Method of Multipliers) framework to handle these continuous constraints separately and to harness its attractive properties. The proposed algorithm is theoretically guaranteed to converge to the epsilon-stationary point. We demonstrate the applicability of Lp-box ADMM on four important applications: MRF energy minimization, graph matching, clustering and model compression of convolutional neural networks. Results show that it outperforms generic IP solvers both in runtime and objective. It also achieves very competitive performance when compared to state-of-the-art methods that are specifically designed for these applications.
[ preprint ]
- Topic 3: Multimedia AI: A brief introduction of researches and applications of Tencent AI Lab.
Tencent AI Lab was established in Shenzhen in 2016 as a company-level strategic initiative and focuses on advancing fundamental and applied AI research. The research fields include computer vision, speech recognition, natural language processing and machine learning. The technologies of AI Lab have been applied in more than 100 Tencent products, including WeChat, QQ and news app Tian Tian Kuai Bao. In this talk, I will give a brief introduction of the researches about multimedia AI,
including AI + image, video, audio and text, ranging from modeling, analysis, understanding to generation, etc.
https://ai.tencent.com/ailab/
[Bio]: Baoyuan Wu is currently a Senior Research Scientist in Tencent AI Lab. He was Postdoc in IVUL lab at KAUST, working with Prof. Bernard Ghanem, from August 2014 to November 2016. He received the PhD degree from the National Laboratory of Pattern Recognition, Chinese Academy of Sciences (CASIA) in 2014, supervised by Prof. Baogang Hu. His research interests are machine learning and computer vision, including probabilistic graphical models, structured output learning, multi-label learning and integer programming. His work has been published in TPAMI, IJCV, CVPR, ICCV, ECCV and AAAI, etc.
|
Y.Y. |
|
10/29/2018, Mon |
Lecture 14: Variational Inference and Deep Learning. [ slides ]
|
Prof. Can YANG |
|
10/31/2018, Wed |
Lecture 15: Phase Transitions of Margin Dynamics [ slides ] and Project 2 [ Assignment ]
[Reference]:
- ZHU, Weizhi, Yifei HUANG, and Yuan YAO.
On Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics. [ arXiv:1810.03389 ]
|
ZHU, Weizhi |
|
11/05/2018, Mon |
Lecture 16: Generative Models and Variational Autoencoders. [ slides ]
|
Y.Y. |
|
11/07/2018, Wed |
Lecture 17: Generative Adversarial Networks. [ pdf ].
[Reference]
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio. Generative Adversarial Networks.
[ arXiv:1406.2661 ]
- Martin Arjovsky, Soumith Chintala, Léon Bottou. Wasserstein GAN.
[ arXiv:1701.07875 ]
- Rie Johnson, Tong Zhang, Composite Functional Gradient Learning of Generative Adversarial Models. [ arXiv:1801.06309 ]
|
Y.Y. |
|
11/12/2018, Mon |
Lecture 18: A Walk Through Non-Convex Optimization Methods: [ A: Online PCA ]
[ B: SPIDER ]
[ Speaker ] Dr. Junchi Li, Tecent AI Lab and Princeton University
[ Abstract ] In this talk, I will discuss briefly the theoretical advances of non-convex optimization methods stemmed from machine learning practice.
I will begin with (perhaps the simplest) PCA model and show that scalable algorithms can achieve a rate that matches minimax information lower bound.
Then, I will discuss scalable algorithms that escape from saddle points, the importance of noise therein, and how to achieve a $\cO(\varepsilon^{-3})$ convergence rate for finding an $(\varepsilon,\cO(\varepsilon^{0.5}))$-approximate second-order stationary point.
If time permits, I will further introduce a very recent ``Lifted Neural Networks'' method that is non-gradient-based and serves as a powerful alternative for training feed-forward deep neural networks.
[ Bio ] Dr. Junchi Li obtained his B.S. in Mathematics and Applied Mathematics at Peking University in 2009, and his Ph.D. in Mathematics at Duke University in 2014. He has since held several research positions, including the role of visiting postdoctoral research associate at Department of Operations Research and Financial Engineering, Princeton University. His research interests include statistical machine learning and optimization, scalable online algorithms for big data analytics, and stochastic dynamics on graphs and social networks. He has published original research articles in both top optimization journals and top machine learning conferences, including an oral presentation paper (1.23%) at NIPS 2017 and a spotlight paper (4.08%) at NIPS 2018.
[ Reference ]
- Junchi Li, Mengdi Wang, Han Liu, and Tong Zhang.
Near-Optimal Stochastic Approximation for Online Principal Component Estimation.
Mathematical Programming 2018. [ arXiv:1603.05305 ]
-
Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, and Praneeth Netrapalli.
Faster Eigenvector Computation via Shift-and-Invert Preconditioning.
ICML 2016
-
Rong Ge, Furong Huang, Chi Jin, and Yuan Yang.
Escaping from Saddle Points.
COLT 2015
-
Jason Lee, Max Simchowitz, Michael Jordan, and Ben Recht.
Gradient Descent Only Converges to Minimizers.
COLT 2016
-
Zeyuan Allen-Zhu, and Yuanzhi Li.
NEON2.
NIPS 2018
-
Cong Fang, Junchi Li, Zhouchen Lin, and Tong Zhang.
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator.
NIPS 2018. [ arXiv:1807.01695 ]
-
Jia Li, Cong Fang, and Zhouchen Lin.
Lifted Proximal Operator Machines.
AAAI 2018
-
Armin Askari, Geoffrey Negiar, Rajiv Sambharya, Laurent El Ghaoui.
Lifted Neural Networks.
arXiv:1805.01532
- Fangda Gu, Armin Askari, Laurent El Ghaoui.
Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training.
arXiv:1811.08039
|
Y.Y. |
|
11/14/2018, Wed |
Lecture 19: Robust Estimation and Generative Adversarial Networks. [ part A ] [ part B ].
[Reference]
- GAO, Chao, Jiyu LIU, Yuan YAO, and Weizhi ZHU.
Robust Estimation and Generative Adversarial Nets.
[ arXiv:1810.02030 ]
|
Y.Y. |
|
11/19/2018, Mon |
Lecture 20: Seminars
|
Y.Y. |
|
11/21/2018, Wed |
Lecture 21: Machine (Deep) Learning Problems in Cryo-EM. [ slides ].
[Reference]
- Yin Xian, Hanlin Gu, Wei Wang, Xuhui Huang, Yuan Yao, Yang Wang, Jian-Feng Cai. Data-Driven Tight Frame for Cryo-EM Image Denoising and Conformational Classification.
The 6th IEEE Global Conference on Signal and Information Processing, Anaheim, California, Nov 26-29, 2018.
[ arXiv:1810.08829 ] .
- Min Su, Hantian Zhang, Kevin Schawinski, Ce Zhang, Michael A. Cianfrocco.
Generative adversarial networks as a tool to recover structural information from cryo-electron microscopy data.
[ pdf ]
|
Hanlin GU |
|
11/26/2018, Mon |
Lecture 22: An Introduction to Adversarials in Deep Learning. [ slides ]
|
Zhichao HUANG |
|
11/28/2018, Wed |
Lecture 23: Final Project. [ project3.pdf ].
[Reference]
- Introduction to Reinforcement Learning.
[ slides ]
- Recurrent Attention Models.
[ slides ]
|
Y.Y. |
|