This class provides an example-based introduction to deep learning using the Keras libary. Monday lectures will focus on a general background of different machine learning techniques including convolutional neural networks, recurrent neural networks, long term short-term memory, as well as applications in image recognition, control and natural language processing. Wednesday lectures will provide an overview over relevant tools for data acquisition and processing, followed by student-driven presentations of selected research papers and homeworks.
Homeworks and research paper reviewed in class must be summarized by an interactive Jupyter notebook, which will be hosted online, that summarizes the paper and lets the reader experience its content by example. The final deliverable for the class is a report on an independent research project consisting of a Jupyter notebook demonstrating a technique learned in class on a real-world data set.
Lectures
MW 4.30-5.45 in ECES 114
Textbook
- Required: Deep Learning with Keras by Antonio Gulli and Sujit Pal
- Recommended: Pattern Recognition and Machine Learning, Bishop
Reading
- Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. “Dropout: a simple way to prevent neural networks from overfitting.” The journal of machine learning research 15, no. 1 (2014): 1929-1958. http://jmlr.org/papers/v15/srivastava14a.html (presented by Annelise Lynch, February 12).
- Kingma, Diederik P., and Jimmy Ba. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980 (2014). https://arxiv.org/abs/1412.6980 (presented by Divya Athoopallil, February 19)
- Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X. and Metaxas, D.N., 2017. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 5907-5915) (presented by Trevor Grant, February 26)
- Redmon, J., Divvala, S., Girshick, R. and Farhadi, A., 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788) (presented by Ashwin Vasan, March 11)
- Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L., 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237 (presented by Joewie Koh, April 1)
- Devlin, J., Chang, M.W., Lee, K. and Toutanova, K., 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (presented by Soumyajyoti Bhattacharya, April 6)
- He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969) (presented by Mutian Yan, April 8).
- Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. and Riedmiller, M., 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (presented by Arturo Freydig Avila, April 13)
- Vinyals, O., Toshev, A., Bengio, S. and Erhan, D., 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3156-3164), (presented by Tetsumichi Umada, April 13).
- Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144), (presented by Adam Resnick, April 15).
- Sun, R., 2019. Modern Statistical/Machine Learning Techniques for Bio/Neuro-imaging Applications (Doctoral dissertation, Columbia University), (presented by Salil Rabade, April 15)
- Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. and Wierstra, D., 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, (presented by Caleb Escobado, April 20).
- Ronneberger, O., Fischer, P. and Brox, T., 2015, October. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham, (presented by Galen Pogoncheff, April 20)
Projects
Your final project should show innovation in at least data preprocessing or network architecture. Straightforward applications of standard processing pipelines on standard network architectures are not acceptable. Possible projects include:
- Train a convolutional neural network on depth data, e.g. from an Intel RealSense, to augment object recognition
- Combine the output of a YOLO or MASK R-CNN classifier with a word embedding to understand scene context
- Classify force/torque data from assembly to estimate whether an assembly is successful or failing
- Build a pipeline for gaze detection that can fit on an embedded computer
- Participate in a competition on Kaggle or the OpenAI gym
- Train a detector for holes, pegs and pulleys for assembly
Project Presentations
Project presentations will be made using Jupyter Notebooks and will be allotted 12 minutes for presentation and 6 minutes for questions. Project deliverables should contain text, equations, code and auxiliary figures as well as references at the end. The goal is to create a self-contained document that describes the methods used in sufficient detail for someone who has taken the class. (You can link to relevant material from the classes’ github page.) In addition to text, your presentation will require sufficient graphical material to allow the class to follow the concepts during a presentation. For example, use drawings to illustrate your network architecture, matplotlib panels to illustrate your data, and label axes of all plots.
- April 22 (Wednesday): Adam, Annelise, Arturo, Ashwin
- April 27 (Monday): Caleb, Divya, Galen, Joewie
- April 29 (Wednesday): Mutian, Salil, Soumyajyoti, Telly, Trevor
Grading
10% In-class participation
12% Homework 1: Implementing a simple classification/regression problem
12% Homework 2: Classification/regression on time series data
26% Jupyter notebook summary of a selected paper
40% Final project
Extra credit: narrated YouTube video
Late policy
As homework and project are being submitted to a public repository, late submissions will lead to a reduction by one letter grade (A->B, B->C etc.).
Syllabus
Week 1: Perceptron algorithm
Week 2: MLK day – Multi-layer networks and back-propagation
Week 3: Deep convolutional neural networks
Week 4: Very deep convolutional networks
Week 5: Generative Adversarial Networks (GAN)
Week 6: Other applications for GANs (WaveNet)
Week 7: Word embeddings
Week 8: Other NLP applications
Week 9: Recurrent Neural Networks (RNN)
Week 10: Long short term memory (LSTM)
Week 11: Regression networks
Week 12: Autoencoders
Week 13: Reinforcement learning
Week 14: Project
Week 15: Project
Week 16: Project
Please follow this link for additional policies regarding accomodations, class room behavior, preferred student names and pronouns, honor code, sexual misconduct, and religious holidays.