Lecturer:	Dr.-Ing. Christopher Mutschler, Christoffer Löffler
Pensum:	2 SWS (5 ECTS)
Requirements:	Registration per E-Mail at christopher.mutschler@fau.de Requirements for passing: 40 minute presentation Writing a short report regarding the essential points of the talk (no copies of slides permitted, ca. 6-8 pages) presence during the talks of the other participants Preperation of the slides until one week before the presentation, Completion of the report until the end of the semester
Comment:	Registration with Topic via E-Mail before the start of the lectures, Topics are distributed via FCFS principle
Date & Location:	Summer Semester 2019 First Meeting on April 16th 14:15 00.010 Carl-Thiersch-Str. 2b, 91052 Erlangen. Seminars on August 3rd and 10th, at 10:00 in 00.010 Carl-Thiersch-Str. 2b, 91052 Erlangen.
Target audience:	WPF INF-MA (> 2.Semester) WPF CE-MA-SEM (> 2.Semester)
Literature:	Giusti, A., Guzzi, J., Ciresan, D. C., He, F. L., Rodríguez, J. P., Fontana, F., … & Scaramuzza, D. (2016). A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots. IEEE Robotics and Automation Letters, 1(2), 661-667. Deisenroth, M. P., Neumann, G., & Peters, J. (2013). A survey on policy search for robotics. Foundations and Trends® in Robotics, 2(1–2), 1-142. Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., … & Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928-1937). Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep Reinforcement Learning with Double Q-Learning. In AAAI (Vol. 2, p. 5). Chua, K., Calandra, R., McAllister, R., & Levine, S. (2018). Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models. arXiv preprint arXiv:1805.12114. Bastani, O., Pu, Y., & Solar-Lezama, A. (2018). Verifiable Reinforcement Learning via Policy Extraction. arXiv preprint arXiv:1805.08328. Lange, S., Gabel, T., & Riedmiller, M. (2012). Batch reinforcement learning. In Reinforcement learning (pp. 45-73). Springer, Berlin, Heidelberg. Recht, B. (2018). A Tour of Reinforcement Learning: The View from Continuous Control. arXiv preprint arXiv:1806.09460. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2017). Domain randomization for transferring deep neural networks from simulation to the real world. In Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on (pp. 23-30).
Content:	Reinforcement Learning (RL) is a kind of learning that allows an autonomous agent to learn in an environment through a trial-and-error process. In Reinforcement Learning the agent takes actions and observes the environmental feedback. If actions lead to better situations, there is the tendency of applying such behavior again, otherwise, the tendency is to avoid such behavior in the future. Hence, the central problem lies within the optimization of selecting optimal actions in any situation to reach a given goal. In this seminar, students will investigate the key aspects and methods used in nowadays deep reinforcement learning algorithms.

We encourage every student to at least get familiar with basic RL concepts before digging into their topics
If you are not familiar with RL yet we recommend available video lectures and course material available online, e.g. minimum the first 6 lectures of David Silver’s class on RL:
- http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html
- https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLzuuYNsE1EZAXYR4FJ75jcJseBmo4KQ9-

The introduction will take place on April, 16th 2019 at 14:15 in Room 00.010 (Carl-Thiersch-Str. 2b, 91052 Erlangen)

The seminars start at 10:00 in room 00.010 (Carl-Thiersch-Str. 2b, 91052 Erlangen) on the 3rd and 10th of August

Topics

	Topic	Author	Download Material
0.	Introduction	Christopher Mutschler	DeepRL_Seminar_compressed
August, 3rd 2019
1.	Reinforcement Learning and Continuous Control	Sacha Medaer
	~~Policy Search~~	~~Karthik Shetty~~
2.	Actor-Critic	Sujit Sahoo
3.	Explainable RL (and AI)	Daniel Luge
August, 10th 2019
4.	Advanced Q-Learning	Christian Klose
	~~Batch Reinforcement Learning~~	~~Vishal Sukumar~~
5.	Imitation Learning	Elgiz Bagcilar
	~~Model-based RL~~	~~Srikrishna Jaganathan~~