Deep Reinforcement Learning



Time and place:

Registration with topic request by e-mail before start of the class; Assignment of presentation topics is FCFS.

  • Time and place on appointment

Fields of study

  • WPF INF-MA from SEM 2
  • WPF CE-MA-SEM from SEM 2

Prerequisites / Organizational information

Registration via e-mail to

  • Presentation (30-40 minutes)

  • Preparation of a report that includes the main points of the talk (not a simply copy of the slides)

  • Attending the presentations of other students

  • Completion of the slides one week before the talk; completion of the report until the end of the semester


Reinforcement Learning (RL) is a kind of learning that allows an autonomous agent to learn in an environment through a trial-and-error process. In Reinforcement Learning the agent takes actions and observes the environmental feedback. If actions lead to better situations, there is the tendency of applying such behavior again, otherwise, the tendency is to avoid such behavior in the future. Hence, the central problem lies within the optimization of selecting optimal actions in any situation to reach a given goal. In this seminar, students will investigate the key aspects and methods used in nowadays deep reinforcement learning algorithms.

Recommended Literature

- Giusti, A., Guzzi, J., Ciresan, D. C., He, F. L., Rodríguez, J. P., Fontana, F., ... & Scaramuzza, D. (2016). A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots. IEEE Robotics and Automation Letters, 1(2), 661-667. - Deisenroth, M. P., Neumann, G., & Peters, J. (2013). A survey on policy search for robotics. Foundations and Trends® in Robotics, 2(1–2), 1-142. - Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., ... & Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928-1937). - Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep Reinforcement Learning with Double Q-Learning. In AAAI (Vol. 2, p. 5). - Chua, K., Calandra, R., McAllister, R., & Levine, S. (2018). Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models. arXiv preprint arXiv:1805.12114. - Bastani, O., Pu, Y., & Solar-Lezama, A. (2018). Verifiable Reinforcement Learning via Policy Extraction. arXiv preprint arXiv:1805.08328. - Lange, S., Gabel, T., & Riedmiller, M. (2012). Batch reinforcement learning. In Reinforcement learning (pp. 45-73). Springer, Berlin, Heidelberg. - Recht, B. (2018). A Tour of Reinforcement Learning: The View from Continuous Control. arXiv preprint arXiv:1806.09460. - Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2017). Domain randomization for transferring deep neural networks from simulation to the real world. In Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on (pp. 23-30).

Additional information

Expected participants: 10