Seminar on Neuro-Symbolic Reinforcement Learning

Saarland University — Summer Semester 2022

## Organizers

- Adish Singla: Instructor
- George Tzannetos: Teaching assistant

## Timeline and updates

**Until 12 April 2022**: Register for the seminar course at https://seminars.cs.uni-saarland.de.**15 April 2022**: We have a new mailing list that includes all organizers/tutors. To reach out to us, you should send an email to neurosymbolicrl-s22-tutors@mpi-sws.org (instead of contacting individuals).**25 April 2022**: Paper assignments for reading and writing reports are available to students. The list of papers is provided below and split into two batches (Batch-A and Batch-B). Each student will have to write reports for 6 papers.**Until 15 May 2022**: After you have been allocated a slot in the seminar, you then need to register for the seminar course examination at Saarland University. You should check when the examination registration starts. You need to register for the seminar course examination by 15 May 2022; this is also the deadline to withdraw by letting us know via email.**20 May 2022**: Reports for Batch-A papers are due. You can pick any three out of the four papers in this batch for writing reports.**10 June 2022**: Reports for Batch-B papers are due. You can pick any three out of the four papers in this batch for writing reports.**17 June 2022**: Project details will be finalized by this date. Your presentation will be based on the project along with the paper related to your project.**12 Aug 2022**: Report and executable code for the project are due.**19 Aug 2022**: Presentation slides are due.**Between 22 Aug to 16 Sep 2022**: Final presentations will take place. The exact dates will be finalized in discussion with enrolled students.

## Course structure

The course consists of three main components: (i) research papers, (ii) project, and (iii) final presentation. Each component carries one-third of the final score. There will be no weekly classes. You can reach out to us anytime by sending an email to neurosymbolicrl-s22-tutors@mpi-sws.org. If needed, the tutors will arrange specific meeting times during the semester — further information will be communicated to students via emails as we move along in the semester.#### Reading research papers and writing reports

- Each student has to write reports for a total of 6 papers. The list of papers is provided below and split into two batches (Batch-A and Batch-B). A student can pick any three out of the four papers from Batch-A and from Batch-B.
- For each of the picked papers, you will have to write a two-page report. The timeline for report submissions is listed above.
- Each report should be submitted as a PDF file via sending an email to neurosymbolicrl-s22-tutors@mpi-sws.org. You should name your PDF files as lastname_paper#.pdf (e.g., lastname_paper1.pdf, lastname_paper2.pdf, lastname_paper4.pdf, and so on).
- Reports should be written in latex using NeurIPS style files.
- https://neurips.cc/Conferences/2021/PaperInformation/StyleFiles.
- You should use the option
__\usepackage[preprint]{neurips_2021}__(Non-anonymous preprints). - The report should include the paper title, the paper number (indexed from 1 to 8) and your full name at the top.

- Structure the report into three sections as follows:
- Write down a review of the paper, including: (a) a short summary of the paper, (b) a discussion on how the paper extends state of the art, and (c) the main strengths of the paper.
- Write down the main weaknesses of the paper and discuss how this paper could be improved.
- Write down your ideas on how you would like to extend the techniques and results in the paper.

- These reports will correspond to one-third of the final score.

#### Project

- The project will involve the implementation of neuro-symbolic reinforcement learning techniques. Project details will be finalized by 17 June 2022.
- You will have to submit a report and executable code for the project. Each student will work on the project separately (no teams).
- The project will correspond to one-third of the final score.

#### Presentations

- You will have to prepare a presentation of 25 mins. Your presentation will be based on the project along with the paper related to your project.
- At the end of the semester, you will give a final presentation. We will block about 8 hours of time for the presentations. The exact dates will be finalized in discussion with enrolled students.
- The slides and presentation will correspond to one-third of the final score.

## List of research papers

#### Batch-A

Reports for Batch-A papers are due on 20 May 2022. You can pick any three out of the four papers in this batch for writing reports. Please download the PDF files from the specific links provided above to avoid confusion about different versions.-
Neural Combinatorial Optimization with Reinforcement Learning

by I. Bello, H. Pham, Q. V. Le, M. Norouzi, S. Bengio (arXiv 2017). -
Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis

by R. Bunel, M. J. Hausknecht, J. Devlin, R. Singh, P. Kohli (ICLR 2018). -
Reinforcement Learning for Integer Programming: Learning to Cut

by Y. Tang, S. Agrawal, Y. Faenza (ICML 2020). -
Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

by Q. Cappart, T. Moisan, L.-M. Rousseau, I. Premont-Schwarz, A. A. Cire (AAAI 2021).

#### Batch-B

Reports for Batch-B papers are due on 10 June 2022. You can pick any three out of the four papers in this batch for writing reports. Please download the PDF files from the specific links provided above to avoid confusion about different versions.-
Deep Reinforcement Learning with Relational Inductive Biases

by V. Zambaldi et al. (ICLR 2019). -
Decision Transformer: Reinforcement Learning via Sequence Modeling

by L. Chen et al. (NeurIPS 2021). -
Verifiable Reinforcement Learning via Policy Extraction

by O. Bastani, Y. Pu, A. Solar-Lezama (NeurIPS 2018). -
Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning

by R. T. Icarte, T. Q. Klassen, R. A. Valenzano, S. A. McIlraith (JAIR 2022).

Imprint / Data Protection |