Seminar on Neuro-Symbolic Reinforcement Learning
Saarland University — Summer Semester 2022
The course will provide an overview of the research in Neuro-Symbolic Reinforcement Learning. The course consists of three main components: (i) research papers, (ii) project, and (iii) final presentation. Each component carries one-third of the final score. Further details about the course structure and logistics are provided below.

Organizers

Timeline and updates

  • Until 12 April 2022: Register for the seminar course at https://seminars.cs.uni-saarland.de.
  • 15 April 2022: We have a new mailing list that includes all organizers/tutors. To reach out to us, you should send an email to neurosymbolicrl-s22-tutors@mpi-sws.org (instead of contacting individuals).
  • 25 April 2022: Paper assignments for reading and writing reports are available to students. The list of papers is provided below and split into two batches (Batch-A and Batch-B). Each student will have to write reports for 6 papers.
  • Until 15 May 2022: After you have been allocated a slot in the seminar, you then need to register for the seminar course examination at Saarland University. You should check when the examination registration starts. You need to register for the seminar course examination by 15 May 2022; this is also the deadline to withdraw by letting us know via email.
  • 20 May 2022: Reports for Batch-A papers are due. You can pick any three out of the four papers in this batch for writing reports.
  • 10 June 2022: Reports for Batch-B papers are due. You can pick any three out of the four papers in this batch for writing reports.
  • 15 June 2022: Project details will be announced by this date.
  • 31 July 2022: Report and executable code for the project are due.
  • 31 July 2022: Assignment for presentations will be finalized. Prior to the assignment, you will be asked for a choice to present either (a) the project or (b) a paper. If you choose (b), then one of the papers will be assigned.
  • 15 Aug 2022: Presentation slides are due.
  • Between 16 Aug to 15 Sep 2022: Final presentations will take place. The exact dates will be finalized in discussion with enrolled students.

Course structure

The course consists of three main components: (i) research papers, (ii) project, and (iii) final presentation. Each component carries one-third of the final score. There will be no weekly classes. You can reach out to us anytime by sending an email to neurosymbolicrl-s22-tutors@mpi-sws.org. If needed, the tutors will arrange specific meeting times during the semester — further information will be communicated to students via emails as we move along in the semester.

Reading research papers and writing reports

  • Each student has to write reports for a total of 6 papers. The list of papers is provided below and split into two batches (Batch-A and Batch-B). A student can pick any three out of the four papers from Batch-A and from Batch-B.
  • For each of the picked papers, you will have to write a two-page report. The timeline for report submissions is listed above.
  • Each report should be submitted as a PDF file via sending an email to neurosymbolicrl-s22-tutors@mpi-sws.org. You should name your PDF files as lastname_paper#.pdf (e.g., lastname_paper1.pdf, lastname_paper2.pdf, lastname_paper4.pdf, and so on).
  • Reports should be written in latex using NeurIPS style files.
  • Structure the report into three sections as follows:
    • Write down a review of the paper, including: (a) a short summary of the paper, (b) a discussion on how the paper extends state of the art, and (c) the main strengths of the paper.
    • Write down the main weaknesses of the paper and discuss how this paper could be improved.
    • Write down your ideas on how you would like to extend the techniques and results in the paper.
  • These reports will correspond to one-third of the final score.

Project

  • The project will involve the implementation of neuro-symbolic reinforcement learning techniques. Project details will be announced by 15 June 2022.
  • You will have to submit a report and executable code for the project. Each student will work on the project separately (no teams).
  • The project will correspond to one-third of the final score.

Presentations

  • You will have to prepare a presentation of 25 mins. You will be asked for a choice to present either (a) the project or (b) a paper. If you choose (b), then one of the papers will be assigned.
  • At the end of the semester, you will give a final presentation. We will block about 8 hours of time for the presentations. The exact dates will be finalized in discussion with enrolled students.
  • The slides and presentation will correspond to one-third of the final score.

List of research papers

Batch-A

Reports for Batch-A papers are due on 20 May 2022. You can pick any three out of the four papers in this batch for writing reports. Please download the PDF files from the specific links provided above to avoid confusion about different versions.
  1. Neural Combinatorial Optimization with Reinforcement Learning
    by I. Bello, H. Pham, Q. V. Le, M. Norouzi, S. Bengio (arXiv 2017).
  2. Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis
    by R. Bunel, M. J. Hausknecht, J. Devlin, R. Singh, P. Kohli (ICLR 2018).
  3. Reinforcement Learning for Integer Programming: Learning to Cut
    by Y. Tang, S. Agrawal, Y. Faenza (ICML 2020).
  4. Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
    by Q. Cappart, T. Moisan, L.-M. Rousseau, I. Premont-Schwarz, A. A. Cire (AAAI 2021).

Batch-B

Reports for Batch-B papers are due on 10 June 2022. You can pick any three out of the four papers in this batch for writing reports. Please download the PDF files from the specific links provided above to avoid confusion about different versions.
  1. Deep Reinforcement Learning with Relational Inductive Biases
    by V. Zambaldi et al. (ICLR 2019).
  2. Decision Transformer: Reinforcement Learning via Sequence Modeling
    by L. Chen et al. (NeurIPS 2021).
  3. Verifiable Reinforcement Learning via Policy Extraction
    by O. Bastani, Y. Pu, A. Solar-Lezama (NeurIPS 2018).
  4. Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
    by R. T. Icarte, T. Q. Klassen, R. A. Valenzano, S. A. McIlraith (JAIR 2022).



Imprint / Data Protection