CSCI2951-F: Learning and Sequential Decision Making

Brown University
Fall 2017
Prof. Michael L. Littman (mlittman@cs.brown.edu, CIT 301)

Time: TTh 10:30-11:50
Place: Brown CIT 316 (subject to change)
Semester: Fall 2017
Web page: http://cs.brown.edu/courses/cs2951f/

Office hours: By appointment.


General Orientation to the Course: Through a combination of classic papers and more recent work, the course explores automated decision making from a computer-science perspective. It examines efficient algorithms, where they exist, for single agent and multiagent planning as well as approaches to learning near-optimal decisions from experience. Topics include Markov decision processes, stochastic and repeated games, partially observable Markov decision processes, and reinforcement learning. Of particular interest will be issues of generalization, exploration, and representation. Students will replicate a result in a published paper in the area. Participants should have taken a machine-learning course and should have had some exposure to reinforcement learning from a previous computer-science class or seminar; check with instructor if not sure. Students should already know how to program. No particular language will be required, but Python is growing interest in the community.

Course goals: Students should understand the main concepts of interest to the field of reinforcement learning, be able to implement standard algorithms and understand how to apply them to relevant problems, and be prepared to be able to contribute new results to the field.

Prerequisites: CSCI 1950F or CSCI 1420 or permission of the instructor.

Flipped structure: Students need to watch the lectures recorded by the professor and available online. Class time will be used for problem solving and discussion.

Result replication presentation: Students will form into small groups of two to four, and select a relevant paper from the literature. They will choose a graph in the paper and create an independent implementation/replication of this result. Students often find that important parameters needed to replicate the result are not stated in the paper and that obtaining the same pattern of results is sometimes not possible. Students will present their work at the end of the semester. Grades are based on the fidelity of the replication (25%), how well they show they understand the original paper (25%), the quality of the presentation itself in terms of clarity and creativity (25%), and their short written report (25%). The grade on this project will represent 50% of the final grade in the class.

Grading: Final grade is derived from: Per-class quizzes (50%), result replication presentation (50%).

Schedule

Topics and Paper Links