Introduction to Computer Vision

Fall 2007

CS 143, M.,W.,F. 11:00-11:50 AM (D Hour), Lubrano Conference Room, CIT

Professor: Michael J. Black

Grad TA: Payman Yadollahpour

Ugrad TA: Teodor Mihai Moldovan

One stop shopping: link to syllabus, lecture slides, homework, etc.

Other Quick Links: Contact Information - Background Material - Vision Talks


News and Announcements

Solutions to assignment 3 are now available at: '/course/cs143/asgn/asgn3/soln'.

Solutions to assignment 1, all problems, have been placed in the directory '/course/cs143/asgn/asgn1/all_soln'.

Assignment 2, problems 1 & 2 are out. Check the syllabus page. Problem 3 will be posted shortly.

Solutions to assignment 1, problems 1 & 2 have been placed in the directory '/course/cs143/asgn/asgn1/p1_p2_soln'.

Payman's office hours have moved from Wednesdays to Thursdays 3-5pm.

If you are not on the cs143 mailing list and should be (you should be if you are in the class), then send mail to Payman.

We will hold the Matlab tutorial on Friday, September 14, 5 PM to 6 PM, in CIT 165 (Motorola)

Assignment 0 is on the web.

Class will be held during reading week on Dec 10 and 12.  The week before, however, (Dec 3-7) will be free for working on projects.

Welcome to CS143.   The syllabus is now available.

Short Course Description

How can computers understand the visual world of humans?


This course treats vision as a process of inference from noisy and uncertain data and emphasizes probabilistic and statistical approaches. Topics include perception of 3D scene structure from stereo, motion, and shading; image filtering, smoothing, edge detection; segmentation and grouping; texture analysis;  learning, recognition, and search; tracking and motion estimation.

Prerequisites

Required: CS032 or equivalent programming experience, basic knowledge of linear algebra (e.g. MA52), basic calculus (e.g. MA10), or permission of the instructor. 

Desirable: knowledge of probability, statistics, (e.g. CS155, AM0040, AM165, AM169, or AM264).  

Who should take the course

This course is designed for undergraduate and graduate students interested in vision, artificial intelligence, or machine learning.  Many of the ideas and techniques used here are also used in other areas of AI (e.g. robotics, natural language understanding, learning).  The course offers a broad introduction to the field, the current problems and theories, the basic mathematics, and some interesting algorithms.  

This course is also a good choice of GRAPHICS students.  There are many techniques in common to vision and graphics and current graphics research uses more and more tools from vision.

It is not an image or signal processing course.  It is also not a robot vision course; we will not be using real-time vision hardware.  While we may touch on human vision, the course is about machine vision.

Students from any department are welcome provided they have the required programming and suggested mathematical background.

If you unsure whether this is the course for you, please come and talk with me.

Office Hours and Contact information

Professor: Michael Black (please call me Michael)
Email: b l a c k <at> c s . b r o w n . e d u
Office hours: Wednesday 3:00-4:00 and Thursday 3:00-4:00pm, CIT 521.

Grad TA: Payman Yadollahpour
Email: p y a d o l l a  <at> c s . b r o w n . e d u
TA hours: Thursday 3:00-5:00pm, CIT 425.

Ugrad TA: Teodor Mihai Moldovan
Email: m o l d o v a n <at> c s . b r o w n . e d u
TA hours: Monday 3:00-5:00pm, CIT 271.

Course staff email alias: c s 1 4 3 s t a f f <at> c s . b r o w n . e d u

Class Goals

Computers today are limited in their ability to interact with the world and with their human users because the lack the ability to "see".  The study of computer vision requires that we understand something about the physics of the world, how light is reflected off surfaces, how objects move, and how all of this information gets projected onto an image by the optics of a camera.  It also requires that we devise algorithms to recover, or reconstruct, some of these physical properties from one or more images.  This "inverse" problem is a great puzzle.  Information is lost when the three dimensional world is projected onto a two dimensional image;  how can we recover this information from a picture of it?  We will study the mathematics behind this and develop algorithms for solving various inverse problems.  But vision is about more than simply reconstructing the 3D world from 2D images; it is about "understanding".  We will explore various machine learning techniques and probabilistic inference methods that begin to address this problem.

In this class you will

  • be exposed to many areas of current computer vision research
  • implement a number of programming assignments to get hands-on experience working with images and image sequences
  • implement a programming project that fits your interests
  • find out that all that linear algebra and calculus you learned is actually useful for something real

Even if you do not go on to study computer vision, the basic tools and techniques we use here will be useful in many other areas.

Syllabus

We will cover

  • Images, cameras, and image formation
  • Image statistics, edges, and texture
  • Regularization, diffusion, and Markov Random Fields
  • Optical flow (image motion): affine flow, regression, dense flow
  • Stereo
  • Tracking
  • Robust statistics
  • Segmentation and grouping
  • Bayesian inference
  • Principal component analysis and eigen-models of objects

Reading

Suggested (but not required) text:  Computer Vision: A Modern Approach. David Forsyth and Jean Ponce. Prentice Hall.  Three copies will be on short-term loan at the library.  This text is not required but is a useful resource.  The lecture slides and other assigned readings should be sufficient but if you want a book this is not a bad choice.

Make sure you get the version that was revised Jan 2003 or later. 

Readings will be from the text and additional material that will be handed out or made available on the web page.   Again, these are not required.

All lecture slides will be available on the web after class.

Materials

The Matlab programming language will be used. It is fairly intuitive and well documented. Students who are unfamiliar with Matlab should go through the on-line tutorial material.  Other resources will be provided in class.

See the Background Material page for useful materials.

Assignments

There will be 4 programming assignments (in Matlab). The goal is for these to help build your the understanding of class material by working with real images. In each assignment you will implement one of the algorithms discussed in class and experiment with it on pre-recorded image data

You will also have to do one project (4 weeks) of your own choosing. More details, suggestions, and guidelines will be available later in the term.

You will also be expected to read all the assigned material and come prepared to talk about it in class.

Grading

Please read and understand the Collaboration Policy.

Subject to change:

10% Class participation.

15% Homework assignment 1

15% Homework assignment 2

15% Homework assignment 3

15% Homework assignment 4

30% Project

Graduate Credit

You may receive graduate credit for CS143 but it requires a more complete writeup of the final project and presentation at the end of term.  I will expect the writeup to be in the style of a scientific conference paper (note that it does not need to be publishable).   One thing this will require is a good review of the literature.

Late Policy

Late assignments will not be accepted without prior approval. Get prior approval. No exceptions. I am ruthless.

Brown Vision List

There is an email list for vision-related announcements at Brown (mostly talks). This list is open to anyone at Brown interested in vision. You can subscribe/unsubscribe and browse the archives at http://listserv.brown.edu/archives/cgi-bin/wa?SUBED1=brown-vision&A=1.

There is also a Vision and Learning Seminar series with talks throughout the year.  Announcements go to the Brown Vision List.