DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO


CSE 250B: Principles of Artificial Intelligence: Learning

Fall 2007

Please ask questions on this message board.

OVERVIEW

CSE 250B is a graduate course devoted to the basic concepts and algorithms of supervised and unsupervised learning from data.  250B is open to Ph.D. and MS students in computer science, cognitive science, and all related areas.  Other prospective participants, including undergraduates, should contact the instructor at elkan@cs.ucsd.edu.  For registration, the section id of CSE 250B is 602195.  In Fall 2007, both 250A (taught by Prof. Lawrence Saul) and 250B will be offered.  Students may take one or both courses: neither is a prerequisite for the other, and there will be little overlap.  

The specific topics discussed in CSE 250B will include, not necessarily in this order,

Two important topics that will likely not be covered are graphical models and reinforcement learning.  The instructor is Charles Elkan, Professor.  Office hours will be announced, in the CSE building, room 4134.  If you are unable to attend office hours, feel free to send email to arrange an appointment.  

Some topics discussed in class will not be in any textbook, and many will be explained differently, so coming to lectures and taking notes carefully is important.  Examinations will be based mainly on the online lecture notes.

LECTURES

Lectures will be on Tuesdays and Thursdays from 2pm to 3:20pm in the Warren Lecture Hall building, room 2208.  For lecture notes from the Fall 2006 version of 250B, see http://www.cs.ucsd.edu/users/elkan/250Bfall2006.

September 27 Geometry of hyperplanes, perceptron algorithm, biological plausibility. Project 1
October 2 see above Perceptron convergence theorem, multilayer perceptron, voted perceptron.
October 4 k-nearest neighbor classification.  Bayes error rate definition.
October 9 Nearest-neighbor-based Bayes error bounds.  Triangle inequality, LAESA algorithm.
October 11 Kernel trick.  Kernelized perceptron algorithm, support vectors.  Polynomial and string kernels.
October 16 Scores versus calibrated probabilities, measuring classifier performance, cross-validation. Project 2
October 18 Supervised learning based on Bayes' rule, the naive Bayes assumption.  Time and space complexity of naive Bayes training.
October 23 Lecture canceled due to fires in San Diego.
October 25 Lecture canceled due to fires in San Diego.
October 30 Principle of maximum likelihood (ML).  ML estimator for a Bernoulli parameter.
November 1 Guidelines for doing projects and writing reports.  ML estimates for Gaussian mean and variance.
November 6 Mixture distributions.  Expectation-maximization (EM) algorithm to train a mixture model. Project 3
November 8 Deterministic annealing.  The general EM algorithm.
November 13 Derivation of EM based on Jensen's inequality.  Conditional likelihood and logistic regression.
November 15 Logistic regression.
November 20 Stochastic gradient ascent/descent.
November 22 No lecture due to Thanksgiving.
November 27 Log-linear models, feature functions, sequence labeling. Project 4
November 29 Midterm review.  Gradient following for training log-linear models


TEXTBOOKS

The course will not be based on any single book.   The following textbooks are recommended as references:
For a price comparison among web booksellers use addall.com with the ISBN numbers.

Some topics discussed in class will not be in any textbook, and many will be explained differently, so coming to lectures and taking notes carefully is important.  Examinations will be based mainly on the online lecture notes.

 

ASSIGNMENTS AND GRADING

There will be one in-class midterm exam (10% of your overall grade), a final examination (30%), and four project assignments (15% each).  You should do each project with one partner, so individual work will count for 40% of your grade and joint work for 60%.  You are free to change partners, or not, between projects. 

Each project will last between two and three weeks and will require coding, experimenting with data, and writing a report.  Using a high-level environment such as Matlab or R is encouraged.  Projects will be graded based exclusively on the written report.  Each pair of partners should hand in their joint report at the start of class on the day that the report is due.  Each day that a report is late will cost 20% of the maximum score available for the project.  Reports will be evaluated using grading criteria similar to those in this formComplete academic honesty is always required. 

The due dates for the four projects will be Thursday October 11, Tuesday November 6 (original deadline extended by one week because of the fires in San Diego), Tuesday November 27, and Thursday December 6 (unofficially, December 13).  The midterm will be in class on Tuesday November 20 and the final exam will be on Thursday December 13 from 3pm to 6pm, room to be announced.  (The last lecture will be on Thursday December 6.)

There is no a priori correspondence between letter grades and numerical scores on the assignments or on the exam.  You can evaluate your performance in the class by comparing your scores with the means and standard deviations, which will be announced.  However there is also no fixed correspondence between letter grades and standard deviations above or below the mean.  If all students do well in the absolute, then all students will get a good grade.  

You should not drop CSE 250B just because you are unhappy with the score that you receive on a project.  Instead, you should make an appointment to discuss with the instructor how you can do better on following projects.




Most recently updated on November 30, 2007 by Charles Elkan, elkan@cs.ucsd.edu.