DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO

CSE 291: Statistical Learning

Winter 2004

 
Here is the extra credit assignment, due Thursday March 18 at 10am.

The final exam will be in SSB 106 (our usual classroom) on Monday March 15, 2004, from 11:50am to 2:30pm.

OVERVIEW

CSE 291 is a graduate lecture course devoted to learning methods based on statistics.  The course will cover mathematical concepts and results as well as algorithms and their analysis.

CSE 291 is open to M.S. and Ph.D. students in computer science, bioinformatics, cognitive science, and related fields.  The course is complementary to other UCSD courses such as Cognitive Science 260 and Math 283 (Statistical Methods in Bioinformatics).  Students are welcome to take any or all of these courses. Unlike CSE 254, CSE 291 is a lecture course.

The prerequisite for CSE 291 is an upper-division undergraduate course on probability and statistics, such as Math 183 or 186 at UCSD, or any graduate course on statistics, pattern recognition, or machine learning.

Students should take CSE 291 for four units, for a letter grade.  For registration, use section id 487957.   Although the section is currently listed as "full," additional students are welcome.

This class meets on Tuesdays and Thursdays, from 11am to 12:20pm, in room 106 in the Social Sciences Building; see this map.  The first meeting was on Tuesday January 6, 2004.  The last lecture will be on

TEXTS AND TOPICS

The main books to be used are Statistical Inference by S. D. Silvey and The Elements of Statistical Learning: Data Mining, Inference, and Prediction by T. Hastie, R. Tibshirani, and J. H. Friedman.  Other books that are recommended include: Some specific topics that may be covered in CSE 291 include: The instructor is Charles Elkan, Associate Professor.  Office hours will be announced, in AP&M room 4856.  If you are unable to attend office hours, feel free to send email to arrange an appointment, or telephone (858) 534-8897.

 

LECTURES

Lectures will be on Tuesdays and Thursdays in APM room 4882.  The first lecture will be on Tuesday January 6, and the final lecture will be on Thursday March 11.

Lecture notes for each class meeting will be published here on the class web page, which is found at http://www-cse.ucsd.edu/users/elkan/291.  Lecture notes from Fall 2002 are available. 
 

date
topics
January 6
Reasoning vs. learning, point estimators and their properties
January 8
Mean squared error, unbiasedness, sufficient statistics, statement of Rao-Blackwell theorem
January 13
Proof of Rao-Blackwell theorem, completeness
January 15
Nested expectations lemma, Jensen's inequality, algorithm to find MVUEs
January 20
Discussion of MVUEs, score function, Cramer-Rao lower bound for MVUE variance
January 22 Achieving the CR bound.  Fisher information.  Large-sample maximum likelihood (ML).
January 27
Proof of consistency and efficiency for large-sample MLEs.  Likelihood ratio hypothesis testing (LRT).
January 29
LRT version of the t-test.  Proof that LRT statistic has aymptotic chi-squared distribution.
February 3
LRT origin of standard chi-squared tests for goodness of fit. 
February 5
Linear regression: matrix formulation.
...
see http://www-cse.ucsd.edu/users/elkan/291
March 4
Testing multiple hypotheses: the Westfall-Young procedure
March 9
Bootstrap methods: the emprical distribution, confidence interval estimation, hypothesis testing
March 11
Logistic regression, KL distance, Gaussian linear discrimant analysis


ASSIGNMENTS

There will likely be five homework assignments, worth 2/3 of the final grade, and a final examination, worth 1/3.

Each assignment will involve mathematical reasoning and programming in Matlab.  You are encouraged to collaborate on solving the problems posed, and to use any books and other resources you wish, but each student must write up his or her solutions independently.

Your solutions should be written in good, concise English with all necessary diagrams, plots, and explanations.  You must use LaTeX or similar high-quality software for text processing.  On the due date, you should submit a stapled 8.5x11 printout in class.


FINAL EXAMINATION

The final exam will be on Monday March 15, from 11:30am to 2:30pm.

Exam questions will be similar to assignment questions, but easier.  Here are the instructions that will be on the exam. In particular, a calculator will be useful.

"Look through the whole exam and answer the questions that you find easiest first. Answer each question in the space below the question, using the backs of the pages for extra space as necessary. If necessary, you may make assumptions that are reasonable, and that do not make a question trivial. If you do make an assumption, state it clearly.

You may bring and use the following materials:

  • the books recommended or required for this course,
  • one other textbook on probability and statistics,
  • the published lecture and section notes.
  • documents linked to the class web site,
  • your own personal hand-written notes, and
  • a calculator.
  • You may not use any other materials. Be prepared to share books with other students."


    Most recently updated on March 12, 2004 by Charles Elkan, elkan@cs.ucsd.edu