Picture of Charles Elkan  CHARLES ELKAN Recent papers

Professor
Department of Computer Science and Engineering 0404
University of California, San Diego

elkan@cs.ucsd.edu

(858) 534-8897  office
(858) 534-7029  fax


 
 

RECENT TEACHING

quarter year number title
Spring 2008 291 Web-Scale Information Retrieval and Data Mining
Fall 2007 250B Principles of Artificial Intelligence: Learning
Fall 2007 151 Introduction to Artificial Intelligence: Learning
Spring 2007 92 Reading and Writing in Computer Science    (Spring 2005 version)
Spring 2007 254 Seminar on Learning Algorithms: Log-Linear Models and Conditional Random Fields  
(Spring 2005 version)  (Spring 2003 version)   (Spring 2002 version)   (Spring 2001 version)
Spring 2007 259 AI seminar
Winter
2005
291
Statistical Learning        (Winter 2004 version)  (Fall 2002 version)
Fall
2004
150
Introduction to Artificial Intelligence        (Winter 2004 version)
Fall 2002 134A Web Service Design and Programming    (Fall 2001 version)   (Spring 2001 version)
Spring 2002 130 Programming Languages: Principles and Paradigms      (Fall 1999 version)
Fall 2001 Cog Sci 200 Historical and Conceptual Foundations of Cognitive Science
Winter
2001 250A Principles of Artificial Intelligence: Search and Reasoning
Spring
2001 190 Seminar on Computers in Society


AWARDS AND HONORS

Award for first place out of 43 entries in the CoIL Challenge 2000 data mining competition.  For a discussion of the contest, see the paper Magical Thinking in Data Mining: Lessons From CoIL Challenge 2000.

Award for first place out of 45 entries in the data mining competition at the International Conference on Knowledge Discovery in Databases (KDD'97), August 1997.  For a description of the method used, see the paper Boosting and Naive Bayesian Learning.

Honorable mention, best-written paper competition, National Conference on Artificial Intelligence (AAAI'93), July 1993, for The Paradoxical Success of Fuzzy Logic.

Best paper award, IEEE Conference on Artificial Intelligence for Applications (CAIA'93), March 1993, for Categorization-Based Diagnostic Problem Solving in the VLSI Design Domain with A. Hekmatpour.
 
 

THESES SUPERVISED

Amir Hekmatpour, Ph.D. 1993.
Timothy L. Bailey, Ph.D. 1995.
Karan Bhatia, M.S. 1995.
Charles Chu, M.S. 1996.
William Riordan, M.S. 1996.
David Martinez, M.S. 1996.
Michael Sussna, Ph.D. 1997.
Alvaro Monge, Ph.D. 1997.
Timothy Leek, M.S. 1997. Thesis Information extraction using hidden Markov models.
William S. Noble, Ph.D. 1998 (name changed from Bill Grundy).
Brian J. Chan, B.S. magna cum laude, Harvard College, 1999.  Thesis Comparing and Enhancing Computational Methods for Predicting Splicing Enhancer Locations.
Greg Hamerly, Ph.D. 2003.
Bianca Zadrozny, Ph.D. 2003.
David Kauchak, Ph.D. 2006.
Doug Turnbull, Ph.D. 2008.


 

PROFESSIONAL SERVICE

Organizer, UCSD data mining contest sponsored by Fair Isaac, 2004, 2005, 2006.

Keynote talk Clustering with k-means: faster, smarter, cheaper at the Workshop on Clustering High-Dimensional Data, SIAM International Conference on Data Mining (SDM 2004)

Invited talk What are the real challenges in data mining? at the Workshop on Learning from Imbalanced Data Sets, ICML, August 21, 2003.

Organizer, 1999 KDD conference data mining competitions on knowledge discovery and classifier learning .

Program committee member, 1999, 1998 and 1997 International Conferences on Knowledge Discovery in Databases (KDD'97, '98, '99), 1998 and 1996 National Conference on Artificial Intelligence (AAAI'96, '98), 1995 International Conference on Machine Learning (ML'95), and many other conferences.

Organizer and program co-chair, AAAI Spring Symposium on Knowledge Assimilation, March 1992.
 
 

SELECTED PAPERS

C. Elkan and K. Noto.  Learning Classifiers from Only Positive and Unlabeled Data.  To appear in Proceedings of the Fourteenth International Conference on Knowledge Discovery and Data Mining (KDD'08).  Data available here.

C. Elkan.  Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution (pdf).  In Proceedings of the Twenty-Third International Conference on Machine Learning (ICML'06).

C. Elkan.  Deriving TF-IDF as a Fisher kernel (pdf).  To appear in Proceedings of the International Symposium on String Processing and Information Retrieval (SPIRE'05), Buenos Aires, Argentina, November 2005, pp. 296-301.

R. Madsen, D. Kauchak, and C. Elkan.  Modeling Word Burstiness Using the Dirichlet Distribution (pdf).  In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML'05).

C. Elkan.  Using the Triangle Inequality to Accelerate k-Means (pdf).  In Proceedings of the Twentieth International Conference on Machine Learning (ICML'03), pp. 147-153.  Software available here.

E. Wiewiora, G. Cottrell, and C. Elkan.  Principled Methods for Advising Reinforcement Learning Agents (pdf).  In Proceedings of the Twentieth International Conference on Machine Learning (ICML'03), pp. 792-799.

G. Hamerly and C. Elkan.   Alternatives to the k-Means Algorithm That Find Better Clusterings (pdf).  In Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM'02), pp. 600-607, November 2002.

G. F. Hughes, J. F. Murray, K. Kreutz-Delgado, C. Elkan.  Improved Disk-Drive Failure Warnings (pdf).  IEEE Transactions on Reliability, vol. 51, no, 3, pp. 350-357, September 2002.

D. Kauchak, J. Smarr, and C. Elkan.  Sources of Success for Information Extraction Methods (pdf, 36 pages) (postscript). Technical Report No. CS2002-0696, January 2002, UCSD.

C. Elkan.  Magical Thinking in Data Mining: Lessons From CoIL Challenge 2000 (postscript) (pdf).  In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining (KDD'01), pp. 426-431.

B. Zadrozny and C. Elkan.  Learning and Making Decisions When Costs and Probabilities are Both Unknown (postscript) (pdf).  In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining (KDD'01), pp. 204-213.

B. Zadrozny and C. Elkan.  Obtaining Calibrated Probability Estimates from Decision Trees and Naive Bayesian Classifiers (pdf).  In Proceedings of the Eighteenth International Conference on Machine Learning (ICML'01), pp. 609-616.

G. Hamerly and C. Elkan.  Bayesian Approaches to Failure Prediction for Disk Drives (postscript) (pdf).  In Proceedings of the Eighteenth International Conference on Machine Learning (ICML'01), pp. 202-209.

C. Elkan.  The Foundations of Cost-Sensitive Learning (postscript) (pdf). In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01), pp. 973-978.

C. Elkan  Paradoxes of Fuzzy Logic, Revisited (postscript) (pdf). International Journal of Approximate Reasoning, vol. 26, no. 2, pp. 153-155, 2001.

B. Zadrozny and C. Elkan.  Learning and Making Decisions When Costs and Probabilities are Both Unknown (pdf) (postscript, 3 megabytes) (gzip postscript). Technical Report No. CS2001-0664, January 2001, UCSD. Note: If you have difficulty downloading this paper, please try right-clicking and choosing Save Link As.  Also try downloading the postscript version instead of the PDF version, or vice versa.

F. Farnstrom, J. Lewis, and C. Elkan.  Scalability for Clustering Algorithms Revisited (postscript).  ACM SIGKDD Explorations, vol. 2, no. 1, pp. 51-57, August 2000.  Software available here.

C. Elkan. Cost-Sensitive Learning and Decision-Making When Costs Are Unknown.  Presented at the Workshop on Cost-Sensitive Learning of the International Conference on Machine Learning (ML'2000), Stanford University, California, June 2000.

M. E. Baker, W. N. Grundy, and C. Elkan.  A common ancestor for a subunit in the mitochondrial proton-translocating NADH:ubiquinone oxidoreductase (complex I) and short-chain dehydrogenases/reductases.  Cellular and Molecular Life Sciences, vol. 55, no. 3, pp. 450-455, 1999.

C. Elkan. Boosting and Naive Bayesian Learning.  Technical Report No. CS97-557, September 1997, UCSD. First version May 1997.

A. E. Monge and C. Elkan. An efficient domain-independent algorithm for detecting approximately duplicate database records (pdf) (ps).  SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'97), May 1997, Tucson, Arizona.

A. M. Segre and C. Elkan. Exploratory analysis of speedup learning data using expectation maximization.  Artificial Intelligence, vol. 85, no. 1-2, pp. 301-319, August 1996.

T. L. Bailey and C. Elkan. The Value of Prior Knowledge in Finding Motifs with MEME.  In Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology (ISMB'95), pp. 21-29.  Cambridge, England, July 1995.  Software available here.

T. L. Bailey and C. Elkan. Unsupervised Learning of Multiple Motifs in Biopolymers using Expectation Maximization.  In Machine Learning, vol. 21, no. 1-2, pp. 51-80, 1995.

A. M. Segre and C. Elkan. A High-Performance Explanation-Based Learning Algorithm.  Artificial Intelligence, vol. 69, no. 1, pp. 1-50, September 1994.

C. Elkan. The Paradoxical Success of Fuzzy Logic. IEEE Expert, pp. 3-8, August 1994. With fifteen responses on pp. 9-46. First version in AAAI'93 proceedings, pp. 698-703.

C. Elkan. The Paradoxical Controversy over Fuzzy Logic. IEEE Expert, pp. 47-49, August 1994.

T. L. Bailey and C. Elkan. Fitting a Mixture Model by Expectation Maximization to Discover Motifs in Biopolymers.  In Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology (ISMB'94), pp. 28-36. Stanford, California, August 1994.

T. L. Bailey and C. Elkan. Estimating the Accuracy of Learned Concepts.  In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI'93), pp. 895-900. Chambéry, France, September 1993.

A. Hekmatpour and C. Elkan. Categorization-Based Diagnostic Problem Solving in the VLSI Design Domain.  In Proceedings of the Ninth IEEE Conference on Artificial Intelligence for Applications (CAIA'93), pp. 121-127. Orlando, Florida, March 1993. IEEE Press.

C. Elkan. Reasoning about Action in First-Order Logic.  In Proceedings of the Ninth Biennial Conference of the Canadian Society for Computational Studies of Intelligence (CSCSI'92). Vancouver, Canada, May 1992. Morgan Kaufmann Publishers, Inc.

For a complete list of refereed papers, see this curriculum vitae (pdf).


Most recently updated on June 18, 2008 by Charles Elkan.