Class schedule
(Warning: class schedule will evolve
during the course)
IEEE Electronic Library ACM Digital Library: http://www.acm.org/dl/ UTexas @ Austin Largescale Data Mining course by I. Dhillon: UTexas @ Austin Machine learning course by R. Mooney: UofA Machine Learning course by Russ Greiner's
UBC Machine Learning course by Kevin Murphy MIT Machine Learning course (from OpenCourseware)
CMU Machine Learning course by T. Mitchell and A. Moore
Tom Mitchell's Machine Learning textbook (incl. lecture slides and chapters of the 2nd edition) Statistical NLP resources Machine Learning for Information Retrieval MLnet Online Information Service
Machine Learning Summer School 2005
Pascal (Pattern Analysis, Statistics, Modeling and Computational Learning)Project Workshops
Chris Bishop's tutorials Andrew Moore's tutorials (excellent introductions of basic concepts) home page
Andrew Ng's course at Stanford
Class  Date  Topic  Reading  Assignments / Comments 
1  10/9  I. Background Probability, Entropy 
Bishop ch. 1, 2 ITILA, ch. 2, 8, 23WP 

2  15/9  Dependent random variables, prob. distributions 
Bishop ch. 1 ITILA, ch. 1, 4, 5, 6, WP , WP, WP 

3  17/9  Information theory  Entropy Probabilistic inference 
Bishop ch. 1 ITILA, ch. 3, 4 
ass. 1 out 
4  22/9  II. Clustering Agglomerative, hierarchical 
Nilsson ch. 9, [ACD] ch. 3, [IML] ch 7.7, [ESL] ch. 14 WP 

5  24/9  Kmeans, Kmedoids, Xmeans  Bishop ch. 9 [ITILA] ch. 20, 22, [IML] ch. 7.3 WP Pelleg&Moore, Xmeans: Extending Kmeans with Efficient Estimation of the Number of Clusters ICML 2000 

6  29/9  Probabilistic methods  Mixtures of Gaussians 
Bishop ch. 9 [ITILA] ch. 22, [IML] ch. 7.2, 7.4 WP 

7  1/10  Probabilistic methods  EM algorithm 
Bishop ch. 9 [ITILA] ch. 22, [IML] ch. 7.2, 7.4 WP 

8  6/10  Principal Component Analysis (PCA), Probabilistic PCA 
Bishop ch. 12 [IML] ch. 6, [ESL] ch. 14.5, WP WP 

9  8/10  Curve fitting, model selection, curse of dimensionality  ass. 1 due ass. 2 out 

10  13/10  III. Supervised learning Definition, issues, accuracy Linear models for classification 
Bishop ch. 4.1 [IML] ch. 2 

11  15/10  Nearestneighbour learners Evaluating classifiers 
Bishop ch. 2.5.2 [IML] ch. 8 [IML] ch. 4, 14 
project proposal due (max. 1 page) 
12  20/10  Probabilistic classification models Model comparison Feature selection 
Bishop ch. 4.2, 4.3, 4.4 [IML] ch. 6.6 [IML] ch. 3 

13  22/10  Kernel methods SVMs 
Bishop ch. 6, 7.1   Scaling large margin classifiers for spoken language understanding 2006, Heffner  Support vector clustering, JMLR 2001, BenHur et al 
14  27/10  Reading Presentations (CiteULike) Ensemble methods in classification 
1. Ensemble
Learning  Dietterich 2002 2. Ensemble Learning 2009(Scholarpedia) 
1.
gugle 2. gugle 
15  29/10  Reading Presentations Ensemble methods in classification 
3. Boosting
and Rocchio applied to text filtering, SIGIR 1998 4. BoosTexter: A boostingbased system for text categorization, Machine Learning 2000 
3. cairns 4. narayan 
16  3/11  no class  
17  5/11  no class 
ass 2 due  
7/11  Reading Presentations Ensemble methods in classification 
5. Boosting
for Text Classification with Semantic Features 2004 
5.meikuan 6. jrahman 

18  10/11  Reading Presentations Active Learning 
7. Less
is More Active Learning with Support Vector Machines ICML 2000 8. Employing EM and PoolBased Active Learning for Text Classification ICML 1998 
7.chapman 8. denil 
19  12/11  Reading Presentations Active Learning 
9. Combining
active learning and relevance vector machines for text classification
2007 10. Effective MultiLabel Active Learning for Text Classification KDD 2009 
9. reza 10. yphilip 
14/11  Reading Presentations Active Learning 
11. A
literature survey of active machine learning in the context of natural language
processing 2009 12. Efficient Multiclass Boosting Classification with Active Learning SIAM DM 2007 
11. fabbas 12. alfaro 

20  17/11  Reading Presentations Ensemble methods in clustering 
13. Cluster
Ensembles  A Knowledge Reuse Framework for Combining Multiple Partitions
JMLR 2002 
13. reza 14. alfaro 
21  19/11  Reading Presentations Ensemble methods in clustering 
15. A
New Efficient Approach in Clustering Ensembles IDEAL 2007 16. Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters PAMI 2008 
15. cairns 16. narayan 
22  24/11 
Reading Presentations Ensemble methods in clustering 
17. A
Survey Clustering Ensembles Techniques  2009 18. Semisupervised Clustering by Seeding ICML 2002 
17. fabbas 18 . chapman 
23  26/11 
Reading Presentations Semisupervised clustering 
19.
Semisupervised clustering with user feedback  draft  2003 20. UserInterestBased Document Filtering via Semisupervised Clustering ISMIS 2005 
19. meikuan 20. yphilip 
24  1/12 
Reading Presentations Semisupervised clustering 
21. A
SemiSupervised Document Clustering Algorithm Based on EM 2005 22. Text Clustering with Extended User Feedback SIGIR 2006 

25  3/12 
Reading Presentations Semisupervised clustering 
23. Enhancing
semisupervised clustering: a feature projection perspective, KDD 2007 24. Constrained Locally Weighted Clustering VLDB 2008 
23. zolaktaf 24. zolaktaf 
Project Presentations  
Project due date  
OTHER TOPICS  
Maximum entropy learners  Nigam, Lafferty, McCallum: Using
Maximum Entropy for Text Classification, IJCAI 1999. WP 

Latent Semantic Indexing  WP WP  
Coclustering  Dhillon, Mallela, Modha: Informationtheoretic coclustering, SIGKDD 2003.  
Rule Learning 
[ESL] ch. 14.2 

Vector spaces, projections, eigenvalues/eigenvectors, SVD  WP WP WP WP  
Independent Component Analysis  Hyvarinen, Oja: Independent
Component Analysis: Algorithms and Applications, Neural Networks 13(45),
2000 Isbell, Viola: Restructuring Sparse High Dimensional Data for Effective Retrieval, NIPS, 1998. [ITILA] ch. 34, [ESL] ch. 14.6 

Maximum a posteriori method 
[ITILA] ch. 28, [IML] ch. 4.4, 4.8 WP WP  
Ensemble methods   Dietterich, T. G. (2000).
Ensemble Methods in Machine Learning. In J. Kittler and F. Roli (Ed.)
First International Workshop on Multiple Classifier Systems, Lecture Notes
in Computer Science (pp. 115). New York: Springer Verlag.  Dietterich, T. G., (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40 (2) 139158. 

Decision trees  [IML] ch. 9  
Constrained optimization review Linear discriminant classifiers Support Vector Machines 
WP
Convex optimization text
[IML] ch. 10, WP [IML] ch. 10  Support Vector Machine resources  Kernel machine resources  Cawley, G. C. Matlab Support Vector Machine Toolbox  Tutorial on Support Vector Machines and Kernel Methods Presented at ICML2001 by Nello Cristianini  J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining, 2(2), 1998. 

Ensemble methods  Bishop ch. 14  
Decision trees  Bishop ch. 14 [IML] ch. 9 

Graphical models  Bishop ch. 8 