CSCI 6505.03 Machine Learning
Instructor: Evangelos E. Milios
FLU PANDEMIC ADVISORY
|Times & Days||Office Hours|
Weekly schedule -- lectures
Virtual class attendance
Machine Learning is the area of Artificial Intelligence concerned with the problem of building computer programs that automatically improve with experience. The intent of this course is to present a broad introduction to the principles and paradigms underlying machine learning, including discussions of each of the major approaches currently being investigated. Main topics covered in the course include a review of information theory, unsupervised learning or clustering (the K-means family, co-clustering, mixture models and the EM algorithm), supervised learning or classification (support vector machines, decision trees, rule learning, Bayesian learners, maximum entropy, ensemble methods), feature selection and feature transformations.
The focus of applications that will be discussed will be text classification and clustering.
|Assignments (a1 , a2 equally weighted)||30%|
|Class Participation - Reading short reports||10%|
|Course project (Discussion paper guidelines)||30%|
Students with disabilities are encouraged to register as quickly as possible at the Student Accessibility Services if they want to receive academic accommodations. To do so please phone 494-2836, e-mail access <at-symbol> dal.ca , drop in at the Killam, G28 or visit our website at www.studentaccessibility.dal.ca .
A course on Probability and Statistics is essential. General mathematical aptitude would really help. .
[CB] is recommended, but it is not required.
[CB] Christopher Bishop: Pattern
Recognition and Machine Learning,
Springer, 2006, ISBN: 0387310738
[ITILA] David J. C. MacKay: Information
Theory, Inference and Learning Algorithms
Cambridge University Press, 2003, ISBN: 0521642981 (pdf file available on the Web for on-screen reading only)
[TK] S. Theodoridis, K. Koutroumbas
Academic Press, 3rd edition, 2006, 0123695317
[IML] Ethem Alpaydin
Introduction to Machine Learning
The MIT Press, October 2004, ISBN 0-262-01211-1
[ESL] Trevor Hastie, Robert Tibshirani, Jerome Friedman
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Springer-Verlag, (August 2001), ISBN: 0387952845
[LV] John Lee, Michel Verleysen: Nonlinear
Springer Verlag, 2007, ISBN: 978-0-387-39350-6 (print), online (978-0-387-39351-3) (pdf file available on SpringerLink)
[ACD] A. Jain, R. Dubes: Algorithms
for Clustering Data
Prentice Hall, 1988 (out of print, pdf file available on the Web)
[BL] Bing Liu: Web
Data Mining: Exploring Hyperlinks, Contents, and Usage Data
2007, 532 p., 177 illus., Hardcover, ISBN-10: 3-540-37881-2, ISBN-13: 978-3-540-37881-5
[SC] Soumen Chakrabarti:
Mining the Web: Discovering Knowledge from Hypertext Data
Hardcover - 344 pages - 9 October 2002
Morgan Kaufmann Publishers; ISBN: 1 55860 754 4
[SK] Scott Spangler and Jeffrey Kreulen
Mining the Talk: Unlocking the business value in unstructured information
ISBN-10: 0-13-233953-6; ISBN-13: 978-0-13-233953-7;
Published: Jul 19, 2007; Copyright 2008; Dimensions 7x9-1/4; Pages: 240; Edition: 1st.
(freely readable on Safari)
[TS] T. Segaran:
Programming Collective Intelligence: Building smart Web 2.0 applications
O'Reilly, Aug. 2007, 1st ed., ISBN 0-596-52932-5
(freely readable on Safari)
[FS] Ronen Feldman and James Sanger
The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Cambridge University Press, 2007
(ISBN-13: 9780521836579 | ISBN-10: 0521836573)
[WF] Ian H. Witten, Eibe Frank:
Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (2nd edition)
Paperback - 525 pages (Jun. 2005)
Morgan Kaufmann Publishers; ISBN: 0-12-088407-0
Introduction to Machine Learning, Nils J. Nilsson (Draft of a Proposed New Textbook available on the Web)
Machine Learning, Tom Mitchell, McGraw Hill, 1997.
& Schuetze: Foundations of
Statistical NLP, MIT Press, 1998, ch. 14
pp. 495-527 (tot. pages: 680, ISBN: 0 262 13360 1)
Manning, Raghavan and Schuetze: Introduction to Information Retrieval, Cambridge University Press, 2008.
R. Sutton, A. Barto: "Reinforcement Learning, an Introduction", MIT
Online version: http://www-anw.cs.umass.edu/~rich/book/the-book.html
Stuart Russell and Peter Norvig: Artificial
Intelligence: A Modern Approach
Publisher: Prentice Hall; 2nd edition (December 20, 2002) ISBN: 0137903952
Hardcover: 1132 pages
Nils J. Nilsson: "Artificial Intelligence : A New Synthesis", Morgan Kaufmann Publishers (April 1998); ISBN: 1558604677
AAAI and IJCAI conference proceedings
European Conference on Artificial Intelligence proceedings
Communications of the ACM
ACM SIGMOD Int. Conf. on Management of Data
Knowledge Engineering Review
Artificial Intelligence (Elsevier Publishers)
IEEE Trans. on Systems, Man and Cybernetics
IEEE Internet Computing
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Software Engineering
Journal of Artificial Intelligence Research
International Conference on Machine Learning
International Journal of Applied Artificial Intelligence
Int. Journal of Cooperative Information Systems
Note: The majority of the articles required for the course are available from the authors' web sites for downloading.
10 challenging problems and 10 most influential algorithms in data
IEEE International Conference on Data Mining (ICDM)
Related Courses at Dalhousie U.
CSCI 4144 & CSCI 6405: Data
Mining and Data Warehousing. Focus on link to databases, pre/post processing
MATH 5500. Statistical Data Mining: Statistical treatment of supervised and unsupervised learning.
CSCI 6501. Intelligent Systems.
CSCI 6504. Software Agents: Reinforcement learning, grammar induction (not offered)
CSCI 6506 Genetic Algorithms and Programming. In-depth treatment of this topic
CSCI 6507. Artificial Neural Networks (not offered)
CSCI 6508. Computational Neuroscience.
CSCI 6509. Advanced Topics in Natural Language Processing. Uses machine learning techniques (Hidden Markov Models, Naive Bayes, Clustering).
Machine Learning Resources on the Web:
UCI Machine Learning Repository
GMD Machine Learning Archive
Reinforcement Learning Repository at UMass
Scilab: a free scientific software package (a free equivalent of Matlab)
Octave (a free GNU project alternative to Matlab)
WEKA: Data Mining Software in Java
RapidMiner (formerly YALE: Yet Another Learning Environment (incorporates WEKA))
The R project for statistical computing (free comprehensive statistical programming environment)
---- TM: Text Mining in R
---- R/Weka: an R interface to WEKA
Dragon toolkit for Languge Modeling, Text Retrieval, and Text Data Mining
LingPipe (NLP applications of ML)
Topic Modelling Toolbox (Matlab) using Latent Dirichlet Allocation (LDA)