[ http://web.cs.dal.ca/~vlado/csci6509/coursecalendar.html ]
Fall 2024 (Sep3-Dec4) Faculty of Computer Science Dalhousie University |
# | Date | Title | |
---|---|---|---|
Part I: Introduction | |||
1 | We Sep 4 | Course Introduction
Course introduction: logistics, administrivia, references, evaluation, policies, schedule; Introduction to NLP (reading Ch.1 [JM]): natural language and other languages, NLP applications, NLP as a research area, NLP Research Links and NLP Anthology http://aclweb.org/anthology/. Short history of NLP. NLP methodology overview. Levels of NLP. Why is NLP generally hard. Files: Syllabus (PDF), slides, lecture notes. Reading: [JM] Ch.1 | |
2 | Mo Sep 9 | Ambiguities in NLP; Course Project
Ambiguities at different levels of NLP. About Course Project: topics and teams, deliverables, P0, P1, P, R; project types, choosing topic, resources, themes and previous topics. Files: slides, lecture notes. | |
Part II: Stream-based Text Processing | |||
3 | We Sep 11 | Finite Automata Review
Part II: Stream-based Text Processing: Deterministic and Non-deterministic Automata. (Reading: Chapter 2 [JM]) Review of Deterministic Finite Automata (DFA). Review of Non-deterministic Finite Automata (NFA), and their use in NLP. NFA-to-DFA conversion. Files: slides, lecture notes. Reading: [JM] Ch.2 | A0 out |
L1 | Fr Sep 13 | Lab 1: FCS Computing Environment, Perl Tutorial 1
Logging in using CSID, timberlea environment; Introduction to Perl programming language: basic syntax, variables, string literals, subroutines. Files: lab notes, slides. | |
4 | Mo Sep 16 | Regular Expressions and Perl Files: slides, lecture notes. Reading: On timberlea server `man perlretut' and `man perlre', or perlretut and perlre | |
Tu Sep 17 | Last day to add/drop courses | ||
5 | We Sep 18 | Basic NLP in Perl
Regular expressions in Perl and basic text processing; Text processing examples: tokenization, counting letters. Elements of Morphology: reading: Section 3.1 [JM]; morphemes, stems, affixes, tokenization, stemming. Files: slides, lecture notes. Reading: Section 3.1 [JM] | A0 due |
L2 | Fr Sep 20 | Lab 2: Perl Tutorial 2
Regular expressions in Perl, Perl: basic I/O. Files: lab notes, slides. | |
6 | Mo Sep 23 | Counting N-grams
Elements of Morphology (continued): lematization, morphological processes; Characters, Words, and N-grams: counting words, Zipf's law. Perl examples with n-gram collection. Elements of Information Retrieval: Vector Space Model. Files: slides, lecture notes. Reading: [JM] 23.1 (Information Retrieval), [MS] Ch.15 (Topics in Information Retrieval) | A1 out |
7 | We Sep 25 | Elements of Information Retrieval and Text Mining
Some interesting links: Lucene, IR book by Manning, Raghavan, and Schutze. IR Evaluation: precision, recall, F-measure, precision-recall curve. Interpolated Precision-Recall curve. Text mining. Text Classification: classifier evaluation precision, recall, and F-measure in classification. Evaluation methods for classification: training error, train-and-test, and n-fold cross-validation. Similarity-based text classification. Files: slides, lecture notes. | |
L3 | Fr Sep 27 | Lab 3: Perl Tutorial 3
Perl: Arrays or lists; associative arrays or hashes; references. Files: lab notes, slides. | |
Fr Sep 27 | P0 Project Topic Proposal due | P0 due | |
Mo Sep 30 | National Day for Truth and Reconciliation, University closed | ||
We Oct 2 | Last day to drop classes without "W", change audit to credit or vv. | ||
8 | We Oct 2 | Similarity-based Classification
CNG classification method for authorship attribution. Edit distance: introduction, dynamic programming approach, example, algorithm. Files: slides, lecture notes. | |
L4 | Fr Oct 4 | Lab 4: Git and GitLab Tutorial
Introduction to GitLab and Git; adding and modifying files, setting up SSH key, add, commit, and push commands, checkout; creating branches and working collaboratively, pull, merge, resolving conflicts. Files: lab notes, slides. | |
Part III: Probabilistic and Machine Learning Approach to NLP | |||
9 | Mo Oct 7 | P0 Topics Discussion; Introduction to Probabilistic Modeling
Projects discussion: P-01, P-03, P-04, P-05, P-06. Probabilistic approach to NLP: logical vs. plausible reasoning in AI and NLP; Brief review of elements of probability theory. Bayesian inference, generative models. Probabilistic modeling: random variables, configurations, and models; computational tasks. Files: P0 slides, slides, lecture notes. | |
10 | We Oct 9 | Basic Probabilistic Models; P0 Topics Discussion (2)
Joint distribution model; simulation task; other tasks in joint distribution model; spam example. Fully independent model. Note on efficient sum-product computation and max-product computation. Project discussion: P-02. Files: slides, lecture notes, P0 slides #12. | |
Th Oct 10 | A1 due | A1 due | |
L5 | Fr Oct 11 | Lab 5: Python NLTK Tutorial 1
Introduction to Python: basics, lists, tuples, dictionaries; Introduction to NLTK: tokenization, stop-words, stemming, n-grams, frequency distribution, classification. Files: lab notes, slides. | |
Mo Oct 14 | Thanksgiving Day, University closed | ||
11 | We Oct 16 | Naive Bayes Model; P0 Topics Discussion (3)
Fully-independent model finished. Naive Bayes model: definition, assumption, graphical model, computational tasks with example, number of parameters, pros and cons, variations and practical issues. Project discussions: P-07, P-08, P-09, P-11, P-12, P-13. Files: slides, lecture notes, P0 slides #14. | |
L6 | Fr Oct 18 | Lab 6: Python NLTK Tutorial 2
Part-of-speech taggers in NLTK: HMM and CRF, Brill tagger; Named entity chunking; Jupyter and using JupyterHub. Files: lab notes, slides. | |
12 | Mo Oct 21 | P0 Topics Discussion (4); N-gram Model
Project discussions: P-14, P-15, P-16, P-17, P-18, P19, P-20, P-21, P-23, P-24. N-gram model: language modeling, N-gram model assumption, graphical representation. Files: P0 slides, slides, lecture notes. Reading: [JM] Ch4 N-Grams | |
13 | We Oct 23 | N-gram Model Smoothing; POS Tags and Hidden Markov Model Files: slides, lecture notes. | |
L7 | Fr Oct 25 | Lab 7: P0 Submission help; TeX/LaTeX Tutorial (not marked) Files: slides. | |
14 | Mo Oct 28 | Neural Networks and NLP | P1 due |
15 | We Oct 30 | Deep Learning Approaches to NLP | |
Th Oct 31 | Last day to drop classes with "W" | ||
L7 | Fr Nov 1 | Lab 7: Python Tutorial with PyTorch | |
Part IV: Parsing (Syntactic Processing) | |||
16 | Mo Nov 4 | Parsing NLP | |
17 | We Nov 6 | Natural Language Syntax | |
Mo Nov 11 | Remembrance Day, University closed | ||
Mo Nov 11 | Fall Study Break Nov 11-15, no classes, University open except Mon | ||
18 | Mo Nov 18 | DCG and PCFG | |
19 | We Nov 20 | Typical Phrase Structure of English | |
L8 | Fr Nov 22 | Lab 8: Prolog Tutorial 1 | |
20 | Mo Nov 25 | Heads and Dependency, NL Phenomena | |
21 | We Nov 27 | Typical Phrase Structure Rules in English | |
L9 | Fr Nov 29 | Lab 9: Prolog Tutorial 2 | |
Part V: Student Presentations | |||
Mo Dec 2 | Student Presentations | ||
Tu Dec 3 | Student Presentations (Monday schedule used) | ||
We Dec 4 | Student Presentations (Monday schedule used) | ||
We Dec 4 | Classes end, Report due | Report due | |
Final Exam | |||
Th Dec 12 | Final Exam (8:30-10:30am)
Final exam, duration 2 hours, starting at 8:30am, location TBA. Exams schedule URL: https://www.dal.ca/exams/halifax-exam-schedule.html | F.Exam |