![]() |
[ http://web.cs.dal.ca/~vlado/csci6509/coursecalendar.html ]
Fall 2021 (Sep7-Dec7) Faculty of Computer Science Dalhousie University |
# | Date | Title | |
---|---|---|---|
Part I: Introduction | |||
1 | Tu Sep 7 | Course Introduction
Course introduction: logistics, administrivia, references, evaluation, policies, schedule, A0; Introduction to NLP (reading Ch.1 [JM]): natural language and other languages, NLP applications, NLP as a research area, NLP Research Links and NLP Anthology http://aclweb.org/anthology/. Short history of NLP. Files: Syllabus (PDF), slides, lecture notes. Reading: [JM] Ch.1 | A0 out |
2 | Th Sep 9 | Levels of NLP; Course Project
NLP methodology overview; Levels of NLP; Why is NLP generally hard; Ambiguities at different levels of NLP. About Course Project: topics and teams, deliverables, P0, P1, P. Files: slides, lecture notes. | |
Part II: Stream-based Text Processing | |||
L1 | Tu Sep 14 | Lab 1: FCS Computing Environment, Perl Tutorial 1
Logging in using CSID, timberlea environment; Introduction to Perl programming language. Files: lab notes, slides. | |
3 | Tu Sep 14 | Finite Automata and Regular Expressions
About course project (continued): R (report), project types, choosing topic, resources, themse and previous topics. Part II: Stream-based Text Processing: Deterministic and Non-deterministic Automata. (Reading: Chapter 2 [JM]) Review of Deterministic Finite Automata (DFA) and Non-deterministic Finite Automata (NFA), and their use in NLP; NFA-to-DFA conversion. Review of regular expressions. Files: slides, lecture notes. Reading: [JM] Ch.2 | |
4 | Th Sep 16 | Text Processing in Perl
Regular expressions review continued: some regex references, history, examples; Introduction to Perl, main Perl features, program examples, syntactic elements, I/O, regular expressions in Perl. Files: slides, lecture notes. | |
Fr Sep 17 | Last day to add/drop courses | A0 due | |
Sa Sep 18 | A1 out | A1 out | |
L2 | Tu Sep 21 | Lab 2: Perl Tutorial 2
Regular expressions and character n-grams in Perl. Files: lab notes, slides. | |
5 | Tu Sep 21 | Elements of Morphology
More on Perl regular expressions; Text processing examples: tokenization, countil letters. Elements of Morphology: reading: Section 3.1 [JM]; morphemes, stems, affixes, tokenization, stemming, lemmatization; morphological processes. Characters, Words, and N-grams: counting words, Zipf's law, n-grams. Files: slides, lecture notes. | |
6 | Th Sep 23 | Elements of Information Retrieval
Perl examples with n-gram collection. Elements of information retrieval: typical IR system architecture, vector space model. Reading: [JM] 23.1 (Information Retrieval), [MS] Ch.15 (Topics in Information Retrieval). Some interesting links: Lucene, IR book by Manning, Raghavan, and Schutze. IR Evaluation: precision, recall, F-measure, precision-recall curve. Files: slides, lecture notes. Reading: [JM] 23.1 (Information Retrieval), [MS] Ch.15 (Topics in Information Retrieval) | |
L3 | Tu Sep 28 | Lab 3: Perl Tutorial 3
Perl: Arrays or lists; associative arrays or hashes; references. Files: lab notes, slides. | |
7 | Tu Sep 28 | Text Classification
Interpolated Precision-Recall curve. Text mining. Text Classification: classifier evaluation, evaluation measures for text classification, evaluation methods for text classification; Text clustering; Similarity-based text classification: CNG classification method for authorship attribution. Files: slides, lecture notes. | A1 due |
Th Sep 30 | National Day for Truth and Reconciliation, University closed | ||
Fr Oct 1 | Last day to drop classes without "W", change audit to credit or vv. | P0 due | |
L4 | Tu Oct 5 | Lab 4: Git and GitLab Tutorial
Introduction to GitLab and Git; adding and modifying files, setting up SSH key, add, commit, and push commands, checkout; creating branches and working collaboratively, pull, merge, rebase, resolving conflicts. Files: lab notes, slides. | |
Part III: Probabilistic Approach to NLP | |||
8 | Tu Oct 5 | Edit Distance; Probabilistic Modeling
Edit distance: introduction, properties, dynamic programming approach, example, algorithm. Probabilistic approach to NLP: logical vs. plausible reasoning in AI and NLP; Brief review of elements of probability theory. Files: slides, lecture notes. | |
9 | Th Oct 7 | P0 Topics Discussion (1)
Projects discussion: P-01, P-02, P-03, P-04, P-05, P-07, P-08, P-09, P-10, P-11, P-12, P-13, P-14, P-15. Files: P0 slides, slides, lecture notes. | |
Mo Oct 11 | Thanksgiving Day, University closed | ||
L5 | Tu Oct 12 | Lab 5: Python NLTK Tutorial 1
Introduction to Python: basics, lists, tuples, dictionaries; Introduction to NLTK: tokenization, stop-words, stemming, n-grams, frequency distribution, classification. Files: lab notes, slides. | |
10 | Tu Oct 12 | P0 Topics Discussion (2); Probabilistic Modeling
Projects discussion: P-06, P-16, P-17, P-18, P-19, P-21, P-23, P-24, P-26, P-27, P-28, P-29. Probabilistic modelling: probability theory review (continued) Files: P0 slides, slides, lecture notes. | |
We Oct 13 | A2 out | A2 out | |
11 | Th Oct 14 | Probabilistic Modeling
Bayesian inference, generative models. Probabilistic modeling: random variables, configurations, and models; computational tasks; joint distribution model; fully independent model. Files: slides, lecture notes. | |
L6 | Tu Oct 19 | Lab 6: Python NLTK Tutorial 2
Part-of-speech taggers in NLTK: HMM and CRF, Brill tagger; Named entity chunking; Jupyter and using JupyterHub. Files: lab notes, slides. | |
12 | Tu Oct 19 | Naive Bayes Model
Fully-independend model (continued); efficient product-sum formula. Naive Bayes model: definition, assumption, graphical model, computational tasks, spam example, additional notes. N-gram model: role in language modeling assumption, (to continue). Files: slides, lecture notes. | |
13 | Th Oct 21 | N-gram Model
N-gram model as Markov Chain; reading:[JM] Ch4 N-Grams; language model evaluation: perplexity; language modeling in classification; N-gram model smoothing: Laplace and Witten-Bell smoothing. POS tags: introduction, open and closed word categories. reading: [JM] Ch5 Part-of-Speech Tagging. Open word categories: nouns (NN, NNS, NNP, NNPS), adjectives (JJ, JJR, JJS), Files: slides, lecture notes. Reading: [JM] Ch4 N-Grams. [JM] Ch5 Part-of-Speech Tagging. | |
L7 | Tu Oct 26 | Lab 7: Fetching Tweets with Python
Using Twitter API, Tweepy, retrieving user profile, retrieving tweets, saving tweets into a csv file, preprocessing tweets. Files: lab notes. | |
14 | Tu Oct 26 | POS Tags and POS Tagging
POS tags (continued): verbs (VB, VBP, VBZ, VBG, VBD, VBN), adverbs (RB, RBR, RBS); Closed word categories: DT, WDT. PDT, PRP, PRP$, WP, WP$, IN, RP, POS, MD, TO, RB (closed), WRB, CC, UH; Other POS classes: EX, FW, LS, punctuation, SYM. Examples. Hidden Markov Model (HMM): motivation, definition, HMM assumption, applications, POS tagging. reading: [JM] Ch. 6 (HMM, first part) Files: slides, lecture notes. Reading: [JM] Ch. 6 (HMM, first part) | A2 due |
15 | Th Oct 28 | HMM as Bayesian Network
P1 requirements. HMM inference: Viterbi algorithm, HMM as a Bayesian Network: BN definition, burglar-earthquake example, computational tasks, brute-force inference in BNs, difficulty of inference in general BNs; Sum-product algorithms: factor graph. Files: slides, lecture notes. | |
Mo Nov 1 | Last day to drop classes with "W" | ||
16 | Tu Nov 2 | Sum-Product (Message Passing) Algorithms
Principles of message-passing algorithms, order of message calculation, cases in message calculation, inference tasks solved by message-passing algorithms, burglar-earthquake example with message passing. Files: slides, lecture notes. | |
17 | Th Nov 4 | Sum-Product HMM Example, Neural Network Models
POS tagging example using message passing algorithm. Neural networks and deep learning: applications, some main developments, large deep learning models, growth in size. Files: slides, lecture notes. | |
Fr Nov 5 | P1 due (postponed) | P1 due | |
Mo Nov 8 | Fall Study Break Nov 8-12, no classes, University open | A3 out | |
Th Nov 11 | Remembrance Day, University closed | ||
Part IV: Parsing (Syntactic Processing) | |||
L8 | Tu Nov 16 | Lab 8: Prolog Tutorial 1 Files: lab notes, slides. | |
18 | Tu Nov 16 | Introduction to Prolog and Unification
Deep learning (continued): biological neuron, perceptron, feed-forward network, activation functions, logistic regression as a simple network, softmax function; model overviews: neural language model, recurrent neural network, stacked and bidirectional RNN, LSTM, self-attention and transformers. Parsing: A brief introduction to Prolog, unification and backtracking. Files: slides, lecture notes. | |
19 | Th Nov 18 | Natural Language Syntax
Prolog (continued): variables, lists, structures; examples: factorial, member. Parsing (Syntactic Processing): Natural language syntax: phrase structure, clauses, sentences; reading: [JM] Ch 12; parsing, parse tree examples. Contest-Free Grammars review: definition, parse trees, derivations and other concepts, bracket representation. Using Prolog to parse NL (started). Files: slides, lecture notes. | |
Mo Nov 22 | A3 due | A3 due | |
L9 | Tu Nov 23 | Lab 9: Prolog Tutorial 2 Files: lab notes, slides. | |
20 | Tu Nov 23 | NL Parsing in Prolog, PCFG
Using Prolog to parse NL (continued): Parsing natural language in Prolog using difference lists; Definite Clause Grammars (DCG): example, building parse tree, handling agreement, embedded code. Probabilistic Context-Free Grammars (PCFG): PCFG as a probabilistic model, computational tasks for PCFG model: evaluation, learning, simulation, proper PCFG, expressing PCFGs in DCGs. reading: [JM] Chapters 13 and 14 (PCFG) CYK chart parsing algorithm, CNF. Files: slides, lecture notes. Reading: [JM] Chapters 13 and 14 (PCFG) | |
21 | Th Nov 25 | CYK Parsing for CFG and PCFG
CYK algorithm: example (continued), algorithm; CYK for PCFG: marginalization, completion; issues with PCFG. Typical phrase structure rules in English (started). Files: slides, lecture notes. | A4 out |
22 | Tu Nov 30 | Natural Language Syntax
Phrase structure in English (continued): NP, VP, PP, ADJP, ADVP; heads and dependency, dependency tree; non-context-free phenomena: agreement, movement subcategorization. Parser evaluation. Elements of semantics (started). Files: slides, lecture notes. | |
23 | Th Dec 2 | Unification in Syntactic and Semantic Processing Files: slides, lecture notes. | |
Part VI: Student Presentations | |||
Mo Dec 6 | Student Presentations (during day)
12:30-13:00: PT-11* (Haorui, Dongyuan, Yilong, Jingwen), PT-12; 13:30-14:30: PT-13* (Leon), PT-14* (Asad), PT-15, PT-16; 14:45-15:45: PT-17* (Keelin), PT-18* (Justin), PT-19* (Zesheng), PT-20* (Arit); 16:00-17:00: PT-21* (Tongqi), PT-22* (Yixiao), PT-23, PT-24* (Ben, Urmzd); | ||
Tu Dec 7 | Student Presentations (during day)
09:30-10:30: PT-25* (Robert, Julia, Noah, Conor), PT-26* (Jemis, Kirtan) , PT-27* (Yaoxin), PT-28* (Emily); 10:45-11:45: PT-29* (Frederik), PT-30* (Keshava), PT-31* (Patrick), PT-32* (Gaurav); 12:00-13:00: PT-33* (Adeolu, Aishik, Grant), PT-34* (Sanjana), PT-35* (Yuqing, Borong, Junqiao, Archer), PT-36* (Parvez); 13:30-14:30: PT-37* (Ohiduzzaman), PT-38* (Sigma), PT-39* (Usmi), PT-40* (Janvi, Bansi, Deep, Sanket); 14:45-15:45: PT-41* (Isaac), PT-42* (Will), PT-43* (Akhilesh), PT-44* (Mayank); 16:00-17:00: PT-45* (Kishan, Narendran, Nirmal, Tejeswi), PT-46* (Siddharth), PT-47* (Temi), PT-48* (Pranav, Yunzhong); | A4 due | |
Tu Dec 7 | Classes end, Monday schedule used, Report due | Reports due | |
Mo Dec 13: | Extended deadline for A4 | A4 | |
We Dec 15: | Extended deadline for Project Report | R | |
Final Exam | |||
Fr Dec 17 | Final Exam (8:30-10:30am)
Final exam, duration 2 hours, starting at 08:30am, On-line. Exams schedule URL: http://www.dal.ca/academics/exam_schedule/halifax_campus_exam_schedule.html | F.Exam |