[ http://web.cs.dal.ca/~vlado/csci6509/coursecalendar.html ]
Winter 2014 (Jan6-Apr7) Faculty of Computer Science Dalhousie University |

# | Date | Title | |
---|---|---|---|

1 |
Mo Jan 6 | Course Introduction
Course information: logistics and administrivia, textbook and other main references, evaluation scheme, academic integrity policy, tentative course schedule. Introduction to NLP: reading Ch.1 [JM, what is a natural language and other kinds of languages; NLP applications, NLP as a research area. NLP Research links and NLP Anthology http://aclweb.org/anthology-new/. Short history of NLP. Handout: Course syllabus Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Ch.1 | |

2 |
We Jan 8 | Introduction to NLP
Levels of NLP, some reasons why NLP is hard, ambiguities at different levels of NLP, examples of lexical and syntactic ambiguities. Files: Slides (PDF), Lecture notes (PDF). | A0 out |

3 |
Fri Jan 10 | About Course Project
Ambiguities at differenct levels of NLP (continued): syntactic, semantic, pragmatic levels; NLP metholodology; about course project: deliverables, P0, P1, P, R; project types, choosing topic, resources. Files: Slides (PDF), Lecture notes (PDF). | |

Part I: Linguistic Background | |||

4 |
Mon Jan 13 | Elements of Morphology
Project discussion (continued): themes and previous topics; Part I: Linguistic background; Words and morphology, reading:[JM] Sec 3.1, Elements of morphology: morphemes, stems, affixes, tokenization, stemming, lemmatization; morphological processes: inflection, derivation, compounding, clitics; Parts-of-speech (POS), reading: [JM] Sec 5.1-5.3, POS tagging, open and closed categories, POS tag sets; Closed word categories, determiners (DT), interrogative determiners (WDT), predeterminers (PDT). Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Sec 3.1, Sec 5.1-5.3 | |

5 |
Wed Jan 15 | Part-of-Speech Tags
Brief introductino to Perl, some examples, handout: NL principles in Perl. Closed word categories (continued): predeterminers (PDT), personal pronouns (PRP), possessive pronouns (PRP$), wh-pronouns (WP) and wh-possessive (WP$), prepositions (IN), particles (RP), possessive ending (POS), modal verbs (MD), infinitive word 'to' (TO), qualifiers (RB), wh-adverbs (WRB), conjunctions (CC), interjections (UH); Open word categories: nouns (NN, NNS, NNP, NNPS), adverbial nouns. Files: Slides (PDF), Lecture notes (PDF). | A0 due |

L1 |
Wed Jan 15 | Lab 1: Perl Tutorial 1
Logging in, bluenose environment; Basic Perl program, about Perl, syntactic elements, variables, string literals, operators, example programs. Files: Slides (PDF), Lab notes (PDF). | |

6 |
Fri Jan 17 | Syntax
Part-of-Speech (POS) tags continued: Adjectives (JJ, JJR, JJS). Numbers (CD), verbs (VB, VBP, VBZ, VBG, VBD, VBN); adverbs (RB, RBR, RBS); Remaining POS classes: existential there (EX), foreign words (FW), list items (LS), punctuation; POS tagging examples; Syntax: phrase structure, phrases, clauses, sentences; reading: [JM] Ch 12; parsing, parse tree examples. Context-Free Grammars (CFG) review, examples. induced grammar, parse trees, derivations, left-most and right-most derivations. Files: Slides(PDF), Lecture notes (PDF). Reading: [JM] Ch 12 | |

7 |
Mon Jan 20 | Typical Phrase Structure Rules in English
Bracket representation of a parse tree; some notes about CGS, Typical phrase structure rules in English: Sentence (S), Noun Phrase (NP), Verb Phrase (VP), Prepositional Phrase (PP), Adjective Phrase (ADJP), Adverbial Phrase (ADVP). Are NLs context-free? Natural Language Phenomena: agreement, movement, subcategorization; heads and dependency; Files: Slides (PDF), Lecture notes (PDF). | A1 out |

Wed Jan 22 | Snow day, University closed from 12pm, no class and no lab | ||

8 |
Fri Jan 24 | Semantics
head-feature principle, dependency trees, arguments and adjuncts; Elements of semantics: semantic analysis, lexical semantics, semantic compositionality, semantic roles. Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] 17-17.2 (Representation of meaning), [JM] 18.6 (Idioms and Compositionality), [JM] 19-19.3 (Lexical Semantics and WordNet). | |

Part II: Statistical Approach to NLP | |||

9 |
Mon Jan 27 | Text Mining
Part II: Statistical approach to NLP, logical and plausible reasoning, two paradigms of NLP; Elements of text mining: counting words and n-grams, elements of information retrieval, basic task definition of ad-hoc retrieval, typical IR system architecture, vector space model, IR evaluation measures; Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] 23.1 (Information Retrieval), [MS] Ch.15 (Topics in Information Retrieval). | |

10 |
Wed Jan 29 | Text Classification and CNG
Example with precision-recall curves; Text classification; about text mining, text classification definition, types; creating classifiers; CNG classification method. Files: Slides (PDF), Lecture notes (PDF). Reading: [MS] Ch. 16 (Text Categorization) | |

L2 |
Wed Jan 29 | Lab 2: SVN Tutorial
Files: Slides (PDF),
Lab notes (PDF). | |

11 |
Fri Jan 31 | Evaluating Classification
Some implementation topics: letter and word frequencies; evaluation measures in text classification, micro- and macro-averaging, general issues with classification evaluation. Files: Slides (PDF), Lecture notes (PDF). | A1 due |

12 |
Mon Feb 3 | Bayesian Inference
Evaluation methods for classification (continued), Elements of probability theory, generative models, Bayesian inference. Files: Slides (PDF), Lecture notes (PDF). | P0 due |

13 |
Wed Feb 5 | Probabilistic Modelling
Probabilistic modeling: random variables, random configurations, computational tasks in probabilistic modeling, spam detection example, joint distribution model. Handout: cng-paper.pdf Files: Slides (PDF), Lecture notes (PDF). | |

L3 |
Wed Feb 5 | Lab 3: Perl Tutorial 2
Files: Slides (PDF),
Lab notes (PDF). | |

Fri Feb 7 | Munro Day, University closed, no class | ||

14 |
Mon Feb 10 | Naive Bayes Model
Some implementational topics: Perl modules, Ngrams. Fully independent probabilistic model; Naive Bayes model: basic idea, assumption, graphical representation, example, number of parameters. Files: Slides (PDF), Lecture notes (PDF). | A2 out |

15 |
Wed Feb 12 | P0 Projects discussion (1st part)
Files: Slides (PDF),
Lecture notes (PDF). | |

L4 |
Wed Feb 12 | Lab 4: Perl Tutorial 3
Files: Slides (PDF),
Lab notes (PDF),
Additional Perl Slides (PDF). | |

16 |
Fri Feb 14 | P0 Projects Discussion (2nd part), Ngrams Model
P0 projects discussion (2nd part); N-gram model (reading: Chapter 4 of [JM]), n-gram model assumption, graphical representation; Markov chain: stochastic process, Markov process, Markov chain; Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Ch 4 | A2 due |

Mon Feb 17 | Study break Mon-Sun, Feb 17-21 | ||

17 |
Mon Feb 24 | Smoothing, Hidden Markov Model
Markov chain graphical representation, perplexity and evaluation of N-gram models, text classification using language models; Smoothing: add-one (Laplace) smoothing, Witten-Bell smoothing (Witten-Bell discounting); Hidden Markov Model (HMM): graphical representation, applications, formal definition. Files: Slides (PDF), Lecture notes (PDF). | |

18 |
Wed Feb 26 | Hidden Markov Model and Bayesian Networks
HMM assumption, POS tagging example, reading: [JM] Sec. 5.5 (HMM POS Tagging), learning parameters, Viterbi algorithm example. Bayesian Networks: graphical representation, Bayesian networks assumption, conditional probability tables. Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Sec. 5.5 (HMM POS Tagging | A3 out |

L5 |
Wed Feb 26 | Lab 5: Tutorial on Collecting Twitter Data
A tutorial by Jacek Wolkowicz. | |

19 |
Fri Feb 28 | Sum-product Algorithms
Bayesian Networks: computational tasks; inference in Bayesian Networks, efficient inference; message-passing framework for efficient inference. Files: Slides (PDF), Lecture notes (PDF). | |

20 |
Mon Mar 3 | Sum-product Algorithms (2)
Efficient Bayesian inference: sum-product algorithms for different tasks; alarm example with message passing. Files: Slides (PDF), Lecture notes (PDF). | P1 due |

21 |
Wed Mar 5 | Probabilistic Context-Free Grammar
HMM as a Bayesian Network example. Probabilistic Contest-Free Grammar (PCFG). Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Chapters 13 and 14 (PCFG) | |

22 |
Fri Mar 7 | Probabilistic CYK Parsing
Computational tasks for PCFG model: evaluation, learning, simulation, proper PCFG; efficient inference in the PCFG model.Chomsky normal form, CYK algorithm by example. Files: Slides (PDF), Lecture notes (PDF). | |

Part III: Unification-based approach to NLP | |||

23 |
Mon Mar 10 | Introduction to Unification-based Approach to NLP
PCFG Marginalization using CYK-style algorithm, example; PCFG conditioning; PCFG completion using CYK-style algorithm, example; Topics related to PCFGs: PCFG as a BN; Issues with PCFGs: structural dependencies, lexical dependencies; Probabilistic lexicalized CFGs. Parser evaluation. reading: [JM] 14.7 (page 479, Evaluating parsers), Unification-based approach to NLP: bits of history; first-order predicate logic: constants, variables, functions, terms. reading: [JM] 17.3 (First-order Predicate Logic).
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] 14.7 (page 479, Evaluating parsers), [JM] 17.3 (First-order Predicate Logic) | A3 due |

24 |
Wed Mar 12 | Resolution and Unification
First-order predicate calculus (continued): predicates, formulae, sentences, axioms, theorems, proofs, inference rules, examples; Resolution-based inference system by Robinson. Resolution inference example, substitution, classical unification. Files: Slides (PDF), Lecture notes (PDF). | |

L6 |
Wed Mar 12 | Lab 6: Prolog Tutorial 1
Files: Slides (PDF),
Lab notes (PDF). | |

25 |
Fri Mar 14 | Unification Algorithms
Unification-related concepts: substitution, unifiers and unifiability, composition of substitutions, most general unifier, Robinson's unification algorithm, exponential running time of the Robinson's algorithm, unification using graph representation, Huet's unification algorithm, example. Files: Slides (PDF), Lecture notes (PDF). | |

26 |
Mon Mar 17 | DCG -- Definite Clause Grammars
Huet's algorithm example; Parsing with Prolog: Prolog overview: Horn clauses, rules and facts, running Prolog, overview of more Prolog examples; using difference lists in parsing; Definite Clause Grammars (DCG), DCG example with parse tree, handling agreement, embedded code, expressing PCFG. Files: Slides (PDF), Lecture notes (PDF). | A4 out |

27 |
Wed Mar 19 | Feature Structures
Unification-based grammars using feature structures: reading:[JM] chapter 15 (Features and Unification), feature structures or attribute-value matrices, DCG expressed using AVMs. Lists in AVMs, graph representation of feature structures, re-entrancy in AVMs, cyclic AVMs, PATR-II notation style; feature structure unification, example. Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] chapter 15 (Features and Unification) | |

L7 |
Wed Mar 19 | Lab 7: Prolog Tutorial 2
Files: Slides (PDF),
Lab notes (PDF). | |

28 |
Fri Mar 21 | Unification-based Grammars and Chart Parsing
Huet's unification algorithm for feature structures; Example of a unification-based grammar. Files: Slides (PDF), Lecture notes (PDF). | |

Part IV: Course review | |||

29 |
Mon Mar 24 | Course Review (part 1)
Review of the sample exam from the last year. Files: Slides (PDF), Lecture notes (PDF). | A4 due |

Wed Mar 26 | Course Review (
No class, university closed, snow day)No class, university closed. Files: Slides (PDF), Lecture notes (PDF). | ||

Part V: Student Presentations | |||

Wed Mar 26 |
(LAB) No class, university closed, snow day | ||

30 |
Fri Mar 28 | Student presentation (PT-16*,PT-17*,PT-18*)
PT-16: Owen Davison. PT-17: Jie Mei and Jesse McMinn. PT-18: Hamid Hooshmandi. | |

31 |
Mon Mar 31 | Student presentations (PT-13*,PT-14*,PT-15*,PT-19*,PT-20*)
PT-13: Mateo Yorke. PT-14: Dan Su. PT-15: Lulu Huang.PT-19: Xiangru Wang. PT-20: Dhuha Al-Amiri. | A5 out |

32 |
Wed Apr 2 | Student presentations (PT-10*,PT-11*,PT-12*)
PT-10: Jayde Fanjoy. PT-11: Nathan Lapierre. PT-12: Vicky Cai. | |

33 |
Wed Apr 2 | (LAB) Student presentations (PT-07*,PT-08*,PT-09*)
PT-07: Arash Koushkestani. PT-08: Andrew Sampson. PT-09: Sarah Morash and Cuong Nguyen. | |

34 |
Fri Apr 4 | Student presentations (PT-04*,PT-05*,PT-06*)
PT-04: Yuqing Jiang. PT-05: Mengtao Ji. PT-06: Yixiao Zhu. | |

35 |
Mon Apr 7 | Student presentations (PT-01*,PT-02*,PT-03*)
PT-01: Matthew Thomas and Andrew Sangster.PT-02: Rob Butler. PT-03: Mathew Caines and Thomas Eaton. | Reports due, A5 due |

Mon Apr 14 | Final Exam (15:30-17:30, Dalplex)Exam schedule: http://www.dal.ca/academics/exam_schedule/halifax_campus_exam_schedule.html | Exam |

© 2002-2014 Vlado Keselj, last update: 07-Apr-2014