CSCI 6405 / ECMM 6014 - Course Project
Project can be done individually or in groups up to four students.
Project presentations are done individually (groups should break up
their presentation into parts).
CSCI 6405 students can choose to do one of the two kinds of
projects:
- Implementation-based project,
- Theoretical project, or
- Application project
ECMM 6014 students can choose to do one of the four kinds of
projects:
- Implementation-based project,
- Theoretical project,
- Application project, or
- Survey project.
Submission
- Project presentation:
- each student is expected to give a short
project presentation.
- Project report:
- Submit a hard copy of your report. Do not go
over 30 pages.
- Code:
- If you are doing Implementation-based project or Theoretical
project, submit your code and data using the command submit on
the machine borg. Make sure that you code works on this machine.
Implementation-based project
Implement a DM algorithm, such as association rules mining, Bayesian
text classification, clustering, characterization, etc.
Properly document and test your code.
Write a report giving user, design, and testing documentation.
Describe relevant work and background research, the algorithm,
references, limitations, and possible improvements to to program.
Theoretical project
Do a research on a theoretical problem and write a report in a style
of a scientific paper. Present background work and discuss your
method.
You will likely have to design, implement, and perform one or more
experiments to test your approach. Discuss your results. What are
the conclusions?
Application project
Solve a real world problem using a DM package.
Find a real world application (database), develop an application model
by analyzing the business information process, and translate the model
into data mining tasks.
Choose a DM package to use, such as DBMiner or Cognos.
Do necessary data preparation and apply DM tools to mine the data for
useful results.
Write a report describing: the problem, preparation phase, your model
and approach, the procedures used in the package, the obtained
results, analysis and discussion of the results, conclusions, and possible
future directions.
Survey project
Conduct a comprehensive survey of DM by searching the latest
publications and other materials through the Internet and the
library. The minimum number of references: 20. Some examples
are given below.
- Survey on a particular business domain, such as DM in electronic
commerce (prefered), etc.
- Survey on a particular type of applications, for instance: fraud
detection, market basket analysis, stock predication, etc.
- Survey on mining a particular type of knowledge, such as
classification, association, clustering, etc.
- Survey on a particular type of technology, such as decision tree,
clustering algorithms, Bayesian methods, neural networks, etc.
The survey should clearly state domain task(s) and the main issues,
introduce the solutions and make a comparison on the solutions
(strength and limitations, etc), present the details of application
cases), summarize lessons/conclusions, etc.
Resources
- URLs listed in Assignment 1
- http://citeseer.nj.nec.com/cs,
library of scientific literature
- Computing Research Repository
- Machine Learning Repository