Home | Syllabus | Course Topics | Recommended Books |
Course description
Data mining is a practical discipline that combines computer science, statistics, math, and optimization techniques to analyze data and gather valuable knowledge from it. This course is designed to study fundamental data mining concepts and provide hands-on experience with several methods. The students will develop a broad and deep background in data mining and crucial skills to solve practical data science challenges. Problems will involve the analysis of real databases coming from various fields, such as food science, astronomy, human resources, social sciences, and banking, among others. Students are expected to have previous knowledge in the Python programming language.
Course Outcomes
By the end of this course, you will be able to identify fundamental principles, techniques, and data mining applications. You will also apply computational and statistical methods to visualize, explore, and prepare data for posterior analysis. In addition, you will be able to translate real-life problems and frame them under supervised or unsupervised paradigms. Later you can apply different classification, prediction, or clustering approaches, where you will be able to evaluate them empirically and choose the best one with clear foundations.
A = | 93–100% | C = | 73–76% |
A- = | 90–92% | C- = | 70–72% |
B+ = | 87–89% | D+ = | 67–69% |
B = | 83–86% | D = | 63–66% |
B- = | 80–82% | D- = | 60–62% |
C+ = | 77–79% | F = | Below 60% |
Assignment | Weight |
Class Participation | 5% |
Midterm Exam | 25% |
Homework Assignments | 30% |
Final Project Presentation | 10% |
Final Project Paper | 30% |
Evaluation Activities
Take-home assignments will help students gain skills and feel more confident about the topics reinforced in the assignment. The final project will be an open ended capstone project, intended to cover a broader spectrum of contents, implementing a data mining solution for a real problem with real data. We expect a thorough analysis and creative solutions to the problem. The final project can be done individually or in teams. The teams should be determined before midterms, if applicable. All details will be provided with the announcement of every course activity.
Class attendance and participation
We base the learning process of this class on in-class discussion and participation. Attendance is mandatory and preparation of the course material is highly recommended. That includes coming/connecting to the class on time. Classes will combine theory and practice with hands-on activities.
Schedule and Materials:
The course material is approximate and subject to change!
Week 1: 1/11 Introduction to Data Mining |
|
Week 2: 1/18 Data Analysis & Summarization
|
|
Week 3: 1/25 Data Preprocessing and Engineering
|
|
Week 4: 2/1 Parameter Estimation
|
|
Week 5: 2/8 Association Rule Mining
|
|
Week 6: 2/15 Unsupervised Machine Learning
|
|
Week 7: 2/22 Supervised Machine Learning
|
|
Week 8: 3/1 Logistic Regression: A Precursor to Deep Learning |
|
Week 9: 3/8 Spring Break |
|
Week 10: 3/15 Deep Neural Network Learning |
|
Week 11: 3/22 Midterm Exam |
|
Week 12: 3/29 Practicalities of Machine Learning |
|
Week 13: 4/5 Special Topics --
|
|
Week 14: 4/12 Industry Day |
|
Week 15: 4/19 Project Presentations |
|
Week 16 4/26 Project Writeup Submissions Due |
Home | Syllabus | Course Topics | Recommended Books |