Project Rubric ############################## Project Proposal ************************** Each **individual student** will submit a project proposal (3% of final grade) 1. (1%) Describes and motivates a real-world problem where data science may provide helpful insights. Your description should be easily understood by a casual reader and include citations to motivating sources or relevant information (e.g. news articles, further reading links … Wikipedia makes for a poor reference but the links it cites are usually promising). 2. (1%) Explicitly load and show your dataset. Provide a data dictionary which explains the meaning of each feature present. Demonstrate that this data is sufficient to make progress on your real-world problem described above. 3. (1%) Write one or two sentences about how the data will be used to solve the problem. **Earlier in the semester, we won’t have studied the Machine Learning methods just yet but you should have a general idea of what the ML will set out to do.** For example: * “We’ll cluster the movies into sets of movies which are often watched by the same users. Doing so allows us to discover if there is a more natural grouping of movies rather than the traditional genres: horror, comedy, romantic-comedy, etc”. .. note:: Proposals must be submitted in ipynb format. The name of this file will be shared with other students as the title of the project on the course website during team formation, please make it short and descriptive (avoid "project proposal.ipynb" or similar). Data and Analysis Plan *************************************** Each **project team** will submit a single Data and Analysis plan (5% of final grade) which: 1. (1%) Expresses the central motivation of the project in one or two sentences. This may evolve a bit through the project. 2. (3%) Builds two visualizations (graphs) from the data which characterize the distribution of the data itself in some interesting way. Your visualizations will be graded based on how much information they can effectively communicate with readers. Please make sure your visualizations are sufficiently distinct from each other. 3. (1%) Discuss what ML tools will be used. Its expected that many (most) analyses will fail to produce the results you're after. Be sure that your ML plans are flexible: - challenging goals should build in "fallback" milestones in the event the stretch goal doesn't work out as planned - otherwise, include alternate analyses which might also be interesting Final Report *************************************** Each **project team** will submit a single Final Report (11% of final grade) which: - **Executive summary** (2%) summarizes the main results in a paragraph of no more than 6 sentences which is easily understood by laymen. Link to the graphs or sections later in the report which supports each of the claims given in the executive summary. - **Introduction** (2%) A final, polished version of Project Proposal parts 1 and 3 above. - **Data Description** (1%) A final, polished version of Project Proposal part 2 as well as Data & Analysis Plan part 2 - **Method** (2%) Discuss which Machine Learning method you're applying and how it is valuable towards your initial goal. - **Results** (2%) - Apply the ML method(s) of your choice and graph results. - At least one graph of your ML results. All graphs should be “self-contained” in that the title, axis labels, legend, caption, or other graph elements are sufficient to understand the meaning of the graph without referring to the text of the document. - Beautiful software which is clearly documented and elegantly implemented. - **Discussion** (2%) - Interpret your results in the context of the application context * Are they ground-breaking or expected? Would this be a surprise to folks in this field? Should your results be accepted at face value, why or why not? (e.g. any dataset bias or methodological issue?) - Takeaway: In your opinion, after the analysis has been conducted, what actions (if any) might be taken as a result of your project? How confident are you that these actions are justified? What questions might you have to answer before such an action is taken - Discuss the most salient ethical implication of your project Project Presentation *************************************** All project presentations (6% of final grade) will be delivered as video recordings in which the entire group walks the audience through their project. - 1% will be rewarded, in full, to all students who submit thoughtful, earnest feedback to their peers via an anonymous google form. - 5% of this Presentation grade will be assigned via Prof Higger who adjusts and synthesizes peer feedback scores Some tips: * your presentation needn't share the entirety of your project, often students try to fit too much into the limited time given. Find the one or two most interesting results and plan backwards on what you'll need to tell the audience to bring them to understanding. * audience attention is often overestimated, spend it wisely: * full sentences on slides distract from your spoken voice * lots of code is overwhelming and distracting * you needn't show any code in your presentation if you don't want to I suggest using Zoom’s record feature to record your group walking through a few PowerPoint / Juypter RISE slides but other formats are welcome too so long as they can showcase your DS work. A strong video can be useful to you for years to come as it is accessible to folks who only have a few moments to look at the project (such as those who found the video link on your resume ...)