{"cells": [{"cell_type": "markdown", "id": "00102b66", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# DS 2500 - Lesson 0\n", "\n", "Please load this Jupyter Notebook (`day0.ipynb`) on your computer so you can follow along.\n", "- download `day0.ipynb` from the course website [https://course.ccs.neu.edu/ds2500/](https://course.ccs.neu.edu/ds2500/)\n", "- suggestion: make a ds2500 folder on your computer somewhere, throw this is \"notes/day0\"\n", "\n", "### Having trouble opening up the jupyter notebook?\n", "- [Setup instructions](https://course.ccs.neu.edu/ds2500/python_setup.html)\n", " - best bet might be google colab if you haven't started already\n", "- Yes!\n", " - ask a neighbor for help, be sure to share names and make a friend :)\n", "- Nope, it worked for me\n", " - say hello to your neighbors anyways, I'm sure you'll help each other through the semester another time!\n", " \n", "### Content:\n", "\n", "- motivating why clear communication in even more important for DS\n", "- admin\n", " - course website\n", " - ICA\n", "- jupyter\n", " - managing & running cells\n", " - gotchas (when in doubt: restart & run all)\n", "- markdown\n"]}, {"cell_type": "markdown", "id": "a1563984", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Backup Jupyter Notebook: Google Colab\n", "\n", "[https://colab.research.google.com/](https://colab.research.google.com/) can host our `day0.ipynb` for you today. Not a good long term solution though ...\n", "\n", "- reach out on piazza if you're still struggling\n"]}, {"cell_type": "markdown", "id": "d7fd7b5c", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Meet your neighbors\n", "\n", "Goal: learn the following about 5 neighbors:\n", "- names (ensuring proper pronunciation)\n", "- how you're feeling about this course (e.g. nervous, excited, sleepy...)\n", "- a favorite hobby (or any other fun fact)\n"]}, {"cell_type": "markdown", "id": "2e007dc2", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Why \"meet your neighbors\"? \n", "\n", "Friends are a wonderful thing in life, but also:\n", "\n", "... it will super-charge your learning!\n", "\n", "- more folks will be comfortable raising their hand with a concern\n", "- primes relationships to ask questions during short in class activities\n", "\n", "\n"]}, {"cell_type": "markdown", "id": "a9719af0", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# In Class Activity0:\n", "\n", "Should we all be wearing masks right now? Convince the class that your response is most correct with the 30 seconds given (hint: use data)\n", "\n", "Write a quick summary of you group's thinking in the \"write your response here\" cell below\n", "\n", "We'll ask for volunteers to share, if you'd like you can submit any supporting evidence to the \"quick submit\" link at the [top of the course webpage](https://course.ccs.neu.edu/ds2500/) so I can load it and display. When uploading give a unique name, can be yours if you like but feel free to use a pseudonym if you like\n", "\n"]}, {"cell_type": "markdown", "id": "8f8ac992", "metadata": {}, "source": ["Write your response here!\n", "\n", "(double click in this cell, so it has a green border, then edit away!)\n"]}, {"cell_type": "markdown", "id": "da0f4e0d", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Whats it take to do great Data Science?\n", "\n", "- clear communication\n", " - crisp, compelling visuals\n", "- beautiful software\n", " - not just a means towards some answer, the software's clarity is evidence itself\n", " - your software should be dynamic\n", " - data source change\n", " - analysis change\n", " - something else\n", "- thoughtful analysis\n", " - machine learning / statistics\n"]}, {"cell_type": "markdown", "id": "ce865fe8", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# How will DS2500 help develop your DS skills?\n", "\n", "- curriculum\n", "| classes | topic | detail |\n", "|---------|------------------------------|-------------------------------|\n", "| 4 | python | warm-up |\n", "| 3 | python for DS | pandas & plotting |\n", "| 3 | Object Oriented Programming | building complex programs |\n", "| 1 | stats | describing sets of numbers |\n", "| 10 | analysis | Machine Learning in practice |\n", "\n", "- labs\n", "- hws\n", "- project\n"]}, {"cell_type": "markdown", "id": "f99e1a70", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Whats the recipe for DS2500 success?\n", "\n", "- make friends with a few folks\n", " - email\n", "- attend all classes\n", "- start your HW early\n", "- push yourself during lab \n", " - make friends there too!\n", "- utilize course resources:\n", " - office hours\n", " - lab digest\n", " - piazza\n", " - textbooks (they're ok, not my first choice)\n"]}, {"cell_type": "markdown", "id": "aa27435d", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# and also ... forgive your mistakes to enjoy learning a bit more\n", " - everyone can achieve mastery of these skills, it just takes time\n"]}, {"cell_type": "markdown", "id": "f93f5f86", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Zoom\n", "\n", "- all lessons available on zoom\n", "- all lessons recorded\n", "\n", "- I'm going to focus on the in-person students (this is an in-person class, the zoom is extra)\n", "\n", "\n", "\n", "- no lab will be available on zoom\n", "\n", "\n", "\n", "## You'll learn more effectively in person, please come to class!\n"]}, {"cell_type": "markdown", "id": "a701cf80", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Admin\n", "\n", "Flip through [course website](https://course.ccs.neu.edu/ds2500/) together.\n"]}, {"cell_type": "markdown", "id": "de40d44c", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## In Class Assignments\n", "- ICAs are graded on completion/effort (not correctness)\n", " - 0 points: not submitted, or submitted without effort shown\n", " - 1 points: submitted and partially complete or minimal effort shown / non ICA materials included in submission\n", " - 2 points: submitted, all parts attempted with meaningful effort demonstrated\n", "- ICAs are due @ 11:59 PM EST each night after class\n", " - tip: just submit right after class is over to make sure you get credit\n", "- [We'll drop everyone's lowest ICA score at the end of the semester](https://course.ccs.neu.edu/ds2500/admin_syllabus.html?#in-class-assignments)\n", " - to keep a consistent grading standard we cannot accept any late ICAs\n", " - please do not email me after having missed an ICA because you forgot, this drop is intended to save us both the time\n", "- **Please submit a seperate jupyter notebook with only the ICA questions and your solutions**\n", " - I'll show you how to do this at the end of class today\n", " \n"]}, {"cell_type": "markdown", "id": "f62d4678", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Jupyter Notebooks\n", "\n", "Jupyter contains two cell (in these blue / green rectangles) types:\n", "- markdown\n", " - markdown is a simple text/document formatting language\n", "- code\n", " - python code which can be run in the Jupyter Notebook \n", " - easily edited, modified, verified\n", " \n", "By merging both, Jupyter provides a 'living' document which includes:\n", "- results of analysis\n", "- method of how analysis was done (the code)\n", "- the ability to easily modify a few things and poke around or modify an analysis\n", "\n", "For example:\n", " [final project examples](https://course.ccs.neu.edu/ds2500/proj_example.html)\n", " \n", "\n"]}, {"cell_type": "markdown", "id": "72306cb1", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Navigating Jupyter\n", "\n", "- selecting a cell\n", " - select (`` and `` arrow keys, or single click w/ mouse)\n", "- changing cell type\n", " - keyboard\n", " - `m` for markdown\n", " - `y` for code\n", " - dropdown menu w/ mouse\n", "- add a cell\n", "- remove a cell\n", "- what does running a cell do?\n", " - for markdown cell: renders text\n", " - for python cell: runs the code\n", "- how to run a cell?\n", " - run (`ctrl + `, or double click w/ mouse)\n", "\n", "(click that little keyboard button on the menu to see keyboard shortcuts for all of the above)\n", "\n", "quick note: I'm using a jupyter notebook extension [rise](https://rise.readthedocs.io/en/stable/) to make this notebook a slideshow. (it adds \"Slide Type\" @ top right of each cell)\n", "\n", "asdofapdshfapisufdh\n"]}, {"cell_type": "code", "execution_count": 1, "id": "aac068db", "metadata": {"scrolled": true}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["asdf\n"]}], "source": ["print('asdf')\n"]}, {"cell_type": "code", "execution_count": null, "id": "6f671f39", "metadata": {"scrolled": true}, "outputs": [], "source": ["\n"]}, {"cell_type": "code", "execution_count": null, "id": "1b606f74", "metadata": {"scrolled": true}, "outputs": [], "source": ["\n"]}, {"cell_type": "markdown", "id": "48139665", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# The Jupyter-Python Gotcha\n"]}, {"cell_type": "code", "execution_count": 2, "id": "5fe04ef1", "metadata": {"scrolled": true}, "outputs": [], "source": ["# seems like scale_it() is a function which multiplies a value by 4\n", "def scale_it(x):\n", " return 4 * x\n"]}, {"cell_type": "code", "execution_count": 3, "id": "fced9d58", "metadata": {"scrolled": true}, "outputs": [{"data": {"text/plain": ["50"]}, "execution_count": 3, "metadata": {}, "output_type": "execute_result"}], "source": ["# then how come when we input 5 to scale_it(), the output is 50?\n", "scale_it(5)\n"]}, {"cell_type": "markdown", "id": "186b3a4b", "metadata": {}, "source": ["## The state of variables and functions may depend on previous cells which have since been modified or deleted.\n", "\n", "This can be problematic as `ipynb` are saved with the outputs of each cell!\n", "\n", "Mitigate the issue by:\n", "- observing the idx in `In [idx]` and `Out [idx]`\n", "\n", "Best practice:\n", "\n", "- Give a fresh `Kernel>Restart & Run All`\n", " - before sharing\n", " - when debugging\n", "\n", "**Note**: this is required of all your submissions for this class\n"]}, {"cell_type": "markdown", "id": "823ce296", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Jupyter Output\n"]}, {"cell_type": "code", "execution_count": 4, "id": "4e31e791", "metadata": {"scrolled": true}, "outputs": [{"data": {"text/plain": ["103"]}, "execution_count": 4, "metadata": {}, "output_type": "execute_result"}], "source": ["# by default jupyter echos the result of the final line's evaluation\n", "x = 3\n", "x + 10\n", "x + 100\n"]}, {"cell_type": "code", "execution_count": 5, "id": "6602d094", "metadata": {"scrolled": true}, "outputs": [], "source": ["# you can suppress it with ;\n", "x + 10;\n"]}, {"cell_type": "code", "execution_count": 6, "id": "c5f450b7", "metadata": {"scrolled": true}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["hey, does this work?\n", "how about this?\n"]}], "source": ["# jupyter reproduces anything printed to the command line, even if its not last\n", "print('hey, does this work?')\n", "print('how about this?');\n"]}, {"cell_type": "markdown", "id": "063b0236", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Markdown\n", "\n", "Markdown is a language which allows you to format text. \n", "\n", "In addition to these class notes, there are many markdown guides online. I like [this one](https://www.markdownguide.org/basic-syntax/)\n"]}, {"cell_type": "markdown", "id": "53790b59", "metadata": {}, "source": ["Plain old text is rendered as plain old text. You can use **bold**, *italics* to emphasize where necessary.\n"]}, {"cell_type": "markdown", "id": "ae973308", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Headings\n", "\n", "more #'s yields smaller headings\n", "\n", "# one #\n", "## two #\n", "### three #\n", "#### four #\n"]}, {"cell_type": "code", "execution_count": null, "id": "2d7ea548", "metadata": {"scrolled": true}, "outputs": [], "source": ["\n"]}, {"cell_type": "markdown", "id": "7f48a820", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## Lists\n", "\n", "here is a list of things I love:\n", "* cycling\n", "- python\n", "- open source software\n", " - a\n", " - b\n", " - x\n", " - y\n", " - z\n"]}, {"cell_type": "markdown", "id": "4d3e3655", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## Links\n", "[this is a markdown cheat sheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) which contains a more complete markdown reference\n"]}, {"cell_type": "markdown", "id": "533543b6", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## Images\n", "\n", "![wikipedia](https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png)\n", "\n", "![what a cutie!](https://i.ibb.co/nD03WJf/baby-cool.png)\n", "\n", "- supports referencing a local file or any image on the web\n", "- strong preference: upload image to web (else your image file must travel with `.ipynb` ... yucky)\n", " - [https://imgbb.com/](https://imgbb.com/)\n", "\n", "Use HTML syntax to control its size:\n", "\n", "\n"]}, {"cell_type": "markdown", "id": "4c7e4dce", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## Tables\n", "\n", "| Car Repair | Cost ($) | Prob | Salted Roads? |\n", "|------------------------------------|----------|------|---------------|\n", "| None | 0 | .9 | No |\n", "| Oxygen sensor replacement | 250 | .01 | No |\n", "| Under car rust repair Under car rust repairUnder car rust repair | 1000 | .02 | Yes |\n", "| Timing Belt Replacement | 750 | .03 | No |\n", "| Fuel cap replacement or tightening | 25 | .03 | No |\n", "| rusted muffler repair | 250 | .01 | Yes |\n", "\n", "| asdf | Mon | oaiudsfiausdh | Weds | asofdhuaso | Fri |\n", "|:---------:|:---:|:--------------------:|:----:|:----------:|:---:|\n", "| toddler a | 0 | 0 | 1 | 0 | 1 |\n", "| toddler b | 0 | dsaf;laksjdf;laskjdf | 1 | 0 | 1 |\n", "| toddler c | 0 | 0 | 0 | 0 | 0 |\n", "\n", "Tables can be tough to generate by hand, go ahead and use a [table generator](https://www.tablesgenerator.com/markdown_tables) online to save yourself some time.\n"]}, {"cell_type": "markdown", "id": "157835b1", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## Block quote\n", " \n", " This is a blockquote\n", " \n", "## Python code for display (not for running) \n", " \n", "\n", "```python\n", "import numpy as np\n", "rng = np.random.default_rng(seed=0)\n", "```\n"]}, {"cell_type": "markdown", "id": "2c684776", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["## Latex Math\n", "\n", "[latex](https://www.latex-project.org/) is a language which typesets math:\n", "\n", "$$ \\sum_{i=0}^n a_i = \\frac{a_0 + a_n}{2} (n + 1) $$\n", "\n", "there are [equation editors online](https://latex.codecogs.com/eqneditor/editor.php) which can help you get started with latex. (We won't use it much in DS2500, but nice to have!)\n"]}, {"cell_type": "markdown", "id": "62fac588", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# In Class Exercise 1\n", "\n", "Re-introduce yourself to your neighbors by writing a quick markdown biography of yourself. Be sure to use:\n", "- 2 different heading levels\n", "- a markdown table\n", "- a list\n", "- a link to some website\n", "- an image\n", " - avoid pictures of yourself please\n", " - link to something available online, see example above\n", "\n", "You're welcome to be funny, this is really an excuse to get warmed up with jupyter and markdown and meet each other. \n", "\n", "When you're done, swap laptops with 2 classmates who will add their own silly positive review, praising and encouraging whatever you've shared. \n", "\n", "Please be mindful:\n", "- everything you share should make all classmates feel safe and welcome\n", "- your response should be positive, take the moment to make somebody else smile and feel good :)\n"]}, {"cell_type": "markdown", "id": "d4a4a128", "metadata": {"slideshow": {"slide_type": "slide"}}, "source": ["# Lets get these ICAs on Gradescope before we forget ...\n", "\n", "- Please submit a seperate jupyter notebook with only the ICA questions and your solutions\n", "\n", "- (note to self: demo gradescope submission w/ @gmail account)\n", " - gspwd\n"]}], "metadata": {"celltoolbar": "Slideshow", "kernelspec": {"display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6"}}, "nbformat": 4, "nbformat_minor": 5}