#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Felix Muzny 11/4/2022 DS 2000 Lecture 17 - string manipulation! (a.k.a. it's time to listen to Taylor Swift) Logistics: - Homework 7 is due today @ 9pm - Dictionaries are required. - I'll be stopping lecture 20 minutes early to answer HW 7 questions - We'll pick up on Tuesday from where we get to today :) - Homework 8 will be released next Tuesday, due two Fridays after that (11/18) - remote attendance (https://bit.ly/remote-ds2000-muzny) Three ways to participate (please do one of these!) 1) via the PollEverywhere website: https://pollev.com/muzny 2) via text: text "muzny" to the number 22333 to join the session 3) via Poll Everywhere app (available for iOS or Android) """ """ String processing, text data, and computer programs --- Let's go play with a chat bot! http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm What can you say that will "break" this chatbot? - can't give you any difinitive answers - if you put in certain phrases it will ask if you have psychological programs - doesn't remember things/facts that you tell her Natural language processing - the task of getting computers to respond reasonably to human language """ """ What does a computer "understand" about a string? --- animal = "turtle" - there is variable called animal that is equal to the string "turtle" - what is a turtle? - what is an animal? x = "cat" y = "dog" z = "lizard" - which is most similar to x? - compare the length of the strings - look up pictures of these things and compare pixels moral of the story: - dealing with text data is *hard* """ """ Let's do something fun with some lyrics! --- We're going to start by creating a visualization where the y-axis is the line number in the lyrics and the x-axis is the word number within that line. We'll start by plotting a point where each word exists. Example: It's me, hi I'm the problem, it's me It's me, hi I'm the problem, it's me It's me, hi Everybody agrees, everybody agrees Becomes: x x x x x x x x x x x x x x x x x x x x x x x """ """ Stop words --- - a common word, these tend to not carry semantic meaning - semantics refers to what words symbolize the, a, some, most, "" """ Sentiment ---- We'll do this in class on Nov 8th! """