## DS2500 Day 9

Feb 10, 2023

### Content:
- oop: overloading operators
- oop: class attributes & methods

### Admin:
- hw1 due Friday
- 



# Goal: Build an intuitive and convenient way of measuring time

Student: take a minute to study the interface below. 
- What do you notice? 
- What questions do you have?

`TimeDelta` is a measurement of time between two moments.


```python
# build a 'TimeDelta', representing a period of time
x = TimeDelta(second=100)
assert str(x) == 'TimeDelta(second=40, minute=1, hour=0)'

# build another time delta
y = TimeDelta(second=100, minute=70)
assert str(y) == 'TimeDelta(second=40, minute=11, hour=1)'

# notice: we can add our TimeDelta objects together
assert str(x + y) == 'TimeDelta(second=20, minute=13, hour=1)'

# notice: we can multiply our TimeDelta objects by ints / floats
assert str(x * 100) == 'TimeDelta(second=40, minute=46, hour=2)'
```


### Mea Culpa: We're re-inventing the sundial
To study operator overloading today lets pretend that python doesn't already have a [wonderful package to manage time](https://docs.python.org/3/library/datetime.html), so that we can build our own 'TimeDelta' object.


# Helpful Reminder: floor division `//` and the modulus operator `%`

Our `TimeDelta` converts all the chunks of 60 seconds from the input `seconds` into minutes automatically in the constructor:
```python
# build a 'TimeDelta', representing a period of time
x = TimeDelta(second=100)
assert str(x) == 'TimeDelta(second=40, minute=1, hour=0)'
```

How is this accomplished?


In [2]:
# normal division
100 / 60


1.6666666666666667

In [1]:
# notice: floor division takes the "floor" of the normal division
# (i.e. rounds down to the nearest integer)
100 // 60


1

In [4]:
# we can get the "remainder" of the division operation via the modulus operator
# (i.e. what remains after dividing 100 by 60?)
100 % 60


40

# In Class Activity A

Build a TimeDelta object which passes the assert statements below. Be sure to properly document your class definition.


In [8]:
class TimeDelta:
 """ a measurement of time between two moments
 
 Attributes:
 second (int): seconds between two moments (0 <= second < 60)
 minute (int): minutes between two moments (0 <= minute < 60)
 hour (int): hours between two moments 
 """
 def __init__(self, second=0, minute=0, hour=0):
 # compute true seconds
 self.second = second % 60
 
 # add leftover seconds to input minute
 minute = minute + second // 60
 
 # compute ture minutes
 self.minute = minute % 60
 
 # add leftover minutes to hour & store
 self.hour = hour + minute // 60
 
 
 def __repr__(self):
 return f'TimeDelta(second={self.second}, minute={self.minute}, hour={self.hour})'

In [10]:
# test0
x = TimeDelta(second=100)
assert str(x) == 'TimeDelta(second=40, minute=1, hour=0)'

In [11]:
# test1
y = TimeDelta(second=100, minute=70)
assert str(y) == 'TimeDelta(second=40, minute=11, hour=1)'

In [12]:
# test2
z = TimeDelta(second=3601)
assert str(z) == 'TimeDelta(second=1, minute=0, hour=1)'

# Operator Overloading

**Operator Overloading** is the process of defining custom operation methods (e.g. `+`, `-`, `*`, `/`) for our objects.

Per the given interface below, we see our target `TimeDelta` object has its own `+` and `*` methods:

```python
# build a few TimeDelta
a = TimeDelta(second=1, minute=2, hour=3)
b = TimeDelta(second=4, minute=5, hour=6)

# notice: we can add our TimeDelta objects together
assert str(a + b) == 'TimeDelta(second=5, minute=7, hour=9)'

# notice: we can multiply our TimeDelta objects by ints / floats
assert str(a * 2) == 'TimeDelta(second=2, minute=4, hour=6)'
```


# What is python really doing when it computes `a + b`?
1. identify the method name associate with the operation:
 1. [lookup table here](https://docs.python.org/3/library/operator.html#mapping-operators-to-functions)
 1. append two leading and trailing underscores (e.g. `__add__`)

1. search for the corresponding method in the leftward object
 - e.g. `a + b` is equivilent to `a.__add__(b)`

#### (++) a few extras
- If the method from step 2 above is not found (or raises a `NotImplementedError`), python searches for a "rightward" version of the operation from the object on the right
 - e.g. `a + b` results in `b.__radd__(a)`
- Notice that defining each and every operation is a bit redundant, right?
 - i.e. do we really need a distinct operation for `>`, `>=`, `<`, `<=`, `==`?
 - no ... a "complete" subset will do


# Operator Overloading (implementation)

To define "addition" for our object, we must define the function `TimeDelta.__add__()`

```python
 def __add__(self, other): 
 # combine self and other into a new TimeDelta object (somehow)
 return new_time_delta_object
```

Notes:
- other represents the object we're adding to TimeDelta
 - for addition, its only meaningful when `other` is another TimeDelta
 - for multiplication, its only meaningful when `other` is a float or int
- the output of this function is the new time delta object
 - its often expected that an operation gives a new output and doesn't modify original inputs `self` and `other`


In [22]:
class TimeDelta:
 """ a measurement of time between two moments
 
 Attributes:
 second (int): seconds between two moments (0 <= second < 60)
 minute (int): minutes between two moments (0 <= minute < 60)
 hour (int): hours between two moments 
 """
 sec_per_min = 60
 
 def __mul__(self, other):
 """ scale a TimeDelta object by some constant
 
 Args:
 other (float): some scale to apply
 """
 assert type(other) in (int, float), \
 'TimeDelta can only be multiplied by int or float'
 
 return TimeDelta(second=self.second * other,
 minute=self.minute * other,
 hour=self.hour * other)

 
 def __init__(self, second=0, minute=0, hour=0):
 # store seconds (after removing minutes)
 self.second = second % sec_per_min
 
 # add any extra minutes (from seconds >= 60)
 minute += second // sec_per_min
 
 # store minutes (after removing hours)
 self.minute = minute % 60
 
 # store hours, including any extra hours (from minutes >= 60)
 self.hour = hour + minute // 60
 
 def __repr__(self):
 return f'TimeDelta(second={self.second}, minute={self.minute}, hour={self.hour})'
 
 def __add__(self, other):
 """ sum time between two TimeDelta objects 
 
 Args:
 other (TimeDelta): other TimeDelta
 """
 assert isinstance(other, TimeDelta), \
 'TimeDelta can only be added to other TimeDelta'
 
 return TimeDelta(second=self.second + other.second,
 minute=self.minute + other.minute,
 hour=self.hour + other.hour)

In [24]:
import pandas as pd

pd.__version__

'1.5.3'

In [21]:
# build a 'TimeDelta', representing a period of time
a = TimeDelta(second=1, minute=2, hour=3)
b = TimeDelta(second=4, minute=5, hour=6)

# notice: we can add our TimeDelta objects together
assert str(a + b) == 'TimeDelta(second=5, minute=7, hour=9)'

# notice: we can multiply our TimeDelta objects by ints / floats
assert str(a * 2) == 'TimeDelta(second=2, minute=4, hour=6)'


TypeError: unsupported operand type(s) for +: 'TimeDelta' and 'TimeDelta'

# Advice:

Overload an operator when the behavior can be unambiguously guessed:
- e.g. adding two `pd.Series` objects together

If behavior isn't obvious, it might be worth making a method with a real function name to cue the reader in:
- e.g. `a.combine_with_another_obj_somehow(b)`

#### Function names are a great opportunity to document your code! (don't miss the chance)


# In Class Activity B

Add a subtraction method to `TimeDelta` above so that it passes the asserts given below.

**Hint:** I [wonder](https://docs.python.org/3/library/operator.html#mapping-operators-to-functions) what method name python looks for to do a subtraction operation? ... this is the one we should be building.

(++) This might feel redundant, is there a way we could re-use existing operations to build subtraction?


In [None]:
class TimeDelta:
 """ a measurement of time between two moments
 
 Attributes:
 second (int): seconds between two moments (0 <= second < 60)
 minute (int): minutes between two moments (0 <= minute < 60)
 hour (int): hours between two moments 
 """
 def __init__(self, second=0, minute=0, hour=0):
 # store seconds (after removing minutes)
 self.second = second % 60
 
 # add any extra minutes (from seconds >= 60)
 minute += second // 60
 
 # store minutes (after removing hours)
 self.minute = minute % 60
 
 # store hours, including any extra hours (from minutes >= 60)
 self.hour = hour + minute // 60
 
 def __repr__(self):
 return f'TimeDelta(second={self.second}, minute={self.minute}, hour={self.hour})'
 
 def __add__(self, other):
 """ sum time between two TimeDelta objects 
 
 Args:
 other (TimeDelta): other TimeDelta
 """
 assert isinstance(other, TimeDelta), \
 'TimeDelta can only be added to other TimeDelta'
 
 return TimeDelta(second=self.second + other.second,
 minute=self.minute + other.minute,
 hour=self.hour + other.hour)
 
 def __mul__(self, other):
 """ scale a TimeDelta object by some constant
 
 Args:
 other (float): some scale to apply
 """
 assert type(other) in (int, float), \
 'TimeDelta can only be multiplied by int or float'
 
 return TimeDelta(second=self.second * other,
 minute=self.minute * other,
 hour=self.hour * other)
 
 def __sub__(self, other):
 """ subtract time between two TimeDelta objects 
 
 Args:
 other (TimeDelta): other TimeDelta
 """
 assert isinstance(other, TimeDelta), \
 'TimeDelta can only be subtracted from other TimeDelta'
 
 return TimeDelta(second=self.second - other.second,
 minute=self.minute - other.minute,
 hour=self.hour - other.hour)


In [None]:
# build a 'TimeDelta', representing a period of time
a = TimeDelta(second=1, minute=2, hour=3)
b = TimeDelta(second=4, minute=5, hour=6)

# notice: we can now subtract one timedelta from another
assert str(b - a) == 'TimeDelta(second=3, minute=3, hour=3)'


# Class Methods & Class Attributes

Remember:
- **attributes** are data (variables) associated with an object
- **methods** are functions associated with an object

However, sometimes we want to associate a particular attribute or method to the class itself (i.e. all objects of a particular class).


# Class Attributes

We can assign attributes to an entire Class, rather than a single instance of the class (an object):


In [39]:
class SillyClass:
 # how_many is a class attribute
 # all objects of type SillyClass can access it, effectively sharing the same variable
 how_many = 0
 
 def __init__(self):
 # increment counter of how many silly class instances have been made
 SillyClass.how_many += 1


In [40]:
# you can access this variable via the class directly
# (notice: there is no particular object in this cell ... though you can access that way too)
SillyClass.how_many

0

In [41]:
# notice: each constructor call __init__ refered to the same variable how_many
silly_class0 = SillyClass()
silly_class1 = SillyClass()
silly_class2 = SillyClass()
SillyClass.how_many


3

In [44]:
silly_class0.how_many is SillyClass.how_many

True

In [30]:
# you can also access this attribute from any instance
silly_class0.how_many += 100


## When should I use a class attribute?

Use a class attribute when we want to store one value for all instances of the class because:
- the value is relevant to the set of all instances (as above)
- the value is constant across all instances


### Purely hypothetically speaking (on a purely hypothetical hw2 ...) 
You're tasked with building a `MonopolyPropertyHand` 
- tracks an individual players monopoly properties

Where would you store information about how many properties of each group (e.g. Dark Purple, Light Blue, Purple, Orange) are required to obtain a monopoly in that group?
- is the value relevant to a particular player's properties (attribute) or
- is the value constant & relevant to all player's properties (class attribute)


# Class Methods

Some functions are better associated with an entire class, rather than a particular instance object.

What if we wanted to add a method `.from_string()` which accepts a string to build a `TimeDelta` object?
- inputs: `03:02:01` implies 3 hours, 2 minutes and 1 second
- output: `TimeDelta(hour=3, minute=2, second=1)`
- notice that this behavior isnt associated with any particular `TimeDelta` object -> class method


In [45]:
# str.split() will split a string on a particular character
'03:02:01'.split(':')

['03', '02', '01']

In [46]:
# unpacking works nicely here
hour, minute, second = '03:02:01'.split(':')


In [47]:
hour


'03'

In [48]:
minute


'02'

In [49]:
second


'01'

In [55]:
class TimeDelta2:
 """ a measurement of time between two moments
 
 Attributes:
 second (int): seconds between two moments (0 <= second < 60)
 minute (int): minutes between two moments (0 <= minute < 60)
 hour (int): hours between two moments 
 """
 def __init__(self, second=0, minute=0, hour=0):
 # store seconds (after removing minutes)
 self.second = second % 60
 
 # add any extra minutes (from seconds >= 60)
 minute += second // 60
 
 # store minutes (after removing hours)
 self.minute = minute % 60
 
 # store hours, including any extra hours (from minutes >= 60)
 self.hour = hour + minute // 60
 
 def __repr__(self):
 return f'TimeDelta2(second={self.second}, minute={self.minute}, hour={self.hour})'
 
 @classmethod
 def from_user(cls):
 """ builds a TimeDelta2 from a string of format HH:MM:SS
 
 Args:
 str_time (str): a string of format HH:MM:SS. we expect
 3 numerical values joined by ':'
 """
 hour = input('input hour')
 minute = input('input minute')
 sec = input('input sec')
 
 # cast each to floats before building TimeDelta2 object
 return TimeDelta2(second=float(sec), 
 minute=float(minute), 
 hour=float(hour))
 
 @classmethod
 def from_string(cls, str_time):
 """ builds a TimeDelta2 from a string of format HH:MM:SS
 
 Args:
 str_time (str): a string of format HH:MM:SS. we expect
 3 numerical values joined by ':'
 """
 print(f'What is the cls argument? {cls}')
 
 # split string into its hour, minute and second
 hour, minute, second = str_time.split(':')
 
 # cast each to floats before building TimeDelta2 object
 return TimeDelta2(second=float(second), 
 minute=float(minute), 
 hour=float(hour))


In [56]:
TimeDelta2.from_user()

input hour10
input minute4
input sec123


TimeDelta2(second=3.0, minute=6.0, hour=10.0)

In [51]:
TimeDelta2.from_string('03:02:01')


What is the cls argument? 


TimeDelta2(second=1.0, minute=2.0, hour=3.0)

In [52]:
TimeDelta2(minute=100)

TimeDelta2(second=0, minute=40, hour=1)

### Syntax of Class Method vs an ordinary Method

```python
 @classmethod
 def from_string(cls, str_time):
```

two differences:
- it requires the `@classmethod` decorator
- we use `cls` to indicate that the first argument is the class is `TimeDelta2`
 - we reserve `self` for a particular instance object
 - convention: dont name the 1st input to a classmethod self


# When should I use a class method?

Use a Class Method when
- function does not use a particular object's attributes to run
 - common use case: provide an "alternate constructor" to build objects from another convenient data format
 - e.g. TimeDelta objects with input `02:04:04` or similar example above


# In Class Activity C

Complete the `WordListWithStats` class definition below so that it passes the asserts which follow
- `add_word()`
- overload any operators used in the assert statement (`+` and `len()`)

(++) We'd rather have two interfaces to build these objects:
- passing `word_list` (in current `__init__()` method)
- passing both `char_count` and `word_list`

Is there some way to build an alternate constructor (hint: class method) which can support both interfaces? 
- Which interface should be the official `__init__()` while the other is `from_something()`? 
- study (and imitate) the class method in `TimeDelta2.from_string()`


In [58]:
for c in 'asdf':b
 print(c)

a
s
d
f


In [67]:
['a', 'b'] + ['c', 'd']

['a', 'b', 'c', 'd']

In [68]:
from collections import defaultdict

class WordListWithStats:
 """ manages a list of words and character count across words 
 
 Attributes:
 word_list (list): a list of words
 char_count (dict): keys are characters, values are how many
 times character appears in all words in self.word_list
 """
 def __init__(self, word_list=tuple()):
 # init empty attribues
 self.char_count = defaultdict(lambda: 0)
 self.word_list = list()
 
 for word in word_list:
 self.add_word(word)
 
 def add_word(self, word):
 """ adds a word into list, updates char_count 
 
 Args:
 word (str): a word
 """
 self.word_list.append(word)
 
 for c in word:
 self.char_count[c] += 1
 
 def __add__(self, other):
 """ adds two WordListWithStats together
 
 Args:
 other (WordListWithStats):
 """
 return WordListWithStats(word_list=self.word_list + other.word_list)
 
 def rm_word(self, word):
 """ removes a word from list, updates char_count 
 
 Args:
 word (str): a word
 """
 self.word_list.remove(word) 
 for c in word:
 self.char_count[c] -= 1
 
 if not self.char_count[c]:
 # delete key if value is 0
 del self.char_count[c]
 
 def __len__(self):
 """ how many words are in word_list"""
 return len(self.word_list)
 
 


In [69]:
day_tup = 'monday', 'tuesday', 'wednesday'
day_list_with_stat = WordListWithStats(day_tup)
assert day_list_with_stat.word_list == ['monday', 'tuesday', 'wednesday']

day_list_with_stat.rm_word('wednesday')
assert day_list_with_stat.word_list == ['monday', 'tuesday']
assert dict(day_list_with_stat.char_count) == {'m': 1, 'o': 1, 'n': 1, 'd': 2, 
 'a': 2, 'y': 2, 't': 1, 'u': 1, 
 'e': 1, 's': 1}


In [70]:
beatles_tup = 'paul', 'george', 'ringo', 'john'
beatles_list_with_stat = WordListWithStats(beatles_tup)
sum_list_with_stat = beatles_list_with_stat.__add__(day_list_with_stat)
assert sum_list_with_stat.word_list == ['paul', 'george', 'ringo', 'john', 'monday', 'tuesday']
assert dict(sum_list_with_stat.char_count) == {'p': 1, 'a': 3, 'u': 2, 'l': 1, 'g': 3,
 'e': 3, 'o': 4, 'r': 2, 'i': 1, 'n': 3,
 'j': 1, 'h': 1, 'm': 1, 'd': 2, 'y': 2,
 't': 1, 's': 1}


In [None]:
assert len(sum_list_with_stat) == 6
