Python Style Guide
Warning
Deviations from this format will yield penalties on any submitted Python code.
Python comes with its own style recommendations PEP 8, we prune down this guide to those relevant themes for the course. It’s also influenced by google’s style guide.
Function and Variable Names
Function and variables names should be all lowercase. Separate distinct words with underscores to improve readability. Use brief, simple language to name the variables. The names of the variables are themselves documentation. Pro tip: give corresponding variables names of identical length for extra readability:
# poor form
FirstGuysScoreInTooLongVariableName += 1
y += 1
# same functionality, but documented properly:
score_player0 += 1
score_player1 += 1
Function Docstrings
Your functions should contain a docstring (the red multi-line comment which begins and ends with triple-quotes below) which consists of:
a single line which summarizes what the function does
(e.g. “computes the greatest …”)
(optional) a longer description of the function
(e.g. “the greatest common divisor …”)
a list and description of all arguments (i.e. inputs) if any
a list and description of all returned variables (i.e. outputs) if any
For example:
def get_gcd(x, y):
""" computes the greatest common divisor of two ints
the greatest common divisor of two values is the biggest integer
which evenly divides both values. For example, the gcd of 12
and 60 is 12.
Args:
x (int): input integer
y (int): input integer
Returns:
gcd (int): gcd of x, y
"""
Class Docstrings
Class names should use CamelCase. Give a brief, high-level description of the class and list its Attributes
(the properties every instance of the object has) just like one would list the inputs / outputs of a function. Each method, being a function, should follow all function documentation rules as above. Use self
as the first argument of every method (referring to the instance of the object whose method has been called).
class BankAccount:
""" tracks balance & ownership of a bank account
Attributes:
owner (str): name of the bank account owner
balance (float): how much money is in the account
is_open (bool): true if bank account is open. (false when bank
account closed)
"""
def __init__(self, owner, balance=0):
self.owner = owner
self.balance = balance
self.is_open = True
def change_balance(self, diff):
""" credits (or debits if diff is negative) account
Args:
diff (float): amount account changes by
"""
assert self.balance + diff >= 0, 'overdraft'
self.balance = self.balance + diff
Comments and whitespace (within Python code)
Separate your code into “chunks” which perform similar functions separated by a line of whitespace between them. Label each chunk with a short message which describes its function. Ideally, these serve as labels to allow a reader to quickly identify the lines of code which perform a particular function they’re interested in looking at.
Consider the following function. Taken out of context, we expect reader’s to have a tough time understanding why it does what it does but the chunking and comments (hopefully) help provide an easy on-ramp for reader’s to begin learning about it. Notice how critical the documentation becomes when you’re tossed into this function without proper context, as one often is when writing software in a team:
def snip_trial(df_mode, trial_len, feat_list, start_stamp=None, start_idx=None):
""" extracts a single trial from a dataframe
Args:
df_mode (pd.DataFrame): dataframe, contains timestamp and trial data
trial_len (int): number of samples in trial
feat_list (list): columns of dataframe which make up trial data
start_stamp (float): timestamp @ start of trial (inclusive)
start_idx (int): index of start of trial (inclusive) in df_mode
Returns:
trial (np.array): (trial_len, len(feat_list)) trial data
"""
# check that only start_stamp xor start_idx is passed
assert (start_stamp is None) != (start_idx is None)
# get start_idx from start_stamp
if start_idx is None:
timestamp = df_mode['timestamp'].to_numpy()
start_idx = np.searchsorted(timestamp, v=start_stamp, side='left')
assert start_idx.size == 1, 'non unique start'
# extract trial (in time)
stop_idx = int(start_idx + trial_len)
trial = df_mode.iloc[start_idx: stop_idx, :]
# extract trial (just relevant features) and cast to array
trial = trial.loc[:, feat_list].to_numpy()
# check that trial has proper shape
if trial.shape[0] != trial_len:
raise IOError('data stream ends before trial')
return trial
Jupyter Notebook Style Notes
Your Jupyter Notebook should be shared empty or with results which are consistent with a fresh “Kernel -> Restart & Run All Cells”. To do otherwise is clumsy and could be considered misleading in professional contexts.
Use cells to chunk your program into pieces which perform a similar function.
Suppress all output which you do not want to draw the reader’s attention to. (A semicolon on the last line will prevent Jupyter from parroting the last line in
Out[]
).Markdown provides you a chance to talk to your reader as they move through your analysis. Use it. Having clear language (and crisp visuals) goes a long way towards teaching the reader just what you’ve accomplished. Be as clear and brief as possible.
Odds and ends
Don’t do too much on one line:
import numpy as np; import sklearn as skl; import pandas as pd;
Use single or double quotes for all strings, but don’t mix them in the same file:
# preferred (if used consistently throughout) String0 = 'this is how Prof Higger does it' # acceptable (if used consistently throughout code) String1 = "I feel like such a rebel" # don't mix and match String2 = 'sometimes you feel like a nut' String3 = "sometimes you dont"