Andrew Simoneau
DS 2500
Professor Higger
February 27, 2023
Millions of people around the world invest their hard-earned savings into the stock market in the hopes of earning returns to support various major life events such as retirement or sending children to college. Attached here is a link to one of my favorite finance writers, Ben Carlson's article which gives a background and reasoning for why to invest in the stock market. However, there are many risks associated with investing, which can lead to a major destruction of wealth. One way to attempt to reduce the risk of wealth destruction is to use the capital asset pricing model (CAPM). CAPM is widely used in the financial world to "[describe] the relationship between systematic risk, or the general perils of investing, and expected return for assets, particularly stocks." (Investopedia.com). The formula for CAPM is Er = Rf = B(Rm - Rf), where Er is the expected return of an asset, Rf is the risk-free rate of return, B is the beta of a stock, and Rm is the return of the overall market. The beta of a stock is defined as a stock's volatility compared to the overall market, and is calculated by dividing the covariance of the stock return and the market return by the variance of the market. In creating a CAPM regression and plotting where many stocks fall in relation to their expected return of CAPM, an investor can gauge if an asset is undervalued, overvalued, or fairly valued.
"""Directory iteration code source: https://www.techiedelight.com/iterate-over-files-directory-python/"""
import os
import pandas as pd
directory = '/Users/andrewsimoneau/Desktop/DS 2500/stock_market_data/sp500/csv'
dataframes = {}
for filename in os.listdir(directory):
filepath = os.path.join(directory, filename)
df = pd.read_csv(filepath, encoding='latin1')
dataframes[filename] = df
dataframes.keys()
dict_keys(['CSCO.csv', 'UAL.csv', 'TROW.csv', 'ISRG.csv', 'NVR.csv', 'MRO.csv', 'BA.csv', 'GILD.csv', 'NLSN.csv', 'EQIX.csv', 'MDT.csv', 'V.csv', 'QRVO.csv', 'A.csv', 'FLT.csv', 'MO.csv', 'SWKS.csv', 'MCHP.csv', 'CDNS.csv', 'MSCI.csv', 'CHTR.csv', 'EIX.csv', 'GS-PJ.csv', 'BBY.csv', 'WBA.csv', 'LVS.csv', 'HCA.csv', 'AJG.csv', 'DTE.csv', 'C.csv', 'T.csv', 'CF.csv', 'FPLPF.csv', 'MGM.csv', 'HUM.csv', 'CFG.csv', 'WU.csv', 'APH.csv', 'MSI.csv', 'FCX.csv', 'ADM.csv', 'LH.csv', 'GGG.csv', 'ALTR.csv', 'PKI.csv', 'LNT.csv', 'BAC.csv', 'LNC.csv', 'PSX.csv', 'GPN.csv', 'PPG.csv', 'KGNR.csv', 'IRM.csv', 'ESS.csv', 'NOV.csv', 'HAL.csv', 'FLS.csv', 'ADI.csv', 'F.csv', '.DS_Store', 'CPRT.csv', 'ULTA.csv', 'ARE.csv', 'SYK.csv', 'CB.csv', 'TSN.csv', 'PEP.csv', 'PEG.csv', 'NOXL.csv', 'NOW.csv', 'COST.csv', 'REG.csv', 'LOW.csv', 'MDLZ.csv', 'FMC.csv', 'XEL.csv', 'AIZ.csv', 'UEEC.csv', 'MET.csv', 'FAST.csv', 'TJX.csv', 'MPC.csv', 'BR.csv', 'D.csv', 'MRK.csv', 'STX.csv', 'NOC.csv', 'BXP.csv', 'KHC.csv', 'UNP.csv', 'ALLE.csv', 'ABBV.csv', 'ECL.csv', 'EBAY.csv', 'SBUX.csv', 'IR.csv', 'AMT.csv', 'INTU.csv', 'DPZ.csv', 'DRE.csv', 'IPGP.csv', 'PG.csv', 'CAT.csv', 'ODFL.csv', 'MCD.csv', 'MNST.csv', 'AMZN.csv', 'WSPOF.csv', 'PNR.csv', 'BDX.csv', 'COWN.csv', 'PWR.csv', 'DXCM.csv', 'EXR.csv', 'HOLX.csv', 'GM.csv', 'TXN.csv', 'VRSK.csv', 'TMO.csv', 'OXY.csv', 'RL.csv', 'TCYSF.csv', 'CCI.csv', 'MMM.csv', 'MOS.csv', 'HSY.csv', 'JNPR.csv', 'DHI.csv', 'ED.csv', 'ES.csv', 'ADSK.csv', 'IP.csv', 'KO.csv', 'PCAR.csv', 'WDC.csv', 'NEE.csv', 'UPS.csv', 'LEG.csv', 'EMR.csv', 'MSFT.csv', 'CTAS.csv', 'BIO.csv', 'NCTKF.csv', 'UDR.csv', 'WEC.csv', 'NVRO.csv', 'AME.csv', 'IT.csv', 'ACN.csv', 'VRSN.csv', 'EW.csv', 'CMG.csv', 'AWK.csv', 'COO.csv', 'SHW.csv', 'HPQ.csv', 'RIBT.csv', 'AMAT.csv', 'MLM.csv', 'AVY.csv', 'AAP.csv', 'ATVI.csv', 'EA.csv', 'DE.csv', 'SPG.csv', 'AMD.csv', 'NDAQ.csv', 'URI.csv', 'WHR.csv', 'KMX.csv', 'WRK.csv', 'RLI.csv', 'XLEFF.csv', 'BIIB.csv', 'CHRW.csv', 'ROP.csv', 'RXMD.csv', 'IDXX.csv', 'HES.csv', 'HD.csv', 'ALB.csv', 'CTQ.csv', 'AON.csv', 'ZTS.csv', 'FDX.csv', 'DG.csv', 'TYL.csv', 'CAG.csv', 'SCHW.csv', 'HSIC.csv', 'AZO.csv', 'AXP.csv', 'HPE.csv', 'DFS.csv', 'SEE.csv', 'HRL.csv', 'SO.csv', 'FRT.csv', 'ZBH.csv', 'FRC.csv', 'CME.csv', 'FN.csv', 'XOM.csv', 'AMP.csv', 'PNW.csv', 'ICE.csv', 'BEN.csv', 'UHS.csv', 'BHI.csv', 'EMN.csv', 'ROK.csv', 'SEGXF.csv', 'NRG.csv', 'NSC.csv', 'NMHLY.csv', 'KACPF.csv', 'FIS.csv', 'FANG.csv', 'VTR.csv', 'RF.csv', 'TAP.csv', 'MAR.csv', 'BMRA.csv', 'XYL.csv', 'CMI.csv', 'KR.csv', 'PLD.csv', 'IBM.csv', 'USB.csv', 'BSX.csv', 'LKQ.csv', 'FBHS.csv', 'ITW.csv', 'FRMC.csv', 'TWTR.csv', 'EOG.csv', 'PVH.csv', 'KMB.csv', 'NTRA.csv', 'INTH.csv', 'EL.csv', 'GD.csv', 'CNP.csv', 'PM.csv', 'RE.csv', 'MCO.csv', 'CLX.csv', 'CAH.csv', 'HRB.csv', 'HTLF.csv', 'DGX.csv', 'AVB.csv', 'DIS.csv', 'GE.csv', 'HII.csv', 'LDOS.csv', 'BF-A.csv', 'ALGN.csv', 'NFLX.csv', 'FITB.csv', 'WST.csv', 'GWW.csv', 'ENS.csv', 'NTRS.csv', 'MS-PF.csv', 'SRG.csv', 'AOS.csv', 'FE.csv', 'ABC.csv', 'JPM.csv', 'ABT.csv', 'OMC.csv', 'TSCO.csv', 'PH.csv', 'HST.csv', 'JBHT.csv', 'COP.csv', 'CNC.csv', 'MCK.csv', 'TXT.csv', 'SPY.csv', 'NTRR.csv', 'AKAM.csv', 'ROL.csv', 'RMD.csv', 'WRB.csv', 'ANET.csv', 'RSNHF.csv', 'TRAUF.csv', 'PAYX.csv', 'ALK.csv', 'DRI.csv', 'CUK.csv', 'ILMN.csv', 'AAL.csv', 'MRCR.csv', 'MAA.csv', 'FMBM.csv', 'MMC.csv', 'STZ-B.csv', 'FFIV.csv', 'CINF.csv', 'VMC.csv', 'MKTX.csv', 'SRE.csv', 'ORLY.csv', 'IVZ.csv', 'RCL.csv', 'PXD.csv', 'COTY.csv', 'SNPS.csv', 'GOOG.csv', 'SIVB.csv', 'YUM.csv', 'LYV.csv', 'KSS.csv', 'PFE.csv', 'REGN.csv', 'CL.csv', 'VFC.csv', 'BSHI.csv', 'UA.csv', 'VZ.csv', 'JCI.csv', 'CDE.csv', 'AMGN.csv', 'TEL.csv', 'JKHY.csv', 'ADP.csv', 'SONC.csv', 'STT.csv', 'RSG.csv', 'IFF.csv', 'LYB.csv', 'GIS.csv', 'PHM.csv', 'ROST.csv', 'LBTYA.csv', 'LUV.csv', 'CPB.csv', 'OKE.csv', 'BK.csv', 'SYF.csv', 'CHD.csv', 'SLG.csv', 'MHK.csv', 'DAL.csv', 'APA.csv', 'K.csv', 'NI.csv', 'NCLH.csv', 'ZION.csv', 'CPICQ.csv', 'PNWRF.csv', 'RJF.csv', 'HBAN.csv', 'PRU.csv', 'GPC.csv', 'FISV.csv', 'WMB.csv', 'EQR.csv', 'DVA.csv', 'HBI.csv', 'HON.csv', 'O.csv', 'BRK-A.csv', 'TTWO.csv', 'CNWT.csv', 'SLB.csv', 'AAPL.csv', 'WY.csv', 'APD.csv', 'GRMN.csv', 'AEE.csv', 'HLT.csv', 'NEOG.csv', 'DLTR.csv', 'HAS.csv', 'TMUS.csv', 'NTAP.csv', 'KIM.csv', 'BAX.csv', 'LMT.csv', 'ABMD.csv', 'KEY.csv', 'UNM.csv', 'KEYS.csv', 'BMY.csv', 'WYNN.csv', 'RHI.csv', 'EFX.csv', 'PKG.csv', 'NOK.csv', 'CTSH.csv', 'SWK.csv', 'MU.csv', 'TRV.csv', 'AEP.csv', 'TW.csv', 'JNJ.csv', 'WM.csv', 'DOV.csv', 'FTI.csv', 'CRM.csv', 'WAT.csv', 'TIME.csv', 'IEX.csv', 'BWA.csv', 'LRCX.csv', 'NWL.csv', 'BLK.csv'])
dataframes['CSCO.csv'].head()
Date | Low | Open | Volume | High | Close | Adjusted Close | |
---|---|---|---|---|---|---|---|
0 | 16-02-1990 | 0.073785 | 0.0 | 940636800 | 0.079861 | 0.077257 | 0.054862 |
1 | 20-02-1990 | 0.074653 | 0.0 | 151862400 | 0.079861 | 0.079861 | 0.056712 |
2 | 21-02-1990 | 0.075521 | 0.0 | 70531200 | 0.078993 | 0.078125 | 0.055479 |
3 | 22-02-1990 | 0.078993 | 0.0 | 45216000 | 0.081597 | 0.078993 | 0.056095 |
4 | 23-02-1990 | 0.078125 | 0.0 | 44697600 | 0.079861 | 0.078559 | 0.055787 |
The dataset is a directory containing csv files that can be iterated through to access different variables for different stocks or indices. Each csv in the dictionary dataframes (shown above), is a file with the following columns:
The open and close prices will be used to calculate intraday performance to calculate beta, and the open and close prices of the SPY index fund will be used to calculate the expected return on the market. With the beta of the stock calculated, the expected return can be calculated using CAPM, and a regression can be built based on these expected returns.
This data will be used to calculate the covariance and variance of each stock to the market, and therefore the beta of each stock, and create a regression model for CAPM. Once the regression line of CAPM is found, any stock over the regression line will be determined as undervalued, while any stock under the line will be determined as overvalued.
Below are the links to the data sources used:
Individual Stocks: https://www.kaggle.com/datasets/paultimothymooney/stock-market-data
S&P Index: https://finance.yahoo.com/quote/SPY/history?period1=728265600&period2=1677456000&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true