The S&P500 (SPY) is a great economy health indicator. It's made up of 500 large cap companies across a vast array of sectors, so it's a good representation of how the market is doing. It's fluctuations can be explained by current events within the market. Throughout history, the greatest fluctuations came at times of major international news breaks or major market crashes, and the S&P500 (SPY) price graph is a fantastic place to understand the timing of market fluctuations since its inception.
The problem I'm discussing today is in regards to market fluctuations currently. I want to use machine learning to create an S&P500 projection. Utilizing time-series data, I want to better understand the market in order to project how its performance will change within the future.
import pandas as pd
df_spy = pd.read_csv("SPY (2).csv")
df_spy.head()
Date | Open | High | Low | Close | Adj Close | Volume | |
---|---|---|---|---|---|---|---|
0 | 1993-01-29 | 43.96875 | 43.96875 | 43.75000 | 43.93750 | 25.218227 | 1003200 |
1 | 1993-02-01 | 43.96875 | 44.25000 | 43.96875 | 44.25000 | 25.397583 | 480500 |
2 | 1993-02-02 | 44.21875 | 44.37500 | 44.12500 | 44.34375 | 25.451401 | 201300 |
3 | 1993-02-03 | 44.40625 | 44.84375 | 44.37500 | 44.81250 | 25.720449 | 529400 |
4 | 1993-02-04 | 44.96875 | 45.09375 | 44.46875 | 45.00000 | 25.828054 | 531500 |
Through research, I found a time series machine learning model "LightGBM". Using this model, as well as the stock price data (organized by earliest-latest date), I will effectively be able to find correlations between stock prices and volumes traded in order to create stock price projections. I could also group by largest price differences to smallest, to better study the fluctuations and their timing to maximize the model further.
https://www.kaggle.com/code/pinardogan/time-series-using-lightgbm-with-explanations