As the world becomes more connected through technolgy, the power of social media has become increasingly obvious. While social media's influence in trends like fashion, music, and pop culture seem natural, the reach of its influence has effected much more than just lifestyle fads. "Meme stocks" are one of the most recent victims of social media influence- Gamestop's insane price volatility in January 2021 being the most prominent example. One hedge fund, Melvin Capital, lost 53% of its total investments by short selling Gamestop stock that month (https://abcnews.go.com/Business/gamestop-timeline-closer-saga-upended-wall-street/story?id=75617315). It prompts the question- how much power does social media really have over the stock market? Can something as regulated, technical, and seemingly concrete as financial markets truly be influenced by something as simple as a viral social media post? Exploring social media's impact in this sphere is important because as shown with Gamestop, spiraling social media trends in the financial industry can lead to severe economic losses for solo investors and companies; which could cause economic uncertainty, increased unemployment, and recessionary movements. What started as a playful joke with Gamestop became a mass market impact that required federal action- and social media has no signs of slowing its interconnectedness and broad influence.
The dataset I chose for this project includes the sentiment impact of tweets on particular stocks and the associated stock returns. It includes sentiment scores that quantify the tone of the message, metrics of recent returns for the associated stock in the tweet, as well as a 10-day and 30-day volatility score for the stock.
import pandas as pd
df = pd.read_csv('reduced_dataset-release.csv')
df.head()
Unnamed: 0 | TWEET | STOCK | DATE | LAST_PRICE | 1_DAY_RETURN | 2_DAY_RETURN | 3_DAY_RETURN | 7_DAY_RETURN | PX_VOLUME | VOLATILITY_10D | VOLATILITY_30D | LSTM_POLARITY | TEXTBLOB_POLARITY | MENTION | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | RT @robertoglezcano: @amazon #Patents Show Fl... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | NaN | Amazon | 31/01/2017 | 823.48 | 0.008379 | 0.014924 | 0.014924 | -0.001263 | 3.137196e+06 | 13.447 | 16.992 | 1.000 | 0.0 | @amazon | NaN |
2 | 1 | @FAME95FM1 Jamaicans make money with @Payoneer... | PayPal | 31/01/2017 | 39.780000 | 0.002011 | 0.012318 | 0.012318 | 5.480141e-02 | 9100057.000 | 18.769 | 16.099 | -1 | 0.0 | @PayPal |
3 | 2 | @CBSi Jamaicans make money with @Payoneer @Pay... | PayPal | 31/01/2017 | 39.780000 | 0.002011 | 0.012318 | 0.012318 | 5.480141e-02 | 9100057.000 | 18.769 | 16.099 | 1 | 0.0 | @PayPal |
4 | 3 | @Hitz92fm Jamaicans make money with @Payoneer ... | PayPal | 31/01/2017 | 39.780000 | 0.002011 | 0.012318 | 0.012318 | 5.480141e-02 | 9100057.000 | 18.769 | 16.099 | -1 | 0.0 | @PayPal |
This data will help answer my question and make progress on the problem because it directly correlates tweets about stocks with their market movements, which could provide key insight to the impact of social media on market movements and also provide opportunity to predict how trends, news, and major events affect markets. I hope to first identify these correlations by analyzing relationship between variables like sentiment score and volatility score, and then utilize these correlations to help predict the effect that future tweets may have on any given stock through the textual aspects of the dataset.