Social Media Influence on Stock Market Volatility¶

Markus Zaba¶

DS 2500¶

As the world becomes more connected through technolgy, the power of social media has become increasingly obvious. While social media's influence in trends like fashion, music, and pop culture seem natural, the reach of its influence has effected much more than just lifestyle fads. "Meme stocks" are one of the most recent victims of social media influence- Gamestop's insane price volatility in January 2021 being the most prominent example. One hedge fund, Melvin Capital, lost 53% of its total investments by short selling Gamestop stock that month (https://abcnews.go.com/Business/gamestop-timeline-closer-saga-upended-wall-street/story?id=75617315). It prompts the question- how much power does social media really have over the stock market? Can something as regulated, technical, and seemingly concrete as financial markets truly be influenced by something as simple as a viral social media post? Exploring social media's impact in this sphere is important because as shown with Gamestop, spiraling social media trends in the financial industry can lead to severe economic losses for solo investors and companies; which could cause economic uncertainty, increased unemployment, and recessionary movements. What started as a playful joke with Gamestop became a mass market impact that required federal action- and social media has no signs of slowing its interconnectedness and broad influence.

The dataset I chose for this project includes the sentiment impact of tweets on particular stocks and the associated stock returns. It includes sentiment scores that quantify the tone of the message, metrics of recent returns for the associated stock in the tweet, as well as a 10-day and 30-day volatility score for the stock.

In [5]:
import pandas as pd
df = pd.read_csv('reduced_dataset-release.csv')
df.head()
Out[5]:
Unnamed: 0 TWEET STOCK DATE LAST_PRICE 1_DAY_RETURN 2_DAY_RETURN 3_DAY_RETURN 7_DAY_RETURN PX_VOLUME VOLATILITY_10D VOLATILITY_30D LSTM_POLARITY TEXTBLOB_POLARITY MENTION
0 0 RT @robertoglezcano: @amazon #Patents Show Fl... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN Amazon 31/01/2017 823.48 0.008379 0.014924 0.014924 -0.001263 3.137196e+06 13.447 16.992 1.000 0.0 @amazon NaN
2 1 @FAME95FM1 Jamaicans make money with @Payoneer... PayPal 31/01/2017 39.780000 0.002011 0.012318 0.012318 5.480141e-02 9100057.000 18.769 16.099 -1 0.0 @PayPal
3 2 @CBSi Jamaicans make money with @Payoneer @Pay... PayPal 31/01/2017 39.780000 0.002011 0.012318 0.012318 5.480141e-02 9100057.000 18.769 16.099 1 0.0 @PayPal
4 3 @Hitz92fm Jamaicans make money with @Payoneer ... PayPal 31/01/2017 39.780000 0.002011 0.012318 0.012318 5.480141e-02 9100057.000 18.769 16.099 -1 0.0 @PayPal

Data Dict for Each Variable in Dataset¶

  • TWEET: the username and content of the actual tweet
  • STOCK: Name of the stock price mentioned in the tweet
  • DATE: Date of the tweet
  • LAST_PRICE: Price at market close on the day of the tweet
  • 1_DAY_RETURN: 1 day market return on the stock
  • 2_DAY_RETURN: 2 day market return on the stock
  • 3_DAY_RETURN: 3 day market return on the stock
  • 7_DAY_RETURN: 7 day market return on the stock
  • PX_VOLUME: Volume traded (amount of shares of the stock traded at time of tweet
  • VOLATILITY_10D: 10 day volatility score of stock
  • VOLATILITY_30D: 30 day volatility score of stock
  • LSTM_POLARITY: Labeled sentiment of the tweet from the LSTM model
  • TEXTBLOB_POLARITY: Labeled sentiment from the Textblob model
  • MENTION: Number of times stock was mentioned in the tweet

How the data will help solve the problem¶

This data will help answer my question and make progress on the problem because it directly correlates tweets about stocks with their market movements, which could provide key insight to the impact of social media on market movements and also provide opportunity to predict how trends, news, and major events affect markets. I hope to first identify these correlations by analyzing relationship between variables like sentiment score and volatility score, and then utilize these correlations to help predict the effect that future tweets may have on any given stock through the textual aspects of the dataset.

All Sources¶

https://www.chase.com/personal/investments/learning-and-insights/article/social-medias-influence-on-the-investing-community

https://www.cnbc.com/2022/05/12/gamestop-surges-more-than-30percent-and-is-halted-in-odd-trading-amc-shares-also-pop.html

https://abcnews.go.com/Business/gamestop-timeline-closer-saga-upended-wall-street/story?id=75617315