Stock Returns and Factors¶

Motivation:¶

Problem¶

Nowadays, for a single investor, there's simply too many stocks and too much data to comprehsively choose which ones are worth their while. While some may choose to specialize in a certain industry, this can leave them open to industry-specific risk with limited diversity. What if there was a way to filter through hundreds or even thousands of stocks to find which prospects might be the most lucrative, then basing your focusing your analysis only on those few items? A lot of financial metrics have been tied to stock performance, so shouldn't we be able to create a model that takes these metrics into account for stock selection?

Some Relevant articles:

  • study of momentum investing
  • Book to Price Ratios
  • PEG Ratios

Solution¶

So much public data on stocks is now widely available to the public and data miners have been long analyzing these indexes in order ot try and find any sort of premium in the market. This amount of financial information combined with a near constant stream of stock pricings allows us to try and draw a direct comparison between certain features and a stocks performance. **The goal of this project is to identify and use a relationship between certain stock's features (ie P/E ratios, momentum, Book/Market Ratios) and a stocks performance over the next 6 months.

Impact¶

If this is successful, this work may provide a simple and quick way for people, even those that are not as financially literate, to parse through a wide selection of stocks to view just a few that might earn them a premium. This will drastically cut down on time for most people who cannot afford to shift through the thousands of stocks out there. If we use logistic regression, we can also find a easy method to choose an index of stocks that are outperforming others, thus outperforming an index, something that is notoriously hard to do. Honestly, my hopes for that to happen are extremely limited due to the the complexity behind stock market returns, but it would be interesting.

One negative impact might be that the model might not work in out of sample data, thus leaving the investor vulnerable to certain changes in the economy between the testing period and the new period. Also, such reliance on these models may leave investors missing out on different stocks who might be at different maturity levels, as older more mature stocks generally have stronger financials compared to new ones still building interest and requiring constant investment.

Dataset¶

Detail¶

We will use a dataset provided by one of my previous classes (Portfolio Management) which provides a lot of different information as well as out of context scenarios as well. I usually would try to find a source on my own, but this datasource also does a good job providing us with different market segments as well so I think it would be really good to use in this scenario. Different markets and industries perform drastically differently so I think it would be wise to keep this in mind when doing our analysis.

Selected Metrics Listed Below (ideally we would test to see which of these metrics have an impact on the end returns):

  • Market Cap
  • P/E Ratio
  • Momentum 6 mo
  • Momentum 12 mo
  • Current Ratio
  • Cash Flow / Price
  • Quick Ratio
  • % of analyst predictions positive
  • ROA
  • ROE
  • B/P Ratio
  • Beta
  • PEG Ratio
  • EPS
Symbol Name GICS Ind Name Ex-Post Returm (t+1) Market Capitalization Annual Gross Profit Margin L90D Annual Operating Profit Margin L90D Annual Fixed Asset Turnover Ratio Annual Total Asset Turnover Ratio Annual Receivables Turnover Ratio Annual Inventory Turnover Ratio Annual CapEx to Assets Ratio L90D Annual Current Ratio Annual Debt to Equity Ratio L90D Annual Interest Coverage Ratio L90D Annual Quick Ratio Standardized Unexpected Earnings (SUE) Estimated Cash Flow / Price (FY1) Estimated Earnings / Price (FY1) Dispersion in Earnings Estimates (FY1) Revisions in Estimated Earnings (FY1) % of Analysts Upgrading Stock Recommendation % of Analysts Downgrading Stock Recommendation % of Analysts Increasing Earnings Forecast (FY1) % of Analysts Decreasing Earnings Forecast (FY1) Median Analyst Stock Recommendation Estimated EPS Growth Rate (FY1) Annual Return on Assets Annual Return on Common Equity 60 Month Beta Against S&P 500 Annual Earnings Variability Annual Cash Flow to Price Annual Book to Price Annual Dividend Yield Annual Earnings to Price Annual Enterprise Value to EBIT Annual PEG Ratio Realized 5 Year EPS Growth Rate 6-Month Momentum 12-month momentum Short-Term Reversal (ret[t-1,t])
DDD 3D Systems Corporation Technology Hardware Storage & Peripherals -9.3 2,346.8 48.9 -6.1 7.6 0.7 4.4 3.1 0.0 3.3 0.0 -30.0 2.5 0.7 0.0 0.0 0.1 1.0 0.0 0.0 0.0 0.0 2.0 12.1 -4.4 -6.0 1.2 #N/A 0.0 0.3 0.0 0.0 #N/A #N/A #N/A 48.9 50.6 30.3
MMM 3M Company Industrial Conglomerates 2.8 120,977.5 49.7 23.3 3.5 0.9 7.0 4.4 0.0 1.9 1.1 33.9 1.2 0.5 0.1 0.0 0.0 1.0 0.0 0.0 0.0 0.0 2.0 9.4 15.4 45.9 1.1 0.1 0.1 0.1 0.0 0.0 19.6 4.2 6.6 19.4 15.6 4.0
EGHT 8x8, Inc. Software 7.0 1,236.7 75.5 -2.6 17.6 0.8 20.0 87.0 0.0 4.5 0.0 #N/A 4.4 4.9 0.0 0.0 0.2 0.9 0.0 9.1 0.0 0.0 1.0 -22.1 -1.5 -1.7 0.5 4.0 0.0 0.2 0.0 0.0 #N/A #N/A #N/A -0.4 -6.9 -6.5
AOS A. O. Smith Corporation Building Products 3.6 9,437.8 41.7 17.2 5.9 1.0 5.3 6.6 0.0 2.0 0.2 63.2 1.7 1.2 #N/A 0.0 0.0 1.0 0.0 6.7 0.0 0.0 2.0 13.4 11.8 22.1 1.5 0.3 0.0 0.2 0.0 0.0 20.7 1.1 18.8 12.4 23.4 0.9
AAON AAON, Inc. Building Products 1.9 1,901.5 30.5 20.7 3.6 1.5 7.4 6.2 0.1 3.6 0.0 #N/A 2.2 -0.3 #N/A 0.0 #N/A 1.0 0.0 0.0 #N/A #N/A 2.0 7.7 21.4 27.7 1.0 0.2 0.0 0.1 0.0 0.0 24.2 2.8 28.3 9.9 31.4 -1.4
AIR AAR CORP. Aerospace & Defense 0.8 1,184.4 14.2 4.0 5.5 1.1 7.0 3.2 0.1 2.7 0.2 10.3 0.8 -1.1 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 25.5 2.7 4.7 1.4 0.5 0.0 0.7 0.0 0.0 19.8 -1.4 -17.1 -6.1 47.8 -4.1
AAN Aaron's, Inc. Specialty Retail 7.7 2,551.5 47.6 8.0 14.7 1.2 16.9 1.6 0.0 3.4 0.3 11.0 1.1 1.9 #N/A 0.1 0.0 1.0 0.0 0.0 0.0 0.0 1.0 4.7 5.2 9.8 0.1 0.3 0.2 0.5 0.0 0.0 11.7 1.7 1.4 24.2 65.0 0.5

In addition to these, each stock has a 1 month return amount (recorded after these features) Our project seeks to use the features above to estimate the returns on a stock.

Potential Problems¶

One potential problem that I see are confounding variables, as there is usually a lot going on behind a stock rather than a simple set of data. There is also a big difference between in sample and out of sample data, so we may find that although our model does well in sample, it doesn't work out of sample.

There are also some problems in terms of market conditions. A bull market may benefit certain firms over others despite having "poor" metrics and those using this method will have to account for this. A possible solution would be to grab data on these stocks during different business cycles and then have different models for each scenario.

Method:¶

We pose our problem as a regression (line of best fit) problem: given the listed features above (except 'name') we seek to estimate the return of each stock over a 1 month period. One advantage of this approach is that it offers an intuitive output as each feature will explicitly be associated with some increase or decrease in stock price.

Another option that I have been considering is to classify each stock as a good investment or bad investment using logistic regression, then test the two groups against each other.

An essential part of this plan though, relies upon us parsing through the available data to see which components will earn us a premium, then using those components to evaluate our stocks' performance.