Predicting Successful Startups¶

Motivation:¶

  • Everyone has heard about the very high failure rate of startups. According to Forbes, 90% of startups fail in their first year. While this number is staggering, I would be curious to see if there are many similarities amongst the companies that fail or among the ones that succeed. I could then use this data to predictions for new company. Hopefully, this algorithm and discovered trends would help someone someone that is looking into starting a new business to see what they will need to do to succeed.
  • Works Cited:

https://www.forbes.com/sites/neilpatel/2015/01/16/90-of-startups-will-fail-heres-what-you-need-to-know-about-the-10/?sh=411652ee6679

In [1]:
import pandas as pd

# loads in dataset of startups
df_startups = pd.read_csv('big_startup_secsees_dataset.csv')

# key is head of column and value is its description
df_startups_dict = {'permalink' : 'Link to Organization',
        'name' : 'Company Name',
        'homepage_url' : 'Startup Site',
        'category_list' : 'Field of company',
        'funding_total_usd' : 'Total Funding in USD',
        'status' : 'Operating Status',
        'country_code' : 'Country Code',
        'state_code' : 'State Code of company location',
        'region' : 'Region of company location',
        'city' : 'City of company location',
        'funding_rounds' : 'number of times company has recieved funding',
        'founded_at' : 'Start date of company',
        'first_funding_at' : 'first funding date',
        'last_funding_at' : 'most recent funding date'}
In [2]:
df_startups
Out[2]:
permalink name homepage_url category_list funding_total_usd status country_code state_code region city funding_rounds founded_at first_funding_at last_funding_at
0 /organization/-fame #fame http://livfame.com Media 10000000 operating IND 16 Mumbai Mumbai 1 NaN 2015-01-05 2015-01-05
1 /organization/-qounter :Qounter http://www.qounter.com Application Platforms|Real Time|Social Network... 700000 operating USA DE DE - Other Delaware City 2 2014-09-04 2014-03-01 2014-10-14
2 /organization/-the-one-of-them-inc- (THE) ONE of THEM,Inc. http://oneofthem.jp Apps|Games|Mobile 3406878 operating NaN NaN NaN NaN 1 NaN 2014-01-30 2014-01-30
3 /organization/0-6-com 0-6.com http://www.0-6.com Curated Web 2000000 operating CHN 22 Beijing Beijing 1 2007-01-01 2008-03-19 2008-03-19
4 /organization/004-technologies 004 Technologies http://004gmbh.de/en/004-interact Software - operating USA IL Springfield, Illinois Champaign 1 2010-01-01 2014-07-24 2014-07-24
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
66363 /organization/zznode-science-and-technology-co... ZZNode Science and Technology http://www.zznode.com Enterprise Software 1587301 operating CHN 22 Beijing Beijing 1 NaN 2012-04-01 2012-04-01
66364 /organization/zzzzapp-com Zzzzapp Wireless ltd. http://www.zzzzapp.com Advertising|Mobile|Web Development|Wireless 114304 operating HRV 15 Split Split 4 2012-05-13 2011-11-01 2014-03-01
66365 /organization/Áeron ÁERON http://www.aeron.hu/ NaN - operating NaN NaN NaN NaN 1 2011-01-01 2014-08-01 2014-08-01
66366 /organization/Ôasys-2 Ôasys http://www.oasys.io/ Consumer Electronics|Internet of Things|Teleco... 18192 operating USA CA SF Bay Area San Francisco 1 2014-01-01 2015-01-01 2015-01-01
66367 /organization/İnovatiff-reklam-ve-tanıtım-hizm... İnovatiff Reklam ve Tanıtım Hizmetleri Tic http://inovatiff.com Consumer Goods|E-Commerce|Internet 14851 operating NaN NaN NaN NaN 1 NaN 2013-10-01 2013-10-01

66368 rows × 14 columns

Solution¶

  • This data will allow me to look into characertics of a large number of startups companies and look for similarities between the ones that succeed and the ones that fail
  • I can then also use this information to make predictions about a new businesses based on some of the creteria above
  • While this data clearly has some limitations I think further analysis of this code could still provide very useful metrics for this project