This dataset contains the most popular 100 instagram accounts (based on followers amount) with 7 attributes. The goal of this project is to determine how much of an impact hashtags have on the popularity of an instagram account.
import pandas as pd
ig_df = pd.read_csv('most_followed_ig.csv', index_col='RANK', encoding='latin-1')
ig_df
BRAND | CATEGORIES 1 | CATEGORIES 2 | FOLLOWERS | ER | iPOSTS ON HASHTAG | MEDIA POSTED | |
---|---|---|---|---|---|---|---|
RANK | |||||||
1 | Selena Gomez | celebrities | musicians | 105.4Mæ(=) | 2.62%æ(1342) | 14.5Mæ(48) | 1.2kæ(2135) |
2 | Taylor Swift | celebrities | musicians | 95.2Mæ(=) | 1.96%æ(2040) | 10.5Mæ(66) | 958æ(2669) |
3 | Ariana Grande | celebrities | musicians | 92.3Mæ(=) | 1.43%æ(2759) | 16.9Mæ(41) | 2.8kæ(824) |
4 | Beyonce | celebrities | musicians | 90.6Mæ(=) | 2.53%æ(1427) | 9.2Mæ(70) | 1.4kæ(1897) |
5 | Kim Kardashian West | celebrities | tv | 89.3Mæ(=) | 1.39%æ(2812) | 5.1Mæ(130) | 3.6kæ(550) |
... | ... | ... | ... | ... | ... | ... | ... |
96 | DanialvesD2 My Twitter | celebrities | athletes | 11.7Mæ(=) | 1.62%æ(2477) | 122.4kæ(1486) | 1.7kæ(1508) |
97 | Dolce & Gabbana | fashion | luxury | 11.7Mæ(=) | 0.48%æ(4142) | 6.1Mæ(105) | 3.9kæ(471) |
98 | Tyga / T-Raww | celebrities | musicians | 11.6Mæ(=) | 1.31%æ(2922) | 1.2Mæ(421) | 2.5kæ(948) |
99 | Paul Labile Pogba | celebrities | athletes | 11.5Mæ(=) | 6.11%æ(170) | 77.6kæ(1745) | 396æ(4219) |
100 | Barack Obama | celebrities | political | 11.5Mæ(=) | 3.37%æ(826) | 2.5Mæ(240) | 231æ(4753) |
100 rows × 7 columns
# create dict of each value
# explains meaning of each feature (column)
col_dict = {'BRAND':'Name of instagram account',
'CATEGORIES1':'category of account',
'CATEGORIES2':'specific field/profession of account',
'FOLLOWERS':'amount of followers',
'ER':'N/A',
'iPOSTS ON HASHTAG':'amount of posts with hashtags',
'MEDIA POSTED':'amount of posts posted'}
col_dict
{'BRAND': 'Name of instagram account', 'CATEGORIES1': 'category of account', 'CATEGORIES2': 'specific field/profession of account', 'FOLLOWERS': 'amount of followers', 'ER': 'N/A', 'iPOSTS ON HASHTAG': 'amount of posts with hashtags', 'MEDIA POSTED': 'amount of posts posted'}
This data is sufficient to answer this question because using this dataset and the attributes of 'iPOSTS ON HASHTAG', 'MEDIA POSTED', and 'FOLLOWERS', it may be possible to determine if there is a correlation to hashtags and followers, which also correlates to hashtags and popularity.
In order to achieve this we would create a set of each ranked account, focusing on the three attributes mentioned above. Doing so would allow us to visualize the data to help determine the possible correlation between followers and hashtags.
This research paper is slightly related: The effect of #enhancement-free Instagram images and hashtags on women’s body image
It describes the engagement of women on instagram post with and without body-enhanced-hashtags, and describes the more hashtags (more engagement) in a post in which a women's body was enhaced, correlated with higher disatisfaction with the subject. This shows the impact that hashtags have on a population, and how analysis of certain accounts may gain popularity and what they promote may provide further insights into how power hashtags are at influencing a population.
Additionally, it is a well-known idea that hashtags are a very useful and powerful tool to gain a following and popularity. It would be interesting to further understand how engaged the most popular accounts on Instagram are in these tools.
CITATIONS
Marika Tiggemann, et al. “The Effect of #Enhancement-Free Instagram Images and Hashtags on Women's Body Image.” Body Image, Elsevier, 9 Oct. 2019, https://www.sciencedirect.com/science/article/pii/S1740144519300981.