Since long ago there has been a difference in the wages man and women receive even in the same deparment or position. It is imperative to know how big this gap is right now and if beside once gender their age, job title or level of education have a significant impact in the disparity.
Glassdoor is the second largest job site in the U.S., following Indeed. It's a free digital platform that gathers information and reviews from employees or former employees about companies, salaries, and even job openings. Meaning that they have wide range of information in the dataset and it is updated in a yearly manner. The goal of this project is to identify which are the factors that with realation to gender have the greatest impact in the wage disparity and in what manner is it affected.
If we are able to create an accurate analysis of the disperities encounter in each job title. It will posible to identify and asses this issue in a different manner focusing on the new insight gain.
We will use a Kaggle Dataset of Gender Pay Gap to odserve how does the following feuteres affect the gender pay gap:
JobTitle | Gender | Age | PerfEval | Education | Dept | Seniority | BasePay | Bonus |
---|---|---|---|---|---|---|---|---|
Graphic Designer | Female | 18 | 5 | College | Operations | 2 | 42363 | 9938 |
Software Engineer | Male | 21 | 5 | College | Management | 5 | 108476 | 11128 |
Warehouse Associate | Female | 19 | 4 | PhD | Administration | 5 | 90208 | 9268 |
Software Engineer | Male | 20 | 5 | Masters | Sales | 4 | 108080 | 10154 |
Graphic Designer | Male | 26 | 5 | Masters | Engineering | 5 | 99464 | 9319 |
import pandas as pd
df = pd.read_csv('glassdoor_gender_pay _gap.csv')
df
JobTitle | Gender | Age | PerfEval | Education | Dept | Seniority | BasePay | Bonus | |
---|---|---|---|---|---|---|---|---|---|
0 | Graphic Designer | Female | 18 | 5 | College | Operations | 2 | 42363 | 9938 |
1 | Software Engineer | Male | 21 | 5 | College | Management | 5 | 108476 | 11128 |
2 | Warehouse Associate | Female | 19 | 4 | PhD | Administration | 5 | 90208 | 9268 |
3 | Software Engineer | Male | 20 | 5 | Masters | Sales | 4 | 108080 | 10154 |
4 | Graphic Designer | Male | 26 | 5 | Masters | Engineering | 5 | 99464 | 9319 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
995 | Marketing Associate | Female | 61 | 1 | High School | Administration | 1 | 62644 | 3270 |
996 | Data Scientist | Male | 57 | 1 | Masters | Sales | 2 | 108977 | 3567 |
997 | Financial Analyst | Male | 48 | 1 | High School | Operations | 1 | 92347 | 2724 |
998 | Financial Analyst | Male | 65 | 2 | High School | Administration | 1 | 97376 | 2225 |
999 | Financial Analyst | Male | 60 | 1 | PhD | Sales | 2 | 123108 | 2244 |
1000 rows × 9 columns
As this data comes from employees and employeers we may encounter some outliers when it comes to the basic pay. These outliers could just by one individual affect the size of the gap in a significant manner when it comes to certain job titles or departments.
We will first clean the data to get rid of the performance evaluation and bonus pay colums. Then look at the gender wage gap as a whole.Also, identify the top 5 jod titles and deparment in which the gap is bigger and smaller, and create a bar chart to visualize the gap. Utilize a scatter plot to show if there is any relation between education level and salary.