Salary Negotiation¶

what's the problem?¶

As we enter the workforce (+ co-op and internship), a big question on a lot of people's mind is "how much will I be getting paid?" I think a better way of looking at this would be "how much can I realistically be paid?" Becoming equipped with this information will allow applicants to better understand their worth and aid in potential salary negotiations.

It can be intimidating to negotiate salary, but with an understanding industry standards, and the pay towards similar jobs in different industries, some of the stress surrounding negotiations can be alleviated.

CNBC and Fidelity Investments claim that while 58% of working Americans accepted their initial job offer, 85% of the Americans who made attempts at negotiating their salaries were successful.

In [4]:
import pandas as pd
datajobs = pd.read_csv('datajobs.csv', encoding = 'latin-1')
In [6]:
datajobs
Out[6]:
company job title location job description salary estimate company_size company_type company_sector company_industry company_founded ... python_yn spark_yn azure_yn aws_yn excel_yn machine_learning_yn job_simpl seniority description_len company_age
0 Microsoft Data & Applied Scientist Redmond, WA Microsoft 365 is a key part of the company’s c... 123486 10000+ Employees Company - Public Information Technology Computer Hardware Development 1975.0 ... 1 0 0 1 0 1 data scientist junior 359 47.0
1 UT Southwestern Medical Center Data Scientist or Bioinformatician (remote) Remote Center Information:\nThe Quantitative Biomedic... 93500 10000+ Employees Hospital Healthcare Health Care Services & Hospitals 1943.0 ... 1 0 0 0 0 1 data scientist mid 267 79.0
2 Notion Data Scientist, Growth New York, NY About Us:\nWe're on a mission to make it possi... 137853 201 to 500 Employees Company - Private Information Technology Enterprise Software & Network Solutions 2016.0 ... 1 0 0 0 0 0 data scientist Senior 589 6.0
3 Net2Aspire Jr. Data Scientist Remote ? Apply Statistical and Machine Learning metho... 72500 Unknown Company - Public NaN NaN NaN ... 0 0 0 0 0 1 data scientist junior 132 NaN
4 Ntropy Network Data Scientist Remote Over the last few decades, technological innov... 155000 1 to 50 Employees Company - Private NaN NaN NaN ... 1 0 0 1 0 0 data scientist mid 522 NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2079 YouTube Staff Software Engineer, Machine Learning, You... San Bruno, CA Minimum qualifications:\nBachelor's degree or ... 141704 1001 to 5000 Employees Subsidiary or Business Segment Information Technology Internet & Web Services 2005.0 ... 0 0 0 0 0 1 machine learning engineer Senior 498 17.0
2080 Hunter Engineering Data Science Co-Op Bridgeton, MO Overview:\nDo you have a passion for data scie... 88383 1001 to 5000 Employees Company - Private Manufacturing Machinery Manufacturing 1946.0 ... 1 0 0 0 1 1 other Senior 349 76.0
2081 precision technologies corp Jr UI/UX Designer Training and Placement Remote If you want to start your IT career as a UI/UX... 70600 201 to 500 Employees Company - Private Information Technology Information Technology Support Services 2008.0 ... 1 0 0 1 1 1 other junior 391 14.0
2082 Argonne National Laboratory Postdoctoral Appointee - Probabilistic Machine... Lemont, IL The Mathematics and Computer Science Division ... 54291 1001 to 5000 Employees Government Management & Consulting Research & Development 1946.0 ... 0 0 0 0 0 1 machine learning engineer Senior 506 76.0
2083 Colossal Biosciences Graduate Research Fellow, Machine Learning – S... Dallas, TX The Machine Learning Graduate Research Fellow ... 66609 1 to 50 Employees Company - Private NaN NaN NaN ... 1 0 0 0 0 1 machine learning engineer Senior 376 NaN

2084 rows × 23 columns

This dataset includes information (company, title, descriptions, location, requirements, salary, etc) of just over 2,000 data related jobs. My hope is to be able to work with the value_counts() function to try to find common words in the job description and job titles columns, and use these in conjunction with the other columns as features in determining expected salary of the job.

In the end, if someone were to input a dataset with jobs in the format of these columns, the intended output would be that job's expected salary. The goal would be to allow for a greater sense of salary transparancy (to the applicant) and provide them with the information that could serve useful in negotiations.

In [7]:
data_dict = {'company':'company name',
             'job title':'job title',
             'location':'office location, when available',
             'job description':'all available details regarding the position',
             'salary estimate':'average annual salary',
             'company_size':'approximation of # of employees at company',
             'company_type':'classification (public, private, etc)',
             'company_sector':'job/company sector',
             'company_industry':'company industry',
             'company_founded':'year company was founded',
             'company_revenue':'company annual revenue',
             'hourly':'whether the pay is hourly or not',
             'rating':'company rating on glassdoor',
             'python_yn':'is python a required skill?',
             'spark_yn': 'is spark a required skill?',
             'azure_yn':'is azure a required skill?',
             'excel_yn': 'is excel a required skill?',
             'machine_learning_yn':'is machine learning a required skill?',
             'seniority':'position ranking in company hierarchy',
             'description_len':'length of job description (word count)',
             'company_age':'how many years the company has been in business'}