Exploring U.S. Software Developer Salaries: A Comprehensive Analysis¶

Introduction:¶

Software developers are in high demand and the tech industry is growing rapidly. The U.S. is one of the largest markets for software development, and understanding the salaries of software developers can be important for employers and employees alike. The U.S. Software Developer Salaries dataset on Kaggle provides a comprehensive collection of salary information for software developers in the United States. This dataset contains information on over 16,000 software developers across various states, industries, and experience levels. In this project, we will explore this dataset and gain insights into the factors that impact software developer salaries in the U.S.

Objectives:¶

To explore the U.S. Software Developer Salaries dataset on Kaggle To analyze the distribution of software developer salaries across various states, industries, and experience levels To identify the factors that impact software developer salaries in the U.S. To build a predictive model to estimate software developer salaries based on various factors such as experience, education, industry, and location

Methodology:¶

The project will begin with an exploratory data analysis of the U.S. Software Developer Salaries dataset on Kaggle. We will examine the distribution of salaries across various states, industries, and experience levels, and identify any outliers or patterns in the data. We will then perform statistical analysis to identify the factors that impact software developer salaries, such as experience, education, industry, and location. We will use visualization tools such as bar plots, histograms, and scatterplots to communicate our findings. Finally, we will build a predictive model to estimate software developer salaries based on the identified factors.

Expected Outcomes:¶

The expected outcomes of this project include:

An in-depth understanding of the distribution of software developer salaries across various states, industries, and experience levels in the U.S. Identification of the factors that impact software developer salaries in the U.S. A predictive model to estimate software developer salaries based on various factors such as experience, education, industry, and location. Insights for employers and employees on the current state of software developer salaries in the U.S. Conclusion: Software development is a critical industry that is constantly evolving. The U.S. Software Developer Salaries dataset on Kaggle provides a valuable resource for understanding the salaries of software developers in the U.S. By analyzing this dataset, we can gain insights into the factors that impact software developer salaries and build a predictive model to estimate salaries based on various factors. This project aims to contribute to the understanding of software developer salaries in the U.S. and provide insights for employers and employees alike.

Key:¶

state: The state where the software developer works¶

industry: The industry where the software developer works¶

sub_industry: The sub-industry where the software developer works¶

job_title: The job title of the software developer¶

years_of_experience: The number of years of experience of the software developer¶

size_of_company: The size of the company where the software developer works (number of employees)¶

salary: The annual salary of the software developer in USD¶

This dataset will be used to explore the factors that impact software developer salaries in the U.S. We can use statistical analysis and visualization tools to identify the relationships between the various features and the salary of software developers. Eventually, we can use machine learning methods to build a predictive model to estimate software developer salaries based on various factors such as experience, education, industry, and location.

In [1]:
import pandas as pd

# Load the dataset
df = pd.read_csv('SofwareDeveloperIncomeExpensesperUSACity.csv')


# Show the first 5 rows of the dataset
print(df.head())
   Unnamed: 0                              Metro  \
0           0                       Columbus, OH   
1           1        Seattle-Tacoma-Bellevue, WA   
2           2  Charlotte-Concord-Gastonia, NC-SC   
3           3               Colorado Springs, CO   
4           4                         Dayton, OH   

   Mean Software Developer Salary (adjusted)  \
0                                   117552.0   
1                                   117323.0   
2                                   114122.0   
3                                   112118.0   
4                                   111616.0   

   Mean Software Developer Salary (unadjusted)  \
0                                     108500.0   
1                                     131167.0   
2                                     107046.0   
3                                     111670.0   
4                                      99338.0   

   Mean Unadjusted Salary (all occupations)  \
0                                   51260.0   
1                                   65400.0   
2                                   51000.0   
3                                   51430.0   
4                                   50100.0   

   Number of Software Developer Jobs  Median Home Price                  City  \
0                            13430.0           192000.0          Columbus, OH   
1                            65760.0           491600.0           Seattle, WA   
2                            12800.0           208500.0         Charlotte, NC   
3                             5780.0           296500.0  Colorado Springs, CO   
4                             4240.0           124100.0            Dayton, OH   

   Cost of Living avg  Rent avg  Cost of Living Plus Rent avg  \
0               984.8    1421.5                        2856.5   
1              1250.7    2528.2                        4091.5   
2               989.9    1974.5                        3221.1   
3              1049.2    1594.0                        3094.5   
4               961.2    1072.1                        2586.0   

   Local Purchasing Power avg  
0                      9335.4  
1                      8971.3  
2                      8939.8  
3                      8493.1  
4                      4887.7