Drinking is prevelant in youth culture, but how much does drinking alcohol influence/affect other aspects of students' lives?
Identify the relationship between alcohol consumption and how it affects other factors in their life such as grades, family relationships, free time, health, etc. This dataset could also answer how other factors such as parent's jobs, parent's education, family relationships, etc may impact a student's success in school.
Depending on if there is a correlation between drinking alcohol and the role it plays in other aspects of a student's life it will allow current students to be more cognisant in the future of how alcohol may affect them. We may also be able to predict if a student does or doesn't drink alcohol given certain factors, or even determine if these factors will lead to them eventually drinking alcohol.
We will use a Kaggle Dataset of Student Alcohol Consumption to observe the following variables in a student's and see if it impacts their alcohol consumption:
import pandas as pd
df=pd.read_csv('student-mat.csv')
df.head()
school | sex | age | address | famsize | Pstatus | Medu | Fedu | Mjob | Fjob | ... | famrel | freetime | goout | Dalc | Walc | health | absences | G1 | G2 | G3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | GP | F | 18 | U | GT3 | A | 4 | 4 | at_home | teacher | ... | 4 | 3 | 4 | 1 | 1 | 3 | 6 | 5 | 6 | 6 |
1 | GP | F | 17 | U | GT3 | T | 1 | 1 | at_home | other | ... | 5 | 3 | 3 | 1 | 1 | 3 | 4 | 5 | 5 | 6 |
2 | GP | F | 15 | U | LE3 | T | 1 | 1 | at_home | other | ... | 4 | 3 | 2 | 2 | 3 | 3 | 10 | 7 | 8 | 10 |
3 | GP | F | 15 | U | GT3 | T | 4 | 2 | health | services | ... | 3 | 2 | 2 | 1 | 1 | 5 | 2 | 15 | 14 | 15 |
4 | GP | F | 16 | U | GT3 | T | 3 | 3 | other | other | ... | 4 | 3 | 2 | 1 | 2 | 5 | 4 | 6 | 10 | 10 |
5 rows × 33 columns
Some problems we may run into along the way is that there are many different variables and it may be difficult to determine which variables affect other variables or which ones are relevant to the questions we are trying to answer.
We can use a linear regression model to find the intensity of dependency between two variables. This will help us find the value of the dependent variable on an explicit value of the independent variable. For example, being able to determine if there is a relationship between alcohol consumption and grades or alcohol consumption and family relationships.