Alcohol Consumption Affect On Student Lives¶

Motivation:¶

Problem:¶

Drinking is prevelant in youth culture, but how much does drinking alcohol influence/affect other aspects of students' lives?

Solution:¶

Identify the relationship between alcohol consumption and how it affects other factors in their life such as grades, family relationships, free time, health, etc. This dataset could also answer how other factors such as parent's jobs, parent's education, family relationships, etc may impact a student's success in school.

Impact:¶

Depending on if there is a correlation between drinking alcohol and the role it plays in other aspects of a student's life it will allow current students to be more cognisant in the future of how alcohol may affect them. We may also be able to predict if a student does or doesn't drink alcohol given certain factors, or even determine if these factors will lead to them eventually drinking alcohol.

Dataset¶

Detail¶

We will use a Kaggle Dataset of Student Alcohol Consumption to observe the following variables in a student's and see if it impacts their alcohol consumption:

  • school: (binary, 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira)
  • sex: M or F
  • age: age (15 - 22)
  • Pstatus:(parent status --> binary, 'T' - together or 'A' - apart)
  • Dalc: workday alcohol consumption (numeric: from 1 - very low to 5 - very high)
  • Walc: Weekend alcohol consumption (numeric: from 1 - very low to 5 - very high)
  • famrel: quality of family relationships (numeric: from 1 - very bad to 5 - excellent)
  • romantic: with a romantic relationship (binary: yes or no)
  • health: Current health status (numeric: from 1 - very bad to 5 - very good
  • goout: Going out with friends (numeric: from 1 - very low to 5 - very high)
  • activities: Extra-curricular activities (binary: yes or no)
  • famsup: Family educational support (binary: yes or no)
  • failures: Number of past class failures (numeric: n)
  • studytime: Weekly study time (numeric: 1 - <2 hours, 2 - 2 to 5 hours, 3 - 5 to 10 hours, or 4 -10 hours)\
  • Medu: Mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 - 5th to 9th grade, 3 - secondary)
  • Fjob: Father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home'
In [3]:
import pandas as pd
df=pd.read_csv('student-mat.csv')
df.head()
Out[3]:
school sex age address famsize Pstatus Medu Fedu Mjob Fjob ... famrel freetime goout Dalc Walc health absences G1 G2 G3
0 GP F 18 U GT3 A 4 4 at_home teacher ... 4 3 4 1 1 3 6 5 6 6
1 GP F 17 U GT3 T 1 1 at_home other ... 5 3 3 1 1 3 4 5 5 6
2 GP F 15 U LE3 T 1 1 at_home other ... 4 3 2 2 3 3 10 7 8 10
3 GP F 15 U GT3 T 4 2 health services ... 3 2 2 1 1 5 2 15 14 15
4 GP F 16 U GT3 T 3 3 other other ... 4 3 2 1 2 5 4 6 10 10

5 rows × 33 columns

Potential Problems:¶

Some problems we may run into along the way is that there are many different variables and it may be difficult to determine which variables affect other variables or which ones are relevant to the questions we are trying to answer.

Method:¶

We can use a linear regression model to find the intensity of dependency between two variables. This will help us find the value of the dependent variable on an explicit value of the independent variable. For example, being able to determine if there is a relationship between alcohol consumption and grades or alcohol consumption and family relationships.