Sleep is essential to vigilance, learning abilities, hand-eye coordination, mood, memory and more, yet people of all socioeconomic backgrounds and ages struggle to get both sufficient and quality sleep each night. Factors like caffiene and alcohol consumption, exercise, early bedtimes and sleep duration are often associated with sleep, but are they truly reflective of sleep quality, or just duration?
The goal of this project is to estimate how much deep (quality) sleep a person can expect to recieve based on how long they sleep, whether they have caffiene, when they get up in the morning, and how often they exercise.
If successful, this work may yield a regression model which predicts an amount of deep sleep (in hours) a person can expect to recieve based on the features in the dataset.
Potential negatives outcome of such a machine learning tool is that it may give people a potentially innacurate understanding of the health effects of factors like caffiene, alcohol, exercise, sleep duration, and bedtime, as these things impact each person uniquely and sleep quality is based on a variety of additional factors not counted here, including stress, noise pollution, and health.
This dataset shows the following features for factors in / measures of sleep
import pandas as pd
df_sleep_efficiency = pd.read_csv('sleep_efficiency.csv')
df_sleep_efficiency
ID | Age | Gender | Bedtime | Wakeup time | Sleep duration | Sleep efficiency | REM sleep percentage | Deep sleep percentage | Light sleep percentage | Awakenings | Caffeine consumption | Alcohol consumption | Smoking status | Exercise frequency | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 65 | Female | 2021-03-06 01:00:00 | 2021-03-06 07:00:00 | 6.0 | 0.88 | 18 | 70 | 12 | 0.0 | 0.0 | 0.0 | Yes | 3.0 |
1 | 2 | 69 | Male | 2021-12-05 02:00:00 | 2021-12-05 09:00:00 | 7.0 | 0.66 | 19 | 28 | 53 | 3.0 | 0.0 | 3.0 | Yes | 3.0 |
2 | 3 | 40 | Female | 2021-05-25 21:30:00 | 2021-05-25 05:30:00 | 8.0 | 0.89 | 20 | 70 | 10 | 1.0 | 0.0 | 0.0 | No | 3.0 |
3 | 4 | 40 | Female | 2021-11-03 02:30:00 | 2021-11-03 08:30:00 | 6.0 | 0.51 | 23 | 25 | 52 | 3.0 | 50.0 | 5.0 | Yes | 1.0 |
4 | 5 | 57 | Male | 2021-03-13 01:00:00 | 2021-03-13 09:00:00 | 8.0 | 0.76 | 27 | 55 | 18 | 3.0 | 0.0 | 3.0 | No | 3.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
447 | 448 | 27 | Female | 2021-11-13 22:00:00 | 2021-11-13 05:30:00 | 7.5 | 0.91 | 22 | 57 | 21 | 0.0 | 0.0 | 0.0 | No | 5.0 |
448 | 449 | 52 | Male | 2021-03-31 21:00:00 | 2021-03-31 03:00:00 | 6.0 | 0.74 | 28 | 57 | 15 | 4.0 | 25.0 | 0.0 | No | 3.0 |
449 | 450 | 40 | Female | 2021-09-07 23:00:00 | 2021-09-07 07:30:00 | 8.5 | 0.55 | 20 | 32 | 48 | 1.0 | NaN | 3.0 | Yes | 0.0 |
450 | 451 | 45 | Male | 2021-07-29 21:00:00 | 2021-07-29 04:00:00 | 7.0 | 0.76 | 18 | 72 | 10 | 3.0 | 0.0 | 0.0 | No | 3.0 |
451 | 452 | 18 | Male | 2021-03-17 02:30:00 | 2021-03-17 10:00:00 | 7.5 | 0.63 | 22 | 23 | 55 | 1.0 | 50.0 | 0.0 | No | 1.0 |
452 rows × 15 columns
The sleep features of interest include: sleep efficiency, caffiene consumption, exercise, and wakeup time. These features will be used in a regression model that predicts the amount of deep sleep (duration * percentage of deep sleep) that a person can expect to recieve based on their lifestyle and sleep habits. These features are not equally important in determining sleep quality, so I would have to use scale normalization to equalize their respective weights.