Knowing what makes people romantically compatible in an everchanging culture and environement is difficult. Knowing yourself and what you want is similarly challenging. However, with aggregated data, we can draw broad conclusions of dating preference. Dating is hard. Let's make it easier.
This study took questionaire data and outcome of 4 minute speed dates. Before/after the date, people rated themselves and their date partner on myriad of qualities (Personality, Attraction, Interests, etc). This combined with date outcome and self assessed expectations provides a basis for correlational attraction. The goal of this project is to identify factors between individuals which predict attraction on an early date.
If succesful, we will be able to predict matches through features such as similarity, personality traits, demographics, etc. This will reveal broader truthes in addition to general tips for early dates. Online dating websites can also use these conclusions to create better matches.
Negative Outcome: We draw an incorrect conclusion which is misleading.
We will use a Kaggle Dataset of Speed dating to observe the following features:
has_null | wave | gender | age | age_o | d_age | d_d_age | race | race_o | samerace | importance_same_race | importance_same_religion | d_importance_same_race | d_importance_same_religion | field | pref_o_attractive | pref_o_sincere | pref_o_intelligence | pref_o_funny | pref_o_ambitious | pref_o_shared_interests | d_pref_o_attractive | d_pref_o_sincere | d_pref_o_intelligence | d_pref_o_funny | d_pref_o_ambitious | d_pref_o_shared_interests | attractive_o | sinsere_o | intelligence_o | funny_o | ambitous_o | shared_interests_o | d_attractive_o | d_sinsere_o | d_intelligence_o | d_funny_o | d_ambitous_o | d_shared_interests_o | attractive_important | sincere_important | intellicence_important | funny_important | ambtition_important | shared_interests_important | d_attractive_important | d_sincere_important | d_intellicence_important | d_funny_important | d_ambtition_important | d_shared_interests_important | attractive | sincere | intelligence | funny | ambition | d_attractive | d_sincere | d_intelligence | d_funny | d_ambition | attractive_partner | sincere_partner | intelligence_partner | funny_partner | ambition_partner | shared_interests_partner | d_attractive_partner | d_sincere_partner | d_intelligence_partner | d_funny_partner | d_ambition_partner | d_shared_interests_partner | sports | tvsports | exercise | dining | museums | art | hiking | gaming | clubbing | reading | tv | theater | movies | concerts | music | shopping | yoga | d_sports | d_tvsports | d_exercise | d_dining | d_museums | d_art | d_hiking | d_gaming | d_clubbing | d_reading | d_tv | d_theater | d_movies | d_concerts | d_music | d_shopping | d_yoga | interests_correlate | d_interests_correlate | expected_happy_with_sd_people | expected_num_interested_in_me | expected_num_matches | d_expected_happy_with_sd_people | d_expected_num_interested_in_me | d_expected_num_matches | like | guess_prob_liked | d_like | d_guess_prob_liked | met | decision | decision_o | match |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
b'' | 1.0 | b'female' | 21.0 | 27.0 | 6.0 | b'[4-6]' | b'Asian/Pacific Islander/Asian-American' | b'European/Caucasian-American' | b'0' | 2.0 | 4.0 | b'[2-5]' | b'[2-5]' | b'Law' | 35.0 | 20.0 | 20.0 | 20.0 | 0.0 | 5.0 | b'[21-100]' | b'[16-20]' | b'[16-20]' | b'[16-20]' | b'[0-15]' | b'[0-15]' | 6.0 | 8.0 | 8.0 | 8.0 | 8.0 | 6.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | 15.0 | 20.0 | 20.0 | 15.0 | 15.0 | 15.0 | b'[0-15]' | b'[16-20]' | b'[16-20]' | b'[0-15]' | b'[0-15]' | b'[0-15]' | 6.0 | 8.0 | 8.0 | 8.0 | 7.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | 6.0 | 9.0 | 7.0 | 7.0 | 6.0 | 5.0 | b'[6-8]' | b'[9-10]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[0-5]' | 9.0 | 2.0 | 8.0 | 9.0 | 1.0 | 1.0 | 5.0 | 1.0 | 5.0 | 6.0 | 9.0 | 1.0 | 10.0 | 10.0 | 9.0 | 8.0 | 1.0 | b'[9-10]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[6-8]' | b'[0-5]' | 0.14 | b'[0-0.33]' | 3.0 | 2.0 | 4.0 | b'[0-4]' | b'[0-3]' | b'[3-5]' | 7.0 | 6.0 | b'[6-8]' | b'[5-6]' | 0.0 | b'1' | b'0' | b'0' |
b'' | 1.0 | b'female' | 21.0 | 22.0 | 1.0 | b'[0-1]' | b'Asian/Pacific Islander/Asian-American' | b'European/Caucasian-American' | b'0' | 2.0 | 4.0 | b'[2-5]' | b'[2-5]' | b'Law' | 60.0 | 0.0 | 0.0 | 40.0 | 0.0 | 0.0 | b'[21-100]' | b'[0-15]' | b'[0-15]' | b'[21-100]' | b'[0-15]' | b'[0-15]' | 7.0 | 8.0 | 10.0 | 7.0 | 7.0 | 5.0 | b'[6-8]' | b'[6-8]' | b'[9-10]' | b'[6-8]' | b'[6-8]' | b'[0-5]' | 15.0 | 20.0 | 20.0 | 15.0 | 15.0 | 15.0 | b'[0-15]' | b'[16-20]' | b'[16-20]' | b'[0-15]' | b'[0-15]' | b'[0-15]' | 6.0 | 8.0 | 8.0 | 8.0 | 7.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | 7.0 | 8.0 | 7.0 | 8.0 | 5.0 | 6.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[0-5]' | b'[6-8]' | 9.0 | 2.0 | 8.0 | 9.0 | 1.0 | 1.0 | 5.0 | 1.0 | 5.0 | 6.0 | 9.0 | 1.0 | 10.0 | 10.0 | 9.0 | 8.0 | 1.0 | b'[9-10]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[6-8]' | b'[0-5]' | 0.54 | b'[0.33-1]' | 3.0 | 2.0 | 4.0 | b'[0-4]' | b'[0-3]' | b'[3-5]' | 7.0 | 5.0 | b'[6-8]' | b'[5-6]' | 1.0 | b'1' | b'0' | b'0' |
b'' | 1.0 | b'female' | 21.0 | 22.0 | 1.0 | b'[0-1]' | b'Asian/Pacific Islander/Asian-American' | b'Asian/Pacific Islander/Asian-American' | b'1' | 2.0 | 4.0 | b'[2-5]' | b'[2-5]' | b'Law' | 19.0 | 18.0 | 19.0 | 18.0 | 14.0 | 12.0 | b'[16-20]' | b'[16-20]' | b'[16-20]' | b'[16-20]' | b'[0-15]' | b'[0-15]' | 10.0 | 10.0 | 10.0 | 10.0 | 10.0 | 10.0 | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | 15.0 | 20.0 | 20.0 | 15.0 | 15.0 | 15.0 | b'[0-15]' | b'[16-20]' | b'[16-20]' | b'[0-15]' | b'[0-15]' | b'[0-15]' | 6.0 | 8.0 | 8.0 | 8.0 | 7.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | 5.0 | 8.0 | 9.0 | 8.0 | 5.0 | 7.0 | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[6-8]' | b'[0-5]' | b'[6-8]' | 9.0 | 2.0 | 8.0 | 9.0 | 1.0 | 1.0 | 5.0 | 1.0 | 5.0 | 6.0 | 9.0 | 1.0 | 10.0 | 10.0 | 9.0 | 8.0 | 1.0 | b'[9-10]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[6-8]' | b'[0-5]' | 0.16 | b'[0-0.33]' | 3.0 | 2.0 | 4.0 | b'[0-4]' | b'[0-3]' | b'[3-5]' | 7.0 | b'[6-8]' | b'[0-4]' | 1.0 | b'1' | b'1' | b'1' | |
b'' | 1.0 | b'female' | 21.0 | 23.0 | 2.0 | b'[2-3]' | b'Asian/Pacific Islander/Asian-American' | b'European/Caucasian-American' | b'0' | 2.0 | 4.0 | b'[2-5]' | b'[2-5]' | b'Law' | 30.0 | 5.0 | 15.0 | 40.0 | 5.0 | 5.0 | b'[21-100]' | b'[0-15]' | b'[0-15]' | b'[21-100]' | b'[0-15]' | b'[0-15]' | 7.0 | 8.0 | 9.0 | 8.0 | 9.0 | 8.0 | b'[6-8]' | b'[6-8]' | b'[9-10]' | b'[6-8]' | b'[9-10]' | b'[6-8]' | 15.0 | 20.0 | 20.0 | 15.0 | 15.0 | 15.0 | b'[0-15]' | b'[16-20]' | b'[16-20]' | b'[0-15]' | b'[0-15]' | b'[0-15]' | 6.0 | 8.0 | 8.0 | 8.0 | 7.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | 7.0 | 6.0 | 8.0 | 7.0 | 6.0 | 8.0 | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | b'[6-8]' | 9.0 | 2.0 | 8.0 | 9.0 | 1.0 | 1.0 | 5.0 | 1.0 | 5.0 | 6.0 | 9.0 | 1.0 | 10.0 | 10.0 | 9.0 | 8.0 | 1.0 | b'[9-10]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[0-5]' | b'[6-8]' | b'[9-10]' | b'[0-5]' | b'[9-10]' | b'[9-10]' | b'[9-10]' | b'[6-8]' | b'[0-5]' | 0.61 | b'[0.33-1]' | 3.0 | 2.0 | 4.0 | b'[0-4]' | b'[0-3]' | b'[3-5]' | 7.0 | 6.0 | b'[6-8]' | b'[5-6]' | 0.0 | b'1' | b'1' | b'1' |
Our project seeks to analyze the features above to predict matches.
Data set is huge. Comparing all attributes may be beyond the scope of this assignment. In addition, data may be skewed because of the self perception bias caused by questionaires. People are rarely very accurate measuring themselves. We will simply assume the data is accurate, but perhaps emphasize the partners perception over the self.
This doesn't necesarily correlate to a normal dating environment. Four minute speed dates are very short.
column_meanings = {'has_null': 'Missing values (binary)',
'wave': 'Group',
'gender': 'Gender of self',
'age': 'Age of self',
'age_o': 'Age of partner',
'd_age': 'difference in age',
'd_d_age': 'difference in age',
'race': 'Race of self',
'race_o': 'Race of partner',
'samerace': 'whether the two have the same race',
'importance_same_race': 'How important is same race to partner',
'importance_same_religion': 'How important is it that partner has same religion?',
'd_importance_same_race': 'difference of same race preference',
'd_importance_same_religion': 'difference of same religion preference',
'field': 'Degree interest',
'pref_o_attractive': 'How important does partner rate attractiveness',
'pref_o_sincere': 'How important does partner rate sincerity',
'pref_o_intelligence': 'How important does partner rate intelligence',
'pref_o_funny': 'How important does partner rate funny',
'pref_o_ambitious': 'How important does partner rate ambition',
'pref_o_shared_interests': 'How important does partner rate shared interest',
'd_pref_o_attractive': 'difference of attractiveness importance',
'd_pref_o_sincere': 'difference of sincerity importance',
'd_pref_o_intelligence': 'difference of intelligence importance',
'd_pref_o_funny': 'difference of funny importance',
'd_pref_o_ambitious': 'difference of ambition importance',
'd_pref_o_shared_interests': 'difference of shared interest importance',
'attractive_o': 'Rating by partner (about me) at night of event on attractiveness',
'sinsere_o': 'Rating by partner (about me) at night of event on sincerity',
'intelligence_o': 'Rating by partner (about me) at night of event on intelligence',
'funny_o': 'Rating by partner (about me) at night of event on funny',
'ambitous_o': 'Rating by partner (about me) at night of event on ambitious',
'shared_interests_o': 'Rating by partner (about me) at night of event of shared interest',
'd_attractive_o': 'difference of skipppppp',
'd_sinsere_o': 'difference of skipppppp',
'd_intelligence_o': 'difference of skipppppp',
'd_funny_o': 'difference of skipppppp',
'd_ambitous_o': 'difference of skipppppp',
'd_shared_interests_o': 'difference of skipppppp',
'attractive_important': 'What do you look for in a partner - attractiveness',
'sincere_important': 'What do you look for in a partner - sincerity',
'intellicence_important': 'What do you look for in a partner - intelligence',
'funny_important': 'What do you look for in a partner - being funny',
'ambtition_important': 'What do you look for in a partner - ambition',
'shared_interests_important': 'What do you look for in a partner - shared interests',
'd_attractive_important': 'difference of what you are looking for: Attractiveness',
'd_sincere_important': 'difference of what you are looking for: Sincerity',
'd_intellicence_important': 'difference of what you are looking for: intelligence',
'd_funny_important': 'difference of what you are looking for: funny',
'd_ambtition_important': 'difference of what you are looking for: Ambition',
'd_shared_interests_important': 'difference of what you are looking for: Shared interests',
'attractive': 'Rate yourself - attractiveness',
'sincere': 'Rate yourself - sincerity',
'intelligence': 'Rate yourself - intelligence',
'funny': 'Rate yourself - funny',
'ambition': 'Rate yourself - ambition',
'd_attractive': 'Difference of self rated - attractiveness',
'd_sincere': 'Difference of self rated - sincerity',
'd_intelligence': 'Difference of self rated - intelligence',
'd_funny': 'Difference of self rated - funny',
'd_ambition': 'Difference of self rated - ambition',
'attractive_partner': 'Rate your partner - attractiveness',
'sincere_partner': 'Rate your partner - sincerity',
'intelligence_partner': 'Rate your partner - intelligence',
'funny_partner': 'Rate your partner - funny',
'ambition_partner': 'Rate your partner - ambition',
'shared_interests_partner': 'Rate your partner - shared interests',
'd_attractive_partner': 'Difference of rated partner attractiveness',
'd_sincere_partner': 'Difference of rated partner sincerity ',
'd_intelligence_partner': 'Difference of rated partner intelligence ',
'd_funny_partner': 'Difference of rated partner funny',
'd_ambition_partner': 'Difference of rated partner ambition ',
'd_shared_interests_partner': 'Difference of rated partner: shared interests',
'sports': '(1-10) Your interest in sports ',
'tvsports': '(1-10) Your interest in tvsports ',
'exercise': '(1-10) Your interest in exercise ',
'dining': '(1-10) Your interest in dining ',
'museums': '(1-10) Your interest in museums (who likes museums?) ',
'art': '(1-10) Your interest in art ',
'hiking': '(1-10) Your interest in hiking ',
'gaming': '(1-10) Your interest in gaming',
'clubbing': '(1-10) Your interest in clubbing ',
'reading': '(1-10) Your interest in reading ',
'tv': '(1-10) Your interest in tv',
'theater': '(1-10) Your interest in theater ',
'movies': '(1-10) Your interest in movies ',
'concerts': '(1-10) Your interest in concerts ',
'music': '(1-10) Your interest in music ',
'shopping': '(1-10) Your interest in shopping ',
'yoga': '(1-10) Your interest in yoga ',
'd_sports': '(1-10) Your interest difference in sports',
'd_tvsports': '(1-10) Your interest difference in tvsports',
'd_exercise': '(1-10) Your interest difference in exercise',
'd_dining': '(1-10) Your interest difference in dining',
'd_museums': '(1-10) Your interest difference in museums (should all be the same)',
'd_art': '(1-10) Your interest difference in art',
'd_hiking': '(1-10) Your interest difference in hiking',
'd_gaming': '(1-10) Your interest difference in gaming',
'd_clubbing': '(1-10) Your interest difference in clubbing',
'd_reading': '(1-10) Your interest difference in reading',
'd_tv': '(1-10) Your interest difference in tv',
'd_theater': '(1-10) Your interest difference in theater',
'd_movies': '(1-10) Your interest difference in movies',
'd_concerts': '(1-10) Your interest difference in concerts',
'd_music': '(1-10) Your interest difference in music',
'd_shopping': '(1-10) Your interest difference in shopping',
'd_yoga': '(1-10) Your interest difference in yoga',
'interests_correlate': 'Correlation between participant’s and partner’s ratings of interests',
'd_interests_correlate': 'Difference',
'expected_happy_with_sd_people': 'How happy do you expect to be with the people you meet during the speed-dating event?',
'expected_num_interested_in_me': 'Out of the 20 people you will meet, how many do you expect will be interested in dating you?',
'expected_num_matches': 'How many matches do you expect to get?',
'd_expected_happy_with_sd_people': 'Difference',
'd_expected_num_interested_in_me': 'Difference',
'd_expected_num_matches': 'Difference',
'like': '(1-10) did you like your partner?',
'guess_prob_liked': '(1-10) did you think your partner liked you?',
'd_like': 'difference in liking partner',
'd_guess_prob_liked': 'difference in whether partner like you',
'met': 'Have you met your partner before?',
'decision': 'Did you choose to match?',
'decision_o': 'Did partner choose to match?',
'match': 'did both people say yes?'}
We propose logistic regression to determine which factors are predictive of match rate. This will also reveal, compartively, the impact of different traits.