Superheroes¶

Superheroes

Introduction¶

The world of superheroes is huge and exciting. This interesting article (Talia 2016) by The People, Ideas, and Things (PIT) Journal of the University of North Carolina at Chapel Hill talks about the prominence of superheroes in American pop culture for over 80 years and how they represent ideals that are important to American culture, such as justice, morality, and the triumph of good over evil. The importance of superheroes cannot be understated.

Problem¶

This world can be overwhelming. For new comic book readers, it can be difficult to choose which superheroes' comics to read. Each superhero has their unique set of abilities, backstory and personality. The reader has limited time and energy and would ideally want to spend time reading about those superheroes which resonate strongly with him. Superheroes rank differently based on different factors and people have personal preferences for superpowers, genre, values, etc. There are also external factors like media hype and peer recommendations which can strongly influence a reader's choice. Overall, the decision of which superhero to read is complex and highly individualized.

Dataset¶

We will use a Kaggle Dataset of Superheros to observe the following features of each superhero:

  • name
  • history_text
  • powers_text
  • intelligence_score
  • strength_score
  • speed_score
  • durability_score
  • power_score
  • combat_score
  • superpowers
  • creator
  • alignment
  • teams

image.png

We have the superhero names. The history and powers text data provide personalized text information of every superhero. The 6 score fields provide great numerical data on the different strengths of superheroes. We also have the individual superpowers, creator, alignment, and the team(s) of every superhero. There are also a lot of additional superpower columns such as electrokinesis, matter manipulation, shapeshifting, etc. which could be utilized for a really in-depth analysis. All in all, the data looks great for catering to the personal needs of any individual.

Method¶

A model that classifies superheroes based on their powers, abilities, and characteristics can be created. Natural language processing (NLP) can be used to analyze sentiment using the history and power data. The data can be used to build a recommendation system which uses collaborative filtering algorithms to suggest personalized superheroes based on user input.

Additonal - Clustering algorithms such as k-means or hierarchial clustering can be used to group superheroes based on their attributes. Dimensionality reduction techniques such as PCA can be used to identify the most important attributes in distinguishing between different superheroes and identifying clusters of similar superheroes.