Exploratory Data Analysis on Categorical Data
Univariate, bivariate, and multivariate analyses on the Marvel Comics datasetโusing R, Python, and Julia.
When I was 13, my younger brother and I collected the 1990 Marvel Universe (series 1) cards. Ten years later, when I was an interface engineer at KPE (a top interactive agency in the late 1990s/early 2000s based in Silicon Alley), I had the priviledge to work on a Wolverine interactive Flash/ActionScript gameโa Marvel Comics advergaming project, in conjunction with the release of the first X-Men movie (2000).
Today, with access to FiveThirtyEightโs Marvel Universe dataset, weโll perform exploratory data anaysis (or EDA), as a continuation of my previous post (on Univariate, Bivariate, and Multivariate Analyses on Numerical Data).
Before we do (and to recap my previous post), why is EDA important? Univariate analysis allows us to explore the distribution and frequencies of individual categorical variables, revealing the prevalence of different categories. Bivariate analysis helps us examine relationships and associations between two categorical variables, enabling comparisons and identifying patterns. Multivariate analysis extends this further by considering interactions and dependencies among multiple categorical variables simultaneously.
EDA provides valuable insights into the composition, relationships, and dependencies within categorical data, allowing us to make informed decisions, identify trends, and gain a deeper understanding of various phenomena across domains like marketing, social sciences, and customer segmentation.
Getting Started
If you are interested in reproducing this work, here are the versions of R, Python, and Julia used (as well as the respective packages for each). Additionally, Leland Wilkinsonโs approach to data visualization (Grammar of Graphics) has been adopted for this work. Finally, my coding style here is verbose, in order to trace back where functions/methods and variables are originating from, and make this a learning experience for everyoneโincluding me.
cat(R.version$version.string, R.version$nickname)
R version 4.2.3 (2023-03-15) Shortstop Beagle
require(devtools)
devtools::install_version("dplyr", version="1.1.2", repos="http://cran.us.r-project.org")
devtools::install_version("ggplot2", version="3.4.2", repos="http://cran.us.r-project.org")
devtools::install_github("davidsjoberg/ggstream", dependencies=FALSE)
library(dplyr)
library(ggplot2)
library(ggstream)
import sys
print(sys.version)
3.11.4 (v3.11.4:d2340ef257, Jun 6 2023, 19:15:51) [Clang 13.0.0 (clang-1300.0.29.30)]
!pip install numpy==1.25.1
!pip install pandas==2.0.3
!pip install plotnine==0.12.2
import numpy
import pandas
import plotnine
using InteractiveUtils
InteractiveUtils.versioninfo()
Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin22.4.0)
CPU: 8 ร Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
Threads: 1 on 8 virtual cores
Environment:
DYLD_FALLBACK_LIBRARY_PATH = /Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server
using Pkg
Pkg.add(name="CSV", version="0.10.11")
Pkg.add(name="DataFrames", version="1.6.1")
Pkg.add(name="CategoricalArrays", version="0.10.8")
Pkg.add(name="Colors", version="0.12.10")
Pkg.add(name="Cairo", version="1.0.5")
Pkg.add(name="Gadfly", version="1.3.4")
using DataFrames
using CSV
using CategoricalArrays
using Colors
using Cairo
using Gadfly
Importing, Wrangling, and Examining Data
As we can see from the data types below, most of the variables are categoricalโsome ordinal, and others non-ordinal (or nominal).
For the analyses to follow, I wonโt need page_id
, urlslug
โthose can be removed. The name
of the character should appropriately be converted to a string data type. FIRST.APPEARANCE
should be appropriately converted to a timestamp object. Any ordinal, categorical variable should be defined in the correct order. And finally, the variable names should follow a consistent naming convention (lower case with words separated by an underscore).
After data wrangling, here is what a clean data looks like:
str(object=marvel_clean_r)
'data.frame': 16376 obs. of 11 variables:
$ name : chr "Spider-Man (Peter Parker)" "Captain America (Steven Rogers)" "Wolverine (James \\\"Logan\\\" Howlett)" "Iron Man (Anthony \\\"Tony\\\" Stark)" ...
$ identity : Factor w/ 5 levels "Unknown","Known to Authorities",..: 5 4 4 4 3 4 4 4 4 4 ...
$ align : Ord.factor w/ 5 levels "Bad"<"Reformed Criminal"<..: 5 5 4 5 5 5 5 5 4 5 ...
$ eye : Factor w/ 25 levels "Unknown","Amber",..: 11 5 5 5 5 5 6 6 6 5 ...
$ hair : Factor w/ 26 levels "Unknown","Auburn Hair",..: 8 25 4 4 5 15 8 8 8 5 ...
$ gender : Factor w/ 5 levels "Unknown","Agender",..: 5 5 5 5 5 5 5 5 5 5 ...
$ gender_sexual_minority: Factor w/ 7 levels "Non-GSM","Bisexual",..: 1 1 1 1 1 1 1 1 1 1 ...
$ living_status : Factor w/ 3 levels "Unknown","Deceased",..: 3 3 3 3 3 3 3 3 3 3 ...
$ appearances : int 4043 3360 3061 2961 2258 2255 2072 2017 1955 1934 ...
$ first_appearance_month: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
$ first_appearance_year : int 1962 1941 1974 1963 1950 1961 1961 1962 1963 1961 ...
head(x=marvel_clean_r, n=8)
name identity align eye hair gender gender_sexual_minority living_status appearances first_appearance_month first_appearance_year
1 Spider-Man (Peter Parker) Secret Good Hazel Brown Male Non-GSM Alive 4043 1962
2 Captain America (Steven Rogers) Public Good Blue White Male Non-GSM Alive 3360 1941
3 Wolverine (James \\"Logan\\" Howlett) Public Neutral Blue Black Male Non-GSM Alive 3061 1974
4 Iron Man (Anthony \\"Tony\\" Stark) Public Good Blue Black Male Non-GSM Alive 2961 1963
5 Thor (Thor Odinson) No Dual Good Blue Blond Male Non-GSM Alive 2258 1950
6 Benjamin Grimm (Earth-616) Public Good Blue No Hair Male Non-GSM Alive 2255 1961
7 Reed Richards (Earth-616) Public Good Brown Brown Male Non-GSM Alive 2072 1961
8 Hulk (Robert Bruce Banner) Public Good Brown Brown Male Non-GSM Alive 2017 1962
tail(x=marvel_clean_r, n=8)
name identity align eye hair gender gender_sexual_minority living_status appearances first_appearance_month first_appearance_year
16369 Marcy (Offer's employee) (Earth-616) Public Neutral Unknown Brown Female Non-GSM Alive NA NA
16370 Melanie Kapoor (Earth-616) Public Good Blue Black Female Non-GSM Alive NA NA
16371 Phoenix's Shadow (Earth-616) Unknown Neutral Unknown Unknown Unknown Non-GSM Alive NA NA
16372 Ru'ach (Earth-616) No Dual Bad Green No Hair Male Non-GSM Alive NA NA
16373 Thane (Thanos' son) (Earth-616) No Dual Good Blue Bald Male Non-GSM Alive NA NA
16374 Tinkerer (Skrull) (Earth-616) Secret Bad Black Bald Male Non-GSM Alive NA NA
16375 TK421 (Spiderling) (Earth-616) Secret Neutral Unknown Unknown Male Non-GSM Alive NA NA
16376 Yologarch (Earth-616) Unknown Bad Unknown Unknown Unknown Non-GSM Alive NA NA
levels(marvel_clean_r$identity)
[1] "Unknown" "Known to Authorities" "No Dual" "Public" "Secret"
levels(marvel_clean_r$align)
[1] "Bad" "Reformed Criminal" "Unknown" "Neutral" "Good"
levels(marvel_clean_r$eye)
[1] "Unknown" "Amber" "Black Eyeballs" "Black" "Blue" "Brown" "Compound Eyes" "Gold" "Green" "Grey" "Hazel" "Magenta Eyes" "Multiple Eyes" "No Eyes" "One Eye" "Orange" "Pink" "Purple" "Red" "Silver Eyes" "Variable Eyes" "Violet" "White" "Yellow Eyeballs" "Yellow"
levels(marvel_clean_r$hair)
[1] "Unknown" "Auburn Hair" "Bald" "Black" "Blond" "Blue" "Bronze Hair" "Brown" "Dyed Hair" "Gold" "Green" "Grey" "Light Brown Hair" "Magenta Hair" "No Hair" "Orange" "Orange-brown Hair" "Pink" "Purple" "Red" "Reddish Blond Hair" "Silver" "Strawberry Blond" "Variable Hair" "White" "Yellow Hair"
levels(marvel_clean_r$sex)
NULL
levels(marvel_clean_r$gender_sexual_minority)
[1] "Non-GSM" "Bisexual" "Gender Fluid" "Homosexual" "Pansexual" "Transgender" "Transvestites"
levels(marvel_clean_r$living_status)
[1] "Unknown" "Deceased" "Alive"
marvel_clean_py.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16376 entries, 0 to 16375
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 name 16376 non-null object
1 url_slug 16376 non-null object
2 id 12606 non-null object
3 align 13564 non-null object
4 eye 6609 non-null object
5 hair 12112 non-null object
6 gender 15522 non-null object
7 gender_sexual_minority 90 non-null object
8 living_status 16373 non-null object
9 appearances 15280 non-null float64
10 first_appearance_month 15561 non-null object
11 first_appearance_year 15561 non-null float64
dtypes: float64(2), object(10)
memory usage: 1.5+ MB
marvel_clean_py.head(n=8)
name url_slug id align eye hair gender gender_sexual_minority living_status appearances first_appearance_month first_appearance_year
0 Spider-Man (Peter Parker) \/Spider-Man_(Peter_Parker) Secret Identity Good Characters Hazel Eyes Brown Hair Male Characters NaN Living Characters 4043.0 Aug-62 1962.0
1 Captain America (Steven Rogers) \/Captain_America_(Steven_Rogers) Public Identity Good Characters Blue Eyes White Hair Male Characters NaN Living Characters 3360.0 Mar-41 1941.0
2 Wolverine (James \"Logan\" Howlett) \/Wolverine_(James_%22Logan%22_Howlett) Public Identity Neutral Characters Blue Eyes Black Hair Male Characters NaN Living Characters 3061.0 Oct-74 1974.0
3 Iron Man (Anthony \"Tony\" Stark) \/Iron_Man_(Anthony_%22Tony%22_Stark) Public Identity Good Characters Blue Eyes Black Hair Male Characters NaN Living Characters 2961.0 Mar-63 1963.0
4 Thor (Thor Odinson) \/Thor_(Thor_Odinson) No Dual Identity Good Characters Blue Eyes Blond Hair Male Characters NaN Living Characters 2258.0 Nov-50 1950.0
5 Benjamin Grimm (Earth-616) \/Benjamin_Grimm_(Earth-616) Public Identity Good Characters Blue Eyes No Hair Male Characters NaN Living Characters 2255.0 Nov-61 1961.0
6 Reed Richards (Earth-616) \/Reed_Richards_(Earth-616) Public Identity Good Characters Brown Eyes Brown Hair Male Characters NaN Living Characters 2072.0 Nov-61 1961.0
7 Hulk (Robert Bruce Banner) \/Hulk_(Robert_Bruce_Banner) Public Identity Good Characters Brown Eyes Brown Hair Male Characters NaN Living Characters 2017.0 May-62 1962.0
marvel_clean_py.tail(n=8)
name url_slug id align eye hair gender gender_sexual_minority living_status appearances first_appearance_month first_appearance_year
16368 Marcy (Offer's employee) (Earth-616) \/Marcy_(Offer%27s_employee)_(Earth-616) Public Identity Neutral Characters NaN Brown Hair Female Characters NaN Living Characters NaN NaN NaN
16369 Melanie Kapoor (Earth-616) \/Melanie_Kapoor_(Earth-616) Public Identity Good Characters Blue Eyes Black Hair Female Characters NaN Living Characters NaN NaN NaN
16370 Phoenix's Shadow (Earth-616) \/Phoenix%27s_Shadow_(Earth-616) NaN Neutral Characters NaN NaN NaN NaN Living Characters NaN NaN NaN
16371 Ru'ach (Earth-616) \/Ru%27ach_(Earth-616) No Dual Identity Bad Characters Green Eyes No Hair Male Characters NaN Living Characters NaN NaN NaN
16372 Thane (Thanos' son) (Earth-616) \/Thane_(Thanos%27_son)_(Earth-616) No Dual Identity Good Characters Blue Eyes Bald Male Characters NaN Living Characters NaN NaN NaN
16373 Tinkerer (Skrull) (Earth-616) \/Tinkerer_(Skrull)_(Earth-616) Secret Identity Bad Characters Black Eyes Bald Male Characters NaN Living Characters NaN NaN NaN
16374 TK421 (Spiderling) (Earth-616) \/TK421_(Spiderling)_(Earth-616) Secret Identity Neutral Characters NaN NaN Male Characters NaN Living Characters NaN NaN NaN
16375 Yologarch (Earth-616) \/Yologarch_(Earth-616) NaN Bad Characters NaN NaN NaN NaN Living Characters NaN NaN NaN
marvel_clean_jl
16376ร11 DataFrame
Row โ name identity align eye hair gender gender_sexual_minority living_status appearances first_appearance_month first_appearance_year
โ String Catโฆ? Catโฆ? Catโฆ? Catโฆ? Catโฆ? Catโฆ? Catโฆ? Int64? String7? Int64?
โโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ Spider-Man (Peter Parker) Secret Good Hazel Eyes Brown Hair Male Non-GSM Living 4043 Aug-62 1962
2 โ Captain America (Steven Rogers) Public Good Blue Eyes White Hair Male Non-GSM Living 3360 Mar-41 1941
3 โ Wolverine (James \\"Logan\\" Howโฆ Public Neutral Blue Eyes Black Hair Male Non-GSM Living 3061 Oct-74 1974
4 โ Iron Man (Anthony \\"Tony\\" Staโฆ Public Good Blue Eyes Black Hair Male Non-GSM Living 2961 Mar-63 1963
5 โ Thor (Thor Odinson) No Dual Good Blue Eyes Blond Hair Male Non-GSM Living 2258 Nov-50 1950
6 โ Benjamin Grimm (Earth-616) Public Good Blue Eyes No Hair Male Non-GSM Living 2255 Nov-61 1961
7 โ Reed Richards (Earth-616) Public Good Brown Eyes Brown Hair Male Non-GSM Living 2072 Nov-61 1961
8 โ Hulk (Robert Bruce Banner) Public Good Brown Eyes Brown Hair Male Non-GSM Living 2017 May-62 1962
โฎ โ โฎ โฎ โฎ โฎ โฎ โฎ โฎ โฎ โฎ โฎ โฎ
16370 โ Melanie Kapoor (Earth-616) Public Good Blue Eyes Black Hair Female Non-GSM Living missing missing missing
16371 โ Phoenix's Shadow (Earth-616) Unknown Neutral Unknown Unknown Unknown Non-GSM Living missing missing missing
16372 โ Ru'ach (Earth-616) No Dual Bad Green Eyes No Hair Male Non-GSM Living missing missing missing
16373 โ Thane (Thanos' son) (Earth-616) No Dual Good Blue Eyes Bald Male Non-GSM Living missing missing missing
16374 โ Tinkerer (Skrull) (Earth-616) Secret Bad Black Eyes Bald Male Non-GSM Living missing missing missing
16375 โ TK421 (Spiderling) (Earth-616) Secret Neutral Unknown Unknown Male Non-GSM Living missing missing missing
16376 โ Yologarch (Earth-616) Unknown Bad Unknown Unknown Unknown Non-GSM Living missing missing missing
16361 rows omitted
Summary Statistics
summary(marvel_clean_r)
name identity align eye hair gender gender_sexual_minority living_status appearances first_appearance_month first_appearance_year
Length:16376 Unknown :3770 Bad :6720 Unknown:9767 Unknown:4264 Unknown : 854 Non-GSM :16286 Unknown : 3 Min. : 1 :16376 Min. :1939
Class :character Known to Authorities: 15 Reformed Criminal: 0 Blue :1962 Black :3755 Agender : 45 Bisexual : 19 Deceased: 3765 1st Qu.: 1 1st Qu.:1974
Mode :character No Dual :1788 Unknown :2812 Brown :1924 Brown :2339 Female : 3837 Gender Fluid : 1 Alive :12608 Median : 3 Median :1990
Public :4528 Neutral :2208 Green : 613 Blond :1582 Gender Fluid: 2 Homosexual : 66 Mean : 17 Mean :1985
Secret :6275 Good :4636 Black : 555 No Hair:1176 Male :11638 Pansexual : 1 3rd Qu.: 8 3rd Qu.:2000
Red : 508 Bald : 838 Transgender : 2 Max. :4043 Max. :2013
(Other):1047 (Other):2422 Transvestites: 1 NA's :1096 NA's :815
marvel_clean_py.describe()
appearances first_appearance_year
count 15280.000000 15561.000000
mean 17.033377 1984.951803
std 96.372959 19.663571
min 1.000000 1939.000000
25% 1.000000 1974.000000
50% 3.000000 1990.000000
75% 8.000000 2000.000000
max 4043.000000 2013.000000
describe(marvel_clean_jl)
11ร7 DataFrame
Row โ variable mean min median max nmissing eltype
โ Symbol Unionโฆ Any Unionโฆ Any Int64 Type
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ name 'Spinner (Earth-616) \\u00c4kr\\u00e4s (Earth-616) 0 String
2 โ identity Unknown Secret 0 Union{Missing, CategoricalValue{โฆ
3 โ align Bad Good 0 Union{Missing, CategoricalValue{โฆ
4 โ eye Amber Eyes Unknown 0 Union{Missing, CategoricalValue{โฆ
5 โ hair Auburn Hair Unknown 0 Union{Missing, CategoricalValue{โฆ
6 โ gender Agender Unknown 0 Union{Missing, CategoricalValue{โฆ
7 โ gender_sexual_minority Bisexual Non-GSM 0 Union{Missing, CategoricalValue{โฆ
8 โ living_status Deceased Unknown 0 Union{Missing, CategoricalValue{โฆ
9 โ appearances 17.0334 1 3.0 4043 1096 Union{Missing, Int64}
10 โ first_appearance_month Apr-00 Sep-99 815 Union{Missing, String7}
11 โ first_appearance_year 1984.95 1939 1990.0 2013 815 Union{Missing, Int64}
Univariate Analysis
# Categorical distribution via bar chart
univariate_bar_gender_r <- ggplot2::ggplot(data=marvel_clean_r, aes(x=gender, y=(..count..))) +
ggplot2::geom_bar(
stat="count",
colour=palette_michaelmallari_r[19],
fill=palette_michaelmallari_r[19]
) +
ggplot2::geom_text(
aes(label=..count..),
stat="count",
vjust=1.5,
colour=palette_michaelmallari_r[1]
) +
ggplot2::scale_x_discrete(limits=c("Male", "Female", "Unknown", "Agender", "Gender Fluid")) +
ggplot2::scale_y_continuous(expand=c(0, 0), position="right") + # Scale
ggplot2::guides(fill=guide_legend(reverse=TRUE)) +
ggplot2::labs(
title="Male-Dominated Characters",
alt="Male-Dominated Characters",
subtitle="Distribution by gender, Marvel Universe characters = 16,376",
x=NULL,
y=NULL,
fill=NULL,
caption="Source: FiveThirtyEight"
) +
theme_michaelmallari_r()
univariate_bar_gender_r
Summary Statistics
summary(object=marvel_clean_r$gender)
Unknown Agender Female Gender Fluid Male
854 45 3837 2 11638
univariate_bar_gender_py = (
plotnine.ggplot(marvel_clean_py, plotnine.aes(x="gender"))
+ plotnine.geom_bar()
)
univariate_bar_gender_py
<Figure Size: (1280 x 960)>
univariate_bar_gender_jl = Gadfly.plot(
marvel_clean_jl,
x=:gender,
Gadfly.Geom.bar(),
Gadfly.Scale.x_discrete,
Gadfly.Scale.y_continuous(format=:plain),
Gadfly.Guide.title("Male-Dominated Characters")
);
Bivariate Analysis
# Comparing categories via stacked bar
multivariate_stacked_bar_time_gender_align_r <- ggplot2::ggplot(data=marvel_clean_r, aes(x=gender, y=(..count..), fill=align)) +
ggplot2::geom_bar(position="stack", stat="count") +
ggplot2::scale_x_discrete(limits=c("Male", "Female", "Unknown", "Agender", "Gender Fluid")) +
ggplot2::scale_y_continuous(expand=c(0, 0), position="right") + # Scale
ggplot2::scale_fill_manual(values=c("Good"=palette_michaelmallari_r[19], "Neutral"=palette_michaelmallari_r[20], "Unknown"=palette_michaelmallari_r[21], "Bad"=palette_michaelmallari_r[2])) +
ggplot2::guides(fill=guide_legend(reverse=TRUE)) +
ggplot2::labs(
title="Male-Dominated Characters (With Roughly Half as Villains)",
alt="Male-Dominated Characters (With Roughly Half as Villains)",
subtitle="Count of alignment by gender, Marvel Universe characters = 16,376",
x=NULL,
y=NULL,
fill=NULL,
caption="Source: FiveThirtyEight"
) +
theme_michaelmallari_r()
multivariate_stacked_bar_time_gender_align_r
2-Way Contingency Table
# Comparing categories via 2-way contingency table
table(marvel_clean_r$gender, marvel_clean_r$align)
Bad Reformed Criminal Unknown Neutral Good
Unknown 386 0 232 114 122
Agender 20 0 2 13 10
Female 976 0 684 640 1537
Gender Fluid 0 0 0 1 1
Male 5338 0 1894 1440 2966
Multivariate Analysis
multivariate_line_chart_time_align_count_r <- ggplot2::ggplot(data=marvel_clean_r, aes(x=first_appearance_year, y=(..count..), color=align)) +
ggplot2::geom_line(stat="count") +
ggplot2::scale_y_continuous(expand=c(0, 0), position="right") + # Scale
ggplot2::scale_color_manual(values=c("Good"=palette_michaelmallari_r[19], "Neutral"=palette_michaelmallari_r[20], "Unknown"=palette_michaelmallari_r[21], "Bad"=palette_michaelmallari_r[2])) +
ggplot2::guides(color=guide_legend(reverse=TRUE)) +
ggplot2::labs(
title="Disproportionate Rise of Villains in the 1990s",
alt="Disproportionate Rise of Villains in the 1990s",
subtitle="First appearances between 1939 and 2013, Marvel Universe characters = 16,376",
x=NULL,
y=NULL,
color=NULL,
caption="Source: FiveThirtyEight"
) +
theme_michaelmallari_r()
multivariate_line_chart_time_align_count_r
marvel_mean_appearances_align_first_year_r <- marvel_clean_r %>%
group_by(align, first_appearance_year) %>%
summarize(
appearances_mean=mean(appearances)
)
marvel_mean_appearances_align_first_year_r$appearances_mean[is.na(marvel_mean_appearances_align_first_year_r$appearances_mean)] <- 0
multivariate_streamgraph_time_appearances_align_r <- ggplot2::ggplot(
data=marvel_mean_appearances_align_first_year_r,
aes(x=first_appearance_year, y=appearances_mean, fill=align)) +
ggstream::geom_stream() +
ggstream::geom_stream_label(aes(label=align)) +
ggplot2::scale_y_continuous(expand=c(0, 0), position="right") + # Scale
ggplot2::scale_fill_manual(values=c("Good"=palette_michaelmallari_r[19], "Neutral"=palette_michaelmallari_r[20], "Unknown"=palette_michaelmallari_r[21], "Bad"=palette_michaelmallari_r[2])) +
ggplot2::guides(fill=guide_legend(reverse=TRUE)) +
ggplot2::labs(
title="Getting Mileage From the OGs",
alt="Getting Mileage From the OGs",
subtitle="Average appearances based on alignment, Marvel Universe characters = 16,376",
x="Year of First Appearance",
y=NULL,
color=NULL,
caption="Source: FiveThirtyEight"
) +
theme_michaelmallari_r() +
ggplot2::theme(legend.position="none")
multivariate_streamgraph_time_appearances_align_r
References
- Mallari, M. (n.d.). Michael Mallari (michaelmallari) - Profile. Pinterest. https://www.pinterest.com/michaelmallari/
- Schwabish, J. (2021). Better Data Visualizations: A Guide for Scholars, Researchers, and Wonks. Columbia University Press.
- Wickham, H. (2010). A Layered Grammar of Graphics. Journal of Computational and Graphical Statistics, 19(1), 3โ28. https://www.jstor.org/stable/25651297
- Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis (2nd ed.). Springer. https://doi.org/10.1007/978-3-319-24277-4
- Wilkinson, L. (2005). The Grammar of Graphics (2nd ed.). Springer.