Handling an overplotting on a scatter plot: `geom_count()`/`stat_sum()`#

The geom_count() counts the number of observations at each location.

Computed variables:

import pandas as pd

from lets_plot import *

LetsPlot.setup_html() 

mpg_df = pd.read_csv ("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv")
mpg_df.head()

	Unnamed: 0	manufacturer	model	displ	year	cyl	trans	drv	cty	hwy	fl	class
0	1	audi	a4	1.8	1999	4	auto(l5)	f	18	29	p	compact
1	2	audi	a4	1.8	1999	4	manual(m5)	f	21	29	p	compact
2	3	audi	a4	2.0	2008	4	manual(m6)	f	20	31	p	compact
3	4	audi	a4	2.0	2008	4	auto(av)	f	21	30	p	compact
4	5	audi	a4	2.8	1999	6	auto(l5)	f	16	26	p	compact

p = ggplot(mpg_df, aes(x=as_discrete('class', order=1), y=as_discrete('drv', order=1)))

1. Plot an Observation Count by Location#

p + geom_count()

p + stat_sum()