Plotly Express plot types¶

Below are examples of plots which can be created using Plotly Express. For the full list of plots and their options see Plotly Express documentation.

Plotly Express provides sample datasets which we will use in all examples.

[2]:

import plotly.express as px

# load DataFrames with sample data
tips = px.data.tips()
gapminder = px.data.gapminder()

print("\ntips:")
display(tips.head())
print("\ngapminder:")
display(gapminder.head())


tips:

	total_bill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4


gapminder:

	country	continent	year	lifeExp	pop	gdpPercap	iso_alpha	iso_num
0	Afghanistan	Asia	1952	28.801	8425333	779.445314	AFG	4
1	Afghanistan	Asia	1957	30.332	9240934	820.853030	AFG	4
2	Afghanistan	Asia	1962	31.997	10267083	853.100710	AFG	4
3	Afghanistan	Asia	1967	34.020	11537966	836.197138	AFG	4
4	Afghanistan	Asia	1972	36.088	13079460	739.981106	AFG	4

Scatter plot¶

[3]:

fig = px.scatter(tips,
                 x="total_bill",
                 y="tip",
                 color="sex",
                 width=750,   # plot width
                 height=500,  # plot height
                 title = "Scatter plot"
                )
fig.show()

Line plot¶

By default Plotly uses DataFrame column names to label plot coordinate axes, title of the legend etc. This can be changed using the labels argument. Its value should be a dictionary whose keys are column names, and values are labels we want to use.

[4]:

# select countries which names start with "A"
ac = gapminder[gapminder["country"].str[0] == "A"]

fig = px.line(ac,
              x="year",
              y="gdpPercap",
              color = "country",
              labels = {"year" : "Year", # change x-axis label
                        "gdpPercap" : "GDP per capita",  # change y-axis label
                        "country" : "Country name"},  # change legend title

              title = "Line plot"
             )
fig.show()

Bar plot¶

By default values in a column with categorical data are plotted in the order they are encountered in the DataFrame. In the example below it means that the order of days on the x-axis would not necessarily correspond to the usual ordering of days in a week. We can override this by assigning a dictionary to the category_orders argument. Dictionary keys are column names. The value corresponding to a given column is a list of values appearing in the column, ordered in the way we want them plotted.

[5]:

# DataFrame with total tip amounts for a given day and sex
t = tips.groupby(["day", "sex"])["tip"].sum().reset_index()
display(t.head())

	day	sex	tip
0	Fri	Female	25.03
1	Fri	Male	26.93
2	Sat	Female	78.45
3	Sat	Male	181.95
4	Sun	Female	60.61

[6]:

fig = px.bar(t,
             x="day",
             y="tip",
             color="sex",
             barmode="group",
             category_orders = {"day" : ["Thur", "Fri", "Sat", "Sun"]}, # order days on x-axis
             title="Bar plot"
            )
fig.show()

Strip plot¶

In a strip plot values in each category are plotted along the y-axis. Plotted points have their x-coordinates randomized a bit, to decrease overlapping.

[7]:

fig = px.strip(tips,
               x="day",
               y="tip",
               color="sex",
               category_orders = {"day" : ["Thur", "Fri", "Sat", "Sun"]}, # order days on x-axis
               title = "Strip plot"
              )
fig.show()

Box plot¶

Components of a box plot:

The lower edge of a box marks the first quartile: 25% of data values are below it.
The line inside a box marks the median: 50% of data values are below, and 50% is above it.
The upper edge of a box marks the third quartile: 75% of data values are below it.
The height of the box (i.e. the difference between the first and third quartiles) is called the Interquartile Range (IRQ).
The whiskers of a box extend to the smallest and larges data values which are within 1.5 \(\times\) IQR from the lower and upper edges of a box.
Data values which are outside the range of whiskers are considered to be outliers. They are plotted as individual points.

[8]:

fig = px.box(tips,
             x="day",
             y="total_bill",
             color="sex",
             labels = {"total_bill" : "total bill"}, # change label of y-axis
             category_orders = {"day" : ["Thur", "Fri", "Sat", "Sun"]},  # order days on x-axis
             title="Box plot")
fig.show()

Violin plot¶

Violin plots show kernel density estimate (KDE) of data.

[9]:

fig = px.violin(tips,
                x="day",
                y="total_bill",
                color="sex",
                labels = {"total_bill" : "total bill"},  # change label of y-axis
                category_orders = {"day" : ["Thur", "Fri", "Sat", "Sun"]}, # order days on x-axis
                title="Violin plot")

fig.show()

Violin plot can be combined with box plot:

[10]:

fig = px.violin(tips,
                x="day",
                y="total_bill",
                color="sex",
                labels = {"total_bill" : "total bill"},  # change label of y-axis
                category_orders = {"day" : ["Thur", "Fri", "Sat", "Sun"]}, # order days on x-axis
                box=True,   # add box plots
                title="Violin plot with boxes")

fig.show()

Histogram plot¶

Figures produced by Plotly Express can be customized using other Plotly tools. Below we use it to modify a histogram to add a bit of space between its bars (by default all bars would be plotted next to each other).

[11]:

fig = px.histogram(tips,
                   x="total_bill",
                   labels = {"total_bill" : "total bill"},
                   title = "Histogram"
                  )

fig.update_layout({"bargap": 0.02})  # add space between bars

fig.show()

Sunburst plot¶

[12]:

fig = px.sunburst(tips,
                  path=["day", "time", "sex"],
                  values="total_bill",
                  title="Sunburst plot")
fig.show()

Marginal plots¶

Most types of plots have options to include one or two marginal sublots. Below we use it to add a carpet plot on the margin of the x-axis, and a box plot on the margin of the y-axis. Possible types of marginal plots are "rug", "box", "violin" and "histogram".

[13]:

fig = px.scatter(tips,
                 x="total_bill",
                 y="tip",
                 color="sex",
                 marginal_x="rug",  # plot on x-axis margin
                 marginal_y="box",  # plot on y-axis margin
                 title="Scatter plot with margin plots")
fig.show()

Pair plot¶

Scatter plot shows relationship between two variables. The function px.scatter_matrix() is useful if we are dealing with more than two variables. It produces a grid of scatter plots, one plot for each pair of variables.

[14]:

fig = px.scatter_matrix(tips,
                        dimensions=["tip", "total_bill", "size"],  # names of columns used for the plot
                        color="sex",
                        title="Pair plot"
                       )
fig.show()

Animated plots¶

Some types plots can be animated using animation_frame and animation_group arguments. See Plotly documentation for more details.

[15]:

fig = px.scatter(gapminder,
                 y="lifeExp",
                 x="gdpPercap",
                 color="continent",
                 size="pop",
                 hover_name="country",
                 animation_frame="year",   # values of this column create animation frames
                 animation_group="country",   # values of this colummn specify how to animate markers
                 log_x = True,   # logarithmic scale on the x-axis
                 size_max=60,   # maximum size of markers
                 range_x=[200,60000],   # range of values on the x-axis
                 range_y=[25,90],   # range of values on the y-axis
                 labels = {"gdpPercap" : "GDP per capita",   # change label of x-axis
                           "lifeExp" : "life expectancy", },   # change label of y-axis
                 title="Animated scatter plot")
fig.show()

Choropleth maps¶

Choropleth maps are used to represent statistical data for geographical areas by assigning colors to each area, depending on values of the data. To illustrate it, we will use data on agricultural exports produced by individual US states in 2011:

[3]:

import pandas as pd

url = "https://raw.githubusercontent.com/plotly/datasets/master/2011_us_ag_exports.csv"
df = pd.read_csv(url)
df.head(5)

[3]:

	code	state	category	total exports	beef	pork	poultry	dairy	fruits fresh	fruits proc	total fruits	veggies fresh	veggies proc	total veggies	corn	wheat	cotton
0	AL	Alabama	state	1390.63	34.4	10.6	481.0	4.06	8.0	17.1	25.11	5.5	8.9	14.33	34.9	70.0	317.61
1	AK	Alaska	state	13.31	0.2	0.1	0.0	0.19	0.0	0.0	0.00	0.6	1.0	1.56	0.0	0.0	0.00
2	AZ	Arizona	state	1463.17	71.3	17.9	0.0	105.48	19.3	41.0	60.27	147.5	239.4	386.91	7.3	48.7	423.95
3	AR	Arkansas	state	3586.02	53.2	29.4	562.9	3.53	2.2	4.7	6.88	4.4	7.1	11.45	69.5	114.5	665.44
4	CA	California	state	16472.88	228.7	11.1	225.4	929.95	2791.8	5944.6	8736.40	803.2	1303.5	2106.79	34.6	249.3	1064.95

We can create an additional column indicating if a state exports cotton:

[4]:

df["cotton_exports"] = df["cotton"] > 0
df.head(5)

[4]:

	code	state	category	total exports	beef	pork	poultry	dairy	fruits fresh	fruits proc	total fruits	veggies fresh	veggies proc	total veggies	corn	wheat	cotton	cotton_exports
0	AL	Alabama	state	1390.63	34.4	10.6	481.0	4.06	8.0	17.1	25.11	5.5	8.9	14.33	34.9	70.0	317.61	True
1	AK	Alaska	state	13.31	0.2	0.1	0.0	0.19	0.0	0.0	0.00	0.6	1.0	1.56	0.0	0.0	0.00	False
2	AZ	Arizona	state	1463.17	71.3	17.9	0.0	105.48	19.3	41.0	60.27	147.5	239.4	386.91	7.3	48.7	423.95	True
3	AR	Arkansas	state	3586.02	53.2	29.4	562.9	3.53	2.2	4.7	6.88	4.4	7.1	11.45	69.5	114.5	665.44	True
4	CA	California	state	16472.88	228.7	11.1	225.4	929.95	2791.8	5944.6	8736.40	803.2	1303.5	2106.79	34.6	249.3	1064.95	True

A choropleth map showing cotton exporting states can be produces as follows:

[18]:

fig = px.choropleth(df,
                    scope="usa",   # scope of the map
                    locationmode="USA-states",    # we will specify US states using their codes
                    locations="code", # dataframe column with US state codes
                    color="cotton_exports", # values of this column will determine colors
                    category_orders={"cotton_exports": [True, False]},
                    color_discrete_sequence=["red", "lightgray"], # colors used in the map
                    title="States exporting cotton",
                    hover_name="state",
                    hover_data={"cotton_exports": False, "code": False},
                    labels={"cotton_exports": "Cotton exporters"}
                   )
fig.show()

As another example, we will create a map showing what percentage of agricultural exports of each state are vegetables. First, we compute a new column with this percentage data:

[5]:

df["veggies_%"] = (df["total veggies"]/df["total exports"])*100
df.head(5)

[5]:

	code	state	category	total exports	beef	pork	poultry	dairy	fruits fresh	fruits proc	total fruits	veggies fresh	veggies proc	total veggies	corn	wheat	cotton	cotton_exports	veggies_%
0	AL	Alabama	state	1390.63	34.4	10.6	481.0	4.06	8.0	17.1	25.11	5.5	8.9	14.33	34.9	70.0	317.61	True	1.030468
1	AK	Alaska	state	13.31	0.2	0.1	0.0	0.19	0.0	0.0	0.00	0.6	1.0	1.56	0.0	0.0	0.00	False	11.720511
2	AZ	Arizona	state	1463.17	71.3	17.9	0.0	105.48	19.3	41.0	60.27	147.5	239.4	386.91	7.3	48.7	423.95	True	26.443270
3	AR	Arkansas	state	3586.02	53.2	29.4	562.9	3.53	2.2	4.7	6.88	4.4	7.1	11.45	69.5	114.5	665.44	True	0.319295
4	CA	California	state	16472.88	228.7	11.1	225.4	929.95	2791.8	5944.6	8736.40	803.2	1303.5	2106.79	34.6	249.3	1064.95	True	12.789445

Next, we plot this data:

[20]:

fig = px.choropleth(df,
                    scope="usa",
                    locationmode="USA-states",
                    locations="code",
                    color="veggies_%",
                    color_continuous_scale = "tempo", # color scale to use
                                                      # see px.colors.sequential for available scales
                    title="Precentage of veggie exports",
                    hover_name="state",
                    hover_data={"code": False},
                    labels={"veggies_%": "Veggie Exports %"},
                   )

fig.update_traces(marker_line_color="white") # plot state boundaries in white
fig.show()

See Plotly documentation for additional information on plotting choropleth maps.