Creating interactive visualizations
In the first recipe, we gave a short preview of creating interactive visualizations in Python. In this recipe, we will show how to create interactive line plots using three different libraries: cufflinks
, plotly
, and bokeh
. Naturally, these are not the only available libraries for interactive visualizations. Another popular one you might want to investigate further is altair
.
The plotly
library is built on top of d3.js (a JavaScript library used for creating interactive visualizations in web browsers) and is known for creating high-quality plots with a significant degree of interactivity (inspecting values of observations, viewing tooltips of a given point, zooming in, and so on). Plotly is also the company responsible for developing this library and it provides hosting for our visualizations. We can create an infinite number of offline visualizations and a few free ones to share online (with a limited number of views per day).
cufflinks
is a wrapper library built on top of plotly
. It was released before plotly.express
was introduced as part of the plotly
framework. The main advantages of cufflinks
are:
- It makes the plotting much easier than pure
plotly
. - It enables us to create the
plotly
visualizations directly on top ofpandas
DataFrames. - It contains a selection of interesting specialized visualizations, including a special class for quantitative finance (which we will cover in the next recipe).
Lastly, bokeh
is another library for creating interactive visualizations, aiming particularly for modern web browsers. Using bokeh
, we can create beautiful interactive graphics, from simple line plots to complex interactive dashboards with streaming datasets. The visualizations of bokeh
are powered by JavaScript, but actual knowledge of JavaScript is not explicitly required for creating the visualizations.
In this recipe, we will create a few interactive line plots using Microsoft’s stock price from 2020.
How to do it…
Execute the following steps to download Microsoft’s stock prices and create interactive visualizations:
- Import the libraries and initialize the notebook display:
import pandas as pd import yfinance as yf import cufflinks as cf from plotly.offline import iplot, init_notebook_mode import plotly.express as px import pandas_bokeh cf.go_offline() pandas_bokeh.output_notebook()
- Download Microsoft’s stock prices from 2020 and calculate simple returns:
df = yf.download("MSFT", start="2020-01-01", end="2020-12-31", auto_adjust=False, progress=False) df["simple_rtn"] = df["Adj Close"].pct_change() df = df.loc[:, ["Adj Close", "simple_rtn"]].dropna() df = df.dropna()
- Create the plot using
cufflinks
:df.iplot(subplots=True, shape=(2,1), shared_xaxes=True, title="MSFT time series")
Running the code creates the following plot:
Figure 3.11: Example of time series visualization using cufflinks
With the plots generated using
cufflinks
andplotly
, we can hover over the line to see the tooltip containing the date of the observation and the exact value (or any other available information). We can also select a part of the plot that we would like to zoom in on for easier analysis.
- Create the plot using
bokeh
:df["Adj Close"].plot_bokeh(kind="line", rangetool=True, title="MSFT time series")
Executing the code generates the following plot:
Figure 3.12: Microsoft’s adjusted stock prices visualized using Bokeh
By default, the
bokeh
plot comes not only with the tooltip and zooming functionalities, but also the range slider. We can use it to easily narrow down the range of dates that we would like to see in the plot.
- Create the plot using
plotly.express
:fig = px.line(data_frame=df, y="Adj Close", title="MSFT time series") fig.show()
Running the code results in the following visualization:
Figure 3.13: Example of time series visualization using plotly
In Figure 3.13, you can see an example of the interactive tooltip, which is useful for identifying particular observations within the analyzed time series.
How it works…
In the first step, we imported the libraries and initialized the notebook
display for bokeh
and the offline mode for cufflinks
. Then, we downloaded Microsoft’s stock prices from 2020, calculated simple returns using the adjusted close price, and only kept those two columns for further plotting.
In the third step, we created the first interactive visualization using cufflinks
. As mentioned in the introduction, thanks to cufflinks
, we can use the iplot
method directly on top of the pandas
DataFrame. It works similarly to the original plot
method. Here, we indicated that we wanted to create subplots in one column, sharing the x-axis. The library handled the rest and created a nice and interactive visualization.
In Step 4, we created a line plot using bokeh
. We did not use the pure bokeh
library, but an official wrapper around pandas—pandas_bokeh
. Thanks to it, we could access the plot_bokeh
method directly on top of the pandas
DataFrame to simplify the process of creating the plot.
Lastly, we used the plotly.express
framework, which is now officially part of the plotly
library (it used to be a standalone library). Using the px.line
function, we can easily create a simple, yet interactive line plot.
There’s more…
While using the visualizations to tell a story or presenting the outputs of our analyses to stakeholders or a non-technical audience, there are a few techniques that might improve the plot’s ability to convey a given message. Annotations are one of those techniques and we can easily add them to the plots generated with plotly
(we can do so with other libraries as well).
We show the required steps below:
- Import the libraries:
from datetime import date
- Define the annotations for the
plotly
plot:selected_date_1 = date(2020, 2, 19) selected_date_2 = date(2020, 3, 23) first_annotation = { "x": selected_date_1, "y": df.query(f"index == '{selected_date_1}'")["Adj Close"].squeeze(), "arrowhead": 5, "text": "COVID decline starting", "font": {"size": 15, "color": "red"}, } second_annotation = { "x": selected_date_2, "y": df.query(f"index == '{selected_date_2}'")["Adj Close"].squeeze(), "arrowhead": 5, "text": "COVID recovery starting", "font": {"size": 15, "color": "green"}, "ax": 150, "ay": 10 }
The dictionaries contain a few elements that might be worthwhile to explain:
x
/y
—The location of the annotation on the x- and y-axes respectivelytext
—The text of the annotationfont
—The font’s formattingarrowhead
—The shape of the arrowhead we want to useax
/ay
—The offset along the x- and y-axes from the indicated point
We frequently use the offset to make sure that the annotations are not overlapping with each other or with other elements of the plot.
After defining the annotations, we can simply add them to the plot.
- Update the layout of the plot and show it:
fig.update_layout( {"annotations": [first_annotation, second_annotation]} ) fig.show()
Running the snippet generates the following plot:
Figure 3.14: Time series visualization with added annotations
Using the annotations, we have marked the dates when the market started to decline due to the COVID-19 pandemic, as well as when it started to recover and rise again. The dates used for annotations were selected simply by viewing the plot.
See also
- https://bokeh.org/—For more information about
bokeh
. - https://altair-viz.github.io/—You can also inspect
altair
, another popular Python library for interactive visualizations. - https://plotly.com/python/—
plotly
's Python documentation. The library is also available for other programming languages such as R, MATLAB, or Julia.