Exploring the Binance API in Python — Part I: The Order Book

Louis Tiao
5 min readFeb 9, 2021

--

In this post, we will explore the live order book data on Binance through its official API using Python.

Photo by Chris Liverani on Unsplash

We directly interact with the API endpoints and explicitly make the low-level HTTP requests ourselves. If you’re just looking for a high-level way to interact with the API endpoints that abstracts away these details please check out python-binance, an unofficial, but slick and well-designed Python Client for the Binance API.

We will be making the requests using the requests library. Thereafter, we will process the results with pandas, and visualize them with matplotlib and seaborn. Let’s import these dependencies now:

import requests
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

To make a GET request for the symbol ETHBUSD from the /depth endpoint:

r = requests.get("https://api.binance.com/api/v3/depth", 
params=dict(symbol="ETHBUSD"))
results = r.json()

Load the buy and sell orders, or bids and asks, into respective DataFrames:

frames = {side: pd.DataFrame(data=results[side], 
columns=["price", "quantity"], dtype=float)
for side in ["bids", "asks"]}

Concatenate the DataFrames containing bids and asks into one big frame:

frames_list = [frames[side].assign(side=side) for side in frames]
data = pd.concat(frames_list, axis="index",
ignore_index=True, sort=True)

Get a statistical summary of the price levels in the bids and asks:

price_summary = data.groupby("side").price.describe()
price_summary.to_markdown()
HTML Rendering of Markdown table output from Pandas

Note that that the Binance API only provides the lowest 100 asks
and the highest 100 bids (see the count column).

Top of the book

The prices of the most recent trades will be somewhere between the maximum bid price and the minimum asking price. This is known as the top of the book. The difference between these two price levels is known as the bid-ask spread.

>>> frames["bids"].price.max()
1056.58
>>> frames["asks"].price.min()
1056.64

We can also get this information from the /ticker/bookTicker endpoint:

r = requests.get("https://api.binance.com/api/v3/ticker/bookTicker",   
params=dict(symbol="ETHBUSD"))
book_top = r.json()

Read this into a Pandas Series and render it as a Markdown table:

name = book_top.pop("symbol")  # get symbol and also delete
s = pd.Series(book_top, name=name, dtype=float)
s.to_markdown()
HTML Rendering of Markdown table output from Pandas

Scatter plot

Let us visualize all the order book entries using a scatter plot, showing price along the x-axis, and quantity along the y-axis. The hue signifies whether the entry is an “ask” or a “bid”.

fig, ax = plt.subplots()ax.set_title(f"Last update: {t} (ID: {last_update_id})")sns.scatterplot(x="price", y="quantity", hue="side", 
data=data, ax=ax)
ax.set_xlabel("Price")
ax.set_ylabel("Quantity")
plt.show()
Scatter Plot Visualization of the Order Book

This is the most verbose visualization, displaying all the raw information, but perhaps also providing the least amount of actionable insights.

Histogram plot

We can compress this information into a histogram.

fig, ax = plt.subplots()ax.set_title(f"Last update: {t} (ID: {last_update_id})")sns.histplot(x="price", hue="side", binwidth=binwidth, 
data=data, ax=ax)
sns.rugplot(x="price", hue="side", data=data, ax=ax)
plt.show()

This shows the number of bids or asks at specific price points, but obscures the volume (or quantity).

This is obviously misleading.
For example, there could be 1 bid at price p_1 and 100 bids at price p_2. However, the 1 bid at price p_1 could be for 100 ETH, while each of those 100 bids at p_2 could be for just 1 ETH. At both price points, the total quantity of ETH being bid is in fact identical. Yet this plot would suggest that there is 100 times greater demand for ETH at p_2.

Weighted histogram plot

This is easy to fix, simply by weighting each entry by the quantity. This just amounts to setting weights="quantity" :

fig, ax = plt.subplots()ax.set_title(f"Last update: {t} (ID: {last_update_id})")sns.histplot(x="price", weights="quantity", hue="side", 
binwidth=binwidth, data=data, ax=ax)
sns.scatterplot(x="price", y="quantity", hue="side",
data=data, ax=ax)
ax.set_xlabel("Price")
ax.set_ylabel("Quantity")
plt.show()
Weighted histogram plot

This paints a more accurate picture about supply-and-demand, but still offers
limited actionable insights.

For example, suppose we wanted to purchase 200 ETH. Based on this visualization alone, can you tell at what price you need to bid so that your buy is guaranteed to be filled? Nope.

To obtain this information, you need to take the cumulative sum of the quantity with the associated prices in ascending order. Conversely, if you wanted to work out at what price you should be asking for so that your sale is guaranteed to be filled, you need to do the same, but with the prices in descending order.

Weighted empirical CDF (ECDF) plot — aka the “Depth Chart”

We finally arrive at the depth chart, which is a popular visualization that is ubiquitous across exchanges and trading platforms. The depth chart is essentially just a combination of two empirical cumulative distribution function (CDF), or ECDF, plots.

More precisely, they are weighted and unnormalized ECDF plots. As before, they are weighted by the quantity and are unnormalized in the sense that they are not normalized to be proportions between [0, 1]. Rather, they are simply kept as counts. Additionally, in the case of bids, we take the complementary ECDF (which basically reverses the order in which the cumulative sum is taken).

In code, this amounts to making calls to sns.ecdfplot with the options weights="quantity" (self-explanatory) and stat="count" (to keep the plot
unnormalized). Finally, for the bids, we add the option complementary=True. Putting it all together:

fig, ax = plt.subplots()ax.set_title(f"Last update: {t} (ID: {last_update_id})")sns.ecdfplot(x="price", weights="quantity", stat="count", 
complementary=True, data=frames["bids"], ax=ax)
sns.ecdfplot(x="price", weights="quantity", stat="count",
data=frames["asks"], ax=ax)
sns.scatterplot(x="price", y="quantity", hue="side",
data=data, ax=ax)
ax.set_xlabel("Price")
ax.set_ylabel("Quantity")
plt.show()
Weighted empirical CDF (ECDF) plot

With that, let us return to the question I posed earlier.

Suppose we wanted to purchase 200 ETH. Based on this visualization alone, can you tell at what price you need to bid so that your buy is guaranteed to be filled?

Easy. Roughly speaking, a bid at almost exactly halfway between $1,057 and $1,058 will guarantee our buy order will be satisfied by the matching engine right away.

This post was originally featured on my blog. To receive updates on more posts like this, follow me on Twitter and GitHub!

--

--

Louis Tiao

Programming | CS | Math | PhD Candidate and former Software Engineer in Probabilistic Machine Learning | Previously at Amazon, CSIRO (Data61) and NICTA