avatarLan D. Phan

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3925

Abstract

sactionDate'</span>, <span class="hljs-attribute">y</span>=<span class="hljs-string">'principal'</span>)
.properties(<span class="hljs-attribute">height</span>=300, <span class="hljs-attribute">width</span>=300)</pre></div><div id="9db1"><pre>capitalPlot <span class="hljs-string">| principalPlot</span></pre></div><figure id="a8a4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*zpWPY9F6MTISF8mYlNqODg.png"><figcaption>Figure 2</figcaption></figure><p id="05f0">We can use the <code>+</code> instead of the <code>|</code> to put both the capital and principal in the same graph: <code>capitalPlot | principalPlot</code></p><figure id="3e7e"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*mHzp6DK6GB-1H35yq6RxQQ.png"><figcaption>Figure 3</figcaption></figure><p id="e4d3">It’s also a good idea use different colors for capital and principal. We can do this by specifying the color we want in the <code>mark_point</code> call.</p><div id="a18b"><pre>principalPlot = altair.Chart(capitalPrincipal).mark_point(<span class="hljs-attribute">color</span>=<span class="hljs-string">'orange'</span>) .encode(<span class="hljs-attribute">x</span>=<span class="hljs-string">'transactionDate'</span>, <span class="hljs-attribute">y</span>=<span class="hljs-string">'principal'</span>)
.properties(<span class="hljs-attribute">height</span>=300, <span class="hljs-attribute">width</span>=500)</pre></div><figure id="dc7f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*glw4FyAWlzyElvEQXUywNA.png"><figcaption>Figure 4</figcaption></figure><p id="8210">Though it’s easy to create separate plots for different data series and combine them, this doesn’t scale. We can actually apply some reshaping technique to the dataframe so that the data are combined for us. We can do this using the <code>melt </code>operation provided by Pandas, which will combine all columns given an identifier column. In our case, the identifier column would be the <code>transactionDate</code>. We’ll create a new column called <code>amount</code> for the values.</p><div id="6e62"><pre>capitalPrincipalRs = capitalPrincipal.melt(<span class="hljs-string">'transactionDate'</span>, <span class="hljs-attribute">var_name</span>=<span class="hljs-string">'capitalPrincipal'</span>, <span class="hljs-attribute">value_name</span>=<span class="hljs-string">'amount'</span>)</pre></div><p id="0eac">This will shape the dataframe as follow:</p><figure id="ad75"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*NYlydxRIB_fVX4XSCHwEcA.png"><figcaption>Figure 5</figcaption></figure><p id="bec8">As you can see, we’ve doubled the number of rows compared to the original dataframe since the <code>capital</code> and <code>principal</code> columns have been combined into a single <code>capitalPrincipal</code> column and the original column names had became the values. The original values are now in the <code>amount</code> column. We can now generate a single graph.</p><div id="754b"><pre>rsPlot = altair.Chart(capitalPrincipalRs).mark_point()
.encode(<span class="hljs-attribute">x</span>=<span class="hljs-string">'transactionDate'</span>, <span class="hljs-attribute">y</span>=<span class="hljs-string">'amount'</span>, <span class="hljs-attribute">color</span>=<span class="hljs-string">'capitalPrincipal'</span>)
.properties(<span class="hljs-attribute">height</span>=300, <span class="hljs-attribute">width</span>=500)</pre></div><figure id="c471"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*xYFE9gjqtBCAIZIa1AMssg.png"><figcaption>Figure 6</figcaption></figure><p id="cf0f">We let Altair know to differentiate the values in the <code>capitalPrincipal</code> column and use that to generate different colored series. We also get a nice legend there.</p><p id="76f6">So far, we can see a general picture of what’s happening overall. Capita

Options

l generally stays above principal which is a good thing. But, with this plot, we’ll have to constantly keep a mental state of this fact as we look at the plot. You might start asking, “This is not good enough? How lazy are you?”, but I want a little extra. If we can get the delta between the capital and principal, then I won’t have to carry this mental state around. We’ll generate a new column for the dataframe.</p><div id="b6ab"><pre>capitalPrincipal[<span class="hljs-string">'delta'</span>] = capitalPrincipal[<span class="hljs-string">'capital'</span>] - capitalPrincipal[<span class="hljs-string">'principal'</span>]</pre></div><p id="796e">The output looks like this:</p><figure id="fdc3"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*Y_d0_afuWP9haYqbkcRtKg.png"><figcaption>Figure 7</figcaption></figure><p id="f77e">And we can simply generate a chart.</p><div id="98cc"><pre>altair.Chart(capitalPrincipal).mark_point()
.encode(<span class="hljs-attribute">x</span>=<span class="hljs-string">'transactionDate'</span>, <span class="hljs-attribute">y</span>=<span class="hljs-string">'delta'</span>)
.properties(<span class="hljs-attribute">height</span>=300, <span class="hljs-attribute">width</span>=500)</pre></div><figure id="201f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*12mNoH_AkjiYOwSN3wuJlw.png"><figcaption>Figure 8</figcaption></figure><p id="105b">Now with this chart, I can sleep better at night seeing it doesn’t go below 0, in fact, it needs to go up over time. Also, notice there’s an outlier in the data. This is likely due to some options being assigned or exercised, probably from a vertical spread. We can actually smooth this out by resampling. Pandas being really good with time series, is basically made for this. We can do resampling based on a 7 days average as follow:</p><div id="5832"><pre><span class="hljs-attr">capitalPrincipalResampled</span> = capitalPrincipal.resample(<span class="hljs-string">'7D'</span>, <span class="hljs-literal">on</span>=<span class="hljs-string">'transactionDate'</span>).mean().reset_index()</pre></div><figure id="6fad"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*elqodPdOsBmNtHRumz-QGA.png"><figcaption>Figure 9</figcaption></figure><p id="f3fb">Now we can see that the outliers’ effects are alleviated. And while we’re at it, let’s also generate and plot the delta as a percentage of the principal.</p><div id="50ee"><pre>capitalPrincipal[<span class="hljs-string">'deltaApoPrincipal'</span>] = (capitalPrincipal[<span class="hljs-string">'delta'</span>] / capitalPrincipal[<span class="hljs-string">'principal'</span>]) * <span class="hljs-number">100</span></pre></div><div id="676a"><pre><span class="hljs-attr">capitalPrincipalResampled</span> = capitalPrincipal.resample(<span class="hljs-string">'7D'</span>, <span class="hljs-literal">on</span>=<span class="hljs-string">'transactionDate'</span>).mean().reset_index()</pre></div><div id="2d91"><pre>altair.Chart(capitalPrincipalResampled).mark_point()
.encode(<span class="hljs-attribute">x</span>=<span class="hljs-string">'transactionDate'</span>, <span class="hljs-attribute">y</span>=<span class="hljs-string">'deltaApoPrincipal'</span>)
.properties(<span class="hljs-attribute">height</span>=300, <span class="hljs-attribute">width</span>=500)</pre></div><figure id="498c"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*I5KJJksipF336BLnZcj15Q.png"><figcaption>Figure 10</figcaption></figure><p id="6971">There we have it. Thanks to all the available useful tools and with a bit of motivation, we are able to extract information and develop insights from data. Thanks for reading and let’s continue our journey soon.</p><p id="9725"><a href="https://phandinhlan.medium.com/options-trading-data-analysis-part-3-the-openings-73c2129d2fd2">Part 3-The openings</a></p></article></body>

Options trading data analysis — Part 2-Visuals

In Part 1, we started our journey by first obtaining and learning about transactional brokerage data, creating Pandas dataframe, and performing basic analytic techniques to calculate the running capital and principal based on this data. Though this is a start, I am personally a visual person so in this part, let’s explore what’s available to us to visualize the information we’ve gathered. There are several popular visualization libraries available, such as Matplotlib and Altair. Let’s explore how we can utilizeAltair for visualization.

Continuing from where we left off, we simply import altair.

import json
import pandas as pd
from pandas.io.json import json_normalize
import numpy as np
import altair

What we’re interested in plotting here is the principal and capital over time. In our dataframe, we can use the transactionDate as the time axis. We’ll first notice, however, it is represented as a string, for example: 2020–06–22T19:56:19+0000. Pandas has a to_datetime operation that can help us do the conversion. We can either specify the exact datetime format or let the function does the inference for us. We’ll specify errors=’raise’ so we’ll get notified in case there’s a conversion problem.

#transactionsDf.loc['transactionDate'] = pd.to_datetime(transactionsDf['transactionDate'], errors='raise', format='%Y-%m-%dT%H:%M:%S%z')
transactionsDf['transactionDate'] = pd.to_datetime(transactionsDf['transactionDate'], errors='raise', infer_datetime_format=True)

We can then create a slice of the dataframe with just the 3 columns.

capitalPrincipal = transactionsDf.loc[:,['transactionDate', 'capital', 'principal']]

The output for my sample data looks something like this:

Figure 1

The following code will generate the capital and principal plot over time and display them side by side.

capitalPlot = altair.Chart(capitalPrincipal).mark_point()  \
    .encode(x='transactionDate', y='capital') \
    .properties(height=300, width=300)
principalPlot = altair.Chart(capitalPrincipal).mark_point()  \
    .encode(x='transactionDate', y='principal') \
    .properties(height=300, width=300)
capitalPlot | principalPlot
Figure 2

We can use the + instead of the | to put both the capital and principal in the same graph: capitalPlot | principalPlot

Figure 3

It’s also a good idea use different colors for capital and principal. We can do this by specifying the color we want in the mark_point call.

principalPlot = altair.Chart(capitalPrincipal).mark_point(color='orange')
    .encode(x='transactionDate', y='principal') \
    .properties(height=300, width=500)
Figure 4

Though it’s easy to create separate plots for different data series and combine them, this doesn’t scale. We can actually apply some reshaping technique to the dataframe so that the data are combined for us. We can do this using the melt operation provided by Pandas, which will combine all columns given an identifier column. In our case, the identifier column would be the transactionDate. We’ll create a new column called amount for the values.

capitalPrincipalRs = capitalPrincipal.melt('transactionDate', var_name='capitalPrincipal', value_name='amount')

This will shape the dataframe as follow:

Figure 5

As you can see, we’ve doubled the number of rows compared to the original dataframe since the capital and principal columns have been combined into a single capitalPrincipal column and the original column names had became the values. The original values are now in the amount column. We can now generate a single graph.

rsPlot = altair.Chart(capitalPrincipalRs).mark_point()  \
    .encode(x='transactionDate', y='amount', color='capitalPrincipal') \
    .properties(height=300, width=500)
Figure 6

We let Altair know to differentiate the values in the capitalPrincipal column and use that to generate different colored series. We also get a nice legend there.

So far, we can see a general picture of what’s happening overall. Capital generally stays above principal which is a good thing. But, with this plot, we’ll have to constantly keep a mental state of this fact as we look at the plot. You might start asking, “This is not good enough? How lazy are you?”, but I want a little extra. If we can get the delta between the capital and principal, then I won’t have to carry this mental state around. We’ll generate a new column for the dataframe.

capitalPrincipal['delta'] = capitalPrincipal['capital'] - capitalPrincipal['principal']

The output looks like this:

Figure 7

And we can simply generate a chart.

altair.Chart(capitalPrincipal).mark_point()  \
    .encode(x='transactionDate', y='delta') \
    .properties(height=300, width=500)
Figure 8

Now with this chart, I can sleep better at night seeing it doesn’t go below 0, in fact, it needs to go up over time. Also, notice there’s an outlier in the data. This is likely due to some options being assigned or exercised, probably from a vertical spread. We can actually smooth this out by resampling. Pandas being really good with time series, is basically made for this. We can do resampling based on a 7 days average as follow:

capitalPrincipalResampled = capitalPrincipal.resample('7D', on='transactionDate').mean().reset_index()
Figure 9

Now we can see that the outliers’ effects are alleviated. And while we’re at it, let’s also generate and plot the delta as a percentage of the principal.

capitalPrincipal['deltaApoPrincipal'] = (capitalPrincipal['delta'] / capitalPrincipal['principal']) * 100
capitalPrincipalResampled = capitalPrincipal.resample('7D', on='transactionDate').mean().reset_index()
altair.Chart(capitalPrincipalResampled).mark_point()  \
    .encode(x='transactionDate', y='deltaApoPrincipal') \
    .properties(height=300, width=500)
Figure 10

There we have it. Thanks to all the available useful tools and with a bit of motivation, we are able to extract information and develop insights from data. Thanks for reading and let’s continue our journey soon.

Part 3-The openings

Finance
Software Development
Data Science
Data Visualization
Python
Recommended from ReadMedium