avatarZach Quinn

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

4596

Abstract

n’t accept or require additional parameters. Just make a call to this URL and you’re set: <a href="https://api.stripe.com/v1/payouts">https://api.stripe.com/v1/payouts</a>.</p><p id="b281">Let’s test the connection to make sure the configuration is correct.</p><div id="2c5b"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">make_request</span>(<span class="hljs-params">url, token</span>):

logging.info(<span class="hljs-string">"Making request to Stripe 'payout' end point..."</span>)

req = requests.get(url, headers=token)

logging.info(<span class="hljs-string">f"Response: <span class="hljs-subst">{req.status_code}</span>"</span>)

make_request(cfg.base_url, token)</pre></div><figure id="5329"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*PGyqclF-t_NuHkQsL8M0XQ.jpeg"><figcaption>200 response. Screenshot by the author.</figcaption></figure><p id="5c5c"><i>Pardon the interruption: For more Python, SQL and cloud computing walkthroughs, follow <a href="https://medium.com/pipeline-a-data-engineering-resource"><b>Pipeline: Your Data Engineering Resource</b></a>.</i></p><p id="74cd"><i>To receive my latest writing, you can <a href="https://medium.com/@zachl-quinn"><b>follow me</b></a> as well.</i></p><h1 id="19d6">Getting Your Earnings Data</h1><p id="7c55">Now if we make a real call, we should see a few months’ worth of payments data in JSON form.</p><div id="8b84"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">make_request</span>(<span class="hljs-params">url, token</span>):

logging.info(<span class="hljs-string">"Making request to Stripe 'payout' end point..."</span>)

req = requests.get(url, headers=token)

logging.info(<span class="hljs-string">f"Response: <span class="hljs-subst">{req.status_code}</span>"</span>)

<span class="hljs-keyword">if</span> req.status_code == <span class="hljs-number">200</span>:
    data = req.json()[<span class="hljs-string">'data'</span>]
    
    <span class="hljs-keyword">return</span> data
<span class="hljs-keyword">else</span>:
    logging.info(<span class="hljs-string">f"Response is <span class="hljs-subst">{req.status_code}</span>."</span>)
    
    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>

make_request(cfg.base_url, token)</pre></div><figure id="bcde"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*SJ4EjoeNE70X0JvAj0nzOA.jpeg"><figcaption>Stripe payout object response. Screenshot by the author.</figcaption></figure><p id="d0bb">To confirm that we’re getting production data we can check the field “livemode.”</p><p id="efe6">If livemode is equivalent to true then we’re getting live (production) data.</p><figure id="cbb1"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*-ewskHbU_1CUpK74ejEFhQ.jpeg"><figcaption>Livemode equals true. Screenshot by the author.</figcaption></figure><p id="dc03">The good news is that the JSON returned is unnested, which means we don’t need to do any additional parsing.</p><p id="efb6">We just need to access the “data” key. Since this isn’t nested JSON, we can pass the “data” payload directly to Pandas’ pd.DataFrame() to create a quick data frame.</p><div id="e031"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">make_request</span>(<span class="hljs-params">url, token</span>):

logging.info(<span class="hljs-string">"Making request to Stripe 'payout' end point..."</span>)

req = requests.get(url, headers=token)

logging.info(<span class="hljs-string">f"Response: <span class="hljs-subst">{req.status_code}</span>"</span>)

<span class="hljs-keyword">if</span> req.status_code == <span class="hljs-number">200</span>:
    data = req.json()[<span class="hljs-string">'data'</span>]
    
    <span class="hljs-keyword">return</span> data
<span class="hljs-keyword">else</span>:
    logging.info(<span class="hljs-string">f"Response is <span class="hljs-subst">{req.status_code}</span>."</span>)
    
    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>

<span class="hljs-keyword">def</span> <span class="hljs-title function_">format_df</span>(<span class="hljs-params">data</span>):

df = pd.DataFrame(data)

<span class="hljs-keyword">return</span> df

<span class="hljs-keyword">def</span> <span class="hljs-title function_">main</span>():

data = make_request(cfg.base_url, token)

df = pd.DataFrame(data)

<sp

Options

an class="hljs-keyword">return</span> df

main()</pre></div><p id="dc3e">Now we can see a cleaner, nearly-finalized output of what we seek: Updated payment data.</p><figure id="4edc"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*62IelK-QW7aKrYTiJfmGlg.jpeg"><figcaption>Data frame. Screenshot by the author.</figcaption></figure><p id="4eb1">However, if you’re going to use this in a dashboard, you will want to clean up the respective fields by doing some data frame manipulation.</p><h1 id="fb62">Cleaning Up Your Earnings Data Frame</h1><p id="1983">For me, the biggest culprits are the amount and arrival_date fields.</p><figure id="e5d5"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*27vejzHxzR25kKuZIJdWWg.jpeg"><figcaption></figcaption></figure><p id="55e3">To clarify, arrival_date is the date that the funds arrived in Stripe from the payment source. Amount, obviously, is the amount.</p><p id="a54a"><i>Note: For my privacy I’ll be excluding outputs that include amount.</i></p><p id="88e6">However, your amount value might be inflated because it’s not correctly converted to a float value.</p><div id="9fcb"><pre>df[<span class="hljs-string">'amount'</span>] = df[<span class="hljs-string">'amount'</span>].astype(<span class="hljs-built_in">float</span>) / <span class="hljs-number">100</span></pre></div><p id="c0fb">Similarly, your arrival_date field might be ingested as a rather messy unix value that you’ll also want to convert.</p><div id="86a8"><pre>df[<span class="hljs-string">'arrival_date'</span>] = pd.to_datetime(df[<span class="hljs-string">'arrival_date'</span>], unit=<span class="hljs-string">'s'</span>) df[<span class="hljs-string">'arrival_date'</span>] = df[<span class="hljs-string">'arrival_date'</span>].dt.strftime(<span class="hljs-string">"%Y-%m-%d"</span>)</pre></div><p id="a32a">After applying these tweaks, we can see a much neater data frame emerge.</p><figure id="382b"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*O62Dw4RR4dE-IbBJsUDxUA.jpeg"><figcaption>Arrival date field. Screenshot by the author.</figcaption></figure><p id="18a6">But if you’ve had funds deposited into your Stripe account for a longer period of time (say, over a year), you’ll notice that the API doesn’t return every value.</p><p id="d0ea">Stripe APIs include pagination, which mean that you will only retrieve the first page of results.</p><p id="5f93">If I were building out a pipeline that needed to ingest historic data on a recurring basis, I’d care about this. But since, going forward, I just want to ingest the prior month’s earnings, I can cheat a bit when it comes to getting the rest of the values.</p><p id="1b79">If you look in the Stripe dashboard you can see a parameter for a custom date range.</p><figure id="6a16"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*dd4uYZi6dYtfPXyMGkdLJw.jpeg"><figcaption>Stripe UI, overview. Screenshot by the author.</figcaption></figure><p id="79ad">If you tweak this you can get all the data for the entire lifespan of your account.</p><figure id="2242"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*K27jXOFqaRSoRE9Au59ZAQ.jpeg"><figcaption>Earnings line graph. Screenshot by the author.</figcaption></figure><p id="14e4">Unless you’ve had a Stripe integration for years, entering these values manually shouldn’t be too much of a chore. Again, since this is a backfill, you only need to do this once.</p><p id="daf6">As a final check, I highly recommend taking a sum of your amount column and summing the values in your Stripe dashboard.</p><p id="1447">If they match, congratulations, you did it!</p><h1 id="66cd">Taking The Next Steps…</h1><p id="de07">You’ve completed the leg work of identifying your data source (Stripe’s payouts end point), connecting to the API, conducting a quality assurance check (QA) against your results and, finally, backfilling your missing values.</p><p id="f87d">But the fun is just beginning, because you don’t want to have to manually make this call every month to get your sweet, sweet earnings.</p><p id="eed7">You want this to run in the background and maybe even get an email report each time your gross earnings are updated. Wouldn’t that be nice?</p><p id="cdec">Click for <a href="https://readmedium.com/theres-an-api-to-get-your-partner-program-earnings-but-it-s-not-what-you-think-pt-2-2381309ebf2f">Part II</a>.</p><p id="eb62"><b>Create a job-worthy data portfolio. Learn how <a href="https://pipe_line.ck.page/e97fc26c83">with my free project guide</a>.</b></p></article></body>

There’s An API To Get Your Partner Program Earnings — But It’s Not What You Think

Track your writer/creator revenue by sourcing, ingesting and cleaning a data frame of your Partner Program earnings data.

I need your help. Take a minute to answer a 3-question survey to tell me how I can help you outside this blog. All responses receive a free gift.

Why You Can’t (Easily) Find Your Earnings Data

In addition to living in a golden age of democratized content, creators are simultaneously living through a different renaissance: The age of dashboards.

As much as writers and other creatives like to claim they “don’t read the comments” or “don’t look at stats”, they (myself included), need metrics to know, at the very least, if something “works” (whatever that means for your individual use case).

To see the importance of dashboard utility, look no further than this platform’s latest update. In the same breath that proclaimed a commitment to memorable, human writing, the team also unveiled a new and improved dashboard to track engagement and earnings.

If you’re a data-minded creative and want to go a step further, there are even API integrations that can get you the same data so you can build out bespoke reports.

But even third-party APIs lack one very important (to some) data source: Earnings. Specifically the monthly lump sum reported in the Stats dashboard.

As I built out a custom dashboard that got me nearly every available engagement metric, I hit a wall when it came to ingesting and integrating earnings data in my reports.

The reason for this is frustratingly simple.

When it comes to what matters, that monthly payout, this data doesn’t even live in Medium.

For that we’ll need to open an account you probably don’t think about as much: Stripe.

Photo by Travis Essinger on Unsplash

Find Your Data — And Your Money

One of the misconceptions of data engineering is that the hardest part of the job is building out a pipeline. I would argue one of the most difficult parts of the job is simply finding where your data lives.

In this case I discovered the aptly-named “payouts” end point wading through the dense Stripe API documentation while working on a separate work project.

Payout object in Stripe. Screenshot by the author.

If you’re in the Medium Partner Program and your stories earn money, your earnings are transferred from Medium to Stripe and then Stripe to your bank account. Since the Partner Program only pays out once monthly, in this case, the payouts end point contains one row per month.

You can easily see this data in your dashboard, which allows you to shift the date range and other variables to fit your individual analysis and use case.

Stripe dashboard overview. Screenshot by the author.

To access the back end of this dashboard, as you might have guessed, you need an API key.

Authentication

Stripe’s reporting APIs are pretty conventional, so this API key is a bearer token.

If you see a key that begins with “Bearer” and then a string of alphanumeric characters, you’re looking at the correct value.

Be careful though because in your developer portal Stripe will provide both a test key and production key. If you use a test key and you haven’t configured test values, you’ll see nothing.

Stripe developers portal. Screenshot by the author.

The payouts end point (v1) is especially easy to interact with because it doesn’t accept or require additional parameters. Just make a call to this URL and you’re set: https://api.stripe.com/v1/payouts.

Let’s test the connection to make sure the configuration is correct.

def make_request(url, token):
    
    logging.info("Making request to Stripe 'payout' end point...")

    req = requests.get(url, headers=token)
    
    logging.info(f"Response: {req.status_code}")

make_request(cfg.base_url, token)
200 response. Screenshot by the author.

Pardon the interruption: For more Python, SQL and cloud computing walkthroughs, follow Pipeline: Your Data Engineering Resource.

To receive my latest writing, you can follow me as well.

Getting Your Earnings Data

Now if we make a real call, we should see a few months’ worth of payments data in JSON form.

def make_request(url, token):
    
    logging.info("Making request to Stripe 'payout' end point...")

    req = requests.get(url, headers=token)
    
    logging.info(f"Response: {req.status_code}")
    
    if req.status_code == 200:
        data = req.json()['data']
        
        return data
    else:
        logging.info(f"Response is {req.status_code}.")
        
        return 0

make_request(cfg.base_url, token)
Stripe payout object response. Screenshot by the author.

To confirm that we’re getting production data we can check the field “livemode.”

If livemode is equivalent to true then we’re getting live (production) data.

Livemode equals true. Screenshot by the author.

The good news is that the JSON returned is unnested, which means we don’t need to do any additional parsing.

We just need to access the “data” key. Since this isn’t nested JSON, we can pass the “data” payload directly to Pandas’ pd.DataFrame() to create a quick data frame.

def make_request(url, token):
    
    logging.info("Making request to Stripe 'payout' end point...")

    req = requests.get(url, headers=token)
    
    logging.info(f"Response: {req.status_code}")
    
    if req.status_code == 200:
        data = req.json()['data']
        
        return data
    else:
        logging.info(f"Response is {req.status_code}.")
        
        return 0
    
def format_df(data):
    
    df = pd.DataFrame(data)

    return df

def main():

  data = make_request(cfg.base_url, token)
  
  df = pd.DataFrame(data)

  return df

main()

Now we can see a cleaner, nearly-finalized output of what we seek: Updated payment data.

Data frame. Screenshot by the author.

However, if you’re going to use this in a dashboard, you will want to clean up the respective fields by doing some data frame manipulation.

Cleaning Up Your Earnings Data Frame

For me, the biggest culprits are the amount and arrival_date fields.

To clarify, arrival_date is the date that the funds arrived in Stripe from the payment source. Amount, obviously, is the amount.

Note: For my privacy I’ll be excluding outputs that include amount.

However, your amount value might be inflated because it’s not correctly converted to a float value.

df['amount'] = df['amount'].astype(float) / 100

Similarly, your arrival_date field might be ingested as a rather messy unix value that you’ll also want to convert.

df['arrival_date'] = pd.to_datetime(df['arrival_date'], unit='s')
df['arrival_date'] = df['arrival_date'].dt.strftime("%Y-%m-%d")

After applying these tweaks, we can see a much neater data frame emerge.

Arrival date field. Screenshot by the author.

But if you’ve had funds deposited into your Stripe account for a longer period of time (say, over a year), you’ll notice that the API doesn’t return every value.

Stripe APIs include pagination, which mean that you will only retrieve the first page of results.

If I were building out a pipeline that needed to ingest historic data on a recurring basis, I’d care about this. But since, going forward, I just want to ingest the prior month’s earnings, I can cheat a bit when it comes to getting the rest of the values.

If you look in the Stripe dashboard you can see a parameter for a custom date range.

Stripe UI, overview. Screenshot by the author.

If you tweak this you can get all the data for the entire lifespan of your account.

Earnings line graph. Screenshot by the author.

Unless you’ve had a Stripe integration for years, entering these values manually shouldn’t be too much of a chore. Again, since this is a backfill, you only need to do this once.

As a final check, I highly recommend taking a sum of your amount column and summing the values in your Stripe dashboard.

If they match, congratulations, you did it!

Taking The Next Steps…

You’ve completed the leg work of identifying your data source (Stripe’s payouts end point), connecting to the API, conducting a quality assurance check (QA) against your results and, finally, backfilling your missing values.

But the fun is just beginning, because you don’t want to have to manually make this call every month to get your sweet, sweet earnings.

You want this to run in the background and maybe even get an email report each time your gross earnings are updated. Wouldn’t that be nice?

Click for Part II.

Create a job-worthy data portfolio. Learn how with my free project guide.

Data Engineering
Medium
Data Science
Python
Stripe
Recommended from ReadMedium