How to Break Down Blockchain and Show the Value of Data Lineage
Bitcoin, blockchain and data lineage.

Tracking data can be a pretty tricky challenge. After all, data flies in through the door and once in the building it moves around. It goes from system to system and then back out the door again to clients, regulators and suppliers. Data lineage is the tracking of that information as it goes on its myriad journeys, transformations and adventures.
It’s valuable because when we look at an item of data, we want know where it came from, how accurate it is and, ultimately, whether it can be relied on.
So we have a picture of data lineage and see value in it, but it takes elbow grease to get there. Before we make any firm commitments, we really need a sense of just how valuable data lineage can be.
Now what better way to demonstrate value than by looking for a rock star? If it is possible for data lineage to have one then this is it. Cue a baseline for the rock star of data lineage — blockchain — the data innovation behind Bitcoin.
Bitcoin Basics
These days pretty much eveyone has heard of Bitcoin. We read about in our morning news and we follow its price rollercoaster down into the dips and up over the peaks.
But what exactly is Bitcoin?
Bitcoin is a peer-to-peer digital currency that anyone in the world can hold, it’s open source ( → here’s the code) and it’s decentralised. Everyone on the Bitcoin network helps to validate transactions of everyone else. That’s one of the main reasons why people are excited about it. There is no central authority to go through, no single point of failure or control.
In order to keep track of the movements of Bitcoin, the complete history of all bitcoins travelling from one wallet to the next is on a public database — the Bitcoin blockchain.
The blockchain tracks the lineage of every transaction of Bitcoin — all the way from where the first bitcoin was born to where it is now. The complete data lineage.
Breaking Down Blockchain
To say there is a lot of excitement around blockchain is an understatement. In November 2021 Jack Dorsey left Twitter. In December 2021, he renamed his payments giant from Square to Block and became its full-time CEO. The message was clear — Jack (and Block) was now 100% focused on blockchain.
Earlier this year Apple co-founder, Steve Wozniak, described the technology behind Bitcoin as a mathematical miracle.
Meanwhile, we’re told that the world is upgrading… to Web 3.0. Facebook is now Meta. And now Meta is building the Metaverse… but they’re not the only ones. Believers in a decentralised Metaverse are building too, and they’re building theirs on a blockchain.
So, all of this suggests that the (digital) future may well be built on a blockchain.
But what exactly is a blockchain?

Incredibly, the basic idea is simple.
A blockchain has blocks like this: [ ] With data in them: [ data ] And a way to chain each block together, called a hash [ prior block hash, data, block hash ]
A hash is a cryptographic algorithm that creates a unique value from any input. The value is one-way meaning it cannot be decrypted. And the way the algorithm works means that two different inputs will practically never lead to the same hash value — a hash is the key ingredient to a blockchain.
Each block gets a hash. Each block also stores the hash of the prior block. The hash for each block is derived from the other data in the block, then the hash itself is added to the block.
So we can think of the calculation for a block hash as: Block hash = hash* ( [prior block hash, data ] ) *Where hash() is the algorithm doing the hashing for us
The thing to remember here is that a unique code (the hash) is being created for each block that everyone on the network can use to verify blocks as they receive them.
So if you send me a block then I can verify it is valid. I do this by calculating the hash of the block myself and checking that it is equal to the hash in the block. That’s how I know the block is consistent with the entire history of all prior blocks.
But the devil is in the details and a blockchain needs to be completely secure. Even with hashing, a decentralised blockchain remains vulnerable to an attack.
For Bitcoin that means facing the risk of a malicious actor submitting seemingly valid blocks in an attempt to double-spend their coin while the rest of the network sweats to reach consensus on the blocks.
To protect Bitcoin against this type of attack, a condition is added to the hash creation. An arbitrary number, known as a nonce, is added to every block, and that number must be set so that the hash generated is within a target range. This energy-intensive process is called proof-of-work and finding a number that solves this condition is called Bitcoin mining.
You can read Satoshi Nakamoto’s 2008 Bitcoin Whitepaper → here You can read about the Bitcoin data structure and hashing procedure → here You can check out the Bitcoin genesis (first) block → here

What This Teaches Us About Data Lineage
Let’s step back from the detail of a blockchain and look at its bigger picture again. What was the benefit of it? What is different about using a blockchain? The answer I like most to both of those questions is that on a blockchain —
The data lineage is wrapped up with the data
There is no guesswork needed to understand the route that the data took to get to where it is today. Everyone can see it and there’s no need for a central authority.
Because every transaction on a blockchain needs to follow a strict protocol for a transaction to occur, we know that every data-point is accurate. The process is cast-iron. On top of that, the data is immutable. That simply means that you can’t erase or change the history. The protocol in place doesn’t allow it.
Summary
Today Bitcoin and other blockchain based crypto-currencies have a combined market cap reaching over a $Trillion. This makes blockchain a perfect example of the value and promise of data lineage.
By breaking down a blockchain, we have seen that it comes with strict constraints. To be valuable, everyone has to follow the exact same well-defined process to write data to it. No-one can change the data once its there.
Compare how a blockchain works to business process within an enterprise — processes that are often flexible by necessity. Data can be overwritten when there is a need and data lineage between internal and external services can be a tricky challenge to track.
Think for a second about all data transfers — within companies, across companies, to and from governments, open sources, and individuals.
Perhaps now, the next conversation may about what else is possible.
I hope you found this post helpful!
If you’re interested in deeper dives on this topic, then I’ve also written about How to Code a Blockchain in 6 Steps and What Data Management Actually Means




