avatarJerry Ng

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

5745

Abstract

ass="hljs-string">"F"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"job"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"CTO"</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">]</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">}</span></pre></div><ul><li>Each of the user’s information is saved as an individual column in the database</li><li>Whenever we need to add a new piece of information, i.e. <code>phone_no</code>, <b>we will need to run database migration</b>, add a new field to the employees' query, etc.</li><li><b>Front-end has the</b> <b>flexibility</b> to always get exactly what they need</li></ul><h2 id="e847">2. Approach “Schema B”</h2><div id="bc7d"><pre><span class="hljs-comment"># Schema B GraphQL Query</span> query { employees { <span class="hljs-built_in">date</span> metadata } }</pre></div><div id="fdd3"><pre><span class="hljs-comment">// JSON Response</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"data"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"employees"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"date"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"2020-09-25 00:00:00+08:00"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"metadata"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Jack"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"age"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">30</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"salary"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">3000</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"sex"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"M"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"job"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Software Engineer"</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">}</span><span class="hljs-punctuation">,</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"date"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"2020-09-26 00:00:00+08:00"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"metadata"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Jimmy"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"age"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">32</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"salary"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">4000</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"sex"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"M"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"job"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Senior Software Engineer"</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">}</span><span class="hljs-punctuation">,</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"date"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"2020-09-27 00:00:00+08:00"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"metadata"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span> <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Jessie"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"age"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">38</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"salary"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">10000</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"sex"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"F"</span><span class="hljs-punctuation">,</span> <span class="hljs-attr">"job"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"CTO"</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">]</span> <span class="hljs-punctuation">}</span> <span class="hljs-punctuation">}</span></pre></div><ul><li>In this approach, we save all the employee information as a single JSON data type column in our database, called <code>metadata</code></li><li>This solution can also be known as schemaless data where employee data does not conform to a rigid schema</li><li><b>No migrations</b> needed if we were to add new pieces of information about the employee</li><li><b>JSON is extremely flexible</b> where it allows us to update or add new information very quickly</li><li>Whenever we add or update a new key-value pair to the <code>metadata</code> field, <b>mobile developer does not require to push a new build</b> to re

Options

ceive that new piece of information (this is great because unlike web applications, not every user tend to update their mobile applications)</li></ul><h1 id="c388">Comparing “Schema A” vs “Schema B”</h1><p id="aa66">In our current scenario when designing <i>page A</i>, <i>Schema B</i> seems like a possible solution to go for, we get everything we need (<code>date</code>, <code>name</code>, <code>job</code> , <code>age</code>, <code>salary</code>, <code>sex</code>) by querying the <code>metadata</code> field alone. However…</p><h2 id="4a21">Over-fetching</h2><p id="4278">In <b>Schema B, </b>lumping all the information together as a single <code>metadata</code> column will result in having to retrieve a lot of unnecessary information when accessing single information. Here’s an example:</p><p id="887b" type="7">Three weeks down the road… We need page B</p><ul><li><i>Page B </i>needs <code>name</code> and <code>age</code>, <code>job</code> fields <b>only</b></li><li>Sure we could still use <i>schema B</i>, but we are fetching <b>ALL</b> the information because the <code>metadata</code> does not provide us the flexibility to choose — <i>Over-fetch </i>😕</li></ul><p id="e32d" type="7">Another five weeks later… We need page C</p><ul><li><i>Page C</i> needs <code>salary</code> and <code>name</code> only — <i>Over-fetch </i>😟</li></ul><p id="da70" type="7">A year down the road… Yikes! We need page D with different sets of data️</p><ul><li>Frontend loses the ability to <b>only ask for what they want</b> where we end up over-fetching ☹️</li><li>Over-fetching is bad because it slows down our application as it scales</li></ul><div id="cb27"><pre><span class="hljs-comment"># Schema B GraphQL Query</span> query { employees { <span class="hljs-built_in">date</span> metadata <span class="hljs-comment"># What happens if you want to omit age? You can't -- Over-fetch</span> } }</pre></div><div id="14dd"><pre><span class="hljs-comment"># Schema A GraphQL Query</span> query { employees { <span class="hljs-keyword">date</span> name job salary sex age <span class="hljs-comment"># If you do not need to get the age information, you can just omit this</span> } }</pre></div><ul><li>Sure, one could argue that whenever we have a new <code>rating</code> field, the front-end does not need to do anything because <code>metadata</code> field can be updated behind the scene to include <code>rating</code> data inside it</li><li><i>But how often are we not going to update the frontend code whenever new data comes in? </i>💭</li></ul><h2 id="6511">Development Pitfalls</h2><ul><li>Another temptation of using JSON is to avoid migration schema — however, if a team is not <b>disciplined</b> enough, it is very easy for the team to stick to yet another key/value pair into a JSON field</li><li>Since there is no migration for it, years down the road, no one will remember what it was for and there is <b>no validation</b> to it</li><li>One day, this flexibility will come at a cost and it will become a real pain point as it will become harder and <b>harder to change</b> because you might have already built so many things on top of this design decision</li><li>Over time, adding new features, having the data in JSON field will lead to more <b>complicated looking queries</b> as compared to using traditional columns</li></ul><h2 id="485b">So, when to use JSON Columns?</h2><ul><li>If you are dependent on data whose schema is <b>not fixed</b></li><li>Storing configuration data that has <b>no defined schema</b></li><li>If you are dependent on data coming from an external API as JSON, you can avoid the process of normalizing this data into different tables. Instead, you can store this data in the same format and structure that you received it in</li></ul><h1 id="b567">Final Thoughts</h1><ul><li>Like most things, it depends. There is no wrong/good or bad in and of itself to store data in columns or JSON. It boils down to what you need and understanding what the technical trade-offs are</li><li>Looking backward, JSON column’s flexibility has allowed us to get something out of the door and iterate fairly quickly</li><li>However, over a period of time as the team size grows, the JSON fields became junk drawers for little pieces of this and that where there are no consistency, validation, or integrity between information</li></ul><h2 id="64d0">Read my other articles at jerrynsh.com</h2><div id="c1e6" class="link-block"> <a href="https://betterprogramming.pub/how-to-write-clean-code-in-python-5d67746133f2"> <div> <div> <h2>How To Write Clean Code in Python</h2> <div><h3>3 tips for writing clean code with Python examples</h3></div> <div><p>betterprogramming.pub</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*S7l8m1iFRMJve25V)"></div> </div> </div> </a> </div><div id="bfb7" class="link-block"> <a href="https://readmedium.com/fast-track-your-api-development-with-insomnia-rest-client-d02521c31b9d"> <div> <div> <h2>Fast Track Your API Development With Insomnia REST Client</h2> <div><h3>A Fast and Simple Alternative to Postman for Sending REST and GraphQL Requests</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*qk-YkW0_3prCwZsS.png)"></div> </div> </div> </a> </div></article></body>

Database — Individual Columns vs JSON Column

Read About Storing JSON in Database vs Having a New Column for Each Key in Just 5 Minutes (with GraphQL as Example)

Photo by Caspar Camille Rubin on Unsplash

When designing a new data model, there are many possible data models for storing information. We could choose to store information with a JSON blob or create a database table with columns for each information.

In this blog post, I am going to provide an example scenario with two different approaches, and then compare these two approaches in GraphQL terms.

I also wrote a short piece on how to boost your GraphQL query performance using Dataloader in Python:

Scenario

  • Imagine we are going to design an admin dashboard (page A), where we need to display our users’ information such as date, name, age, sex, job, salary etc.
  • This application has both mobile and web view
  • Using GraphQL API with relational database

Proposed Solutions

1. Approach “Schema A”

# GraphQL Query
query {
  employees {
    date
    name
    age
    salary
    sex
    job
  }
}
// JSON Response
{
 "data": {
  "employees": [
    {
    "date": "2020-09-25 00:00:00+08:00",
    "name": "Jack",
    "age": 30,
    "salary": 3000,
    "sex": "M",
    "job": "Software Engineer"
   },
   {
    "date": "2020-09-26 00:00:00+08:00",
    "name": "Jimmy",
    "age": 32,
    "salary": 4000,
    "sex": "M",
    "job": "Senior Software Engineer"
   },
   {
    "date": "2020-09-27 00:00:00+08:00",
    "name": "Jessie",
    "age": 38,
    "salary": 10000,
    "sex": "F",
    "job": "CTO"
   }
  ]
 }
}
  • Each of the user’s information is saved as an individual column in the database
  • Whenever we need to add a new piece of information, i.e. phone_no, we will need to run database migration, add a new field to the employees' query, etc.
  • Front-end has the flexibility to always get exactly what they need

2. Approach “Schema B”

# Schema B GraphQL Query
query {
  employees {
    date
    metadata
  }
}
// JSON Response
{
 "data": {
  "employees": [
    {
    "date": "2020-09-25 00:00:00+08:00",
    "metadata": {
     "name": "Jack",
     "age": 30,
     "salary": 3000,
     "sex": "M",
     "job": "Software Engineer"
    }
   },
   {
    "date": "2020-09-26 00:00:00+08:00",
    "metadata": {
     "name": "Jimmy",
     "age": 32,
     "salary": 4000,
     "sex": "M",
     "job": "Senior Software Engineer"
    }
   },
   {
    "date": "2020-09-27 00:00:00+08:00",
    "metadata": {
     "name": "Jessie",
     "age": 38,
     "salary": 10000,
     "sex": "F",
     "job": "CTO"
    }
   }
  ]
 }
}
  • In this approach, we save all the employee information as a single JSON data type column in our database, called metadata
  • This solution can also be known as schemaless data where employee data does not conform to a rigid schema
  • No migrations needed if we were to add new pieces of information about the employee
  • JSON is extremely flexible where it allows us to update or add new information very quickly
  • Whenever we add or update a new key-value pair to the metadata field, mobile developer does not require to push a new build to receive that new piece of information (this is great because unlike web applications, not every user tend to update their mobile applications)

Comparing “Schema A” vs “Schema B”

In our current scenario when designing page A, Schema B seems like a possible solution to go for, we get everything we need (date, name, job , age, salary, sex) by querying the metadata field alone. However…

Over-fetching

In Schema B, lumping all the information together as a single metadata column will result in having to retrieve a lot of unnecessary information when accessing single information. Here’s an example:

Three weeks down the road… We need page B

  • Page B needs name and age, job fields only
  • Sure we could still use schema B, but we are fetching ALL the information because the metadata does not provide us the flexibility to choose — Over-fetch 😕

Another five weeks later… We need page C

  • Page C needs salary and name only — Over-fetch 😟

A year down the road… Yikes! We need page D with different sets of data️

  • Frontend loses the ability to only ask for what they want where we end up over-fetching ☹️
  • Over-fetching is bad because it slows down our application as it scales
# Schema B GraphQL Query
query {
  employees {
    date
    metadata # What happens if you want to omit `age`? You can't -- Over-fetch
  }
}
# Schema A GraphQL Query
query {
  employees {
    date
    name
    job
    salary
    sex
    age # If you do not need to get the `age` information, you can just omit this
  }
}
  • Sure, one could argue that whenever we have a new rating field, the front-end does not need to do anything because metadata field can be updated behind the scene to include rating data inside it
  • But how often are we not going to update the frontend code whenever new data comes in? 💭

Development Pitfalls

  • Another temptation of using JSON is to avoid migration schema — however, if a team is not disciplined enough, it is very easy for the team to stick to yet another key/value pair into a JSON field
  • Since there is no migration for it, years down the road, no one will remember what it was for and there is no validation to it
  • One day, this flexibility will come at a cost and it will become a real pain point as it will become harder and harder to change because you might have already built so many things on top of this design decision
  • Over time, adding new features, having the data in JSON field will lead to more complicated looking queries as compared to using traditional columns

So, when to use JSON Columns?

  • If you are dependent on data whose schema is not fixed
  • Storing configuration data that has no defined schema
  • If you are dependent on data coming from an external API as JSON, you can avoid the process of normalizing this data into different tables. Instead, you can store this data in the same format and structure that you received it in

Final Thoughts

  • Like most things, it depends. There is no wrong/good or bad in and of itself to store data in columns or JSON. It boils down to what you need and understanding what the technical trade-offs are
  • Looking backward, JSON column’s flexibility has allowed us to get something out of the door and iterate fairly quickly
  • However, over a period of time as the team size grows, the JSON fields became junk drawers for little pieces of this and that where there are no consistency, validation, or integrity between information

Read my other articles at jerrynsh.com

Database
GraphQL
Json
Column
Data Model
Recommended from ReadMedium