Managing multiple databases migrations with Alembic

Coming from a background in ORM and python, I loved working with SQLAlchemy as it makes our life so much easier that we don’t have to overthink whether we are using PostgresSQL, MySQL or any other database engine. Alembic just adds more spice to our tech stack when it comes to a migrate your database from one state to another without writing the raw queries. So basically my world was like a fair “PythonyTale”, I meant fairytale.

Everything was going smooth like a JSON object. But then we thought to convert our project from a monolithic app to micro-services.

Now comes the twist in architecture. Generally, if you want to convert a project into a microservice then the first thing you have to think for the candidates that can move out and acts as microservices. Microservices will have their own resources, backing services, environment basically everything. So we thought first it would be easy to separate out databases and then move out the resource layer from the project.

So we have to separate our database into smaller independent DBS and as we were following rolling updates so we shipped different DBS to production and in the next release the plan was to ship the other stuff. So in the meantime, we have to support different DBS in a single project.

The easiest way is to use different alembic.ini and env.py for different databases. But that will lead to a lot of code replication and maintainability of code. So we came up with a solution

So let’s get started

Installation

As easy as being an influencer.

pip install alembic
pip install pymysql

Alembic Setup

Alembic comes with a different types of templates for different use cases. To check that simply type

alembic list_templates

So bydefault alembic has async, multidb and generic template. We are not using multidb template because we are not using multidb the functionality of SQLalchemy. The multidb template in alembic is interesting, but it tracks all of your migrations in a single alembic_version table. (We found this to be undesirable because if we wanted to downgrade a database, we only wanted those changes to be isolated to that particular database–this doesn’t seem possible with a single alembic_version table for multiple databases.

So in our use case, we will be going to use the default template that is generic.

To initialize migration with a default template.

alembic init migrations

Here migrations are the migration directory that alembic will be going to create.

It will create alembic.ini (on the root level). migrations directory on the root level. Inside migrations, there is a versions directory that will have all the migrations files and on the same level, there is env.py.

So for our use case, we are going to play with two files alembic.ini and env.py

This is how the default alembic.ini looks like

# A generic, single database configuration.

[alembic]
# path to migration scripts
script_location = migrations

# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
# Uncomment the line below if you want the files to be prepended with date and time
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
# for all available tokens
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s

# sys.path path, will be prepended to sys.path if present.
# defaults to the current working directory.
prepend_sys_path = .

# timezone to use when rendering the date within the migration file
# as well as the filename.
# If specified, requires the python-dateutil library that can be
# installed by adding `alembic[tz]` to the pip requirements
# string value is passed to dateutil.tz.gettz()
# leave blank for localtime
# timezone =

# max length of characters to apply to the
# "slug" field
# truncate_slug_length = 40

# set to 'true' to run the environment during
# the 'revision' command, regardless of autogenerate
# revision_environment = false

# set to 'true' to allow .pyc and .pyo files without
# a source .py file to be detected as revisions in the
# versions/ directory
# sourceless = false

# version location specification; This defaults
# to migrations/versions.  When using multiple version
# directories, initial revisions must be specified with --version-path.
# The path separator used here should be the separator specified by "version_path_separator" below.
# version_locations = %(here)s/bar:%(here)s/bat:migrations/versions

# version path separator; As mentioned above, this is the character used to split
# version_locations. The default within new alembic.ini files is "os", which uses os.pathsep.
# If this key is omitted entirely, it falls back to the legacy behavior of splitting on spaces and/or commas.
# Valid values for version_path_separator are:
#
# version_path_separator = :
# version_path_separator = ;
# version_path_separator = space
version_path_separator = os  # Use os.pathsep. Default configuration used for new projects.

# the output encoding used when revision files
# are written from script.py.mako
# output_encoding = utf-8

sqlalchemy.url = driver://user:pass@localhost/dbname


[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts.  See the documentation for further
# detail and examples

# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME

# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = WARN
handlers = console
qualname =

[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine

[logger_alembic]
level = INFO
handlers =
qualname = alembic

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S

Make sure you create a backup before editing this file

Alembic.ini

Alembic placed a file alembic.ini into the current directory. This is a file that the alembic script looks for when invoked.

So first create three databases in MySQL

Now I have three databases. Let’s our databases in the default alembic section of the ini file

# A generic, single database configuration.

[alembic]
# path to migration scripts
databases=first, second, third

Let’s add three databases in the alembic.ini as the three sections now so that we can use them while invoking script and also add some additional property in it.

[first]
sqlalchemy.url = driver://user:pass@localhost/dbname
version_locations = ./migrations/first

[second]
sqlalchemy.url = driver://user:pass@localhost/dbname
version_locations = ./migrations/second

[third]
sqlalchemy.url = driver://user:pass@localhost/dbname
version_locations = ./migrations/third

# version_locations is the key here to segregate migrations file for different databases

For all the databases settings I have put them in default section

[default]
script_location = migrations
# template used to generate migration files
file_template = %%(year)d-%%(month).2d-%%(day).2d-%%(hour).2d-%%(minute).2d-%%(second).2d_%%(rev)s_%%(slug)s
truncate_slug_length = 60

I have also added file_template that is to give your auto-generated file name a good structure and also to truncate it if the length is more than 60

Final alembic.ini looks like this

# A generic, single database configuration.

[alembic]
# path to migration scripts
databases = first, second, third


[DEFAULT]
# template used to generate migration files
file_template = %%(year)d-%%(month).2d-%%(day).2d-%%(hour).2d-%%(minute).2d-%%(second).2d_%%(rev)s_%%(slug)s
truncate_slug_length = 60

# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
# Uncomment the line below if you want the files to be prepended with date and time
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
# for all available tokens
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s

# sys.path path, will be prepended to sys.path if present.
# defaults to the current working directory.
prepend_sys_path = .
script_location = ./migrations/
# timezone to use when rendering the date within the migration file
# as well as the filename.
# If specified, requires the python-dateutil library that can be
# installed by adding `alembic[tz]` to the pip requirements
# string value is passed to dateutil.tz.gettz()
# leave blank for localtime
# timezone =

# max length of characters to apply to the
# "slug" field


# set to 'true' to run the environment during
# the 'revision' command, regardless of autogenerate
# revision_environment = false

# set to 'true' to allow .pyc and .pyo files without
# a source .py file to be detected as revisions in the
# versions/ directory
# sourceless = false

# version location specification; This defaults
# to migrations/versions.  When using multiple version
# directories, initial revisions must be specified with --version-path.
# The path separator used here should be the separator specified by "version_path_separator" below.
# version_locations = %(here)s/bar:%(here)s/bat:migrations/versions

# version path separator; As mentioned above, this is the character used to split
# version_locations. The default within new alembic.ini files is "os", which uses os.pathsep.
# If this key is omitted entirely, it falls back to the legacy behavior of splitting on spaces and/or commas.
# Valid values for version_path_separator are:
#
# version_path_separator = :
# version_path_separator = ;
# version_path_separator = space
version_path_separator = os

# the output encoding used when revision files
# are written from script.py.mako
# output_encoding = utf-8

sqlalchemy.url = driver://user:pass@localhost/dbname

[first]
sqlalchemy.url = driver://user:pass@localhost/dbname
version_locations = ./migrations/first

[second]
sqlalchemy.url = driver://user:pass@localhost/dbname
version_locations = ./migrations/second

[third]
sqlalchemy.url = driver://user:pass@localhost/dbname
version_locations = ./migrations/third
# version_locations is the key here to segregate migrations file for different databases

[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts.  See the documentation for further
# detail and examples

# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME

# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = WARN
handlers = console
qualname =

[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine

[logger_alembic]
level = INFO
handlers =
qualname = alembic

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S

env.py

Now let's update the env.py file, for this blog we are mainly going to edit the run_migrations_online function. So let’s go one by one.

Let's if you remember that in the alembic.ini file we left the sqlalchemy.url section to a placeholder URL only. So here in this file, we are going to set the URL. Add the below lines in env.py

db_name = config.config_ini_section # active config ini section is the db name that we have chosen

config.set_main_option(
    "sqlalchemy.url",
    f'{os.environ["SQLALCHEMY_DATABASE_URI"]}/{db_name}'
)

Change the run_migrations_online

def run_migrations_online() -> None:
    """Run migrations in 'online' mode.

    In this scenario we need to create an Engine
    and associate a connection with the context.

    """
    connectable = engine_from_config(
        config.get_section(config.config_ini_section),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )

    def include_object(object, name, type_, reflected, compare_to):
        if type_ == 'foreign_key_constraint' and compare_to and (
                compare_to.elements[0].target_fullname == db_name + '.' +
                object.elements[0].target_fullname or
                db_name + '.' + compare_to.elements[0].target_fullname == object.elements[
                    0].target_fullname):
            return False
        if type_ == 'table':
            if object.schema == db_name or object.schema is None:
                return True
        elif object.table.schema == db_name or object.table.schema is None:
            return True
        else:
            return False

    with connectable.connect() as connection:
        context.configure(
            connection=connection, target_metadata=target_metadata, include_object=include_object
        )

        with context.begin_transaction():
            context.run_migrations()

Now let’s create some SQLalchemy models for test tables for all three different databases.

from sqlalchemy import Text, Column, Integer
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()


class FirstTest(Base):
    __table_args__ = ({'schema': 'first'})
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    name = Column(Text, nullable=True)


class SecondTest(Base):
    __table_args__ = ({'schema': 'second'})
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    name = Column(Text, nullable=True)


class ThirdTest(Base):
    __table_args__ = ({'schema': 'third'})
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    name = Column(Text, nullable=True)

If we want to autogenerate the migrations file we want to include them Base to alembic configuration in env.py

# add your model's MetaData object here
# for 'autogenerate' support
# from myapp import mymodel
# target_metadata = mymodel.Base.metadata

from models import Base
target_metadata = Base.metadata

Now let’s test our configurations

RUN

Export database URL

export SQLALCHEMY_DATABASE_URI=mysql+pymysql://root:[email protected]:3306

To test the configuration just check the current version of alembic

# Use -n argument to specify the configuration and which db to choose

alembic -n first current
alembic -n second current
alembic -n third current

Now let’s generate our first autogenerated migration

alembic -n first revision --autogenerate -m "First test created"

It will create a migration file inside migrations/first directory

Then run same command for second and third database

alembic -n second revision --autogenerate -m "Second test created"
alembic -n third revision --autogenerate -m "Third test created"

So the final directory structure will look like this

Every migration has its own versions separated from each other. All the databases have their own alembic_version table in them to track versions.

The migrations file looks like this

Conclusion

You need to change the env.py according to your use case especially the include_object function object. If you feel any trouble with the code please checkout the git project of this blog at this link

GitHub - anandtripathi5/alembic-multidatabase

docker-compose -f docker-compose-mysql.yml up -d alembic init migrations alembic list_templates alembic -n first…

github.com

If you liked this story, you might also like a Medium membership. It’s only $5 a month (a price of a cup of coffee!) but it will give you unlimited access to stories while supporting your favourite writers. If you sign up using this link, I’ll earn a small commission. Thanks!

Join Medium with my referral link - Anand Tripathi

Read every story from Anand Tripathi (and thousands of other writers on Medium). Your membership fee directly supports…

anandtripathi5.medium.com

For more content like this follow our publication Pythonistas and become a Pythonista

Pythonistas

Home to every Pythonistas out there

medium.com