Receiving over 19 billion ad requests every day, Smaato operates one of the world’s largest mobile advertising marketplaces. Our talented team of developers ensures the smooth operation of this marketplace — both our client-facing platforms (SPX and SDX) and the core technology “under the hood.” We're excited to provide a glimpse into the challenges and successes that our development team recently encountered: how to manage database changes in an efficient and effective way.
The Challenge When Updating Databases
Nearly all software applications use some kind of data storage: either relational or non-relational databases. This means that not only will data changes be required, but database refactoring as well. So, how can we manage database changes in an efficient way?
Let’s begin with an example. You have a shared database, you use a CI tool and you need to make a change in the database when implementing a new feature. Unfortunately, this change will break the tests for other branches, so you need to execute your changes, run your tests and return the database to its previous state. You will then need to repeat this complicated procedure whenever you want to test again.
Furthermore, if you choose to execute database updates manually in production mode, you will need to precisely configure the order in which your database changes should be executed — for example, by finding out if they are dependent or if they are touching the same tables.
One way or another, you will run into the need for a database migration tool.
What Are Database Migration Tools?
Database migration is the process of automating and managing database changes. There are many tools available to automate this process — one of the database migration tools that we are currently using at Smaato is Liquibase.
This database migration tool helps manage and control database changes by providing automated features for updating a database to a new state or for rolling back a database to a previous state. It can help manage system boot-up data and test data independently from the database your application will run against. This tool is also useful for achieving Continuous Delivery.
Liquibase is an open source database-independent library for tracking, managing and applying database schema changes. It offers many useful features, including the support of multiple databases, the ability to read changes in multiple formats (JSON, XML, SQL) and the ability to run from Maven or command line (e.g. Java JAR).
Most of the database migration tools share a common idea, which is to maintain a table for recording executed scripts and a change log directory that holds all of the changes that should be executed. When the tool runs, it compares these two directories and executes the newly-added scripts as visualized below:
How Liquibase Works
- In the Liquibase world, a change is called a changeSet.
- Each changeSet tag is uniquely identified by the combination of the “id” tag, the “author” tag and the changelog file classpath.
- Liquibase executes the databaseChangeLog, which is an XML file where changeSet files are listed. It reads the changeSets in the same order these changes are mentioned in the changelog.
- Liquibase checks the databaseChangeLog table to see if the combination of id/author/filepath has been run.
- Liquibase will insert a new row with the id/author/filepath along with an md5sum of the changeSet in the databaseChangeLog table.
- Liquibase checks if a file was executed or not by using its md5sum hash. Executed changes should not be touched, as this can mess up the schema version and tracking changes. To make a change in the database, always create a new changeSet. If an old executing file is changed, then the tool will fail to run.
- Liquibase offers a rollback feature. This gives you the control to roll back certain changes. For example:
- You can tag your current state of changes with a unique id (e.g. last released project version), execute your new changes and then call a rollback to the created tag.
- You can rollback to a specific date.
- You can rollback the last x number of executed changes.
How to Use Liquibase
To integrate Liquibase into your application, follow these steps:
- Add Liquibase Maven plugin in project POM file.
- Add liquibase.properties in project resources.
- Add Liquibase master (this is the main file used by Liquibase as the ChangeLog, which should point to your changeSets files).
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
- Add a changeSet. You can use pure SQL or Liquibase’s own DSL — we chose pure SQL.
UPDATE student_configuration SET propertyValue = 'someValue' WHERE propertyKey = 'someKey';
UPDATE student_configuration SET propertyValue = 'oldValue' WHERE propertyKey = 'someKey';
How to Integrate Liquibase Into Your Workflow
- Tag the database state — for example, the current project version
- Execute database changes
- Run tests
- Rollback database changes to return database to old state
Database Migrations and Continuous Delivery
In order to achieve Continuous Delivery, all deployment steps should be automated. One of these steps, of course, is to execute the database changes.
Let’s say we have a cluster of nodes to deploy our application, with some shared database server, and we want to add a non-nullable field into one of our tables. If we do that in one changeSet that inserts a non-nullable column, we will have the following problem:
- One node is taken out of the load balancer.
- Application is deployed to this node.
- Database migrations run against the shared database, so insert changeSet is executed.
- All live nodes fail to insert into this table because it uses the old code that doesn't know about the newly added field.
The solution is to ensure that every change is backward compatible with the rest of the system. This means that for a feature database change, we need to break the change down into multiple backward compatible steps, and for each step, we will need a separate release. Therefore, we need to run deployment multiple times in order for the feature to be completely released.Example:
- Add a nullable column to database — system keeps adding rows, nulls are fine, reads ignore the null.
- Code change to write correct value to new rows and handle reading unexpected nulls — database doesn’t change, now we have some null rows and some rows with data.
- Run data migration to fill the other null records with default values.
- Add the non-null constraint — the database now has no nulls, and your new code is writing the correct data.
- Remove the code that handles the null case.
By automating database migrations with Liquibase, we can subtract manual intervention (and human error) from the equation as much as possible. The implementation of this database migration tool also allows us to achieve Continuous Delivery, which is vital for increasing speed, improving reliance and boosting productivity.