We built the Civis Platform with Ruby on Rails. In the Ruby ecosystem, there are few dependencies that are more involved than this one. And while it may not be exciting to keep it updated, it is important. Bug fixes, security patches, and new features are some of the main reasons to do so. Plus, the Ruby on Rails maintenance policy only provides security patches for the two most recent versions. While the most recent upgrade turned out to be one of the most time consuming upgrades we’ve done yet, we learned some important lessons along the way about mitigating risk and increasing efficiency.
The Upgrade Process
Step 1: Familiarize yourself with the software upgrade
Start by reading the release notes and changelog, particularly to find breaking changes and deprecations. Before we did anything else, we read the release notes—and we tried to identify the changes without worrying about how to address them. We knew that’d become clear later in the process. Don’t forget, a big piece of code like Rails often has many dependencies that may also have changed significantly. For example, Rails has Railties and Active Record, which can complicate matters.
Step 2: Create tickets for everything you think you’ll need to fix
Use task management software, and enjoy the peace of mind that comes with it. First, file tickets based on what you’ve learned from the release notes. Second, create a branch off of master with the only difference being the upgrade, and upgrades to dependencies. At this point, your test suite will identify issues with the upgrade. Third, file tickets for all the issues—it helps to group similar tests and issues into one ticket.
When we did this with Rails, our test suite identified A LOT of issues—and more materialized as we got further into the process. We were glad to have a record of everything that we did. This tracking payed off months down the line when we wanted to know everything that went into performing the upgrade.
Step 3: Start fixing things, incrementally!
You may have the urge to fix every failed test you encounter in a single branch. Resist it. When we started doing this, we realized we were going to have a branch with an unmanageable number of changes. Merging a large branch with master and deploying it to production introduces a large risk to the stability of your application. What’s more, large branches make it harder to review code, find bugs, and avoid conflicts and incompatibilities. It’s best to keep the branches small.
We realize that many, if not most, of the changes needed for the upgrade were backwards-compatible, so we decided to create a single branch for each fix. This meant that the change had to work in both the old and in the new versions. Here’s a step-by-step outline of what we did:
- Create a new branch from master for the actual version bump (we usually call ours “rails_[version]_bump”).
- Fix a single small issue on this “rails_[version]_bump” branch.
- Copy these small changes to a new branch from master that still runs on the old version of the software.
- Ensure the changes still work as intended on the old version of the software.
Example of how to introduce a backwards-compatible change
We’ll take you through our process from when we were upgrading from Rails 4.1 to 4.2 for context. During that upgrade, we noticed that in a couple of places in our code base, we used the `Hash#to_h` method (defined on standard Ruby `Hash` objects) on `ActionController::Parameters` objects to turn the objects into Ruby `Hash` objects. This worked because `ActionController::Parameters` is a descendant of the Ruby `Hash` class. For example, here is a simplified example controller that works in Rails 4.1:
However, in Rails 4.2, developers introduced a `ActionController::Parameters#to_h` override directly on the `ActionController::Parameters` class. This caused problems because the new method returns a Hash but “with all unpermitted keys removed.” For example, this is what would happen on our example controller once on Rails 4.2:
This means we would need to explicitly call `permit` in order to get a Ruby `Hash` with the right key/value pairs we wanted. We realized that the number of parameters we would need to permit was large, and we saw the new permit functionality built into `ActionController:Parameters#to_h` as an unnecessary level of protection since we already validate that there are no unpermitted parameters in our endpoints.
Given these considerations, our refactor/fix was to do the following: Wherever we called `ActionController::Parameters#to_h` on `ActionController::Parameters` objects, we changed our code to explicitly call `Hash#to_hash`. The `Hash#to_hash` method returns self, which will convert our `ActionController:Parameters` object into a standard Ruby `Hash`—exactly as we intended with our original code. Going back to our example, here is our updated controller:
The `to_hash` method behaves the same in both Rails 4.1 and 4.2, so this is a great example of a code change we can ease into the code base by first testing on 4.2, verifying it works, testing on 4.1, and then merging into master and deploying while still on 4.1.
Once the “fix_to_h_on_rails_4_2” branch was fixing all the tests we expected it to, we moved these changes into a new branch from master called “use_to_hash_over_to_h”.
Then, once we verified that everything still worked in the “use_to_hash_over_to_h” branch running on Rails 4.1, we merged this branch into master and deployed. If for some reason something went wrong in production, we would be able to isolate the problem to this very small subset of changes (changing `obj.to_h` to `obj.to_hash`) and roll back if necessary.
The last thing we did during this cycle was merge master back into our “rails_4_2_bump” branch so that it can absorb the new `obj.to_h` changes!
This was the final version control tree:
This was one of the simplest changes we encountered. Often, these changes will be more involved and require more investigation and refactoring. We decided to upgrade third-party libraries with the same process, ensuring they work both in both versions. As with our own source code changes, this allowed us to incrementally introduce new code and isolate any problems to the version upgrade of the particular gem.
This process works well when your changes can work in both a pre and post upgrade environment. For incompatible changes, there are a couple different options. Remember that since the “rails_[version]_bump” is the final branch that will actually do the upgrade, you want to try to keep this branch as small as possible. If you keep all of your incompatible changes in the “rails_[version]_bump” branch and the branch is still relatively small (less than 200 lines), then it might be okay to leave this as-is and merge that branch directly once you are ready to upgrade.
If this branch becomes too big, you might consider writing some conditional code that can switch between the pre-upgrade code and the post-upgrade code. You can write logic that checks the version of Rails and calls the appropriate methods, or if applicable, you can use logic with Rails’s `#try` method to see if a method exists and works to toggle between pre-upgrade and post- upgrade. Don’t forget to remove this logic once your upgrade is complete!
In addition to validating changes with your test suite, you should perform other checks to make sure every change is compatible and working correctly. We tested changes using a staging environment and deployed to production during maintenance windows when we felt there was even a slight risk of adverse behavior. With software upgrades, it’s hard to know everything that may be affected, so we took the utmost precaution every step of the way.
Step 4: The Grand Finale
After making all our changes—roughly 30 different pull requests that took 2 engineers working full time over 4 months—we checked for the following:
- Our “rails_[version]_bump” branch had a green passing test suite.
- Our “rails_[version]_bump” branch had the minimal set of changes.
- We’d deployed “rails_[version]_bump” to a staging environment and felt comfortable it worked like production.
Then, we merged and deployed the branch during one of our scheduled maintenance windows. In the next week, we found a few minor bugs—but overall, the process was successful. We upgraded Rails and nobody even noticed!
Stay tuned as we’ll share a list of tips and lessons learned for taking the pain out of upgrading in a future post.