pt-online-schema-change is a tool in the Percona Toolkit which allows you to make large table alters without locking the database.
Newer versions of MySQL have Online DDL which reduce the need for this tool. But if you're like us,
pt-online-schema-change is still very valuable because we use Galera. Running a DDL query on a Galera cluster blocks the entire cluster (even other databases in the same cluster).
I've successfully used
pt-online-schema-change in the past with great success. This week I tried to use it again to perform some expensive queries and I ran into an error message like this:
Error creating new table: DBD::mysql::db do failed: Can't write; duplicate key in table '_child_new'
A bit of Google-fu and I discovered it was linked to this bug.
The bug is marked as fixed in version 2.2.17 -- but it is not. If you need a truly fixed version, jump to the end of this post: The Fix.
Briefly: How the tool works
To understand the bug, you need to understand a bit about how the tool actually works.
pt-online-schema-change creates a brand new table, applies your ALTER query on the new table (which is "fast" because it's empty), and then proceeds to copy records from the old table to the new table in chunks. It installs triggers on the old table to handle deletes and updates. Basically, you're syncing the two tables together.
When the new table is synced with the old table, the tool switches them: The old table is dropped and the new table is renamed to the real table name.
The Problem: Foreign Keys
The problem is foreign keys. The FK needs to exist simultaneously in the old table and the new table, but FK names must be unique across the entire database.
To work around this,
pt-online-schema-change prefixes the FK's with an underscore in the new table. Example:
- Original FK:
- New FK:
Then when the tables are swapped, you're still left with that prefixed FK in the production table.
Run the tool a second time? Well, once upon a time, you'd just end up with a double-prefixed FK. So the second time, you'd have
__MY_FK. The third time, you'd have
The issue is that the maximum length of a FK is 64 characters. Imagine a schema changing many times over the course of a few years, you can easily breach this limit.
So Percona patched the tool so it would detect if there was a FK with an underscore, and switch to stripping the underscore. In other words, they wanted it to alternate:
_MY_FK, then back to
MY_FK again if you ran it again.
The bug is that the original patch checked for any FK with underscores. Meaning, if you had
MY_OTHER_FK, it would detect the underscore on the first one and assume that it needed to strip FKs. Of course, there is no underscore on the second, so it would be a no-op.
- New FKs:
As you see, a new table would attempt to create a FK with a name that already exists. And that is the error.
Peter Dolberg, the op of the bug report, supplied a suggested patch. Despite the bug being marked as fixed in 2.2.17, it is not. What I ended up doing is copying the source, patching it myself, and then I just used the patched version:
1) Install percona-toolkit the usual way.
2) Copy the source of the
cp `which pt-online-schema-change` my-pt-osc
3) Patch it according to Peter's instructions. I took the liberty of creating a patch file: