Triclinic Periodic Boxes

OpenMM 6.3 was recently released, and it added a very important new feature: triclinic periodic boxes.  In previous releases, it only supported rectangular boxes.  When you used periodic boundary conditions, you specified the shape of the periodic box with three numbers: the size of the box along the x, y, and z axes.  Now you get to specify three vectors, not just three numbers.  A triclinic box is like a generalized parallelogram.  The angles don't all need to be 90 degrees.

So why is this useful?  There are two reasons.  First, you might want to simulate a crystal, and not all crystals have rectangular unit cells.  You need to be able to specify whatever unit cell shape the actual crystal has, and with OpenMM 6.3 you can now do that.

The other reason is less obvious, but probably more important.  When you simulate a protein in water, you should make the periodic box big enough that the protein doesn't interact with itself.  "Big enough" is hard to exactly define, since Coulomb interactions are very long ranged, but generally you want at least 1 nm of water between periodic copies of the protein, and more is better.

But a protein in water is free to rotate, and that can change the distance between copies.  You need to make sure that, no matter how the protein rotates, there will still be enough padding.  That is, you need a sufficiently large sphere of water surrounding it.  You can then embed that sphere in a cube, but that's wasteful.  The extra water makes your simulation run slower.  Ideally, you want your periodic box to be a sphere.

That's impossible, since you can't pack spheres to fill space.  But there are polyhedra you can use that at least come closer to being spherical, like the truncated octahedron and rhombic dodecahedron.  By using one of those shapes, you can get away with fewer water molecules while still keeping the same minimum distance between copies of the protein.

So what does this have to do with triclinic boxes?  There turns out to be an amazing mathematical result: any repeating shape that fills all of space is exactly equivalent to a triclinic box.  Given a truncated octahedron, for example, there is an equivalent triclinic box with exactly the same volume that, when tiled periodically throughout space, produces exactly the same repeating pattern of atoms.  And that means that if you have triclinic boxes, you automatically have all other shapes as well.

Given the minimum required distance between periodic copies, the optimal triclinic cell has about 71% the volume of the optimal rectangular cell, which means your system only needs about 71% the number of atoms.  So although this isn't a huge optimization, it does allow you to use a somewhat smaller system that can be simulated somewhat faster.