Summary after some investigation on my part:
The MTZ file in question was generated from a cryo-EM map using phenix.map_to_structure_factors. Due to the large box size of cryo-EM maps, such files tend to be huge - in this case, 1.8 billion reflections, about 800 times larger than the largest crystallographic dataset in the wwPDB. That alone wouldn't necessarily be a problem, but as well as the map amplitudes and phases (F/PHI), the MTZ output also contains a (undocumented, as far as I'm aware) "F-obs/SIGF-obs" array - the data type normally used to store crystallographic observations. Clipper interprets this as such, and tries to generate a set of live-recalculating maps. This is where things were going a bit awry: the extreme number of reflections broke some (naive) assumptions I'd made in the outlier rejection step, causing it to blow out from <1 second to almost 5 hours (!) - that's now been fixed (will be in the next full or dev release, whichever comes first), which should have the knock-on effect of making the loading of large crystallographic datasets much faster. On my Windows laptop (32 GB RAM, 32 GB swap) it manages to eventually generate the maps; on my Linux desktop (32 GB RAM, 2 GB swap) it eventually gets killed when it runs out of memory. The upshot right now:
- unless you have a very good reason to do otherwise, just work directly with the cryo-EM map rather than converting to structure factors first.
- if you do need to be working with structure factors, you can use the Reflection File Editor in the Phenix GUI to copy just the F/PHI column to a new MTZ. That opens in about 2-3 minutes and behaves as expected.
I'll also look into adding a GUI option to choose which columns to actually turn into maps when a multi-dataset MTZ file is opened.
-- Tristan