Mol2 Trajectory Reader?
data:image/s3,"s3://crabby-images/bd809/bd809963e77cf2cff85fc5706e39524f150c9e7d" alt=""
Hello, How hard would it be to write an extension to chimera that read a multimolecule mol2 file into chimera as a trajectory? That would be really useful for me. Please let me know if you can do it or some advice on how to procede. S Joshua Swamidass
data:image/s3,"s3://crabby-images/6afbe/6afbe7577c5a571d04e2d32118581c9ef7f0ad74" alt=""
On Dec 20, 2004, at 1:49 PM, S Joshua Swamidass wrote:
How hard would it be to write an extension to chimera that read a multimolecule mol2 file into chimera as a trajectory? That would be really useful for me. Please let me know if you can do it or some advice on how to procede.
Hi Joshua, I guess I'm curious as to what package puts out trajectories in Mol2 format... The difficulty of adding a format is directly proportional to how fast you need to have the trajectory read. Reading the entire trajectory at startup in interpreted Python is not too difficult to code. Using a C/C++ Python module to read the trajectory and/or reading the frames on demand both increase the coding effort. If you provided me with examples of Mol2-format trajectories, I could probably have support for them done in about a month or possibly less, given the various other demands on my time. Performance would be similar to the multi-MODEL PDB trajectory case, since the entire trajectory would be read in on startup, but using the C++ layer. I've appended an outline of how to add a new trajectory format, in case you want to give it a stab yourself... Eric Pettersen UCSF Computer Graphics Lab pett@cgl.ucsf.edu http://www.cgl.ucsf.edu Adding a New Trajectory Format ---------------------------------------------- Movie actually uses the Trajectory module (chimera/share/Trajectory) to read the various formats. Trajectory has a subdirectory named "formats" that in turn has subdirectories for each supported format (the subdirectories are Python modules). By convention the module name for each format is the name of the format with the initial letter of each word capitalized and all other letters lowercase (e.g. the MMTK module's name is Mmtk). A format's module is typically structured so that the code that interfaces with Trajectory's generic format handling is in the __init__.py file, and the code specific to supporting reading the format's files is in another file -- usually named after the format itself (e.g. Gromos.py). __init__.py: The __init__.py file needs to support the following things: 1) If the name of the format as displayed to the user is different from the module name (which, due to capitalization, it usually is) then there has to be a global variable named "formatName" that is initialized to the display name of the format. 2) A class named ParamGUI needs to be defined that handles presenting the file-loading interface for that format to the user. It must have two methods: 2.1) __init__, which receives a Tkinter.Frame instance argument. The __init__ method should populate the frame with widgets for gathering the input information for the format from the user. 2.2) loadEnsemble, which takes as arguments a starting frame number, ending frame number, and callback function. loadEnsemble needs to compose a list of the arguments that were provided by the user to the widgets defined in the __init__ function, and then call this module's global loadEnsemble method (see below) with that list as the first argument and the start/end frame number and callback as the remaining three arguments. 3) A global loadEnsemble function that generates an ensemble instance (discussed later). This function is not only called by the ParamGUI.loadEnsemble method, but also when the user uses a "metafile" to specify the input parameters. This function takes fours arguments: a format-specific list of input parameters, a starting frame number, ending frame number, and callback function to start the Movie interface. This function should call the Movie-interface callback with the generated ensemble as an argument. This function should also remember the provided format-specific values as preferred defaults for future uses of the format. The code for a format's __init__.py file is very similar from format to format. The easiest way to write your own is to grab another format's __init__.py file and modify it. The __init__.py file for the Gromos format is a good example since it uses multiple input files and and has a non-file parameter as well, so it pretty much covers all the bases in what you might need. The format-specific .py file: This file defines an "ensemble" class that gets instantiated from __init__.py's loadEnsemble function. The ensemble class needs to support the following methods: 1) An __init__ method that takes the format's input parameters and start/end frames as arguments. The __init__ method may read input files or do whatever is necessary to support the other instance methods (i.e. call into a C/C++ module to read the files -- the Amber format does this). 2) A GetDict method that takes a string argument. The string specifies what data should be returned. The possible string values are: 2.1) atomnames -- return a list of the atom names; a residue's atoms must be consecutive 2.2) elements -- return a list of the atom elements. These should be instances of chimera.Element (which can be initialized with a string (e.g. "Fe") or a number). Trajectory's determineElementFromMass function may be useful here if the format doesn't specify the atomic number directly or it can't be easily determined from the atom name. 2.3) resnames -- return a list of the residue names 2.4) bonds -- return a list of "bonds": two-tuples of indices into the atomnames list 2.5) ipres -- a list of the first atom of each residue (indices into atomnames, but unlike previous indices these are 1-based, so the first element of ipres will always be 1) 3) A __getitem__ method taking a frame-number argument (starting with 1): return a list of 3-tuples corresponding to the xyz coordinates of the atoms in that frame (same order as atomnames). The coordinates should be in angstroms. 4) A __len__ method that returns the total number of frames in the trajectory (not just the number of frames between the user-specified start/end frames).
participants (2)
-
Eric Pettersen
-
S Joshua Swamidass