
I am dealing with the average structure (a protein complex embedded in a POCP membrane and water solvated) derived with Amber's ptraj from a 1.5 ns MD. Opening this pdb file in 1.2470 Chimera has become extremely slow. The file is 6.4MB. First, below the screen it is warned "Ignored bad PDB record found on line #", for lines from 1 to 114154. This may take some 10 minutes. The pdb records read as in the following examples: For the lipid: ATOM 20 2C21 POP 1 25.569 20.201 48.492 0.00 0.00 For the protein: ATOM 10915 HB2 ALA 117 44.211 74.567 28.832 0.00 0.00 For water: ATOM 22264 O WAT 1771 25.558 39.417 16.580 0.00 0.00 ATOM 22265 H1 WAT 1771 25.582 39.432 16.549 0.00 0.00 ATOM 22266 H2 WAT 1771 25.569 39.482 16.611 0.00 0.00 After that, the warning message changes to "Computed secondary structure assignments (see reply log)" which lasts for longer than 1 hour and 20 minutes. During this time, "top" command shows that python is using 12% MEM and 99% CPU. Then, the graphics appears, with the membrane-protein-complex not centered in the water box. During all that time, I had to avoid doing anything else with the GNOME interface. A second terminal for "top" had to be opened before launching Chimera on another terminal window. Otherwise, spurious graphics become superimposed to the Chimera window. I could then carry out rapid mapping of the protein residues around the single-residue ligand (select protein & :ligandname z<#), which was what I wanted to do. The same events occurred with shorter trajectories. At ca 0.7ns the time taken by Chimera to work out the average-structure pdb was about 15 minutes. Clearly, there is an exponential trend. I can in part compare these events on VMD: the pdb file from 1.5ns MD is opened in VMD in less than 1 minute and the resulting membrane-protein-complex is centered in the water box. Is all that caused by the not-updated atom naming by Amber? Or could the execution be accelerated by calling some C routines by python? Chimera was run on a modest desktop: Athlon 1GHz, RAM 1GB, a poor main board (product: K7S5A, vendor: ECS, version: 1.0), Debian Linux i386. Thanks francesco pietra ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

Hi Francesco, You are a patient man! I have no doubt what you say is true but have no idea why processing your trajectory would be so slow if it's not a memory issue (and it doesn't seem to be based on your "top" output). Also, I have no idea why VMD would show a PDB-based trajectory one way and Chimera another. Could you send me a compressed version of your trajectory directly? Being a PDB file, it should compress a lot and therefore not be too big a problem to email. Thanks! --Eric Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu On Jan 1, 2008, at 1:44 PM, Francesco Pietra wrote:
I am dealing with the average structure (a protein complex embedded in a POCP membrane and water solvated) derived with Amber's ptraj from a 1.5 ns MD.
Opening this pdb file in 1.2470 Chimera has become extremely slow. The file is 6.4MB. First, below the screen it is warned "Ignored bad PDB record found on line #", for lines from 1 to 114154. This may take some 10 minutes.
The pdb records read as in the following examples:
For the lipid: ATOM 20 2C21 POP 1 25.569 20.201 48.492 0.00 0.00
For the protein: ATOM 10915 HB2 ALA 117 44.211 74.567 28.832 0.00 0.00
For water: ATOM 22264 O WAT 1771 25.558 39.417 16.580 0.00 0.00 ATOM 22265 H1 WAT 1771 25.582 39.432 16.549 0.00 0.00 ATOM 22266 H2 WAT 1771 25.569 39.482 16.611 0.00 0.00
After that, the warning message changes to "Computed secondary structure assignments (see reply log)" which lasts for longer than 1 hour and 20 minutes. During this time, "top" command shows that python is using 12% MEM and 99% CPU. Then, the graphics appears, with the membrane-protein-complex not centered in the water box.
During all that time, I had to avoid doing anything else with the GNOME interface. A second terminal for "top" had to be opened before launching Chimera on another terminal window. Otherwise, spurious graphics become superimposed to the Chimera window.
I could then carry out rapid mapping of the protein residues around the single-residue ligand (select protein & :ligandname z<#), which was what I wanted to do.
The same events occurred with shorter trajectories. At ca 0.7ns the time taken by Chimera to work out the average-structure pdb was about 15 minutes. Clearly, there is an exponential trend.
I can in part compare these events on VMD: the pdb file from 1.5ns MD is opened in VMD in less than 1 minute and the resulting membrane-protein- complex is centered in the water box.
Is all that caused by the not-updated atom naming by Amber? Or could the execution be accelerated by calling some C routines by python?
Chimera was run on a modest desktop: Athlon 1GHz, RAM 1GB, a poor main board (product: K7S5A, vendor: ECS, version: 1.0), Debian Linux i386.
Thanks
francesco pietra
______________________________________________________________________ ______________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http:// mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://www.cgl.ucsf.edu/mailman/listinfo/chimera-users

On Jan 1, 2008, at 1:44 PM, Francesco Pietra wrote:
I am dealing with the average structure (a protein complex embedded in a POCP membrane and water solvated) derived with Amber's ptraj from a 1.5 ns MD.
Opening this pdb file in 1.2470 Chimera has become extremely slow. The file is 6.4MB. First, below the screen it is warned "Ignored bad PDB record found on line #", for lines from 1 to 114154. This may take some 10 minutes.
These are for the water ATOM records where the atom serial number and/ or residue number were "****" (what FORTRAN inserts when a number won't fit inside a field width).
After that, the warning message changes to "Computed secondary structure assignments (see reply log)" which lasts for longer than 1 hour and 20 minutes. During this time, "top" command shows that python is using 12% MEM and 99% CPU.
Due to the fact that this is an "average" structure, Chimera's estimation of the connectivity is bad for many parts of the structure -- particularly the POP residues in the membrane. This creates a rat's nest of intra-residue connectivity which the ring-finding algorithm (designed for "reasonable" structures) takes a long time to operate on. Normally Chimera wouldn't run ring-finding as a structure opens, but due some interesting naming of hydrogens in the POP residues (e.g. RH16) it assigns some of the hydrogens to be other elements (e.g. rhodium, as per PDB atom naming rules). Since rhodium is a metal, it wants to depict it as a sphere, which means it needs to know the radius, which in turn depends on the atom type, which needs to find rings...
Then, the graphics appears, with the membrane-protein-complex not centered in the water box.
This is due to the "****" waters being ignored.
I could then carry out rapid mapping of the protein residues around the single-residue ligand (select protein & :ligandname z<#), which was what I wanted to do.
If you only care about the protein and ligand in your analysis, you should just edit your file to strip the waters and lipids. When I did this with the file you sent it only took moments to open. --Eric Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu

Eric: Thanks a lot for this lesson, surely useful to many other guys, too. Whether looking at an "average structure" or some other output structure was a problem I got no clarification from the Amber mailing list. Unfortunately, Pr Brozell is not active in this period on Dockfans to ask him. My question was, is the "average structure" representative of the interactions occurring between the protein and the ligand as resulting from the MD carried out? Although DOCK6.1 does that with amber score, it is only for implicit medium surrounding the complex. Ideally, I would have liked to estimate the free energy of interaction in the presence of explicit surrounding medium but Amber does not appear to have a way to that for a complex in a membrane (or anyway for a non-standard ligand). Therefore, what I am relying on, is the distance between the protein residues and the ligand. The closest the protein residues are to the ligand, the more they are considered to be relevant. At any event, given the problem Chimera has encountered with an average structure, do you believe that mapping the protein environment around the ligand with Chimera's "zone" is correct? From your "lesson" I understand YES. I would appreciate any comment or suggestion about that. francesco --- Eric Pettersen <pett@cgl.ucsf.edu> wrote:
On Jan 1, 2008, at 1:44 PM, Francesco Pietra wrote:
I am dealing with the average structure (a protein complex embedded in a POCP membrane and water solvated) derived with Amber's ptraj from a 1.5 ns MD.
Opening this pdb file in 1.2470 Chimera has become extremely slow. The file is 6.4MB. First, below the screen it is warned "Ignored bad PDB record found on line #", for lines from 1 to 114154. This may take some 10 minutes.
These are for the water ATOM records where the atom serial number and/ or residue number were "****" (what FORTRAN inserts when a number won't fit inside a field width).
After that, the warning message changes to "Computed secondary structure assignments (see reply log)" which lasts for longer than 1 hour and 20 minutes. During this time, "top" command shows that python is using 12% MEM and 99% CPU.
Due to the fact that this is an "average" structure, Chimera's estimation of the connectivity is bad for many parts of the structure -- particularly the POP residues in the membrane. This creates a rat's nest of intra-residue connectivity which the ring-finding algorithm (designed for "reasonable" structures) takes a long time to operate on. Normally Chimera wouldn't run ring-finding as a structure opens, but due some interesting naming of hydrogens in the POP residues (e.g. RH16) it assigns some of the hydrogens to be other elements (e.g. rhodium, as per PDB atom naming rules). Since rhodium is a metal, it wants to depict it as a sphere, which means it needs to know the radius, which in turn depends on the atom type, which needs to find rings...
Then, the graphics appears, with the membrane-protein-complex not centered in the water box.
This is due to the "****" waters being ignored.
I could then carry out rapid mapping of the protein residues around the single-residue ligand (select protein & :ligandname z<#), which was what I wanted to do.
If you only care about the protein and ligand in your analysis, you should just edit your file to strip the waters and lipids. When I did this with the file you sent it only took moments to open.
--Eric
Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu
____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs

Well, I'm no expert at this type of analysis. I should think that if your trajectory is dominated by one binding mode than an average structure could provide insight, but then so would most individual frames of the trajectory. If your trajectory indicates multiple binding modes then an average structure would be misleading at best. Perhaps you could carry out Elaine's suggestion per cluster analysis until such time as it's integrated with MD Movie. --Eric Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu On Jan 2, 2008, at 11:04 PM, Francesco Pietra wrote:
Eric: Thanks a lot for this lesson, surely useful to many other guys, too.
Whether looking at an "average structure" or some other output structure was a problem I got no clarification from the Amber mailing list. Unfortunately, Pr Brozell is not active in this period on Dockfans to ask him. My question was, is the "average structure" representative of the interactions occurring between the protein and the ligand as resulting from the MD carried out? Although DOCK6.1 does that with amber score, it is only for implicit medium surrounding the complex. Ideally, I would have liked to estimate the free energy of interaction in the presence of explicit surrounding medium but Amber does not appear to have a way to that for a complex in a membrane (or anyway for a non-standard ligand). Therefore, what I am relying on, is the distance between the protein residues and the ligand. The closest the protein residues are to the ligand, the more they are considered to be relevant.
At any event, given the problem Chimera has encountered with an average structure, do you believe that mapping the protein environment around the ligand with Chimera's "zone" is correct? From your "lesson" I understand YES.
I would appreciate any comment or suggestion about that.
francesco
--- Eric Pettersen <pett@cgl.ucsf.edu> wrote:
On Jan 1, 2008, at 1:44 PM, Francesco Pietra wrote:
I am dealing with the average structure (a protein complex embedded in a POCP membrane and water solvated) derived with Amber's ptraj from a 1.5 ns MD.
Opening this pdb file in 1.2470 Chimera has become extremely slow. The file is 6.4MB. First, below the screen it is warned "Ignored bad PDB record found on line #", for lines from 1 to 114154. This may take some 10 minutes.
These are for the water ATOM records where the atom serial number and/ or residue number were "****" (what FORTRAN inserts when a number won't fit inside a field width).
After that, the warning message changes to "Computed secondary structure assignments (see reply log)" which lasts for longer than 1 hour and 20 minutes. During this time, "top" command shows that python is using 12% MEM and 99% CPU.
Due to the fact that this is an "average" structure, Chimera's estimation of the connectivity is bad for many parts of the structure -- particularly the POP residues in the membrane. This creates a rat's nest of intra-residue connectivity which the ring-finding algorithm (designed for "reasonable" structures) takes a long time to operate on. Normally Chimera wouldn't run ring-finding as a structure opens, but due some interesting naming of hydrogens in the POP residues (e.g. RH16) it assigns some of the hydrogens to be other elements (e.g. rhodium, as per PDB atom naming rules). Since rhodium is a metal, it wants to depict it as a sphere, which means it needs to know the radius, which in turn depends on the atom type, which needs to find rings...
Then, the graphics appears, with the membrane-protein-complex not centered in the water box.
This is due to the "****" waters being ignored.
I could then carry out rapid mapping of the protein residues around the single-residue ligand (select protein & :ligandname z<#), which was what I wanted to do.
If you only care about the protein and ligand in your analysis, you should just edit your file to strip the waters and lipids. When I did this with the file you sent it only took moments to open.
--Eric
Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu
______________________________________________________________________ ______________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs
participants (2)
-
Eric Pettersen
-
Francesco Pietra