[chimera-dev] source of ~10x slowdown in opening of Molecules in lastest chimera

After much digging, it turns out that the slowdown occurs when the PDBio wrapper returns the new molecules to chimera. Due to wrappy's C++/Python reference count caching, the molecule's attributes are fetched at that time, and one of those attributes, sortedAtoms, takes a long time to compute (for 4371 atoms it takes 4.1 seconds on socrates, 3.2 on spin and 5.9 on my home P4). With a timing script, we can see that sortedAtoms isn't any slower in the lastest chimera, and sortedAtoms has been an attribute since 2000/05/02, so an optimization to speed up the initial sorting of atoms seems to have been lost recently. So far, I have been unable to find it, so I'm hoping Eric or Conrad will chime in. In the mean time, making sortedAtoms a member function instead of an attribute (which wrappy much prefers anyway) avoids the delay. Luckily no .py file in /usr/local/src/chimera calls sortedAtoms, so it is a fairly safe fix. - Greg

Obviously, the code for sortedAtoms() hasn't changed in a long time. So one way to look at it is that the "problem" is the new pre-fetching behavior of wrappy. So is it true that with the new behavior of wrappy it is effectively impossible to have a lazily-evaluated attribute of a Molecule (or Atom or Bond for that matter)? I think I already ran afoul of this new behavior when I couldn't make primaryBonds() or allLocations() into attributes. sortedAtoms() was being used in my analysis of trajectories, where the molecules in question were hundreds or atoms in size and performance as is was more than acceptable. Now that sortedAtoms is forcibly executed no matter what, then you have it operating on molecules that are many thousands of atoms and you get what you see now. Perhaps we need a discussion before we commit to never having lazily-evaluated attributes. --Eric On Tuesday, June 3, 2003, at 02:44 PM, Greg Couch wrote:
After much digging, it turns out that the slowdown occurs when the PDBio wrapper returns the new molecules to chimera. Due to wrappy's C++/Python reference count caching, the molecule's attributes are fetched at that time, and one of those attributes, sortedAtoms, takes a long time to compute (for 4371 atoms it takes 4.1 seconds on socrates, 3.2 on spin and 5.9 on my home P4). With a timing script, we can see that sortedAtoms isn't any slower in the lastest chimera, and sortedAtoms has been an attribute since 2000/05/02, so an optimization to speed up the initial sorting of atoms seems to have been lost recently. So far, I have been unable to find it, so I'm hoping Eric or Conrad will chime in.
In the mean time, making sortedAtoms a member function instead of an attribute (which wrappy much prefers anyway) avoids the delay. Luckily no .py file in /usr/local/src/chimera calls sortedAtoms, so it is a fairly safe fix.
- Greg _______________________________________________ Chimera-dev mailing list Chimera-dev@cgl.ucsf.edu http://www.cgl.ucsf.edu/mailman/listinfo/chimera-dev

Less obvious is that the code that wrappy generates hasn't changed in a long time either. The "new pre-fetching behavior or wrappy" was added in 2000/11/07. I'm still trying to figure why this problem with sortedAtoms happened now, but it was just waiting to be a problem. Whether or not we support lazily-evaluated attributes belongs to a discussion on a new version of wrappy or a wrappy replacement. It is not supported now. - Greg On Tue, 3 Jun 2003, Eric Pettersen wrote:
Date: Tue, 3 Jun 2003 15:47:48 -0700 From: Eric Pettersen <pett@cgl.ucsf.edu> To: Greg Couch <gregc@cgl.ucsf.edu> Cc: chimera-dev@cgl.ucsf.edu Subject: Re: [chimera-dev] source of ~10x slowdown in opening of Molecules in lastest chimera
Obviously, the code for sortedAtoms() hasn't changed in a long time. So one way to look at it is that the "problem" is the new pre-fetching behavior of wrappy. So is it true that with the new behavior of wrappy it is effectively impossible to have a lazily-evaluated attribute of a Molecule (or Atom or Bond for that matter)? I think I already ran afoul of this new behavior when I couldn't make primaryBonds() or allLocations() into attributes.
sortedAtoms() was being used in my analysis of trajectories, where the molecules in question were hundreds or atoms in size and performance as is was more than acceptable. Now that sortedAtoms is forcibly executed no matter what, then you have it operating on molecules that are many thousands of atoms and you get what you see now.
Perhaps we need a discussion before we commit to never having lazily-evaluated attributes.
--Eric
On Tuesday, June 3, 2003, at 02:44 PM, Greg Couch wrote:
After much digging, it turns out that the slowdown occurs when the PDBio wrapper returns the new molecules to chimera. Due to wrappy's C++/Python reference count caching, the molecule's attributes are fetched at that time, and one of those attributes, sortedAtoms, takes a long time to compute (for 4371 atoms it takes 4.1 seconds on socrates, 3.2 on spin and 5.9 on my home P4). With a timing script, we can see that sortedAtoms isn't any slower in the lastest chimera, and sortedAtoms has been an attribute since 2000/05/02, so an optimization to speed up the initial sorting of atoms seems to have been lost recently. So far, I have been unable to find it, so I'm hoping Eric or Conrad will chime in.
In the mean time, making sortedAtoms a member function instead of an attribute (which wrappy much prefers anyway) avoids the delay. Luckily no .py file in /usr/local/src/chimera calls sortedAtoms, so it is a fairly safe fix.
- Greg _______________________________________________ Chimera-dev mailing list Chimera-dev@cgl.ucsf.edu http://www.cgl.ucsf.edu/mailman/listinfo/chimera-dev
_______________________________________________ Chimera-dev mailing list Chimera-dev@cgl.ucsf.edu http://www.cgl.ucsf.edu/mailman/listinfo/chimera-dev
participants (2)
-
Eric Pettersen
-
Greg Couch