This seems to be a memory leak in Chimera. Using the 1.3 production release (1.3.2577) the weak key dictionary discards entries as they are closed, but with the 1.4 production release (1.4.1) and later the entries accumulate in the dictionary. That means that some code somewhere in Chimera is holding onto a reference to the closed model, preventing it from being destroyed and its memory freed (and its weak- key dictionary references from being deleted). This is why your "flush cache" trick helps some but doesn't completely solve the problem: the closed models aren't freeing up memory like they should. I will be investigating this and will post something here when I've fixed it, but memory leaks are pretty difficult to track down so it might take until sometime next week for me to find/ fix it. --Eric On Jul 28, 2011, at 11:24 AM, Maciek Wójcikowski wrote:
Hello Eric,
My script is quite simple, but I'm not an python expert:
#! /usr/bin/env python # -*- coding: utf-8 -*-
import os import glob import sys
if len(sys.argv) < 2: print 'No directory specified.' sys.exit()
path = sys.argv[1]
from chimera import runCommand runCommand("open 0 model.mol2")
i=0
for files in glob.glob( os.path.join(path, '*.mol2') ): #print path+files runCommand('open 1 %s' % files) if i < 100: cache = " cacheDA true" else: cache = "" i=0 try: runCommand("hbonds intramodel false distSlop 0.8 angleSlop 40" + cache ) except: pass runCommand("close 1") i+=1
I've added the "non-cache" run every 100 iterations, since I've noticed in source code that non-cache hbond check trigers flushCache() and it seams to help. When I previously had all iterations with "cacheDA true" chimera was eating RAM like a beast (up to about 2GB per 100.000 molecules) making the cache searches slow. It would be ideal to have cached only those D+A from the model (protein).
Some tech. details: Chimera 1.5.3, Fedora 14, CentOS 6.0, Debian Sid - all have this problem.
PS. Is it possible to get the number of donors and acceptors somehow? Or should I add this to the source code? It can be achieved easily since it counts them, but doesn't print such number.
Thank you in advance for your help. ---- Pozdrawiam, | Best regards, Maciek Wójcikowski maciek@wojcikowski.pl
2011/7/28 Eric Pettersen <pett@cgl.ucsf.edu> On Jul 28, 2011, at 3:35 AM, Maciek Wójcikowski wrote:
Hello everyone,
I'm trying to compute hbonds for quite large molecular database, so i do it in CLI. I've added cacheDA parameter which speeds up whole process at first by 10 times, although after time cache is getting bigger and bigger operations slows down as one can expect, because Chimera is caching every compound. Is there a way to limit the size of a cache, or even better to tell chimera to cache only protein donors and acceptors?
Hi Maciek, Are you closing the compound models after you do the H-bond computation? I ask because the caching uses a "weak key dictionary" where the key is the model. What this means is that if the model is closed it should simply disappear from the cache, no fuss no muss. If you are closing the models then either the slowdown is due to something else, or the models aren't being properly removed from the cache. Let me know if your script is closing the models but still having this problem and I will investigate.
--Eric
Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu