pdb parsing problem-- hydrogen as mercury
data:image/s3,"s3://crabby-images/aeb38/aeb3830227e860e4302b4433c9a61848d98055fb" alt=""
I'm getting errors on reading PDB files containin hydrogens added by TRIPOS SYBYL. For example, the atoms 1677-1679 are parsed as Mercury (HG): ATOM 1674 HD11 ILE A 107 4.939 -2.338 17.094 1.00 0.00 ATOM 1675 HD12 ILE A 107 6.613 -2.979 16.971 1.00 0.00 ATOM 1676 HD13 ILE A 107 5.703 -3.133 18.512 1.00 0.00 ATOM 1677 HG21 ILE A 107 4.770 -2.095 20.141 1.00 0.00 ATOM 1678 HG22 ILE A 107 6.235 -1.314 20.827 1.00 0.00 ATOM 1679 HG23 ILE A 107 4.630 -0.527 21.006 1.00 0.00 ATOM 1680 H ILE A 107 5.291 0.799 16.773 1.00 0.00 ATOM 1681 N TYR A 108 3.267 1.810 19.498 1.00 21.63 Accoring to the table at http://www.bmrb.wisc.edu/ref_info/atom_nom.tbl There are descrepencies between PDB-like file formats, and we see that the official PDB format would specify "1HG2" instead of "HG21". Does anyone know of a quick way to deal with this? I can write a perl search/replace script if need be. Thanks, Josh
data:image/s3,"s3://crabby-images/f1d5e/f1d5ebae441e543630bbc60e982a4fa99c9d3b65" alt=""
On Fri, 18 Feb 2005, Josh Tasman wrote:
I'm getting errors on reading PDB files containin hydrogens added by TRIPOS SYBYL. For example, the atoms 1677-1679 are parsed as Mercury (HG):
ATOM 1674 HD11 ILE A 107 4.939 -2.338 17.094 1.00 0.00 ATOM 1675 HD12 ILE A 107 6.613 -2.979 16.971 1.00 0.00 ATOM 1676 HD13 ILE A 107 5.703 -3.133 18.512 1.00 0.00 ATOM 1677 HG21 ILE A 107 4.770 -2.095 20.141 1.00 0.00 ATOM 1678 HG22 ILE A 107 6.235 -1.314 20.827 1.00 0.00 ATOM 1679 HG23 ILE A 107 4.630 -0.527 21.006 1.00 0.00 ATOM 1680 H ILE A 107 5.291 0.799 16.773 1.00 0.00 ATOM 1681 N TYR A 108 3.267 1.810 19.498 1.00 21.63
Accoring to the table at
http://www.bmrb.wisc.edu/ref_info/atom_nom.tbl There are descrepencies between PDB-like file formats, and we see that the official PDB format would specify "1HG2" instead of "HG21".
Does anyone know of a quick way to deal with this?
I can write a perl search/replace script if need be.
We'll take the suggestion under advisement. Please file a bug report with TRIPOS. We'd be especially happy if they fixed the names, but it would also help if they put the atomic element symbol in columns 77-78 as allowed by PDB version 2. A perl script is an excellent workaround. Regards, Greg
data:image/s3,"s3://crabby-images/6afbe/6afbe7577c5a571d04e2d32118581c9ef7f0ad74" alt=""
Hi Josh, There's a short Python script at the end of this message that produces a fixed PDB file. If there is an 'H' in the problem column and the atom name is four characters long ending in a digit, it will switch the fourth character of the name around to the front. The scripts expects the problem file name as a command-line argument and prints the corrected file to standard output. Hope this helps... Eric Pettersen UCSF Computer Graphics Lab pett@cgl.ucsf.edu http://www.cgl.ucsf.edu --- script --- #!/bin/env python import sys pdbFile = file(sys.argv[1], "rU") for line in pdbFile: if (line.startswith("ATOM ") or line.startswith("HETATM")) and line[12] == "H" and line[15].isdigit(): print line[:12] + line[15] + line[12:15] + line[16:], else: print line, pdbFile.close()
participants (3)
-
Eric Pettersen
-
Greg Couch
-
Josh Tasman