data:image/s3,"s3://crabby-images/b17a9/b17a975573d5607c4ebef4fb86e114259b5be97c" alt=""
Hi there, I'm looking for a neat way to retrieve the sequence of an atom selection (in this case part of a double-stranded DNA molecule) from within a python script. I've been playing around with: selection = run (session, 'select #1/A:2-4|#1/B:6-9') for a in selection.atoms.unique_residues: print(a) which gives me: 151 atoms, 169 bonds, 7 residues, 1 model selected /A DG 2 /A DG 3 /A DG 4 /B G 6 /B G 7 /B A 8 /B T 9 This is a good start, but ideally I'd like to get back a contiguous sequence string for each chain (like (('A', 'GGG'), ('B', GGAT'))). Also, I do not really understand, why residue names are reported as 'DG' for strand A, but 'G' for strand B, which makes extracting a sequence string from the above output quite difficult. I've also tried: selection = run (session, 'select #1/A:2-4|#1/B:6-9') for a in selection.atoms.by_chain: print(a[1]) print(a[2].unique_residues.unique_sequences[0][1]) which gives me: 151 atoms, 169 bonds, 7 residues, 1 model selected A AGGGAGTAATCCCCTTG B CAAGGGGATTACTCCCT This is obviously also not what I want since it returns the entire sequence of each chain. I guess I could take the residue numbers from the first command and use those to slice the strings returned by the second, but that feels a little awkward, and so I was wondering if there was a better, more direct way to achieve this seemingly simple task. Your help would be much appreciated. Cheers, Maik