Hi
I encounter a problem in multimer prediction with the sequences I use as input. It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks! - Dennis
Sorry, I couldn't generate the pdb from chimera colab - is there anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh pohdennis90@gmail.com wrote:
Hi
I encounter a problem in multimer prediction with the sequences I use as input. It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
Hi Dennis, Your sequence input is wrong - it should contain only the sequences pasted as plain text, with only a comma between them (NOT the ">description" line because it is not supposed to be in fasta format). How to input sequence(s) is explained in the AlphaFold help page.
https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict
"For predicting a complex (multimer), the sequences of all chains in the complex must be given. The same sequence must be given multiple times if it occurs in multiple copies in the complex. The sequences can be specified either collectively as a model number chosen from the menu of currently open models (e.g. when that model contains multiple chains), or individually within a comma-separated list of UniProt identifiers or pasted-in amino acid sequences."
E.g. something like
ACCCC,ALLPAAAA
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Chimera(X) team Department of Pharmaceutical Chemistry University of California, San Francisco
On Oct 5, 2022, at 4:12 AM, Dennis Poh via ChimeraX-users chimerax-users@cgl.ucsf.edu wrote:
Sorry, I couldn't generate the pdb from chimera colab - is there anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh pohdennis90@gmail.com wrote: Hi
I encounter a problem in multimer prediction with the sequences I use as input. It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
Hi Elaine,
Thanks, I was trying to predict a multimer or the overall structure of many subunit chains using individual sequences, each separated with a comma in colab. But it seems that there was some error and no pdb file was generated; the error message is as follows:
ERROR:colabfold.batch:Could not generate input features af1848: Invalid character in the sequence: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1357, in run model_type, File "<ipython-input-1-d6881d38b934>", line 122, in generate_input_feature_wrapper (input_features, domain_names) = batch.generate_input_feature_orig(*args, **kw) File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1018, in generate_input_feature sequence, input_msa, template_features[sequence_index] File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 869, in build_monomer_feature sequence=sequence, description="none", num_res=len(sequence) File "/usr/local/lib/python3.7/dist-packages/alphafold/data/pipeline.py", line 43, in make_sequence_features map_unknown_to_x=True) File "/usr/local/lib/python3.7/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: INFO:colabfold.batch:Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory
-Dennis
On Wed, 5 Oct 2022 at 23:36, Elaine Meng meng@cgl.ucsf.edu wrote:
Hi Dennis, Your sequence input is wrong - it should contain only the sequences pasted as plain text, with only a comma between them (NOT the ">description" line because it is not supposed to be in fasta format). How to input sequence(s) is explained in the AlphaFold help page.
https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict
"For predicting a complex (multimer), the sequences of all chains in the complex must be given. The same sequence must be given multiple times if it occurs in multiple copies in the complex. The sequences can be specified either collectively as a model number chosen from the menu of currently open models (e.g. when that model contains multiple chains), or individually within a comma-separated list of UniProt identifiers or pasted-in amino acid sequences."
E.g. something like
ACCCC,ALLPAAAA
I hope this helps, Elaine
Elaine C. Meng, Ph.D. UCSF Chimera(X) team Department of Pharmaceutical Chemistry University of California, San Francisco
On Oct 5, 2022, at 4:12 AM, Dennis Poh via ChimeraX-users <
chimerax-users@cgl.ucsf.edu> wrote:
Sorry, I couldn't generate the pdb from chimera colab - is there
anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh pohdennis90@gmail.com wrote: Hi
I encounter a problem in multimer prediction with the sequences I use as
input.
It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence
characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
Well, the message says there is an invalid character, so all I can say is to make sure that you are pasting plain text, and check to see that you have only standard amino acid codes and commas.
WIthout seeing exactly what you pasted, we can't tell which part caused the problem.
Elaine
On Oct 5, 2022, at 9:09 AM, Dennis Poh pohdennis90@gmail.com wrote:
Hi Elaine,
Thanks, I was trying to predict a multimer or the overall structure of many subunit chains using individual sequences, each separated with a comma in colab. But it seems that there was some error and no pdb file was generated; the error message is as follows:
ERROR:colabfold.batch:Could not generate input features af1848: Invalid character in the sequence: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1357, in run model_type, File "<ipython-input-1-d6881d38b934>", line 122, in generate_input_feature_wrapper (input_features, domain_names) = batch.generate_input_feature_orig(*args, **kw) File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1018, in generate_input_feature sequence, input_msa, template_features[sequence_index] File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 869, in build_monomer_feature sequence=sequence, description="none", num_res=len(sequence) File "/usr/local/lib/python3.7/dist-packages/alphafold/data/pipeline.py", line 43, in make_sequence_features map_unknown_to_x=True) File "/usr/local/lib/python3.7/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: INFO:colabfold.batch:Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory
-Dennis
On Wed, 5 Oct 2022 at 23:36, Elaine Meng meng@cgl.ucsf.edu wrote: Hi Dennis, Your sequence input is wrong - it should contain only the sequences pasted as plain text, with only a comma between them (NOT the ">description" line because it is not supposed to be in fasta format). How to input sequence(s) is explained in the AlphaFold help page.
https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict
"For predicting a complex (multimer), the sequences of all chains in the complex must be given. The same sequence must be given multiple times if it occurs in multiple copies in the complex. The sequences can be specified either collectively as a model number chosen from the menu of currently open models (e.g. when that model contains multiple chains), or individually within a comma-separated list of UniProt identifiers or pasted-in amino acid sequences."
E.g. something like
ACCCC,ALLPAAAA
I hope this helps, Elaine
Elaine C. Meng, Ph.D. UCSF Chimera(X) team Department of Pharmaceutical Chemistry University of California, San Francisco
On Oct 5, 2022, at 4:12 AM, Dennis Poh via ChimeraX-users chimerax-users@cgl.ucsf.edu wrote:
Sorry, I couldn't generate the pdb from chimera colab - is there anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh pohdennis90@gmail.com wrote: Hi
I encounter a problem in multimer prediction with the sequences I use as input. It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
Also, if you are copying the sequence from a text editor, make sure the editor is displaying the sequence as plain text or that you are copying as plain text, otherwise invisible formatting characters may be embedded in what you paste.
--Eric
Eric Pettersen UCSF Computer Graphics Lab
On Oct 5, 2022, at 9:14 AM, Elaine Meng via ChimeraX-users chimerax-users@cgl.ucsf.edu wrote:
Well, the message says there is an invalid character, so all I can say is to make sure that you are pasting plain text, and check to see that you have only standard amino acid codes and commas.
WIthout seeing exactly what you pasted, we can't tell which part caused the problem.
Elaine
On Oct 5, 2022, at 9:09 AM, Dennis Poh pohdennis90@gmail.com wrote:
Hi Elaine,
Thanks, I was trying to predict a multimer or the overall structure of many subunit chains using individual sequences, each separated with a comma in colab. But it seems that there was some error and no pdb file was generated; the error message is as follows:
ERROR:colabfold.batch:Could not generate input features af1848: Invalid character in the sequence: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1357, in run model_type, File "<ipython-input-1-d6881d38b934>", line 122, in generate_input_feature_wrapper (input_features, domain_names) = batch.generate_input_feature_orig(*args, **kw) File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1018, in generate_input_feature sequence, input_msa, template_features[sequence_index] File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 869, in build_monomer_feature sequence=sequence, description="none", num_res=len(sequence) File "/usr/local/lib/python3.7/dist-packages/alphafold/data/pipeline.py", line 43, in make_sequence_features map_unknown_to_x=True) File "/usr/local/lib/python3.7/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: INFO:colabfold.batch:Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory
-Dennis
On Wed, 5 Oct 2022 at 23:36, Elaine Meng meng@cgl.ucsf.edu wrote: Hi Dennis, Your sequence input is wrong - it should contain only the sequences pasted as plain text, with only a comma between them (NOT the ">description" line because it is not supposed to be in fasta format). How to input sequence(s) is explained in the AlphaFold help page.
https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict
"For predicting a complex (multimer), the sequences of all chains in the complex must be given. The same sequence must be given multiple times if it occurs in multiple copies in the complex. The sequences can be specified either collectively as a model number chosen from the menu of currently open models (e.g. when that model contains multiple chains), or individually within a comma-separated list of UniProt identifiers or pasted-in amino acid sequences."
E.g. something like
ACCCC,ALLPAAAA
I hope this helps, Elaine
Elaine C. Meng, Ph.D. UCSF Chimera(X) team Department of Pharmaceutical Chemistry University of California, San Francisco
On Oct 5, 2022, at 4:12 AM, Dennis Poh via ChimeraX-users chimerax-users@cgl.ucsf.edu wrote:
Sorry, I couldn't generate the pdb from chimera colab - is there anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh pohdennis90@gmail.com wrote: Hi
I encounter a problem in multimer prediction with the sequences I use as input. It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
ChimeraX-users mailing list ChimeraX-users@cgl.ucsf.edu Manage subscription: https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
Hi Eric
1. It seems that I don't have sufficient GPU memory in colab.
I have the following error message: INFO:colabfold.batch:Running model_1 /usr/local/lib/python3.7/dist-packages/haiku/_src/data_structures.py:195: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. leaves, structure = jax.tree_flatten(mapping) /usr/local/lib/python3.7/dist-packages/haiku/_src/data_structures.py:203: FutureWarning: jax.tree_unflatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_unflatten instead. self._mapping = jax.tree_unflatten(self._structure, self._leaves) /usr/local/lib/python3.7/dist-packages/haiku/_src/stateful.py:457: FutureWarning: jax.tree_leaves is deprecated, and will be removed in a future release. Use jax.tree_util.tree_leaves instead. length = jax.tree_leaves(xs)[0].shape[0] /usr/local/lib/python3.7/dist-packages/alphafold/model/geometry/struct_of_array.py:136: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. flat_array_like, inner_treedef = jax.tree_flatten(array_like) /usr/local/lib/python3.7/dist-packages/alphafold/model/geometry/struct_of_array.py:210: FutureWarning: jax.tree_unflatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_unflatten instead. inner_treedef, data[array_start:array_start + num_array]) /usr/local/lib/python3.7/dist-packages/alphafold/model/mapping.py:50: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. values_tree_def = jax.tree_flatten(values)[1] /usr/local/lib/python3.7/dist-packages/alphafold/model/mapping.py:54: FutureWarning: jax.tree_unflatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_unflatten instead. return jax.tree_unflatten(values_tree_def, flat_axes) /usr/local/lib/python3.7/dist-packages/alphafold/model/mapping.py:129: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. flat_sizes = jax.tree_flatten(in_sizes)[0] ERROR:colabfold.batch:Could not predict af1819. Not Enough GPU memory? INTERNAL: cublas error INFO:colabfold.batch:Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory
2. Would it be possible to run this in jupyter? Or are there alternatives?
3. " Prediction may fail with total sequence length over 1000 residues due to limited GPU memory." - this total sequence length meaning all the sequences in the list to be concatenated?
4. I seem to also have some pdbxx.m8 and afxxxx.csv files - may I know what these files are for?
Thanks! -Dennis
On Thu, 6 Oct 2022 at 00:25, Eric Pettersen pett@cgl.ucsf.edu wrote:
Also, if you are copying the sequence from a text editor, make sure the editor is displaying the sequence as plain text or that you are copying as plain text, otherwise invisible formatting characters may be embedded in what you paste.
--Eric
Eric Pettersen UCSF Computer Graphics Lab
On Oct 5, 2022, at 9:14 AM, Elaine Meng via ChimeraX-users <
chimerax-users@cgl.ucsf.edu> wrote:
Well, the message says there is an invalid character, so all I can say
is to make sure that you are pasting plain text, and check to see that you have only standard amino acid codes and commas.
WIthout seeing exactly what you pasted, we can't tell which part caused
the problem.
Elaine
On Oct 5, 2022, at 9:09 AM, Dennis Poh pohdennis90@gmail.com wrote:
Hi Elaine,
Thanks, I was trying to predict a multimer or the overall structure of
many subunit chains using individual sequences, each separated with a comma in colab.
But it seems that there was some error and no pdb file was generated;
the error message is as follows:
ERROR:colabfold.batch:Could not generate input features af1848: Invalid
character in the sequence:
Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line
1357, in run
model_type, File "<ipython-input-1-d6881d38b934>", line 122, in
generate_input_feature_wrapper
(input_features, domain_names) =
batch.generate_input_feature_orig(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line
1018, in generate_input_feature
sequence, input_msa, template_features[sequence_index] File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line
869, in build_monomer_feature
sequence=sequence, description="none", num_res=len(sequence) File
"/usr/local/lib/python3.7/dist-packages/alphafold/data/pipeline.py", line 43, in make_sequence_features
map_unknown_to_x=True) File
"/usr/local/lib/python3.7/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot
raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: INFO:colabfold.batch:Done Downloading structure predictions to directory
Downloads/ChimeraX/AlphaFold
cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or
directory
cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file
or directory
-Dennis
On Wed, 5 Oct 2022 at 23:36, Elaine Meng meng@cgl.ucsf.edu wrote: Hi Dennis, Your sequence input is wrong - it should contain only the sequences
pasted as plain text, with only a comma between them (NOT the ">description" line because it is not supposed to be in fasta format). How to input sequence(s) is explained in the AlphaFold help page.
https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict
"For predicting a complex (multimer), the sequences of all chains in
the complex must be given. The same sequence must be given multiple times if it occurs in multiple copies in the complex. The sequences can be specified either collectively as a model number chosen from the menu of currently open models (e.g. when that model contains multiple chains), or individually within a comma-separated list of UniProt identifiers or pasted-in amino acid sequences."
E.g. something like
ACCCC,ALLPAAAA
I hope this helps, Elaine
Elaine C. Meng, Ph.D. UCSF Chimera(X) team Department of Pharmaceutical Chemistry University of California, San Francisco
On Oct 5, 2022, at 4:12 AM, Dennis Poh via ChimeraX-users <
chimerax-users@cgl.ucsf.edu> wrote:
Sorry, I couldn't generate the pdb from chimera colab - is there
anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh pohdennis90@gmail.com wrote: Hi
I encounter a problem in multimer prediction with the sequences I use
as input.
It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence
characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
ChimeraX-users mailing list ChimeraX-users@cgl.ucsf.edu Manage subscription: https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
Hi Dennis,
AlphaFold on Google Colab will only work for total sequence length of about 1000 because the old Google Colab GPUs only have 16 Gbytes of memory. Modern GPUs for machine learning like an Nvidia A40 (48 GB) or A100 (40 or 80GB) or A6000 (48 GB) can handle more than 3000 residues but those GPUs cost $5000-$15000. A consumer Nvidia RTX 3090 (24 GB, $1500) can manage up to 2000 residues. Setting up AlphaFold on your own machine is a bit of work. I see from your output file name "af1848" you tried to run total sequence length 1848 which certainly will fail on Google Colab. For examples of AlphaFold runs at different sequence lengths see
https://www.rbvi.ucsf.edu/chimerax/data/alphafold-jan2022/afspeed.html
Also AlphaFold has a minimum sequence length for each individual sequence is 16 so your example ACCCC and ALLPAAAA will not work. This minimum length is because AlphaFold uses a sequence alignment made by searching a billion database sequences and searching shorter sequence produces millions of hits that are not evolutionarily related.
The sequence length limits are described in the ChimeraX documentation
https://www.rbvi.ucsf.edu/chimerax/docs/user/commands/alphafold.html#caveats
Tom
On Oct 5, 2022, at 9:35 AM, Dennis Poh via ChimeraX-users chimerax-users@cgl.ucsf.edu wrote:
Hi Eric
- It seems that I don't have sufficient GPU memory in colab.
I have the following error message: INFO:colabfold.batch:Running model_1 /usr/local/lib/python3.7/dist-packages/haiku/_src/data_structures.py:195: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. leaves, structure = jax.tree_flatten(mapping) /usr/local/lib/python3.7/dist-packages/haiku/_src/data_structures.py:203: FutureWarning: jax.tree_unflatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_unflatten instead. self._mapping = jax.tree_unflatten(self._structure, self._leaves) /usr/local/lib/python3.7/dist-packages/haiku/_src/stateful.py:457: FutureWarning: jax.tree_leaves is deprecated, and will be removed in a future release. Use jax.tree_util.tree_leaves instead. length = jax.tree_leaves(xs)[0].shape[0] /usr/local/lib/python3.7/dist-packages/alphafold/model/geometry/struct_of_array.py:136: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. flat_array_like, inner_treedef = jax.tree_flatten(array_like) /usr/local/lib/python3.7/dist-packages/alphafold/model/geometry/struct_of_array.py:210: FutureWarning: jax.tree_unflatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_unflatten instead. inner_treedef, data[array_start:array_start + num_array]) /usr/local/lib/python3.7/dist-packages/alphafold/model/mapping.py:50: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. values_tree_def = jax.tree_flatten(values)[1] /usr/local/lib/python3.7/dist-packages/alphafold/model/mapping.py:54: FutureWarning: jax.tree_unflatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_unflatten instead. return jax.tree_unflatten(values_tree_def, flat_axes) /usr/local/lib/python3.7/dist-packages/alphafold/model/mapping.py:129: FutureWarning: jax.tree_flatten is deprecated, and will be removed in a future release. Use jax.tree_util.tree_flatten instead. flat_sizes = jax.tree_flatten(in_sizes)[0] ERROR:colabfold.batch:Could not predict af1819. Not Enough GPU memory? INTERNAL: cublas error INFO:colabfold.batch:Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory
Would it be possible to run this in jupyter? Or are there alternatives?
" Prediction may fail with total sequence length over 1000 residues due to limited GPU memory." - this total sequence length meaning all the sequences in the list to be concatenated?
I seem to also have some pdbxx.m8 and afxxxx.csv files - may I know what these files are for?
Thanks! -Dennis
On Thu, 6 Oct 2022 at 00:25, Eric Pettersen <pett@cgl.ucsf.edu mailto:pett@cgl.ucsf.edu> wrote: Also, if you are copying the sequence from a text editor, make sure the editor is displaying the sequence as plain text or that you are copying as plain text, otherwise invisible formatting characters may be embedded in what you paste.
--Eric
Eric Pettersen UCSF Computer Graphics Lab
On Oct 5, 2022, at 9:14 AM, Elaine Meng via ChimeraX-users <chimerax-users@cgl.ucsf.edu mailto:chimerax-users@cgl.ucsf.edu> wrote:
Well, the message says there is an invalid character, so all I can say is to make sure that you are pasting plain text, and check to see that you have only standard amino acid codes and commas.
WIthout seeing exactly what you pasted, we can't tell which part caused the problem.
Elaine
On Oct 5, 2022, at 9:09 AM, Dennis Poh <pohdennis90@gmail.com mailto:pohdennis90@gmail.com> wrote:
Hi Elaine,
Thanks, I was trying to predict a multimer or the overall structure of many subunit chains using individual sequences, each separated with a comma in colab. But it seems that there was some error and no pdb file was generated; the error message is as follows:
ERROR:colabfold.batch:Could not generate input features af1848: Invalid character in the sequence: Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1357, in run model_type, File "<ipython-input-1-d6881d38b934>", line 122, in generate_input_feature_wrapper (input_features, domain_names) = batch.generate_input_feature_orig(*args, **kw) File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 1018, in generate_input_feature sequence, input_msa, template_features[sequence_index] File "/usr/local/lib/python3.7/dist-packages/colabfold/batch.py", line 869, in build_monomer_feature sequence=sequence, description="none", num_res=len(sequence) File "/usr/local/lib/python3.7/dist-packages/alphafold/data/pipeline.py", line 43, in make_sequence_features map_unknown_to_x=True) File "/usr/local/lib/python3.7/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: INFO:colabfold.batch:Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_1_model_*.pdb': No such file or directory cp: cannot stat '*_unrelaxed_rank_1_model_*_scores.json': No such file or directory
-Dennis
On Wed, 5 Oct 2022 at 23:36, Elaine Meng <meng@cgl.ucsf.edu mailto:meng@cgl.ucsf.edu> wrote: Hi Dennis, Your sequence input is wrong - it should contain only the sequences pasted as plain text, with only a comma between them (NOT the ">description" line because it is not supposed to be in fasta format). How to input sequence(s) is explained in the AlphaFold help page.
<https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict>
"For predicting a complex (multimer), the sequences of all chains in the complex must be given. The same sequence must be given multiple times if it occurs in multiple copies in the complex. The sequences can be specified either collectively as a model number chosen from the menu of currently open models (e.g. when that model contains multiple chains), or individually within a comma-separated list of UniProt identifiers or pasted-in amino acid sequences."
E.g. something like
ACCCC,ALLPAAAA
I hope this helps, Elaine
Elaine C. Meng, Ph.D. UCSF Chimera(X) team Department of Pharmaceutical Chemistry University of California, San Francisco
On Oct 5, 2022, at 4:12 AM, Dennis Poh via ChimeraX-users <chimerax-users@cgl.ucsf.edu mailto:chimerax-users@cgl.ucsf.edu> wrote:
Sorry, I couldn't generate the pdb from chimera colab - is there anything that i may have missed?
-Dennis
On Wed, 5 Oct 2022 at 17:11, Dennis Poh <pohdennis90@gmail.com mailto:pohdennis90@gmail.com> wrote: Hi
I encounter a problem in multimer prediction with the sequences I use as input. It's always indicating: "Missing or invalid "sequences" argument: Sequences argument" and " is not a chain specifier, alignment id, UniProt id, or sequence characters"
The input format is always something like a fasta format:
seq_id
ACCCC
seq_id2
ALLPAAAA
May I know how I can rectify this?
Thanks!
- Dennis
ChimeraX-users mailing list ChimeraX-users@cgl.ucsf.edu mailto:ChimeraX-users@cgl.ucsf.edu Manage subscription: https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
ChimeraX-users mailing list ChimeraX-users@cgl.ucsf.edu Manage subscription: https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
participants (4)
-
Dennis Poh
-
Elaine Meng
-
Eric Pettersen
-
Tom Goddard