Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peptide-shaker doesn't progress from "Importing PSMs from mgf.comet.pep.xml.gz" #496

Open
MigRibe opened this issue Nov 15, 2022 · 27 comments
Assignees

Comments

@MigRibe
Copy link

MigRibe commented Nov 15, 2022

I'm trying to use the search gui and peptide shaker for peptide mass fingerprinting. Both seem pretty intuitive and complete, but I can't make progress on some algorithms in the search gui, but mostly in the peptide shaker. Attached figure. Could you help me, please?
Capturar
Capturar2

@hbarsnes hbarsnes self-assigned this Nov 15, 2022
@hbarsnes
Copy link
Member

If PeptideShaker is stuck at that particular step for long, you can cancel the process and have a look at the PeptideShaker log file. You can find it from the PeptideShaker Welcome dialog > Settings & Help > Help > Bug Report. If you share that file here we can hopefully figure out what is going on.

@MigRibe
Copy link
Author

MigRibe commented Nov 15, 2022 via email

@hbarsnes
Copy link
Member

I don't see any errors in the log file related to the progress halting when importing the PSMs. Would it be possible for you to share the SearchGUI output file with me so that I can try to reproduce the issue on my end?

@MigRibe
Copy link
Author

MigRibe commented Nov 15, 2022 via email

@MigRibe
Copy link
Author

MigRibe commented Nov 16, 2022

searchgui_out.zip

@hbarsnes
Copy link
Member

Thanks for sharing the SearchGUI zip file! However, in order to try loading the data I will also need the spectrum and FASTA files. Any chance you could share those as well?

@MigRibe
Copy link
Author

MigRibe commented Nov 16, 2022 via email

@hbarsnes
Copy link
Member

Thanks again! Any chance you can share the mgf file C:\Gui\MS\Glutenmgf.mgf instead of the raw file? (As I'm having some issues with the spectrum titles not matching yours if I do the conversion myself.)

@MigRibe
Copy link
Author

MigRibe commented Nov 16, 2022

https://we.tl/t-Jc3VXihQPr
Many thanks

@hbarsnes
Copy link
Member

Your data initially loaded fine on my end without any issues. However, when I downgraded my Java version to the same as the one you are using, I was finally able to reproduce the issue. I would therefore recommend that you upgrade your Java version to a more recent version and see if that solves the problem. You can find the most recent versions here: https://adoptium.net/temurin/releases. After upgrading, you can check and set the Java version used in PeptideShaker via the Welcome dialog > Settings & Help > Settings > Java Settings.

By the way, it seems like your FASTA file does not contain any decoy sequences? Without this PeptideShaker will not be able to calculate the FDR for your dataset and you cannot be sure which identifications to trust. Decoys can easily be added in SearchGUI when selecting a FASTA file. Simply click yes to the question of whether you would like to add decoys. Note that you would then have to repeat the search before later loading the data in PeptideShaker.

@MigRibe
Copy link
Author

MigRibe commented Nov 17, 2022 via email

@MigRibe
Copy link
Author

MigRibe commented Nov 18, 2022

Dear Harald,
I followed your advice and I managed to get to the end of the peptideshaker analysis using the comet. Many thanks!
As for Amanda, it seems to have stucked; about 4 hours waiting without leaving 33% in peptideshaker (the file is the same). The bug log is attached.
PeptideShaker 2.2.17 log.txt

@hbarsnes
Copy link
Member

Great to hear that you can now load the Comet data!

As for the MS Amanda issue, I don't see any explanations for this in the log file. Would you be able to share the SearchGUI zip file again?

@MigRibe
Copy link
Author

MigRibe commented Nov 21, 2022

Dear Harald
Many thanks for your reply.
https://we.tl/t-fgazVhaiAE

Also, I tried a search on another computer, but the step to open peptideshaker fails. And I can't open the file manually. I leave the log.
PeptideShaker.log

@hbarsnes
Copy link
Member

Aha, so it's not PeptideShaker that fails when loading the data (as for the Comet output), but MS Amanda itself that fails when searching the data? Not much we can do there I'm afraid. I would recommend having a closer look at your search parameters to see if there is anything you can consider changing there? If that does not help, you can always try contacting the MS Amanda developers directly at https://groups.google.com/g/msamanda.

Also, I tried a search on another computer, but the step to open peptideshaker fails. And I can't open the file manually. I leave the log.

I don't see anything in the error log that can explain this. What do you mean specifically with "the step to open peptideshaker fails" though?

@MigRibe
Copy link
Author

MigRibe commented Nov 21, 2022 via email

@hbarsnes
Copy link
Member

Regarding the error on another computer, everything seems to go well (searchgui), but then there is a failure and the ongoing process (searchgui + peptideshaker) is interrupted.

You can always try running the two tools separately, i.e. first run only SearchGUI and the later try to load the data in PeptideShaker. This can perhaps provide you with a bit more information about what is happening.

@MigRibe
Copy link
Author

MigRibe commented Nov 21, 2022 via email

@hbarsnes
Copy link
Member

Seems like you shared the wrong mzml file? At least PeptideShaker is asking for ModA1_GlucoseWBA.mzML and you only shared ModC1_GlioxalWBA.mzML?

@MigRibe
Copy link
Author

MigRibe commented Nov 22, 2022 via email

@hbarsnes
Copy link
Member

I'm now able to process your data but it ends in the following message:

Tue Nov 22 15:20:38 CET 2022        Importing sequences from UniprotSprotTD_concatenated_target_decoy.fasta.
Tue Nov 22 15:20:50 CET 2022        Importing gene mappings.
Tue Nov 22 15:22:06 CET 2022        Establishing local database connection.
Tue Nov 22 15:22:06 CET 2022        Reading identification files.
Tue Nov 22 15:22:06 CET 2022        Parsing ModA1_GlucoseWBA.comet.pep.xml.gz.
Tue Nov 22 15:22:06 CET 2022        Checking spectra for ModA1_GlucoseWBA.comet.pep.xml.gz.
Tue Nov 22 15:22:06 CET 2022        Importing PSMs from ModA1_GlucoseWBA.comet.pep.xml.gz
Tue Nov 22 15:22:09 CET 2022        666 identified spectra (78.9%) did not present a valid peptide.
Tue Nov 22 15:22:09 CET 2022        11136 of the best scoring peptides were excluded by the import filters:
Tue Nov 22 15:22:09 CET 2022            - 72.1% peptide mapping to both target and decoy.
Tue Nov 22 15:22:09 CET 2022            - 27.9% peptide length less than 8 or greater than 30.
Tue Nov 22 15:22:09 CET 2022        Warning: More than 75% of the PSMs did not pass the import filters.
 Apparently your database contains a high degree of shared peptides between the target and decoy sequences. Please verify your database.
 Please verify that your peptide selection criteria are not too restrictive.

Tue Nov 22 15:22:09 CET 2022        No identification results.

I would recommend having a look at your search settings and your FASTA file.

As for why you are not able to get to the message above, could it be that you are running out of space or memory?

@MigRibe
Copy link
Author

MigRibe commented Nov 24, 2022 via email

@hbarsnes
Copy link
Member

the X! tandem says the spectrum doesn't meet the necessary criteria, but I've changed several settings and it still doesn't work. I did the test with the same spectrum using the X! tandem on Petunia and did the analysis. The X! tandem is the one from the latest release?

The version included in SearchGUI is X! TANDEM Vengeance (2015.12.15.2). The specific search engine settings can be found in the resources\temp\search_engines folder during the running of the search (but are removed when the search is completed). Perhaps a direct comparison of the parameters will indicate why there are differences in the results.

Also be sure to check out the advanced X! Tandem settings by clicking the cog wheel next to X! Tandem in the main SearchGUI dialog.

when looking for modifications in peptides, if there is more than one modification in an amino acid, peptideshaker fails in the visualization and in the coverage bar; with just one modification there is no problem.

Generally, more than one modification is not supported on the same reside unless it is at the terminals. Can you share an example where you have two modifications on the same residue (that is not at the termini)? As these should have been filtered out when loading the data in PeptideShaker. And can you also share the error log for when the front end fails?

I also noticed some differences when using comet on peptideshaker and petunia. Peptideshaker does not detect fragments from charge greater than 3. Is there any reason why the results are different using the same search criteria?

You would again have to compare the specific search settings. But the default value for the maximum fragment charge for Comet is 3. You may however change this in the advanced Comet settings by clicking the cog wheel next to Comet in the main SearchGUI dialog.

I ams also interested in identifying cross-linked peptides. Is there any routine you recommend me in searchgui+peptideshaker? Is there a possibility for petideshaker to work with Kojak?

The identification of crosslinked peptides is not support by the search engines included in SearchGUI. Or at least not implemented in SearchGUI at the moment. I have not looked into this in detail, but as far as I remember only MetaMorpheus has support for this (https://github.com/smith-chem-wisc/MetaMorpheus/wiki/Crosslink-Search-Task), however not from inside SearchGUI (nor PeptideShaker).

finally, I would like to ask you if a workstation substantially reduces the search time, or rather what requirements you would recommend. We have some searches taking more than 2 hours (i7 10th series, 16 gb ram, ssd).

This depends on what you mean by search. If you are talking about the processing in PeptideShaker, the main limiting factor is the amount of memory. Note that you will have to set this manually in the PeptideShaker Welcome dialog > Settings & Help > Settings > Java Settings > Memory. Increasing the number of CPUs can also speed up the processing as then you can utilize more parallel processing.

When it comes to the searches in SearchGUI we have much less control over how the individual search engines utilize the resources provided, but generally the same reasoning applies, i.e. the more memory and CPUs the better.

@MigRibe
Copy link
Author

MigRibe commented Dec 9, 2022 via email

@hbarsnes
Copy link
Member

The potential modifications found by the program can actually happen as we are using experimental models to check chemical derivatizations. So, what happens is that the same peptide (in terms of identity) can have different modifications in the same amino acid from the attribution of different scans. When this happens, the location on the protein is the same (bottom bar) and it gives an error when you click there. At the moment I can't have the files. If it's important I'll try to upload it.

Aha, so what you are talking about are different PSMs for the same peptide having different PTMs on the same residue? That can of course happen. I thought you were talking about more than one PTM on the same residue for the same PSM. I guess we just never came across any such cases ourselves. Would be great if you could point me to one such example. Maybe you can also share the error log for when this happens?

How peptideshaker performs merging of the results of different search algorithms?

It basically compares all of the matches for the same spectrum across the search engines and picks the one with the best score.

Is it possible to use a random database (scrambled decoys) instead reverse decoys for FDR calculation?

I think you can create your decoys any way you want as long as they are properly annotated in the FASTA headers. I'm not sure if we have ever done any testing with scrambled decos though. In general there is however no real difference between reversed and scrambled decoys with regards to the results. Hence I would not bother with this unless you have very good reasons to do so.

How peptideshaker uses data from for example two replicates? (i.e. adding two samples at once) does it merge all the spectra?

If results from more than one spectrum file are loaded at the same time the results are indeed merged into one common result. If you want to look at the two replicates separately you will therefore have to load them individually. You can however check out the Fractions tab as it would give you some ideas of which peptides that were detected in each spectrum file, but this is generally intended for loading multiple fractions from the same sample and not for comparing different replicates or samples.

@MigRibe
Copy link
Author

MigRibe commented Aug 2, 2023 via email

@hbarsnes
Copy link
Member

hbarsnes commented Aug 7, 2023

After some time that I didn't use the program I did it again, but the peptide shaker fails to open. I send the report and I was able to locate the file .psdb. Can you please help me?

Can you please also share the PeptideShaker log file: C:\Users\Miguel Ribeiro\Desktop\PeptideShaker-2.2.25\PeptideShaker-2.2.25\resources\PeptideShaker.log?

I would also like to ask you if it is possible to do a research on adducts, but for a polymer-type adduct. That is, let's imagine that I have a monomer linked to an amino acid, is it possible to search for different species of [x]n polymers?

I guess you could create your own polymer adduct modifications and search for those. However you would have to create one per x and include each in your search. So I guess it depends on how high x would be. If x is, for example, 2 or 3 you should be ok, but if x is much higher you probably need a custom search engine. I do not have any experience with such searches though. Perhaps it would be worth asking some of the search engine developers directly? See for example: https://groups.google.com/g/comet-ms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants