Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max bin don't find enough marker genes. #29

Closed
SilasK opened this issue Oct 24, 2017 · 8 comments
Closed

Max bin don't find enough marker genes. #29

SilasK opened this issue Oct 24, 2017 · 8 comments
Labels

Comments

@SilasK
Copy link
Member

SilasK commented Oct 24, 2017

I had several times the problem that maxbin gave me the following error.

MaxBin 2.2.1
Input contig: F35/F35_contigs.fasta
Located abundance file [F35/genomic_bins/F35_contig_coverage.tsv]
out header: F35/genomic_bins/F35
Min contig length: 200
Thread: 6
Probability threshold: 0.9
Max iteration: 50
Searching against 107 marker genes to find starting seed contigs for [F35/F35_contigs.fasta]...
Try harder to dig out marker genes from contigs.
Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

But my contigs statistics don't look so bad. I don't know what's the problem.

n_scaffolds | scaf_bp    | scaf_N50 | scaf_L50 | scaf_N90 | scaf_L90 | scaf_max |
12470         | 65564177 | 1130       | 13095       | 7423        | 1725       | 165795     | 

and the cov file looks like this:

F35_0   18.9348
F35_1   15.5894
F35_2   18.8915
F35_3   20.1290
F35_4   28.7693
F35_5   33.6804
F35_6   24.2962
F35_7   19.3721
F35_8   20.3123
F35_9   17.1395
@SilasK SilasK added the bug label Oct 25, 2017
@brwnj
Copy link
Member

brwnj commented Nov 7, 2017

I'm running into the same thing. We could likely add support for alternative binning codes as you previously suggested, like CONCOCT, GroopM, and/or MetaBAT.

@vindarbot
Copy link

I had the same problem, after deletes files from a previous analysis (same output: rm sample_0/maxbin/MaxBin.out*), run_Maxbin.pl finally worked properly.

@CMagnoBR
Copy link

CMagnoBR commented Mar 24, 2023

Hi everyone! Sorry for reopen this issue.
I had the same problem here, when using Atlas version: 2.14.3 Snakemake 7.18.2, with SE reads. Its seems some problem in the maxbin binning step. Below is the maxbin.log:

MaxBin 2.2.7
Input contig: SRR19257246.trim.woutplant/SRR19257246.trim.woutplant_contigs.fasta
Located abundance file [SRR19257246.trim.woutplant/binning/coverage/SRR19257246.trim.woutplant_coverage.txt]
out header: SRR19257246.trim.woutplant/binning/maxbin/intermediate_files/SRR19257246.trim.woutplant
Min contig length: 1000
Thread: 8
Probability threshold: 0.9
Max iteration: 50
Searching against 107 marker genes to find starting seed contigs for [SRR19257246.trim.woutplant/SRR19257246.trim.woutplant_contigs.fasta]...
Running FragGeneScan....
Running HMMER hmmsearch....
Try harder to dig out marker genes from contigs.
Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

Please, help me!

@SilasK
Copy link
Member Author

SilasK commented Mar 24, 2023

You have the folowing options:

  1. Use a other final_binner e.g. vamb or metabat.
  2. remove the sample from the sample.tsv
  • I should add the option to exclude a sample only for the binning but not for quantification

@SilasK SilasK reopened this Mar 24, 2023
@mladen5000
Copy link
Contributor

Still having this issue, just a thought - would it be possible to touch a file upon completion of maxbin (or Semibin, vamb, etc).

You have the folowing options:

1. Use a other `final_binner` e.g. vamb or metabat.

2. remove the sample from the sample.tsv


* [ ]  I should add the option to exclude a sample only for the binning but not for quantification

@SilasK
Copy link
Member Author

SilasK commented May 15, 2023

Lets discuss on #651

@SilasK SilasK closed this as completed May 15, 2023
@LLansing
Copy link
Contributor

I've had the exact log for maxbin as CMagnoBR for 4/14 of my samples.
I understand that choosing a different binner is an option, but @SilasK could you please provide some context as to what the output means and what this means about my samples?

@SilasK
Copy link
Member Author

SilasK commented Jul 25, 2023

As I understand it. maxbin estimates how many genomes can be found in a sample to set binning parameters.
If it doesn't find marker genes it says it cannot bin. This probably means that your sample is too shallow or not well assembled.
Other binners might recover bins, but likely not too many hq bins anyway.

Unfortunately, there is no easy way in atlas to ignore such a sample for now. We are working on it #651 .

But for now best options are to drop very bad samples and or choose another binner, e.g. vamb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants