Skip to content

Tool for the Quality Control of Long-Read Defined Transcriptomes

License

Notifications You must be signed in to change notification settings

dnwissel/SQANTI3_pip

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQANTI3 logo

SQANTI3

SQANTI3 is the newest version of the SQANTI tool that merges features from SQANTI and SQANTI2, together with new additions. SQANTI3 will continue as an integrated development aiming to provide the best characterization for your new long read-defined transcriptome.

SQANTI3 is the first module of the Functional IsoTranscriptomics (FIT) framework, which also includes IsoAnnot and tappAS.

Latest updates

The latest SQANTI3 release (19/01/2023) is version 5.1.2. See our wiki for installation instructions.

WARNING: v5.0 represented a major release of the SQANTI3 software. Versions of SQANTI3 >= 5.0 will not have backward compatibility with previous releases and their output (v4.3 and earlier). Users that wish to apply any of the new functionalities in v5.0 to output files from older versions will herefore need to re-run SQANTI3 QC. See below for a full list of changes implemented in SQANTI3 v5.0.


Patch to SQANTI3 v5.1.2 [LATEST]:

  • Speed improvements in rescue when finding best match IDs for rescue candidates.
  • Output all reference transcripts associated to FSM/ISM found during QC if the --isoform_hits flag is supplied.
  • Fixed bug in ML filter leading to errors when input data had no mono-exonic transcripts.
  • Fixed saturation curve bug in pigeon report.
  • Fixes conda environment installation of bcbio-gff.

Patch to SQANTI3 v5.1.1:

  • Adapt for pigeon compatibility.
  • Bug fixes in STM function.
  • Minor fixes in ML filter documentation and output file handling.
  • Fixed bug leading to incorrect classification of genic intron transcripts.

New features in SQANTI3 v5.1:

Major changes:

  • Implemented new rescue strategy to recover transcriptome diversity lost after filtering (see details at the SQ rescue wiki).
  • Updated conda environment to include rescue dependencies. We recommend creating the environment again in order for SQANTI3 to run without error.
  • Fixed behavior of mono-exon transcripts during ML filter:
    • FSM now undergo intra-primming evaluation if they are mono-exons.
    • Corrected ML filter output when --force_multi_exon option is supplied: mono-exon transcripts will now be labeled as Artifacts.
  • Fixed reasons file output by rules filter: the table now includes correct filtering reasons for mono-exon transcripts.
  • Added an option to rules filter to control for mono-exon transcripts (previously available in ML filter).
  • Modified the output of SQANTI3 QC to incorporate the creation of a complete params.txt file, i.e. including all arguments and the full paths of all supplied files.

Minor fixes/enhancements:

  • Fixed output path for IsoAnnotLite GFF3 that prevented writing the file to the correct output directory when -gff3 option was not used.
  • Set temporary file dir for HTML report creation (fixes Singularity container error).

New features in SQANTI3 v5.0:

Major changes:

  • Implemented new machine learning-based filter.
  • Updated rules filter: users can now define their own set of rules using a JSON file. By default, the rules filter applies the same set of rules that were implemented in the old sqanti3_RulesFilter.py script.
  • The sqanti3_RulesFilter.py script is now deprecated and has been replaced by sqanti3_filter.py, which works a wrapper for both filters (see details in the documentation).
  • IsoAnnotLite updated to version 2.7.3.
  • Substantial modification of the SQANTI3 directory structure, with utilities folder now being divided into subfolders that group the scripts by their function.
  • Added a column in the classification file to indicate whether a polyA motif was found, which adds to the existing column detailing the detected motif (details here).
  • Changed CAGE argument and CAGE/polyA columns to capital letters (for consistency across columns and arguments).
  • The example folder now includes sample commands and output files for SQANTI3 QC, rules filter and machine learning filter.
  • Added new supported transcript model (STM) plots to the SQANTI3 QC report.

Minor fixes/enhancements:

  • Included cython (cDNA_cupcake dependency) as a dependency in the SQANTI3 conda environment.
  • pip installed in conda environment.
  • When supplied, the new sqanti3_filter.py filters the sqanti3_qc.py output files using the filter result (rules or ML). This was not previously done by sqanti3_RulesFilter.py.
  • Antisense vs intergenic bug: fixed inconsistencies in classification of isoforms across the two categories.
  • Fixed deprecation warnings in calculation of ratioTSS.
  • Minor report updates.

Documentation

For detailed documentation, please visit the SQANTI3 wiki.

Wiki contents:

Please, note that we are currently updating and expanding the wiki to provide as much information as possible and enhance the SQANTI3 user experience. Pages under construction -or where information is still missing- will be indicated where appropriate. Thank you for your patience!

How to cite SQANTI3

SQANTI3 paper is currently in preparation. In the meantime, when using SQANTI3 in your research, please cite the original SQANTI paper as well as this repository:

  • Tardaguila M, de la Fuente L, Marti C, et al. SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res, 2018. 28(3):396-411. doi:10.1101/gr.222976.117

About

Tool for the Quality Control of Long-Read Defined Transcriptomes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 45.7%
  • R 36.7%
  • Shell 11.9%
  • Perl 3.4%
  • Cython 2.0%
  • CSS 0.2%
  • JavaScript 0.1%