Equivalent Junctions

This repository contains the python scripts that can be used to extarct equivalent junctions for a genome. The codes are pretty general and can be used for any genome.

Software

Python 3.6.1 with Biopython 1.71 installed

Pyhton scripts

getJunctionsFromGTF.py:

determines equivalent junctions from a genome fasta file and an annotation file in the GTF format

usage:

python getJunctionsFromGTF.py -f genome_file.fa
                              -a annotation_file.gtf
                              -o output_file.txt

output file:
- a tab delimited file with an extracted equivalent junction sequence for each annotated exon-exon boundary:

      equiv_junc_sequence    gene_name    chromosome    strand    donor_exon_coordinate    acceptor_exon_coordinate

example equivalent junction in the output file:

                                 AG   LEPR   chr1    +    65420740    65425302

getJunctionsFromTxt.py:

determines equivalent junction sequences from a genome fasta file and annotatiaon file in the BED format containing chr, donor, and acceptor coordinates.
usage:

  python getJunctionsFromTxt.py -f genome_file.fa
                                -t annotation_file.txt
                                -o output_file.txt
                                -c choromosome_column (defult=0) 
                                -d donor_coordinate_column (default=1) 
                                -a acceptor_coordinate_column (default=2)
                                -s strand_column (default=3)

output file:
- a tab delimited file with the extracted equivalent junction sequence for each annotated exon-exon boundary:

      equiv_junc_sequence    gene_name    chromosome    strand    donor_exon_coordinate    acceptor_exon_coordinate

Note: Example genome fasta and annotation files that have been analyzed for equivalent junctions are provided in equiv_junc_genomes.

Computing the number of equivalent junctions:

After running the python script and obtaining the output_file containing the equivalent junction sequence for each exon-exon junction, the following command gives the total number of each equivalent junction sequence:

cut -f1,3,4,5,6 output_file.txt | sort -u |cut -f1 | sort |uniq -c| sort -k1nr > output_file_equivSeqCounts.txt

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
README.md		README.md
equiv_junc_genomes.md		equiv_junc_genomes.md
getJunctionsFromGTF.py		getJunctionsFromGTF.py
getJunctionsFromTxt.py		getJunctionsFromTxt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Equivalent Junctions

Software

Pyhton scripts

getJunctionsFromGTF.py:

getJunctionsFromTxt.py:

Computing the number of equivalent junctions:

About

Releases

Packages

Languages

roozbehdn/Equivalent-Junctions

Folders and files

Latest commit

History

Repository files navigation

Equivalent Junctions

Software

Pyhton scripts

getJunctionsFromGTF.py:

getJunctionsFromTxt.py:

Computing the number of equivalent junctions:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages