-
Notifications
You must be signed in to change notification settings - Fork 0
Bowtie
- Author: Dreycey Albin
- Date: 05/10/2019
- Updates: 05/12/2019 -- finished
- Mapping short sequences on to bigger sequences
- documentation (website): URL TO WEBSITE
- documentation (manual): http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml
- documentation (publication): BOWTIE 2
INSTRUCTIONS ON HOW TO INSTALL USING TERMINAL
- installation for MacOS
conda install bowtie2
OR
docker
OR
wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.5.1/bowtie2-2.3.5.1-macos-x86_64.zip/download;
- installation for Linux
conda install bowtie2;
OR
docker;
OR
wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.5.1/bowtie2-2.3.5.1-linux-x86_64.zip/download;
- installation from source
wget [bowtie2-2.3.5.1-source.zip](https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.5.1/bowtie2-2.3.5.1-source.zip/download "Click to download bowtie2-2.3.5.1-source.zip");
or try..
wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.5.1/bowtie2-2.3.5.1-sra-linux-x86_64.zip/download;
- building the index
bowtie2-build [options]* <reference_in> <bt2_base>
- aligning the reads
bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> | -U <r> | --interleaved <i> | --sra-acc <acc> | b <bam>} -S [<sam>]
PUT FILES HERE
GENERAL OUTPUT FROM PROGRAM (files and commands)
-
may do
- end-to-end
- local alignments
-
Scoring techniques
- End-to-end alignment score
- Local alignment score
-
There are thresholds for each scoring method
- minimum score can be changed
- --score-min flag
- minimum score can be changed
-
mapping quality metric
- can be thought of as uniqueness
- Q = -10 log([p]), where p is probability that alignment does not correspond to reads true point of origin.
-
concordance give an estimate on the paired end reads
- must be within a certain distance from each other and expected orientation
- This can be used to identify structural rearrangements
- Build the index for the illumina example
bowtie2-build -f -c GCF_000006765.1_ASM676v1_genomic.fna illumina_output
- Build the index
bowtie2-build -f -c GCF_000005845.2_ASM584v2_genomic.fna output
output
output.1.bt2
output.2.bt2
output.3.bt2
output.4.bt2
output.rev.1.bt2
output.rev.2.bt2
- Run Bowtie2
- build the index
bowtie2-build -f GCF_000005845.2_ASM584v2_genomic.fna illumina_output_2
- next test out bowtie2
bowtie2 -p 70 -x illumina_output_2 -1 paired_dat1.fq -2 paired_dat2.fq -S albinout.sam
- index files
illumina_output_2.1.bt2
illumina_output_2.3.bt2
illumina_output_2.rev.1.bt2
illumina_output_2.2.bt2
illumina_output_2.4.bt2
illumina_output_2.rev.2.bt2
- Main output
albinout.sam
- obtain reference genome
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/765/GCF_000006765.1_ASM676v1/GCF_000006765.1_ASM676v1_genomic.fna.gz;
gzip -d GCF_000006765.1_ASM676v1_genomic.fna.gz
- get illumina reads
wget http://resources.qiagenbioinformatics.com/testdata/paeruginosa-reads.zip;
- build index
bowtie2-build -f GCF_000006765.1_ASM676v1_genomic.fna illumina_output_test
- test out read mapping
bowtie2 -p 70 -x illumina_output_test -1 paeruginosa-reads/SRR396636.sra_1.fastq -2 paeruginosa-reads/SRR396636.sra_2.fastq -S illumina_test_out.sam
- convert sam to bam
./../samtools/samtools view -S -b illumina_test_out.sam > illumina.bam
- view the output
samtools view sample.bam | head
- sort the BAM file
./../samtools/samtools sort illumina.bam -o illumina.sorted.bam
*view the output
samtools view illumina.sorted.bam | head
- index files
\illumina_output_test.1.bt2
illumina_output_test.3.bt2
illumina_output_test.rev.1.bt2
illumina_output_test.2.bt2
illumina_output_test.4.bt2
illumina_output_test.rev.2.bt2
- main output
illumina.bam
illumina.sorted.bam
To align paired-end reads included with Bowtie 2, stay in the same directory and run:
$BT2_HOME/bowtie2 -x lambda_virus -1 $BT2_HOME/example/reads/reads_1.fq -2 $BT2_HOME/example/reads/reads_2.fq -S eg2.sam
This aligns a set of paired-end reads to the reference genome, with results written to the file eg2.sam
.
To use local alignment to align some longer reads included with Bowtie 2, stay in the same directory and run:
$BT2_HOME/bowtie2 --local -x lambda_virus -U $BT2_HOME/example/reads/longreads.fq -S eg3.sam
This aligns the long reads to the reference genome using local alignment, with results written to the file eg3.sam
.
website : parameters for Bowtie
### Main arguments
-x <bt2-idx>
The basename of the index for the reference genome. The basename is the name of any of the index files up to but not including the final `.1.bt2`/ `.rev.1.bt2` / etc. `bowtie2` looks for the specified index first in the current directory, then in the directory specified in the `BOWTIE2_INDEXES`environment variable.
-1 <m1>
Comma-separated list of files containing mate 1s (filename usually includes `_1`), e.g. `-1 flyA_1.fq,flyB_1.fq`. Sequences specified with this option must correspond file-for-file and read-for-read with those specified in `<m2>`. Reads may be a mix of different lengths. If `-` is specified, `bowtie2`will read the mate 1s from the “standard in” or “stdin” filehandle.
-2 <m2>
Comma-separated list of files containing mate 2s (filename usually includes `_2`), e.g. `-2 flyA_2.fq,flyB_2.fq`. Sequences specified with this option must correspond file-for-file and read-for-read with those specified in `<m1>`. Reads may be a mix of different lengths. If `-` is specified, `bowtie2`will read the mate 2s from the “standard in” or “stdin” filehandle.
-U <r>
Comma-separated list of files containing unpaired reads to be aligned, e.g. `lane1.fq,lane2.fq,lane3.fq,lane4.fq`. Reads may be a mix of different lengths. If `-` is specified, `bowtie2` gets the reads from the “standard in” or “stdin” filehandle.
--interleaved
Reads interleaved FASTQ files where the first two records (8 lines) represent a mate pair.
--sra-acc
Reads are SRA accessions. If the accession provided cannot be found in local storage it will be fetched from the NCBI database. If you find that SRA alignments are long running please rerun your command with the [`-p`/`--threads`](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#bowtie2-options-p) parameter set to desired number of threads.
NB: this option is only available if bowtie 2 is compiled with the necessary SRA libraries. See [Obtaining Bowtie 2](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#obtaining-bowtie-2) for details.
-b <bam>
Reads are unaligned BAM records sorted by read name. The [`--align-paired-reads`](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#bowtie2-options-align-paired-reads) and [`--preserve-tags`](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#bowtie2-options-preserve-tags) options affect the way Bowtie 2 processes records.
-S <sam>
File to write SAM alignments to. By default, alignments are written to the “standard out” or “stdout” filehandle (i.e. the console).
These are a growing collection of manuals for commonly used bioinformatics tools.
Just go to the page for the tool you are trying to use, and scroll through the page to download and install. That simple. The goal is to add extra documentation for using these tools, in addition to what is already supplied by the manual pages for the programs.