CRISPR-spacer identification

Identify CRISPR arrays using CRT and PILERCR
python identify_crispr.py -i example/GUT_GENOME147678.fna -o out

The programs CRT and PILERCR are located in the bin directory The program first splits your input sequence into chunks of 50 Mbp. This is done to limit RAM usage. Not an an issue for small genomes, but can be one for large metagenomes. Each sequence is padded with 50 Ns, which helps to find CRISPR arrays at contig boundaries CRT v1.2 and PILER v1.06 are then run on each chunk of data The results are parsed and written to the output directory

Merge overlapping CRISPR arrays identified using CRT and PILERCR
python merge_crispr.py out/crt out/pilercr out/merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CRISPR-spacer identification

Files

README.md

Latest commit

History

README.md

File metadata and controls

CRISPR-spacer identification