Navigate to scripts/ and run with the following arguments (ordered):
- Path to SNP bed/starch file with p-val or log10 bf in column 5. File should already be filtered to remove MHC and Missense SNPs.
- Trait name. Corresponds to column 1 in trait_genes_key.txt. (Must match exactly)
- Either "-log10_P-value" or "log10_Bayes_factor". Describes column 5 of "trait_snps".
- Either "All_SNPs", "DHS_SNPs", or "Trait-Specific_DHS_SNPs". Describes first argument.
- Path to txt file of positive control genes. Format: one column of gene IDs
- Name of positive control set. Corresponds to columns 2&3 in trait_genes_key.txt. (Must match exactly)
- Path to bed/starch file with gene ID in column 4. Contains all genes in transcript model
- Directory name for output files.
./ "$snp_dir/glucose_dhs_snps_pvals.bed"
Run this script for each trait/positive control gene set pairing. Name the output directory to separate files using different SNP sets (e.g. All_SNPs vs DHS_SNPs). This will avoid collision in output files.
Run ./ Combines all files within "../data/intermediate_files/" (the output of ./ Produces file "../data/all_enrichments.db.txt"
Run 3_plot_dist_effect.R Reads "../data/all_enrichments.db.txt" to produce plot at "../plots/distance_effect.pdf"