Skip to content

Commit

Permalink
2.1.3: fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
weber8thomas committed Jul 6, 2023
1 parent 0324577 commit 7239154
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 85 deletions.
79 changes: 1 addition & 78 deletions docs/output.md
Original file line number Diff line number Diff line change
@@ -1,85 +1,9 @@
# Outputs (ongoing)
# Outputs

This document describes the final outputs produced by the pipeline. Most of the plots are taken from report generated from the [full-sized test dataset](https://sandbox.zenodo.org/record/1074721) for the pipeline.

The files listed below will be created in the selected results directory (`output_location` parameter). All paths are relative to the top-level results directory.

## Directory structure (example for `<SAMPLE>`=_RPE-BM510_)

```bash
<DATA_LOCATION>/<SAMPLE>
|-- alfred
| |-- Celln.tsv.gz
| `-- Celln.json.gz
|-- bam
| |-- Cell1.sort.mdup.bam
| |-- Cell2.sort.mdup.bam
| `-- Celln.sort.mdup.bam
|-- cell_selection
| |-- labels_raw.tsv
| `-- labels.tsv
|-- config
| |-- chroms_to_exclude.txt
| `-- single_paired_end_detection.txt
|-- counts
| `-- RPE-BM510
| `-- counts-per-cell
|-- fastq
| |-- Cell1.1.fastq.gz
| |-- Cell1.2.fastq.gz
| |-- Cell2.1.fastq.gz
| `-- Cell2.2.fastq.gz
|-- haplotag
| |-- bam
| | `-- RPE-BM510
| |-- bed
| `-- table
| `-- RPE-BM510
| `-- by-cell
|-- log
| |-- ...
| `-- ...
|-- merged_bam
| `-- merged_bam.bam
|-- mosaiclassifier
| |-- haplotag_likelihoods
| |-- postprocessing
| | |-- filter
| | | `-- RPE-BM510
| | |-- group-table
| | | `-- RPE-BM510
| | `-- merge
| | `-- RPE-BM510
| |-- sv_calls
| | `-- RPE-BM510
| `-- sv_probabilities
| `-- RPE-BM510
|-- plots
| `-- RPE-BM510
| |-- counts
| |-- final_results
| |-- sv_calls
| |-- sv_clustering
| `-- sv_consistency
|-- segmentation
| `-- RPE-BM510
| `-- segmentation-per-cell
|-- snv_genotyping
| `-- RPE-BM510
|-- stats
| `-- RPE-BM510
`-- strandphaser
|-- phased-snvs
|-- RPE-BM510
| `-- StrandPhaseR_analysis.chr21
| |-- browserFiles
| |-- data
| |-- Phased
| |-- SingleCellHaps
| `-- VCFfiles
`-- R_setup
```

## Plots folder

### Mosaic count - reads density across bins
Expand Down Expand Up @@ -208,7 +132,6 @@ By using these heatmaps, the user can easily identify subclones based on the SV

---


File path: `<OUTPUT_FOLDER>/<SAMPLE>/stats/stats-merged.tsv`

Report category: `Stats`
Expand Down
15 changes: 9 additions & 6 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ All these arguments can be specified in two ways:
| ---------------------------------------- | --------------------------------------------------------------------------------------------------- | ------- | ------------ |
| `multistep_normalisation_analysis` | Allow to perform multistep normalisation including GC correction for visualization (Marco Cosenza). | False | False |
| `multistep_normalisation_for_SV_calling` | Allow to use multistep normalisation count file during SV calling (Marco Cosenza). | False | False |
| `arbigent` | Enable ArbiGent mode of execution to genotype SV based on arbitrary segments | False | True |
| `scNOVA` | Enable scNOVA mode of execution to compute Nucleosome Occupancy (NO) of detected SV | False | True |
| `hgsvc_based_normalized_counts` | Use HGSVC based normalisation . | True | False |
| `arbigent` | Enable ArbiGent mode of execution to genotype SV based on arbitrary segments | False | False |
| `scNOVA` | Enable scNOVA mode of execution to compute Nucleosome Occupancy (NO) of detected SV | False | False |

### External files

Expand Down Expand Up @@ -66,10 +67,12 @@ All these arguments can be specified in two ways:

### EMBL specific options

| Parameter | Comment | Default |
| ---------------------- | ----------------------------------------------------------------------------------------------------- | ------- |
| `genecore` | Enable/disable genecore mode to give as input the genecore shared folder in /g/korbel/shared/genecore | False |
| `genecore_date_folder` | Specify folder to be processed | |
| Parameter | Comment | Default |
| ------------------------- | ----------------------------------------------------------------------------------------------------- | ------------------------------------------- |
| `genecore` | Enable/disable genecore mode to give as input the genecore shared folder in /g/korbel/shared/genecore | False |
| `genecore_date_folder` | Specify folder to be processed | |
| `genecore_prefix` | Specify genecore prefix folder | /g/korbel/STOCKS/Data/Assay/sequencing/2023 |
| `genecore_regex_elements` | Specify genecore regex element to be used to distinguish sample from well number | PE20 |

If `genecore` and `genecore_date_folder` are correctly specified, each plate will be processed independently by creating a specific folder in the `data_location` folder.

Expand Down
2 changes: 1 addition & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ snakemake \

**ℹ️ Note for 🇪🇺 EMBL users**

- You can load already installed snakemake modusl on the HPC (by connecting to login01 & login02) using the following `module load snakemake/7.14.0-foss-2022a`
- You can load already installed snakemake modules on the HPC (by connecting to login01 & login02) using the following `module load snakemake/7.14.0-foss-2022a`
- Use the following command for singularity-args parameter: `--singularity-args "-B /g:/g -B /scratch:/scratch"`

---
Expand Down
20 changes: 20 additions & 0 deletions workflow/rules/external_data.smk
Original file line number Diff line number Diff line change
Expand Up @@ -147,11 +147,31 @@ rule download_scnova_data:
keep_local=True,
),
output:
"workflow/data/scNOVA/utils/bin_chr_length.bed",
"workflow/data/scNOVA/utils/bin_Genebody_all.bed ",
"workflow/data/scNOVA/utils/bin_Genes_for_CNN_num_sort_ann_sort_GC_ensemble.txt",
"workflow/data/scNOVA/utils/bin_Genes_for_CNN_num_sort.txt",
"workflow/data/scNOVA/utils/bin_Genes_for_CNN_reshape_annot.txt",
"workflow/data/scNOVA/utils/bin_Genes_for_CNN_sort.txt.corrected ",
"workflow/data/scNOVA/utils/Deeptool_Genes_for_CNN_merge_sort_lab_final.txt",
"workflow/data/scNOVA/utils/Features_reshape_CpG_orientation_impute.txt",
"workflow/data/scNOVA/utils/Features_reshape_CpG_orientation.txt",
"workflow/data/scNOVA/utils/Features_reshape_GC_orientation_impute.txt",
"workflow/data/scNOVA/utils/Features_reshape_GC_orientation.txt",
"workflow/data/scNOVA/utils/Features_reshape_RT_orientation.txt",
"workflow/data/scNOVA/utils/Features_reshape_size_orientation.txt",
"workflow/data/scNOVA/utils/FPKM_sort_LCL_RPE_19770_renamed.txt",
"workflow/data/scNOVA/utils/regions_all_hg38_v2_resize_2kb_sort_num_sort_for_chromVAR.bed",
"workflow/data/scNOVA/utils/regions_all_hg38_v2_resize_2kb_sort.bed ",
"workflow/data/scNOVA/utils/Strand_seq_matrix_Genebody_for_SCDE.txt",
"workflow/data/scNOVA/utils/Strand_seq_matrix_Genebody_for_SVM.txt",
"workflow/data/scNOVA/utils/Strand_seq_matrix_TES_for_SVM.txt",
"workflow/data/scNOVA/utils/Strand_seq_matrix_TSS_for_SVM.txt",
log:
touch("log/config/dl_arbigent_mappability_track.ok"),
conda:
"../envs/scNOVA/scNOVA_DL.yaml"
container: None
shell:
"""
directory="workflow/data/ref_genomes/"
Expand Down

0 comments on commit 7239154

Please sign in to comment.