Merge pull request #6525 from pavanvidem/xref-seurat

Seurat add bio.tools
galaxyproject · Nov 5, 2024 · 566984b · 566984b
2 parents 92d9a56 + f72f923
commit 566984b
Show file tree

Hide file tree

Showing 8 changed files with 54 additions and 42 deletions.
diff --git a/tools/seurat/create_seurat.xml b/tools/seurat/create_seurat.xml
@@ -3,6 +3,7 @@
     <macros>
         <import>macros.xml</import>
     </macros>
+    <expand macro="bio_tools"/>
     <expand macro="requirements"/>
     <expand macro="version_command"/>
     <command detect_errors="exit_code"><![CDATA[
@@ -128,15 +129,15 @@
             pattern = '$method.percent_mt.pattern',
         )
         #end if
-        
+
         #if $method.input_type.citeseq.citeseq == 'true'
             citeseq<-read.table(
                 'citeseq.tab',
                 header = TRUE,
                 row.names = 1,
                 sep = "\t"
             )
-        
+
             seurat_obj[['ADT']]<-CreateAssay5Object(counts = citeseq)
         #end if
     #end if
@@ -446,19 +447,19 @@ seurat_obj[['$method.col_name']]<-PercentageFeatureSet(
 Seurat
 ======
 
-Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. 
+Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data.
 
 Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data.
 
-Creating a Seurat Object 
+Creating a Seurat Object
 ========================
 
-Seurat objects can be created from single cell data in matrix market or tab-delimited table formats, using the Read10X or read.table functions followed by CreateSeuratObject. 
+Seurat objects can be created from single cell data in matrix market or tab-delimited table formats, using the Read10X or read.table functions followed by CreateSeuratObject.
 The input should be a single cell matrix with cells as rows and genes as columns.
 
 Both RNA-seq and combined RNA and CITE-seq data can be used as inputs.
 
-Read10X 
+Read10X
 ========
 
 Load sparse data matrices provided by 10X genomics.
@@ -469,36 +470,36 @@ More details on the `seurat documentation
 read.table
 ==========
 
-Read a tab-delimited tsv or tabular file into an RDS file as a table. 
+Read a tab-delimited tsv or tabular file into an RDS file as a table.
 
 More details on the `R documentation
 <https://www.rdocumentation.org/packages/utils/versions/3.6.2/topics/read.table>`__
 
 CreateSeuratObject
 ==================
 
-Create a Seurat Object from raw data in RDS format. 
+Create a Seurat Object from raw data in RDS format.
 
 names.field
 
-For the initial identity class for each cell, choose this field from the cell's name. 
+For the initial identity class for each cell, choose this field from the cell's name.
 E.g. If your cells are named as BARCODE_CLUSTER_CELLTYPE in the input matrix, set names.field to 3 to set the initial identities to CELLTYPE.
 
 names.delim
 
-For the initial identity class for each cell, choose this delimiter from the cell's column name. 
+For the initial identity class for each cell, choose this delimiter from the cell's column name.
 E.g. If your cells are named as BARCODE-CLUSTER-CELLTYPE, set this to “-” to separate the cell name into its component parts for picking the relevant field.
 
 meta.data
 
-Additional cell-level metadata to add to the Seurat object. Should be a data.frame where the rows are cell names and the columns are additional metadata fields. 
+Additional cell-level metadata to add to the Seurat object. Should be a data.frame where the rows are cell names and the columns are additional metadata fields.
 Row names in the metadata need to match the column names of the counts matrix.
 
 Filtering can also be performed on:
 
-min.cells = only include features/genes detected in at least this many cells 
+min.cells = only include features/genes detected in at least this many cells
 
-min.features = only include cells where at least this many features are detected 
+min.features = only include cells where at least this many features are detected
 
 Some QC metrics are added when creating a Seurat Object (nCount_RNA and nFeature_RNA).
 Mito percentage can optionally be calculated - it will be based on gene names starting with "MT-". If this pattern does not work for your gene names then you can use the separate 'Calculate QC Metrics' function instead.
@@ -509,7 +510,7 @@ More details on the `seurat documentation
 Calculate QC Metrics
 ====================
 
-Calculate the percentage of all the counts belonging to a subset of the possible features for each cell. This is useful when trying to compute the percentage of transcripts that map to mitochondrial genes for example. 
+Calculate the percentage of all the counts belonging to a subset of the possible features for each cell. This is useful when trying to compute the percentage of transcripts that map to mitochondrial genes for example.
 The calculation here is simply the column sum of the matrix present in the counts slot for features belonging to the set divided by the column sum for all features times 100.
 
 Feature sets can be defined by entering a list of genes or using a shared pattern in the gene names, such as "^MT-" or "^RP[LS]" for human mitochondrial or ribosomal genes.
@@ -520,7 +521,7 @@ More details on the `seurat documentation
 Filter Cells
 ============
 
-Filter cells based on QC metrics. 
+Filter cells based on QC metrics.
 
 nFeature_RNA = number of unique genes identified in the cell
 

diff --git a/tools/seurat/inspect_and_manipulate.xml b/tools/seurat/inspect_and_manipulate.xml
@@ -3,6 +3,7 @@
     <macros>
         <import>macros.xml</import>
     </macros>
+    <expand macro="bio_tools"/>
     <expand macro="requirements"/>
     <expand macro="version_command"/>
     <command detect_errors="exit_code"><![CDATA[
@@ -54,8 +55,8 @@
 
     #else if $method.inspect.inspect == 'Matrix'
         inspect<-LayerData(
-            seurat_obj, 
-            assay='$method.inspect.assay', 
+            seurat_obj,
+            assay='$method.inspect.assay',
             layer='$method.inspect.layer'
             )
         row.names = TRUE
@@ -672,7 +673,7 @@
 Seurat
 ======
 
-Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. 
+Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data.
 
 Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data.
 
@@ -717,7 +718,7 @@ AddMetaData
 Merge
 =====
 
-Combine two Seurat Objects into a single Seurat Object. 
+Combine two Seurat Objects into a single Seurat Object.
 Each object will be placed in a separate layer, but you can choose to run the JoinLayers function after merging to combine the objects into a single layer.
 
 Subset
@@ -728,7 +729,7 @@ Subset a group of cells based on their ident or another grouping in your cell me
 DietSeurat
 ==========
 
-Keep only certain aspects of the Seurat object. 
+Keep only certain aspects of the Seurat object.
 Can be useful in functions that utilize merge as it reduces the amount of data in the merge
 
 More details on these essential commands can be found in the `seurat documentation
@@ -737,7 +738,7 @@ More details on these essential commands can be found in the `seurat documentati
 AggregateExpression
 ===================
 
-Returns summed counts ("pseudobulk") for each identity class. 
+Returns summed counts ("pseudobulk") for each identity class.
 
 More details on the `seurat documentation
 <https://satijalab.org/seurat/reference/aggregateexpression>`__
@@ -746,7 +747,7 @@ More details on the `seurat documentation
 DefaultAssay
 ============
 
-Set the default assay for multimodal data. 
+Set the default assay for multimodal data.
 
 You can use the Inspect - General function to check which assay is currently active and which other assays are available.
 

diff --git a/tools/seurat/integrate.xml b/tools/seurat/integrate.xml
@@ -3,6 +3,7 @@
     <macros>
         <import>macros.xml</import>
     </macros>
+    <expand macro="bio_tools"/>
     <expand macro="requirements"/>
     <expand macro="version_command"/>
     <command detect_errors="exit_code"><![CDATA[
@@ -15,7 +16,7 @@
 
 #if $method.method == 'SplitLayers'
     seurat_obj[['$method.assay']]<-split(
-        seurat_obj[['$method.assay']], 
+        seurat_obj[['$method.assay']],
         f = seurat_obj[['$method.factor', drop = TRUE]]
     )
 
@@ -26,12 +27,12 @@
     #end if
 
     seurat_obj<-IntegrateLayers(
-        seurat_obj, 
+        seurat_obj,
         method = $method.integration.integration_method,
         #if $method.integration.integration_method == 'CCAIntegration'
             #if $method.integration.adv.k_filter
             k.filter = $method.integration.adv.k_filter,
-            #else           
+            #else
             k.filter = NA,
             #end if
             dims = 1:$method.integration.adv.dims,
@@ -73,7 +74,7 @@
         #else if $method.integration.integration_method == 'RPCAIntegration'
             #if $method.integration.adv.k_filter
             k.filter = $method.integration.adv.k_filter,
-            #else           
+            #else
             k.filter = NA,
             #end if
             dims = 1:$method.integration.adv.dims,
@@ -87,7 +88,7 @@
             sd.weight = $method.integration.adv.sd_weight,
             preserve.order = $method.integration.adv.preserve_order,
         #end if
-        orig.reduction = '$method.orig_reduction', 
+        orig.reduction = '$method.orig_reduction',
         new.reduction = '$method.new_reduction',
         #if $method.assay != ''
         assay = '$method.assay',
@@ -219,7 +220,7 @@
             <conditional name="method">
                 <param name="method" value="SplitLayers"/>
                 <param name="assay" value="RNA"/>
-                <param name="factor" value="Group"/>                 
+                <param name="factor" value="Group"/>
             </conditional>
             <section name="advanced_common">
                 <param name="show_log" value="true"/>
@@ -261,7 +262,7 @@
                     <param name="integration_method" value="HarmonyIntegration"/>
                 </conditional>
                 <param name="orig_reduction" value="pca"/>
-                <param name="new_reduction" value="integrated.harm"/>            
+                <param name="new_reduction" value="integrated.harm"/>
             </conditional>
             <section name="advanced_common">
                 <param name="show_log" value="true"/>
@@ -310,7 +311,7 @@
 Seurat
 ======
 
-Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. 
+Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data.
 
 Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data.
 
@@ -327,7 +328,7 @@ More details on the `R documentation
 Integrate
 =========
 
-Multiple layers are integrated to enable them to be analysed together. 
+Multiple layers are integrated to enable them to be analysed together.
 
 Available methods are: CCA, Harmony, JointPCA, RPCA, FastMNN and scVI.
 
@@ -345,7 +346,7 @@ More details on the `seurat documentation
 PrepSCTFindMarkers
 ==================
 
-Given a merged object with multiple SCT models, this function uses minimum of the median UMI (calculated using the raw UMI counts) of individual objects to reverse the individual SCT regression model using minimum of median UMI as the sequencing depth covariate. 
+Given a merged object with multiple SCT models, this function uses minimum of the median UMI (calculated using the raw UMI counts) of individual objects to reverse the individual SCT regression model using minimum of median UMI as the sequencing depth covariate.
 The counts slot of the SCT assay is replaced with recorrected counts and the data slot is replaced with log1p of recorrected counts.
 
 More details on the `seurat documentation

diff --git a/tools/seurat/macros.xml b/tools/seurat/macros.xml
@@ -2,6 +2,11 @@
     <token name="@TOOL_VERSION@">5.0</token>
     <token name="@VERSION_SUFFIX@">0</token>
     <token name="@PROFILE@">23.0</token>
+    <xml name="bio_tools">
+        <xrefs>
+            <xref type="bio.tools">seurat</xref>
+        </xrefs>
+      </xml>
     <xml name="requirements">
         <requirements>
             <requirement type="package" version="@TOOL_VERSION@">r-seurat</requirement>
@@ -141,7 +146,7 @@ write.csv(seurat_obj, 'markers_out.csv', quote = FALSE)
         </data>
     </xml>
     <token name="@CMD_inspect_rds_outputs@"><![CDATA[
-write.table(inspect, 'inspect_out.tab', sep="\t", col.names = col.names, row.names = row.names, quote = FALSE)    
+write.table(inspect, 'inspect_out.tab', sep="\t", col.names = col.names, row.names = row.names, quote = FALSE)
     ]]>
     </token>
     <xml name="plot_out">

diff --git a/tools/seurat/neighbors_clusters_markers.xml b/tools/seurat/neighbors_clusters_markers.xml
@@ -3,6 +3,7 @@
     <macros>
         <import>macros.xml</import>
     </macros>
+    <expand macro="bio_tools"/>
     <expand macro="requirements"/>
     <expand macro="version_command"/>
     <command detect_errors="exit_code"><![CDATA[
@@ -754,7 +755,7 @@ More details on the `seurat documentation
 FindMultiModalNeighbors
 =======================
 
-This function will construct a weighted nearest neighbor (WNN) graph for two modalities (e.g. RNA-seq and CITE-seq). For each cell, we identify the nearest neighbors based on a weighted combination of two modalities. 
+This function will construct a weighted nearest neighbor (WNN) graph for two modalities (e.g. RNA-seq and CITE-seq). For each cell, we identify the nearest neighbors based on a weighted combination of two modalities.
 
 Takes as input two dimensional reductions, one computed for each modality.
 

diff --git a/tools/seurat/normalize_select_features_scale.xml b/tools/seurat/normalize_select_features_scale.xml
@@ -3,6 +3,7 @@
     <macros>
         <import>macros.xml</import>
     </macros>
+    <expand macro="bio_tools"/>
     <expand macro="requirements"/>
     <expand macro="version_command"/>
     <command detect_errors="exit_code"><![CDATA[
@@ -404,7 +405,7 @@ seurat_obj<-SCTransform(
 Seurat
 ======
 
-Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. 
+Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data.
 
 Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data.
 
@@ -444,7 +445,7 @@ More details on the `seurat documentation
 Scale and regress the data with ScaleData
 =========================================
 
-Scale and center features in the dataset. 
+Scale and center features in the dataset.
 
 If variables are provided in vars.to.regress, they are individually regressed against each feature, and the resulting residuals are then scaled and centered.
 
@@ -454,7 +455,7 @@ More details on the `seurat documentation
 SCTransform
 ===========
 
-Use this function as an alternative to the NormalizeData, FindVariableFeatures, ScaleData workflow. 
+Use this function as an alternative to the NormalizeData, FindVariableFeatures, ScaleData workflow.
 
 Results are saved in a new assay (named SCT by default) with counts being (corrected) counts, data being log1p(counts), scale.data being pearson residuals; sctransform::vst intermediate results are saved in misc slot of new assay.
 

diff --git a/tools/seurat/plot.xml b/tools/seurat/plot.xml
@@ -3,6 +3,7 @@
     <macros>
         <import>macros.xml</import>
     </macros>
+    <expand macro="bio_tools"/>
     <expand macro="requirements"/>
     <expand macro="version_command"/>
     <command detect_errors="exit_code"><![CDATA[
@@ -1177,7 +1178,7 @@ More details on the `seurat documentation
 FeatureScatter
 ==============
 
-Create a scatter plot of two features (typically feature expression), across a set of single cells. Cells are colored by their identity class. 
+Create a scatter plot of two features (typically feature expression), across a set of single cells. Cells are colored by their identity class.
 Pearson correlation between the two features is displayed above the plot.
 
 More details on the `seurat documentation
@@ -1210,7 +1211,7 @@ More details on the `seurat documentation
 DimPlot
 =======
 
-Graph the output of a dimensional reduction technique on a 2D scatter plot where each point is a cell and it's positioned based on the cell embeddings determined by the reduction technique. 
+Graph the output of a dimensional reduction technique on a 2D scatter plot where each point is a cell and it's positioned based on the cell embeddings determined by the reduction technique.
 By default, cells are colored by their identity class (can be changed with the group.by parameter).
 
 More details on the `seurat documentation
@@ -1219,7 +1220,7 @@ More details on the `seurat documentation
 DimHeatmap
 ==========
 
-Draw a heatmap focusing on a principal component. Both cells and genes are sorted by their principal component scores. 
+Draw a heatmap focusing on a principal component. Both cells and genes are sorted by their principal component scores.
 Allows for nice visualization of sources of heterogeneity in the dataset.
 
 More details on the `seurat documentation
@@ -1228,7 +1229,7 @@ More details on the `seurat documentation
 ElbowPlot
 =========
 
-Plot the standard deviations of the principal components for easy identification of an elbow in the graph - plots PCA as default reduction. 
+Plot the standard deviations of the principal components for easy identification of an elbow in the graph - plots PCA as default reduction.
 
 More details on the `seurat documentation
 <https://satijalab.org/seurat/reference/elbowplot>`__
@@ -1252,7 +1253,7 @@ More details on the `seurat documentation
 DotPlot
 =======
 
-Visualize how feature expression changes across different identity classes (e.g. clusters). 
+Visualize how feature expression changes across different identity classes (e.g. clusters).
 The size of the dot encodes the percentage of cells within a class that express the gene, while the color encodes the AverageExpression level across all cells within a class.
 
 More details on the `seurat documentation