Skip to content

Commit

Permalink
Ampvis subset allow file input for taxa selection (#6463)
Browse files Browse the repository at this point in the history
* input update and test case

* fixed test

* bump

---------

Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
  • Loading branch information
paulzierep and bgruening authored Oct 25, 2024
1 parent 938f808 commit d929359
Show file tree
Hide file tree
Showing 2 changed files with 72 additions and 10 deletions.
80 changes: 70 additions & 10 deletions tools/ampvis2/subset_taxa.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<tool id="ampvis2_subset_taxa" name="ampvis2 subset data" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@" license="MIT">
<tool id="ampvis2_subset_taxa" name="ampvis2 subset data" version="@TOOL_VERSION@+galaxy1" profile="@PROFILE@" license="MIT">
<description>by taxonomy</description>
<macros>
<import>macros.xml</import>
Expand All @@ -11,33 +11,62 @@
<configfile name="rscript"><![CDATA[
library(ampvis2, quietly = TRUE)
data <- readRDS("$data")
#set tax_vector = 'c("' + '", "'.join(str($tax_vector).split(",")) + '")'
data <- amp_subset_taxa(
#if $select_param == "option_input_selected_file"
## Define the file path (assuming it's a text file with one value per line)
file_path <- "$selected_taxonomy_list"
## Read the file into a character vector, where each line represents a value
lines <- readLines(file_path)
## Remove any leading or trailing whitespace (if necessary)
tax_vector <- trimws(lines)
#else
tax_vector <- unlist(strsplit("$tax_vector", ","))
#end if
data <- amp_subset_taxa(
data,
tax_vector = $tax_vector,
tax_vector = tax_vector,
normalise = $normalise,
remove = $remove
)
)
saveRDS(data, "$ampvis")
@SAVE_TAX_LIST@
]]></configfile>
</configfiles>
<inputs>
<expand macro="rds_input_macro"/>
<!-- add validator -->
<param name="taxonomy_list" type="data" optional="false" format="tabular" label="Taxonomy list" help="Generated with ampvis2: load"/>
<expand macro="taxonomy_select_macro" argument="tax_vector" label="Taxa to keep"/>
<conditional name="taxonomy_select_type">
<param name="select_param" type="select" label="Choose taxa input option">
<option value="option_input_file">Select taxa from taxonomy list file</option>
<option value="option_input_selected_file">Provide a file with selected taxa</option>
</param>

<when value="option_input_selected_file">
<param name="selected_taxonomy_list" type="data" optional="false" format="tabular" label="Selected taxonomy list" help="A file containing the taxa to subsample. One taxa per line."/>
</when>

<when value="option_input_file">
<param name="taxonomy_list" type="data" optional="false" format="tabular" label="Taxonomy list" help="Generated with ampvis2: load"/>
<expand macro="taxonomy_select_macro" argument="tax_vector" label="Taxa to keep"/>
</when>
</conditional>

<expand macro="normalise_macro"/>
<param argument="remove" type="boolean" truevalue="TRUE" falsevalue="FALSE" checked="false" label="Remove" help="Selected taxa will be removed instead of being the only ones kept in the data."/>
</inputs>
<outputs>
<data name="ampvis" format="ampvis2"/>
<data name="taxonomy_list_out" format="tabular" label="${tool.name} on ${on_string}: taxonomy list"/>
</outputs>
<!-- Test cases -->
<tests>
<!-- defaults -->
<!-- Test case for using comma-separated taxonomy list -->
<test expect_num_outputs="2">
<param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
<param name="select_param" value="option_input_file"/>
<param name="taxonomy_list" value="AalborgWWTPs-taxonomy.list"/>
<param name="tax_vector" value="f__Caldilineaceae,p__Elusimicrobia"/>
<output name="ampvis" value="AalborgWWTPs-subset_taxa.rds" ftype="ampvis2" compare="sim_size"/>
Expand All @@ -48,9 +77,11 @@
</assert_contents>
</output>
</test>
<!-- defaults -->

<!-- Test case for using comma-separated taxonomy list and remove=true -->
<test expect_num_outputs="2">
<param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
<param name="select_param" value="option_input_file"/>
<param name="taxonomy_list" value="AalborgWWTPs-taxonomy.list"/>
<param name="tax_vector" value="f__Caldilineaceae,p__Elusimicrobia"/>
<param name="remove" value="true"/>
Expand All @@ -62,6 +93,21 @@
</assert_contents>
</output>
</test>

<!-- Test case for using selected taxonomy list from a file -->
<test expect_num_outputs="2">
<param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
<param name="select_param" value="option_input_selected_file"/>
<param name="selected_taxonomy_list" value="AalborgWWTPs-selected.taxonomy.list"/>
<param name="remove" value="true"/>
<output name="ampvis" value="AalborgWWTPs-subset_taxa.rds" ftype="ampvis2" compare="sim_size"/>
<output name="taxonomy_list_out" ftype="tabular">
<assert_contents>
<not_has_text text="f__Caldilineaceae"/>
<not_has_text text="p__Elusimicrobia"/>
</assert_contents>
</output>
</test>
</tests>
<help><![CDATA[
What it does
Expand All @@ -73,12 +119,26 @@ The Galaxy tool calls the `amp_subset_taxa
<https://kasperskytte.github.io/ampvis2/reference/amp_filter_taxa.html>`_ function
of the ampvis2 package.
This tool provides two options for specifying the `subset`:
1. **Provide taxa names matching the taxonomy table**:
The taxonomy subset is done by providing a tax_vector of taxa names which are
then matched to the taxonomy table, where all other taxa not matching the
"Taxa to keep" (``tax_vector``) are removed. If "Remove" is enabled, then the
matching taxa are the ones being removed instead. The taxa names
will be matched in all columns of the taxonomy table.
2. **Provide a file of selected taxonomies**:
In this option, you can upload a file containing the taxa you wish to subset.
The file should contain one taxon name per line.
This option is useful when you have a pre-defined list of taxa stored in a file.
This is useful if you want to use the tool in a workflow or if you automatically generate
a taxonomy selection e.g. with differential abundance tools like MaAsLi2 and you only want to
plot these taxa.
@HELP_RELATIVE_ABUNDANCES@
Input
Expand All @@ -95,4 +155,4 @@ An ampvis2_rds dataset containing subsetted data and an updated taxonomy list da
]]></help>
<expand macro="citations"/>
</tool>
</tool>
2 changes: 2 additions & 0 deletions tools/ampvis2/test-data/AalborgWWTPs-selected.taxonomy.list
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
f__Caldilineaceae
p__Elusimicrobia

0 comments on commit d929359

Please sign in to comment.