Add more data to recount3 #50

lcolladotor · 2023-12-01T14:34:29Z

This is a recurrent goal as new data is deposited nearly every day to the Sequence Read Archive.

To add more data to recount3, we first need computing credits at some large computing clusters such as ACCESS (formerly called XSEDE) https://access-ci.org/.
Next, we have to run Monorail https://github.com/langmead-lab/monorail-external to process new data.
The outputs are then transferred to a local cluster where we can keep a backup of the data. On the recount3 paper, this is called the aggregation node. There files across studies are aggregated.
The data is then uploaded to IDIES, AWS Open Data Sponsorship Program https://aws.amazon.com/marketplace/pp/prodview-t3rflz3f557jq#resources, AnVIL, or any other active mirrors. It has to follow the data structure that the recount3 R package expects.

There are additional steps that are part of the recount3 world such as:

generating tissue predictions and all predictions that Shijie C. Zheng ran for the initial recount3 release
there are processing steps needed to generate the Snaptron compilations. Christopher Wilks ran this for the initial release
Afrooz Razi et al recently also used part of the Monorail output that is not publicly shared to obtain genotype predictions https://doi.org/10.1101/2023.10.21.562237
Update the recount3 study browser https://github.com/LieberInstitute/recount3-docs/tree/master/study-explorer

This goal really falls outside the recount3 R package, though the R package is one of the most commonly used interfaces for the data. Accomplishing this goal will likely need its own support and/or coordination with Wilks et al and/or Razi et al

The text was updated successfully, but these errors were encountered:

lcolladotor added depends on new monorail output enhancement New feature or request labels Dec 1, 2023

lcolladotor modified the milestone: bioc v3.23 Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more data to recount3 #50

Add more data to recount3 #50

lcolladotor commented Dec 1, 2023

Add more data to recount3 #50

Add more data to recount3 #50

Comments

lcolladotor commented Dec 1, 2023