Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve definition for 'metagenomic sample' #1014

Open
jfy133 opened this issue Nov 9, 2022 · 8 comments
Open

Improve definition for 'metagenomic sample' #1014

jfy133 opened this issue Nov 9, 2022 · 8 comments
Assignees
Labels
question Further information is requested

Comments

@jfy133
Copy link
Member

jfy133 commented Nov 9, 2022

https://spaam-community.slack.com/archives/C0183TC8B0R/p1667987904955569

@jfy133 jfy133 self-assigned this Nov 9, 2022
@jfy133 jfy133 added the question Further information is requested label Nov 9, 2022
@jfy133
Copy link
Member Author

jfy133 commented Nov 9, 2022

To summarise, I was unsure whether to include #1001 or not.

It is a sedaDNA sample, however the ultimate purposes/outcome of the paper was for population genetics of particular eukaryotic species, rather than metagenomic analyses. However they did upload shotgun data and they did as a screening step run Centrifuge (a metagenomic classifier) for screening.

After the discussion on slack, there was an overall consensus to indeed include it, however there was some discussion about improving the definitions of metagenomes and how/what to include them.

The following suggestion from @pheintzman was made:

Perhaps a definition of a metagenomics sample could be something like: “An ancient sample that is either expected or has been shown to contain genomic information from multiple taxa”

Where by 'expected' would be environmental deposits (sediments, dental calculus, etc), and 'shown' would be tissue from individuals where pathogens and/or microbiomes were found. And that [this would] avoid[s] the rabbit hole of including any and all ancient samples, which may well, but not necessarily, be metagenomic.

However, myself and @aidaanva pointed out that this didn't really work with what we count in the host-associated metagenomic table as

because [in some cases] a tooth may have a very minimal multi-taxa presence still. The oral signal [can be] present (even if [highly] skewed), so we need to be a bit more precise I think

Whereby I mean being more precise in the sense that for the host-associated microbiome while teeth may hold the oral signal it is so skewed you wouldn't want to use it for 'ecological' analyses.

@jfy133
Copy link
Member Author

jfy133 commented Nov 9, 2022

When questioned about technically any 'ancient' sample will be metagenomic: @alexhbnr also provided a further refinement when comparing to a bone: "So we expect a similar signal as for a sediment sample that was studied for microbes and not mammals."

@jfy133
Copy link
Member Author

jfy133 commented Nov 9, 2022

Our current definition:

We define here 'metagenome' in a broad sense, primarily focusing any data where the whole DNA content is analysed and explored. Examples for this are (but not limited to) ancient microbiomes (host associated metagenome), ancient sedimentary DNA (environmental) and also samples used for ancient pathogen screening (single genomes).

@jfy133
Copy link
Member Author

jfy133 commented Nov 9, 2022

I sort of like what we already have, and that technically includes Gelabert... I wonder if we could improve it slightly by saying something like;

where the whole DNA content is analysed and explored at some point during a scientific publication?

@ivelsko
Copy link
Contributor

ivelsko commented Nov 9, 2022

Maybe instead something like "we explicitly exclude host-derived samples for which the host DNA was the primary focus of the study"? Because I don't think that "where the whole DNA content is analysed and explored" always applies to pathogen capture data - isn't the metagenomic analysis for these is often just to look for signals of the species of interest, then focus is on the reads from that species?

@ivelsko
Copy link
Contributor

ivelsko commented Nov 9, 2022

Or "sole focus" rather than "primary focus"

@jfy133
Copy link
Member Author

jfy133 commented Nov 9, 2022

Yes, that's correct, I like your idea of:

we explicitly exclude host-derived samples for which the host DNA was the sole focus of the study

However to clarify - this would only apply to host-associated samples right? Not sedimentary samples?

@ivelsko
Copy link
Contributor

ivelsko commented Nov 9, 2022

Right, not to sedimentary samples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants