Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some records with unclear ionization and precursor #164

Open
meowcat opened this issue Mar 29, 2021 · 4 comments
Open

Some records with unclear ionization and precursor #164

meowcat opened this issue Mar 29, 2021 · 4 comments

Comments

@meowcat
Copy link

meowcat commented Mar 29, 2021

A number of records have no MS$FOCUSED_ION: PRECURSOR_TYPE. At least a block of NAIST records have this issue.

For some it is quite clear that they're [M+H]+ or [M-H]-, for others an adduct can be extrapolated. For some, I haven't come up with an explanation. E.g. KNA00172:

https://massbank.eu/MassBank/RecordDisplay?id=KNA00172

Molecule mass is 181.0738, and precursor m/z is 284.10.

If my (still WIP) calculator is correct, this is none of the 113 adducts/ions specified in RMassBank:::getAdductInformation(""). It is also none of the adducts from Fiehn table https://fiehnlab.ucdavis.edu/images/files/software/ESI-MS-adducts-2020.xls .

A hypothetical [M+CF3CO2H+H]+ adduct would be at 284.074. But the authors claim formate, not TFA as a modifier.

Should I try to find and flag these records? Should I try and annotate the adducts where they can be inferred with some confidence?

(Note that I don't use the addition and charge from the RMassBank table, because I think base mass multiplication and charge multiplication isn't working properly in RMassBank right now. See MassBank/RMassBank#284. Instead, I am parsing and processing the adductStrings, it does take into account stuff like [2M+3Na-4H]+ and the trivial name for ACN).

@meowcat
Copy link
Author

meowcat commented Mar 30, 2021

Quick summary:

  • There are a number of records that are not using MS$FOCUSED_ION: PRECURSOR_TYPE.
  • The vast majority uses MS$FOCUSED_ION: ION_TYPE instead. This harks back to a discussion from a while ago: Meaning of MS$FOCUSED_ION: ION_TYPE MassBank-web#176 It is still not completely clear to me what the distinction should be, I believe that MS$FOCUSED_ION: PRECURSOR_TYPE should probably be the only one used for MS2, and possibly ION_TYPE for MS1. For the purposes of building the msms_spectrum view, we can probably fix this in the view directly without changing the data. But it would probably help to specify a single tag to use in the record format.
  • ~100 records have the precursor ion only in the title
  • The only records with no info at all seem to be the 438 KNA records and 4 CASMI 2012 records.

5 records Atsushi FFF: precursor type is in title only
45 records JEOL Ltd JEL: precursor type is in title only
438 records Takahashi KNA: just precursor m/z without ion information
53 records Maoka MSJ: using MS$FOCUSED_ION: ION_TYPE
121 records Parejo et al. PM...: using MS$FOCUSED_ION: ION_TYPE
261 records from RIKEN PR...: using MS$FOCUSED_ION: ION_TYPE
3604 records from RIKEN PS...: using MS$FOCUSED_ION: ION_TYPE
917 recrds from RIKEN PT...: using MS$FOCUSED_ION: ION_TYPE
4 recods SMI00106..00164 from CASMI 2012: just precursor m/z without ion information
45 records Tanaka TY...: precursor type is in title only

@meowcat
Copy link
Author

meowcat commented Mar 30, 2021

In addition, there is also a significant number of records (3881) where MS$FOCUSED_ION: PRECURSOR_TYPE is correct, but MS$FOCUSED_ION: PRECURSOR_M/Z is not set:

  • FIO (FIOCRUZ) records
  • NGA (RIKEN) records
  • PB (IPB Halle) records
  • PR (RIKEN) records using MS$FOCUSED_ION: FULL_SCAN_FRAGMENT_ION_PEAK

@takaakin
Copy link

takaakin commented Mar 30, 2021 via email

@tsufz
Copy link
Member

tsufz commented Mar 30, 2021

Dear Takaaki-san and all,
I suggest a trial of automated curation of the records. If there is no reliable result of automated curation, we could try it manually. If this is not possible, the authors could check the records. I don't know the records in detail, but many precursors could be derived from context and checked by the masses. If we cannot assign reliable precursor ions, we will deprecate those records.

Best wishes,
Tobias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants