Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query SOAR metadata #46

Open
ebuchlin opened this issue Aug 24, 2022 · 19 comments
Open

Query SOAR metadata #46

ebuchlin opened this issue Aug 24, 2022 · 19 comments
Labels
enhancement New feature or request

Comments

@ebuchlin
Copy link

Describe the feature

My understanding is that sunpy-soar currently only supports queries by instrument / time / level / product, as this is basically what is available in the SOAR web query form and in the v_sc_data_item and v_ll_data_item tables. However, the user should also be able to do queries with different metadata (other Fido attributes).

Proposed solution

The list of all tables and their columns is available from SOAR with TAP. I attach a human-readable version (tree by schema / table / column), generated by XSLT with this XSL stylesheet.

This shows that more complete metadata are available in SOAR, in instrument-specific tables, e.g. v_spi_sc_fits. Example query: http://soar.esac.esa.int/soar-sl-tap/tap//sync?REQUEST=doQuery&LANG=ADQL&FORMAT=json&QUERY=SELECT+TOP+10+%2A+FROM+v_spi_sc_fits

Fido attributes should be linked to columns in these different instrument-specific tables. For a query with multiple instruments, multiple tables should be queried... (or should this not be supported?). LL files metadata can also be queried, from still different tables.

@dstansby dstansby added the enhancement New feature or request label Oct 7, 2022
@ebuchlin
Copy link
Author

ebuchlin commented Feb 5, 2023

There is a draft documentation for the tables, views, and columns of the SOAR TAP interface: https://www.cosmos.esa.int/web/soar/tables-views-and-columns

@ebuchlin
Copy link
Author

ebuchlin commented May 10, 2023

There are now new columns soop_name and soop_type available in the SOAR TAP interface.

@wtbarnes
Copy link
Member

There are now new columns soop_name and soop_type available in the SOAR TAP interface.

Is this covered by #84?

@ebuchlin
Copy link
Author

There are now new columns soop_name and soop_type available in the SOAR TAP interface.
Is this covered by #84?

For queries by SOOP name it seems that #84 covers it, yes.

@wtbarnes
Copy link
Member

@ebuchlin sorry for the lack of traffic on this issue. I definitely agree we should be supporting more complex queries against the SOAR through Fido, but I'm a bit confused as to the scope. Looking at the docs you linked above, it is not quite clear to me what attributes should be supported through the attrs interface. Could you provide an example of what a Fido query would look like with these additional metadata?

The example from @hayesla in #66 makes it a bit more clear to me, but again the issue is what subset of that metadata we should support. I don't think it is practical to try and translate each bit of SOAR metadata to a Fido attribute. However, maybe there could be some sort of interface to specifying these filters as strings, similar to what we allow with JSOC keywords in the sunpy.net.jsoc.attrs.

@ebuchlin
Copy link
Author

This is a generic issue meant to tell that there were more possibilities with the SOAR TAP interface than the ones initially used by sunpy-soar (the details of the TAP interface were undocumented at that time). Now that we have some documentation and that queries by SOOP name have been implemented, we can be more specific about the potentially other useful attributes, starting from existing sunpy.net ones:

  • Detector: from the v_<instrument>_<ll/sc>_fits tables, column detector. Partially overlaps the use cases for a.soar.Product.
  • Wavelength: from the v_<instrument>_<ll/sc>_fits tables, column wavelength
    • SPICE windows wavelengths (one range per window) are not all accessible through SOAR, I think that only the first window is, unless the full list is in the undocumented v_<instrument>_<sc/ll>_extension_fits tables for <instrument> = SPICE (just a guess).
    • STIX rather has energy bands
  • Resolution: for AIA and HMI, this is a factor from the highest resolution. There is some information in the cdelt[n], total_binning_factor and binning_factor columns (in one row per dimension), but not sure how to combine this into something meaningful, and consistent with the existing meaning for AIA and HMI. Also, should this be limited to spatial resolution?
  • Phyobs: not in SOAR; could be deduced from a.soar.Product?
  • Extent (as in sunpy.net.vso.attrs) could be useful, but the meaning should be clarified (some overlap with FOV?) and it might be difficult to implement.

An issue is that v_<instrument>_<ll/sc>_fits is actually multiple tables, one per instrument and per data type (low-latency or science), and that there must then be join operations with the v_<ll/sc>_data_item tables.

In case one would like to have access to previous versions of the files (instead of only the latest version), the v_<ll/sc>_repository_file tables would also have to be considered.

For a start, we can of course ignore previous versions of files, ignore low-latency observations, and prioritize attributes in the above list. The efforts should also be balanced with those put on access to Solar Orbiter data through VSO as data provider.

@ebuchlin
Copy link
Author

For complex SOAR TAP queries, here is a tutorial on TAP queries that we did at IAS; it could provide ideas for how to do some of the queries we would like to be doable using Fido.

@nabobalis
Copy link
Contributor

@ebuchlin we are going to add this as a GSoC project and I have a really rough draft here: https://github.com/OpenAstronomy/openastronomy.github.io/pull/350/files#diff-03a99800468bb348b3741103deee0d442348ced2997c4a20c1aa6479cd7729e9

If you had time could you review it and would you be willing to help with the project in an advisory capacity?

@ebuchlin

This comment was marked as outdated.

@nabobalis

This comment was marked as outdated.

@MetaphorC

This comment was marked as outdated.

@nabobalis

This comment was marked as outdated.

@MetaphorC

This comment was marked as outdated.

@Dhruvkumar0463

This comment was marked as outdated.

@nabobalis

This comment was marked as outdated.

@hayesla

This comment was marked as outdated.

@Dhruvkumar0463

This comment was marked as outdated.

@esdcheliodevops
Copy link

Hello @ebuchlin and @hayesla - I would just like to comment that if you find the current structure of Tables in SOAR TAP difficult to work with, then please do make suggestions about how we can improve that. :-)

They are currently structured with the internal relational database in mind. But of course, we are always open to the possibility of making more user friendly views to combine data etc. This would help avoid making complex queries with joins which are often slow due to lack of indexes etc on certain columns.

It would be great to capture this kind of feedback which would surely benefit the whole community of SOAR TAP users.

Many thanks,
Jonathan Cook (I am using a shared ESDC github account we have)

@ebuchlin
Copy link
Author

Hello, here is a new analysis (notebook PDF, notebook source) of what could be done with the following keywords, with the current way they are filled (even when most of them are optional keywords) by the instrument teams and/or the SOAR:

  • sensor could be used for:
    • EUI (values: FSI174, FSI304, HRI, HRI1216, HRI174; detector would be better)
    • Metis (values: UV, VL)
    • PHI (values: FDT, HRT)
    • Not filled for SoloHI and STIX, wrong values for SPICE
  • detector could be used for:
    • EUI (values: FSI, HRI_EUV, HRI_LYA)
    • Metis (values: UV, VL; same as sensor)
    • PHI (values: FDT, HRT, and a few probably erroneous vavues)
    • SoloHI (values: 1, 2, 3, 4), I guess that these correspond to different parts of the FOV, but not sure.
    • Not filled for STIX, and is the detector of the first spectral window only for SPICE (then not relevant for the file as a whole).
  • telescope is always in the form SOLO / instrument / detector, so redundant with instrument and detector.
  • btype could be useful in principle when doing multi-instrument searches, if the values where standardized, but not much when doing searches on a specific instrument (except for PHI?):
    • EUI (value: Flux)
    • Metis (values: Stokes I, UV Lyman-alpha intensity, VL fixed-polarization intensity, VL polarization angle, VL polarized brightness, VL total brightness; are these values standardized?)
    • PHI (values: Intensity, Magnetic Field Strength, and a few files with other values)
    • Not filled for SoloHI and STIX; correspond to first window only for SPICE (values: Radiance, Spectral Radiance)
  • filter is not filled, or is redundant with sensor for Metis, or is filled and meaningful for EUI but probably not very useful for end users.
  • observation_mode could be used for:
    • EUI (16 values over the considered period)
    • Metis (9 values)
    • SoloHI (8 values)
    • SPICE (101 values). However these values are potentially too many (and growing) for the users to use in a meaningful way. The list of values will have to be updated in some way (attributes in sunpy-soar updated at a new sunpy-soar release; attributes that can be updated by the user; or list maintained online).
    • 1 value for PHI which is not useful, 2 values (probably not relevant for user) for STIX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants