Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document use within a SPARQL pipeline #34

Open
donpellegrino opened this issue May 26, 2023 · 3 comments
Open

Document use within a SPARQL pipeline #34

donpellegrino opened this issue May 26, 2023 · 3 comments
Assignees
Labels
docs Improvements or additions to documentation
Milestone

Comments

@donpellegrino
Copy link

It would be useful to document a few examples where the Rust hdt library is used within a full pipeline, starting from an HDT file (as generated by hdt-cpp) to SPARQL query results.

For example, the Python rdflib-hdt library wraps hdt-cpp and this function point is where the triple pattern query over the HDT is then used by the rdflib SPARQL query processor: (https://github.com/RDFLib/rdflib-hdt/blob/master/rdflib_hdt/hdt_document.py#L114)

Documenting how Rust hdt might provide triple pattern query results to a few separate SPARQL query engines would show users how the Rust hdt library can fit into a broader pipeline from data to SPARQL query results.

@KonradHoeffner KonradHoeffner self-assigned this Jul 14, 2023
@KonradHoeffner KonradHoeffner added the docs Improvements or additions to documentation label Jul 14, 2023
@KonradHoeffner KonradHoeffner added this to the next milestone Jul 14, 2023
@KonradHoeffner
Copy link
Owner

KonradHoeffner commented Nov 2, 2023

My current use case for the Rust HDT library is the RickView RDF browser which does not currently require anything beyond the HDT low level triple pattern query features so I don't have a high level SPARQL pipeline available that could be documented. I actually quite like it to use triple patterns without full SPARQL because among other reasons it doesn't risk a high time complexity and overload of a server.

On the other hand, there already is the Rust database Oxigraph, which implements SPARQL, so I'm not sure in which situation it would make sense top implement a full SPARQL pipeline on top of hdt-rs that Oxigraph doesn't cover (though I haven't used Oxigraph yet).

However if you have a use case, I can investigate and document what the best way to generate SPARQL results could be using hdt-rs. One candidate would be to use the HDT Sophia adapter and find out if it's possible to answer SPARQL queries with a Sophia graph.

One such use case could be to have a very lightweight immutable SPARQL endpoint, which I certainly could use for several projects where I currently use Virtuoso but that is much too heavyweight to host a 10 MB knowledge base, however https://github.com/dice-group/tentris (written in C++) seems to be a very promising and light weight development in this direction.
It would certainly be interesting to create an alternative SPARQL endpoint based on hdt-rs, but due to my current lack of time I'm not optimistic that this would be more than a prototype.

@donpellegrino
Copy link
Author

@KonradHoeffner - Thanks for the background. That makes perfect sense.

I am attempting to integrate Oxigraph and this HDT crate. I have a development branch at https://github.com/DeciSym/oxigraph/tree/1-enable-sparql-query-of-hdt-storage that passes W3C SPARQL 1.0 Basic test cases. Based on that work, the approach of using Oxigraph as a SPARQL front-end to the HDT low level triple pattern query feature appears to be feasible. The development branch needs to be cleaned up and I would like to run it though more of the W3C SPARQL test cases before submitting it as a pull request to Oxigraph.

@KonradHoeffner
Copy link
Owner

@donpellegrino: Is this still in development and if yes can I support it somehow?

I think SPARQL support is also a topic that should be discussed in the new RDF Rust Common Crates Community Group at https://www.w3.org/community/r2c2/.
When a common knowledge graph API is reached, adding SPARQL support on top of that may be a worthwhile endeavor or at least something to consider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants