Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drafting SPARQL Update support #320

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft

Drafting SPARQL Update support #320

wants to merge 13 commits into from

Conversation

kjetilk
Copy link
Member

@kjetilk kjetilk commented Oct 11, 2021

Here is a first pass over #125 with a suggested BNF and some text.

I'll mark it WiP, because I have a feeling we should be able to simplify it further especially around GroupGraphPattern, which shouldn't need that, it seems.

@RubenVerborgh
Copy link
Contributor

As an open question (that is perhaps answered somewhere else): is BNF the most effective way to capture a subset of SPARQL? Both now and moving forward?

Because I expect that we might have multiple subsets in the future, and the BNF might become repetitive (and in essence already might be compared to the SPARQL spec itself).

Copy link
Contributor

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The particular subset looks good to me; I'm just not sure whether BNF is the most sustainable way to capture this (see other comment).

Should we already mention something about atomicity?
Note that NSS implements a subset of SPARQL 1.1 with a different semantics, namely it requires exactly one match for the WHERE clause.

@@ -704,6 +704,12 @@ <h3 property="schema:name">Writing Resources</h3>

<p><span about="" id="server-post-target-not-found" rel="spec:requirement" resource="#server-post-target-not-found"><span property="spec:statement">When a <code>POST</code> method request targets a resource without an existing representation, the <span rel="spec:requirementSubject" resource="spec:Server">server</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> respond with the <code>404</code> status code.</span></span> [<a href="https://github.com/solid/specification/issues/108#issuecomment-549448159" rel="cito:citesAsSourceDocument">Source</a>]</p>

<p>
<span about="" id="server-patch-sparql-update" rel="spec:requirement" resource="#server-patch-sparql-update"><span property="spec:statement">When the target resource of a <code>PATCH</code> request is represented by an <em>RDF document</em> [<cite><a class="bibref" href="#bib-rdf11-concepts">RDF11-CONCEPTS</a></cite>], <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> support changing the representation using the request body containing a subset of SPARQL Update as defined in the <a href="#sparql-update-subset">Appendix</a>.</span></span>
<span about="" id="server-patch-sparql-all" rel="spec:requirement" resource="#server-patch-sparql-all"><span property="spec:statement"> <span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MAY">MAY</span> further support SPARQL 1.1 Update [<cite><a class="bibref" href="#bib-sparql-overview">SPARQL</a></cite>],</span></span>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…but how will they indicate that they do? If not answered now, shall we leave a comment here pointing to an open issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two open issues to this: One is discovery, whether a client can discover that before issuing a request. I think that is something we shouldn't design in the immediate term unless you have a ready-made solution for that in CSS. SPARQL 1.1 has a service description that we could probably borrow from. The other is what happens if the server has to signal an error if it gets a query it does not support, or if the query tries to modify a different resource. That, I think, we need to define now. I had an issue for that in solid-contrib/query-panel#6 but since we don't have that forum, I guess I'll just propose something in subsequent commits. Always happy to hear about ways it is being done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the above; main action point would be to link to these from the spec test, to show that we are aware. We can then follow up at any point later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 422 stands out as the most appropriate status code for responding to a query that isn't supported. I added a commit with that. Do you think we need to record an issue in the spec text itself about discovery? I think we should have open issues in here on github, at least for now. Otherwise, it is going to be a lot of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I thought we also pointed to relevant issues on GitHub from the spec, but no strong opinion.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 12, 2021

As an open question (that is perhaps answered somewhere else): is BNF the most effective way to capture a subset of SPARQL? Both now and moving forward?

Because I expect that we might have multiple subsets in the future, and the BNF might become repetitive (and in essence already might be compared to the SPARQL spec itself).

Right. I was quite confident about using a BNF grammar to define the subset as it is very concise and would be sure to capture the same semantics and thus make reuse of the algebra defined in the spec. For that reason, I honestly didn't think of a different way to do it as every other way would need to be more verbose.

Unfortunately, the grammar isn't as well defined as I hoped, in particular, it doesn't actually define INSERT DATA and DELETE DATA without patterns, like other parts of the spec does. So, I am certainly open to other ideas in this area, but perhaps not in the immediate term?

@ericprud
Copy link

Should we already mention something about atomicity?

This text precludes one way to provide atomic updates over multiple Resources (e.g. to keep them in a mutually-consistent state):

except that servers MUST NOT allow a request with a PATCH method to change other resources than the target resource. [Source]

@RubenVerborgh , is that what you meant by "atomicity"?

(Precluding could be a good thing 'cause it leaves us room to spec it later.)

@csarven
Copy link
Member

csarven commented Oct 12, 2021

FWIW, HTTP, LDP specs use ABNF. wac-allow, Accept-Put is also in ABNF.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 12, 2021

The particular subset looks good to me; I'm just not sure whether BNF is the most sustainable way to capture this (see other comment).

Should we already mention something about atomicity? Note that NSS implements a subset of SPARQL 1.1 with a different semantics, namely it requires exactly one match for the WHERE clause.

I need to re-review the NSS code on this point. I am aware of the willful violation of SPARQL when it comes to the semaphore mechanism, but I thought the matching was connected to that.

@RubenVerborgh
Copy link
Contributor

except that servers MUST NOT allow a request with a PATCH method to change other resources than the target resource. [Source]

@RubenVerborgh , is that what you meant by "atomicity"?

No; I'm talking about semaphore behavior within a single document (so perhaps I used the wrong term).
The relevant issues are #139 and solid/solid-spec#193

@kjetilk
Copy link
Member Author

kjetilk commented Oct 13, 2021

@RubenVerborgh , is that what you meant by "atomicity"?

No; I'm talking about semaphore behavior within a single document (so perhaps I used the wrong term). The relevant issues are #139 and solid/solid-spec#193

Right, among the ACID properties, it is really isolation that we're talking about there. It is a mechanism that is supposed to be helping servers do the right thing when they can't make sure each query doesn't run in isolation.

And yes, I think it is within our immediate scope to specify that behavior even though I see large problems with. The main issue for it was solid-contrib/query-panel#3 .

In the short term, my main concern isn't so much that we have that behavior, it is that if we need to change the semantics of SPARQL, it is likely that you cannot just put in a SPARQL engine if you have one and so, it creates a tension between those that implement the minimal subset and those who have a SPARQL engine. I think it is important to avoid that.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 13, 2021

From reading the NSS source code, I have concluded that what it does is this:

When the request body of a PATCH request has a SPARQL Update query that contains an INSERT keyword, the server must treat the request as an Append operation. When the query contains a WHERE keyword, the server must treat the request as a Read operation. When the query contains a DELETE keyword, the server must treat the request as a Read and Write operation.

So, that's what I'll be putting in the spec shortly. Even if it is a pretty blunt way of doing it.

@RubenVerborgh

This comment has been minimized.

@kjetilk

This comment has been minimized.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 14, 2021

I added another commit with the above with links and all. That would close #220 too.

Copy link
Contributor

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, agree this closes #220 now!

protocol.html Outdated
<div class="note" id="sparql-subset-definition-note" inlist="" rel="schema:hasPart" resource="#sparql-subset-definition-note">
<h4 property="schema:name"><span>Note</span>: Definition of SPARQL Update Subset</h4>
<div datatype="rdf:HTML" property="schema:description">
<p>This specification alters the grammar of 10 rules in the original grammar of the SPARQL 1.1 Query Language [<cite><a class="bibref" href="#bib-sparql-overview">SPARQL</a></cite>] grammar and introduces another 10 new rules to accommodate for <code>INSERT DATA</code> and <code>DELETE DATA</code>, which was defined in the language definition but not precisely defined in the grammar. This specification references the original grammar definition where no changes were made.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"alter"—do we want to specifically add into the phrasing that it is compatible? Or does "subset" sufficiently capture that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that it covers that, but what matters is what is understood by implementors.... We need to figure out if a change to the semantics is required, I have the only 1 match in mind in particular.

<p>
<span about="" id="server-patch-sparql-insert" rel="spec:requirement" resource="#server-patch-sparql-insert"><span property="spec:statement">When the request body of a <code>PATCH</code> request has a SPARQL Update query that contains an <code>INSERT</code> keyword, <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> treat the request as an <a href="#append-operation">Append operation</a>.</span></span>
<span about="" id="server-patch-sparql-where" rel="spec:requirement" resource="#server-patch-sparql-where"><span property="spec:statement">When the query contains a <code>WHERE</code> keyword, <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> treat the request as a <a href="#read-operation">Read operation</a>.</span></span>
<span about="" id="server-patch-sparql-delete" rel="spec:requirement" resource="#server-patch-sparql-delete"><span property="spec:statement">When the query contains a <code>DELETE</code> keyword, <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> treat the request as a <a href="#read-operation">Read</a> and <a href="#write-operation">Write operation</a>.</span></span>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about them being cumulative?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I'm inclined to say it is covered actually, as I say "contains"... Does it really need further elaboration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still be explicit; right now this could be read as a priority list (stop when match) or a cumulative list. The "contains" does not change that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ditch the 2112 and introduce a Request Actions concept? Each line could say "the Request Actions include Read operation."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like i'm missing a subtlety about why containing a DELETE implies READ.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 14, 2021

I opened #322 to understand the semaphore mechanism in further depth.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 15, 2021

I have now made three PRs to this branch that can be reviewed more superficially for inclusion into this PR.

They are

I hope you will all have a look at this. If we can get #321 and #324 merged into this branch, I think this PR can go from draft to actual review. I'd like that to happen ASAP.

<p>
<span about="" id="server-patch-sparql-insert" rel="spec:requirement" resource="#server-patch-sparql-insert"><span property="spec:statement">When the request body of a <code>PATCH</code> request has a SPARQL Update query that contains an <code>INSERT</code> keyword, <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> treat the request as an <a href="#append-operation">Append operation</a>.</span></span>
<span about="" id="server-patch-sparql-where" rel="spec:requirement" resource="#server-patch-sparql-where"><span property="spec:statement">When the query contains a <code>WHERE</code> keyword, <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> treat the request as a <a href="#read-operation">Read operation</a>.</span></span>
<span about="" id="server-patch-sparql-delete" rel="spec:requirement" resource="#server-patch-sparql-delete"><span property="spec:statement">When the query contains a <code>DELETE</code> keyword, <span rel="spec:requirementSubject" resource="spec:Server">servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> treat the request as a <a href="#read-operation">Read</a> and <a href="#write-operation">Write operation</a>.</span></span>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like i'm missing a subtlety about why containing a DELETE implies READ.

@RubenVerborgh
Copy link
Contributor

why containing a DELETE implies READ.

Because the semaphore mechanism #322 currently lets DELETE requests only succeed if there was exactly one match, giving away the existence of certain triples.

Which makes me wonder… is the semaphore mechanism the only reason? If so, Read should probably be dropped until we agree on the semaphore.

@kjetilk
Copy link
Member Author

kjetilk commented Oct 18, 2021

Which makes me wonder… is the semaphore mechanism the only reason? If so, Read should probably be dropped until we agree on the semaphore.

Yes, it is the only reason, AFAICT. Standard SPARQL doesn't have this problem since it always succeeds. But the semaphore is required for 0.9, so we might as well have it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Do
Development

Successfully merging this pull request may close these issues.

4 participants