Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce constraint on PATCH to only patch target resource #349

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

kjetilk
Copy link
Member

@kjetilk kjetilk commented Nov 12, 2021

Even if SPARQL isn't specified in 0.9, we should add some of the constraints that we agreed on in that context. The constraint that PATCH should only modify the target resource is a sound constraint that I think should go in. This text is mostly derived from earlier PRs, but generalized to any media type (which I guess is a good idea anyway).

@kjetilk kjetilk requested a review from a team November 12, 2021 23:19
@kjetilk kjetilk self-assigned this Nov 12, 2021
@kjetilk kjetilk added this to the Release 0.9 milestone Nov 12, 2021
protocol.html Outdated Show resolved Hide resolved
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Copy link
Contributor

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this?
Because it seems to limit extensibility.

Furthermore, the spec does not define any methods that could change multiple resources, and the server cannot be expected to understand patch formats that are not covered by the spec.

So I don't think this text is necessary (or testable beyond what the specific patch formats already require, given that we cannot test any other patch formats because the server is not required to support them).

@kjetilk
Copy link
Member Author

kjetilk commented Nov 23, 2021

I'd like to have such things in because it is a dangerous thing to do, if people were writing new patch formats, they shouldn't do it.

There is a specific case of WITH in SPARQL, which could reasonable have an interpretation that the graph is a resource. This is to prevent such an interpretation.

@RubenVerborgh
Copy link
Contributor

if people were writing new patch formats, they shouldn't do it.

Does that mean we can never have atomic updates to multiple resources at any point in the future? Or rather that we cannot have them via PATCH? (It seems to be the latter.)

I could get behind that principle.


However, then there is another obstacle:

MUST NOT allow a request with a PATCH method to change resources other than the target resource

What does change mean? To give a trivial example, the mtime listed in the container might change.

Or a patch to one resource could trigger something in another resource. (If I switch my main controller off, then all of my lights go off too.)

So language-wise, perhaps something to the extent of: servers MUST NOT allow clients to explicitly request multi-resource changes via PATCH, but the server may let effects propagate.

@kjetilk
Copy link
Member Author

kjetilk commented Nov 23, 2021

if people were writing new patch formats, they shouldn't do it.

Does that mean we can never have atomic updates to multiple resources at any point in the future? Or rather that we cannot have them via PATCH? (It seems to be the latter.)

I could get behind that principle.

Yeah, definitely! Or, we could remove the constraint if we are sufficiently confident it would not cause harm. What I'm trying to do is to harden the servers so that a malicious actor can't exploit a client to confuse a user to modify a resource they shouldn't. If we can be confident through other means that can't happen, then OK. But until then, I think such measures are very important.

It can be done throught SPARQL Update on a SPARQL Endpoint, for example, that would lie outside of the current protocol, but it should be doable.

However, then there is another obstacle:

MUST NOT allow a request with a PATCH method to change resources other than the target resource

What does change mean?

Oh, yes, that's a real issue, because we do rely on side-effects for much of the functionality, and...

To give a trivial example, the mtime listed in the container might change.

Yup.

Or a patch to one resource could trigger something in another resource. (If I switch my main controller off, then all of my lights go off too.)

Yeah, and there could be cascading writes, like in #227...

So language-wise, perhaps something to the extent of: servers MUST NOT allow clients to explicitly request multi-resource changes via PATCH, but the server may let effects propagate.

OK, great, I'll adopt that!

@TallTed
Copy link
Contributor

TallTed commented Nov 23, 2021

The latest language is a big improvement. SPARQL Update on a SPARQL Endpoint is not only outside the purview of Solid per se, but also well beyond "expert mode", and ramifications of any change a user makes vis such "expert mode" should also be considered outside the purview of Solid, and the user assumes the responsibility of such ramifications. Much like tinkering inside the cover of any appliance voids the warranty... The tinkering is not impossible, just reasonably guarded and warned against.

@RubenVerborgh
Copy link
Contributor

SPARQL Update on a SPARQL Endpoint is not only outside the purview of Solid per se

Well no; just not with PATCH.

Copy link
Contributor

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@kjetilk kjetilk requested a review from a team November 23, 2021 22:18
@@ -708,6 +708,11 @@ <h3 property="schema:name">Writing Resources</h3>

<p><span about="" id="server-protect-containment" rel="spec:requirement" resource="#server-protect-containment"><span property="spec:statement"><span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> NOT allow HTTP <code>POST</code>, <code>PUT</code> and <code>PATCH</code> to update a container’s containment triples; if the server receives such a request, it <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> respond with a <code>409</code> status code.</span></span> [<a href="https://github.com/solid/specification/issues/40#issuecomment-573358652" rel="cito:citesAsSourceDocument">Source</a>]</p>

<p>
<span about="" id="server-patch-not-other" rel="spec:requirement" resource="#server-patch-not-other"><span property="spec:statement"><span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MUST-NOT">MUST NOT</span> allow a client to explicitly request multi-resource changes via the <code>PATCH</code> method, but the server is permitted to propagate side-effects of a change to the target resource to other resources.</span></span> [<a href="https://github.com/solid/specification/issues/125#issuecomment-873035679" rel="cito:citesAsSourceDocument">Source</a>]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title and the first comment is clear to me. But I don't quite understand the way the requirement reads.

The first sentence seems as though it is the payload that will 1) attempt to change different resources 2) besides the request's target resource. And then it says "but" it can propagate the changes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the server is permitted to propagate side-effects of a change to the target resource to other resources.

I think this bit provides more confusion than clarity. If there are explicit allowed and/or expected side-effects that aren't covered elsewhere in the specification those should be mentioned. For example, if a patch is used to create the target resource R in container P, containment information in P is affected, but that is covered elsewhere in the spec, so it wouldn't need to be mentioned. Are there some specific side-effects we want to address here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I can think of. @RubenVerborgh , would you be OK to revert to the previous commit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's correct then, but happy to abstain from voting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RubenVerborgh I think the idea is that your concerns are addressed with my generic language elsewhere in the spec. I tend to agree that the last commits are more explicit and is thus clearer. But I also see the point of @csarven and @justinwb , containment triples are dealt with elsewhere, and so, the fact that PATCH can create a resource and thus have the side effect of adding a containment triple, doesn't need to be mentioned here. That some changes to a resource will also have a the side-effect of updating timestamps are also defined elsewhere, again, it doesn't need to be mentioned here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are mentioned in other places, but this statement creates a contradiction with those other places. Whereas the only intention here was to say "PATCH targets one resource".

Copy link
Member

@csarven csarven Nov 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear: my review comment above was about the soundness of the statements in the PR. No judgement on the solution until the statements are clarified.

#349 (comment) is the suggestion to make the request semantics more clear and what happens when applied to the target resource.

For example, if the payload of the PATCH were to result in deleting resources besides the resource that's same as the effective request URI, then those are part of the semantics of PATCH with a particular media type. That is not a "side effect". It is part of processing the request. "Side effects" may indeed be things like changes to containment when a resource is created with PATCH; server creating a memento resource of the target resource; adding/tracking creator of the resource; associating an ACL resource and a description resource with the created resource; updating authoritative information, and so on.

Kjetil, I only mentioned containment as a simple example to you in private and later in the editors meeting ( https://github.com/solid/specification/blob/main/meetings/2021-11-24.md#constraint-on-patch-to-target-resource ) . I think Justin understood and agreed with my point in the meeting and repeating it here because you asked him to leave a comment.

Ruben, I'd like to know what potential conflicts you are referring to and how to avoid them. Or are we talking passed each other?

I think we are in agreement with the intention re "PATCH targets one resource" and "the update should only change the description of the target resource", right?

PATCH /foo
INSERT DATA foo bar baz
INSERT DATA qux bar baz
DELETE DATA quux bar baz
INSERT DATA quuz bar baz

That only needs to be processed to update the description of the resource foo. It should not update the descriptions of the resources qux, quux, quuz (in their respective resource descriptions - external to foo) or create resources qux and quux if they happen to not exist or any of the example side effects mentioned above. If we want those things, they should be incorporated into the semantics of the request, case by case. Later on.

We shouldn't spend more time on dissecting the current requirement which is problematic and not really want we want. Kjetil, can you update?

@csarven
Copy link
Member

csarven commented Nov 24, 2021

The update should only change the description of the target resource. Don't mention propagations or side-effects until the request semantics can be further specialised.

@kjetilk
Copy link
Member Author

kjetilk commented Nov 24, 2021

The update should only change the description of the target resource. Don't describe propagations or side-effects.

I'm fine with that, since that is what I originally did, but then @RubenVerborgh wanted some specific language to address that. I'm fine either way. Perhaps we haven't got enough on side-effects in the spec, perhaps we should have a section about that that we can refer to in such cases. Meanwhile, I think we could merge it as it is now, and do that for 1.0.

@@ -708,6 +708,11 @@ <h3 property="schema:name">Writing Resources</h3>

<p><span about="" id="server-protect-containment" rel="spec:requirement" resource="#server-protect-containment"><span property="spec:statement"><span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> NOT allow HTTP <code>POST</code>, <code>PUT</code> and <code>PATCH</code> to update a container’s containment triples; if the server receives such a request, it <span rel="spec:requirementLevel" resource="spec:MUST">MUST</span> respond with a <code>409</code> status code.</span></span> [<a href="https://github.com/solid/specification/issues/40#issuecomment-573358652" rel="cito:citesAsSourceDocument">Source</a>]</p>

<p>
<span about="" id="server-patch-not-other" rel="spec:requirement" resource="#server-patch-not-other"><span property="spec:statement"><span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MUST-NOT">MUST NOT</span> allow a client to explicitly request multi-resource changes via the <code>PATCH</code> method, but the server is permitted to propagate side-effects of a change to the target resource to other resources.</span></span> [<a href="https://github.com/solid/specification/issues/125#issuecomment-873035679" rel="cito:citesAsSourceDocument">Source</a>]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a bit of rephrasing will help matters? I've linked to a relevant DBpedia resource, but that could be changed however the Editors deem appropriate.

Suggested change
<span about="" id="server-patch-not-other" rel="spec:requirement" resource="#server-patch-not-other"><span property="spec:statement"><span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MUST-NOT">MUST NOT</span> allow a client to explicitly request multi-resource changes via the <code>PATCH</code> method, but the server is permitted to propagate side-effects of a change to the target resource to other resources.</span></span> [<a href="https://github.com/solid/specification/issues/125#issuecomment-873035679" rel="cito:citesAsSourceDocument">Source</a>]
<span about="" id="server-patch-not-other" rel="spec:requirement" resource="#server-patch-not-other"><span property="spec:statement"><span rel="spec:requirementSubject" resource="spec:Server">Servers</span> <span rel="spec:requirementLevel" resource="spec:MUST-NOT">MUST NOT</span> allow a client request to explicitly change multiple resources via the <code>PATCH</code> method. However, a <code>PATCH</code> method request that explicitly changes one resource <span rel="spec:requirementLevel" resource="spec:MAY">MAY</span> cause the server to change one or more other resources as a side-effect of the explicitly requested change (akin to a <a href="http://dbpedia.org/resource/Database_trigger">`TRIGGER` firing in a SQL DBMS</a>).</span></span> [<a href="https://github.com/solid/specification/issues/125#issuecomment-873035679" rel="cito:citesAsSourceDocument">Source</a>]

@kjetilk
Copy link
Member Author

kjetilk commented Nov 29, 2021

I'll remove this from the current milestone so that we can wordsmith this into shape at some later point. We need to finish 0.9 urgently.

@kjetilk kjetilk removed this from the Release 0.9 milestone Nov 29, 2021
@timbl
Copy link
Contributor

timbl commented Jan 12, 2022

The updateMany function in rdflib https://github.com/linkeddata/rdflib.js/blob/main/src/update-manager.ts#L685 is what we suggest developers call when the change updates more than one resource. It currently iterates over the resources needed to be changed, doing a patch to each one. There is no chance of atomicity.

It would be faster if all the requests could be combined in a more complex PATCH which patches more than one resource. And great is atomicity were an option too. So this issue is not in the spirit of going into th multiple resource patch would be an enhancement.

@timbl
Copy link
Contributor

timbl commented Jan 12, 2022

See Road map item "Atomic multi-file updates"
https://solidproject.solidcommunity.net/Roadmap/state.ttl#Iss1595438497668
The "To Be Done" state means we had decided it was a good thing to do though it is not scheduled.

@kjetilk
Copy link
Member Author

kjetilk commented Jan 12, 2022

I absolutely see multi-resource patches as desirable, and atomic ones important too.

The question is whether that should be going through this kind of interface.

My concern is that users may be duped by apps, possibly because the app has a vulnerability, to modify a resource they didn't intend to modify. I'd like to plug that possible hole at the protocol level.

To support multi-resource updates in a single request, I don't think a resource-centric interface makes sense, it is not a single resource anyway. In that case, thinking in terms of endpoints (e.g. a SPARQL endpoint) makes much sense, and I suppose you could do this already using the SPARQL Protocol.

@kjetilk kjetilk added the status: Nominated An issue that has been nominated for the next monthly milestone label Feb 15, 2022
@kjetilk
Copy link
Member Author

kjetilk commented Feb 22, 2022

Another concern is that if authorization always depends on the body of the message, then parsing the body and checking for other possible targets is a cost that must be paid by every PATCH request, unless it can be discerned from metadata (with SPARQL, there is nothing that can do that at present).

Currently, we don't have a cheap method for append (though see #305 for a proposal to fix that), so even the simplest append operation will have this cost.

It will also complicate caching on intermediaries as it implies that any resource might change based on any PATCH request (in contrast with just parent container and aux resources belonging to the resource), and so, implementations would need a mechanism to invalidate the cache for any resource.

This suggests to me that if we are going to have multi-resource operations, it ought to be on a different interface (an endpoint-oriented one), and we should only do it once we have adverse side effects under control.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc: Protocol status: Nominated An issue that has been nominated for the next monthly milestone topic: resource access
Projects
Status: Consensus Phase
Development

Successfully merging this pull request may close these issues.

6 participants