Skip to content

Piero Bosio Social Web Site Personale Logo Fediverso

Social Forum federato con il resto del mondo. Non contano le istanze, contano le persone

FEP-4f05: Soft Deletion

ActivityPub Protocol
25 9 0
  • Hi all,

    Some discussion regarding NodeBB's handling of soft deleted posts and Discourse's parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.

    https://codeberg.org/fediverse/fep/src/branch/main/fep/4f05/fep-4f05.md

    @Claire, in Feb 2002, you created a topic where you mentioned soft deletes. While this isn't strictly related to Undo(Delete), this FEP recommends thinking of a received Delete as an instruction to invalidate the cache, and re-fetch, which would give you a better answer as to how to handle the received Delete or Undo(Delete).

    Perhaps this might help.

  • Hi all,

    Some discussion regarding NodeBB's handling of soft deleted posts and Discourse's parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.

    https://codeberg.org/fediverse/fep/src/branch/main/fep/4f05/fep-4f05.md

    >Request the object (via its id) from the origin server directly

    Couldn't Delete activity itself indicate the type of operation?

    For example, if Delete contains embedded Tombstone, then treat it as a soft delete. Otherwise, treat it as a hard delete.

    >The Forums and Threaded Discussions Task Force (ForumWG) has identified a common nomenclature when referring to organized objects in a threaded discussion model.

    I find this nomenclature a bit confusing. Commented on the linked issue.

  • >Request the object (via its id) from the origin server directly

    Couldn't Delete activity itself indicate the type of operation?

    For example, if Delete contains embedded Tombstone, then treat it as a soft delete. Otherwise, treat it as a hard delete.

    >The Forums and Threaded Discussions Task Force (ForumWG) has identified a common nomenclature when referring to organized objects in a threaded discussion model.

    I find this nomenclature a bit confusing. Commented on the linked issue.

    The assumption is that the object is not embedded. If it is, then it stands to reason that the embedded object can be used as is. I'll call it out in that section, thanks.

  • Hi all,

    Some discussion regarding NodeBB's handling of soft deleted posts and Discourse's parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.

    https://codeberg.org/fediverse/fep/src/branch/main/fep/4f05/fep-4f05.md

    @rimu@piefed.social I noticed today that PieFed supports the concept of soft deletes:

    7b0318bb-2838-4675-b53e-28e6904ebf45-image.png

    Perhaps this FEP would be of interest to you.

  • Hi all,

    Some discussion regarding NodeBB's handling of soft deleted posts and Discourse's parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.

    https://codeberg.org/fediverse/fep/src/branch/main/fep/4f05/fep-4f05.md

    What would happen if you receive a Delete for an object that you believe to have been soft deleted, but now it shows up as an object instead of a Tombstone? Like, it was undeleted by the time you receive the Delete or something?

    Likewise, you receive an Undo(Delete) and when you fetch the referenced object, it returns back a Tombstone instead of the object?

    It'd be good to document those cases, because I think the answers are:

    • If you receive a Delete and the object returns an object, not a 410 / 404 or Tombstone, then you discard the Delete
    • If you receive an Undo(Delete) and the object returns a 404, 410 or Tombstone, then you discard the Undo(Delete)
  • What would happen if you receive a Delete for an object that you believe to have been soft deleted, but now it shows up as an object instead of a Tombstone? Like, it was undeleted by the time you receive the Delete or something?

    Likewise, you receive an Undo(Delete) and when you fetch the referenced object, it returns back a Tombstone instead of the object?

    It'd be good to document those cases, because I think the answers are:

    • If you receive a Delete and the object returns an object, not a 410 / 404 or Tombstone, then you discard the Delete
    • If you receive an Undo(Delete) and the object returns a 404, 410 or Tombstone, then you discard the Undo(Delete)

    Hi Emelia, thanks for the second pair of eyes on this.

    I will amend the FEP with those behaviours. It makes sense that no action be taken if the backreference check fails.

    Secondly, on re-read of my own FEP it is unclear that a backreference call is to be made, so I will need to make it clearer as well.

  • Hi Emelia, thanks for the second pair of eyes on this.

    I will amend the FEP with those behaviours. It makes sense that no action be taken if the backreference check fails.

    Secondly, on re-read of my own FEP it is unclear that a backreference call is to be made, so I will need to make it clearer as well.

    I have amended the FEP with an "Unexpected Responses" section.

    Of note, it's less so that you discard the activity, but since you already made the request, you may as well go through with what you received back.

    So if you get a Delete and a backreference shows the object alive and well, then just process it as an Update if you so wish.

    https://codeberg.org/fediverse/fep/pulls/665/files

  • Hi all,

    Some discussion regarding NodeBB's handling of soft deleted posts and Discourse's parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.

    https://codeberg.org/fediverse/fep/src/branch/main/fep/4f05/fep-4f05.md

    Hi @devnull

    this regards soft deletion + context collections (as a collection of posts). This topic started at

    https://codeberg.org/silverpill/feps/issues/19

    I'm curious what should happen if the context contains three elements ap-obj, reply, and reply2. reply2 is a reply of reply. Now reply is deleted. How many elements does the context then contain?

    @silverpill said that for mitra the context would contain 1 element ap-obj.

    The scenario as Gherkin:

    Background: Given A new user called "Alice" And A new user called "Bob" And An ActivityPub object called "ap-obj"Scenario: Reply to reply with parent reply deleted Given "Alice" replied to "ap-obj" with "Nice post!" as "reply" And "Bob" replied to "reply" with "Good point!" as "reply2" When "Alice" deletes "reply" Then For "Alice", the "context" collection of "ap-obj" contains "?" elements
  • Hi @devnull

    this regards soft deletion + context collections (as a collection of posts). This topic started at

    https://codeberg.org/silverpill/feps/issues/19

    I'm curious what should happen if the context contains three elements ap-obj, reply, and reply2. reply2 is a reply of reply. Now reply is deleted. How many elements does the context then contain?

    @silverpill said that for mitra the context would contain 1 element ap-obj.

    The scenario as Gherkin:

    Background: Given A new user called "Alice" And A new user called "Bob" And An ActivityPub object called "ap-obj"Scenario: Reply to reply with parent reply deleted Given "Alice" replied to "ap-obj" with "Nice post!" as "reply" And "Bob" replied to "reply" with "Good point!" as "reply2" When "Alice" deletes "reply" Then For "Alice", the "context" collection of "ap-obj" contains "?" elements

    Hey Helge.

    Per my understanding, when processing a deletion of reply, you would not presume deletion of any or all downstream objects. Only the referenced object is deleted.

    Deleting multiple objects at once would require multiple activities, or perhaps a single (and as-yet undefined) "batch" style activity.

  • It is not about deleting the objects, it's about if they are in the context collection or not.

    If I understand you correctly, we would have before Alice deletes her reply

    context[ap-obj] = [ap-obj, reply, repl2]

    and

    context[ap-obj] = [ap-obj, repl2]

    afterwards

  • It is not about deleting the objects, it's about if they are in the context collection or not.

    If I understand you correctly, we would have before Alice deletes her reply

    context[ap-obj] = [ap-obj, reply, repl2]

    and

    context[ap-obj] = [ap-obj, repl2]

    afterwards

    Yes, that's correct. Deletion of one object will not affect membership of downstream objects in the context collection.

  • @julian @helge I'll mention in FEP-f228 that behavior differs between implementations.

  • Hi all,

    Some discussion regarding NodeBB's handling of soft deleted posts and Discourse's parallel implementation prompted the creation of this FEP, which attempts to describe how the concept of soft deletion can be published without the introduction of new activities—using as:Delete as-is and relying on a backreference check for Tombstone in order to signal a soft delete.

    https://codeberg.org/fediverse/fep/src/branch/main/fep/4f05/fep-4f05.md

    Hi!

    Sorry for being late to the party.

    I understand the need to be able to undo deletions, this is something we face at Mastodon for the edge case of appealing moderation decisions (currently, most moderation decisions can be reversed upon appeal, but not post deletion).

    I have some concern with the FEP as it stands regarding performances, and ensuring consistency wrt. chronology of events, caching and possible out-of-order activities.

    Indeed, performance-wise, the FEP asks recipients of a \Delete\ to fetch the object that has just been deleted. This means that for a post that has, over its lifetime, reached a thousand different servers, in addition to ideally reaching all of those servers again (either directly or through inbox forwarding), the authoring server must now handle all of these servers fetching the now-deleted post all at once. I fear this is an especially bad instance of the thundering herd issue.

    As for ensuring consistency wrt. chronology of events, we face a lot of challenges:

    • depending on their architecture, servers may emit outgoing activities (or process incoming ones) out-of-order (for instance, Mastodon queues jobs into work queues, but if there are multiple workers, a later job can finish before an earlier job does)
    • due to network failures, servers may fail to deliver an activity on time and retry later
    • due to caching (e.g. Mastodon offers short-time caching on reverse proxies, but does not invalidate the reverse-proxy cache when the resource is changed), fetched data might actually be older than just-delivered data

    The ActivityPub primer makes note of this but offers no solutions besides “The receiving server, if it receives an activity that refers to an unknown activity, should store that activity for later processing.” While this is relatively easy to do when an object cannot be brought back once it’s deleted, this breaks done if you can undo the \Delete\, and I have seen no solution offered for that in the current FEP.

    Using \published\ in activities and \published\/\updated\ or similar in objects might help with that, but I’m afraid this might not be enough because of the seconds resolution of \xsd:Datetime\ (and it would require extra care that the lifecycle of an object is indeed serialized with a monotonic time).

  • julian@activitypub.spaceundefined julian@activitypub.space shared this topic on
  • Hi!

    Sorry for being late to the party.

    I understand the need to be able to undo deletions, this is something we face at Mastodon for the edge case of appealing moderation decisions (currently, most moderation decisions can be reversed upon appeal, but not post deletion).

    I have some concern with the FEP as it stands regarding performances, and ensuring consistency wrt. chronology of events, caching and possible out-of-order activities.

    Indeed, performance-wise, the FEP asks recipients of a \Delete\ to fetch the object that has just been deleted. This means that for a post that has, over its lifetime, reached a thousand different servers, in addition to ideally reaching all of those servers again (either directly or through inbox forwarding), the authoring server must now handle all of these servers fetching the now-deleted post all at once. I fear this is an especially bad instance of the thundering herd issue.

    As for ensuring consistency wrt. chronology of events, we face a lot of challenges:

    • depending on their architecture, servers may emit outgoing activities (or process incoming ones) out-of-order (for instance, Mastodon queues jobs into work queues, but if there are multiple workers, a later job can finish before an earlier job does)
    • due to network failures, servers may fail to deliver an activity on time and retry later
    • due to caching (e.g. Mastodon offers short-time caching on reverse proxies, but does not invalidate the reverse-proxy cache when the resource is changed), fetched data might actually be older than just-delivered data

    The ActivityPub primer makes note of this but offers no solutions besides “The receiving server, if it receives an activity that refers to an unknown activity, should store that activity for later processing.” While this is relatively easy to do when an object cannot be brought back once it’s deleted, this breaks done if you can undo the \Delete\, and I have seen no solution offered for that in the current FEP.

    Using \published\ in activities and \published\/\updated\ or similar in objects might help with that, but I’m afraid this might not be enough because of the seconds resolution of \xsd:Datetime\ (and it would require extra care that the lifecycle of an object is indeed serialized with a monotonic time).

    Regarding the performance issue, and avoiding the thundering herd problem, one could simply embed the object itself (so, a Delete with an expanded Tombstone in object) into the activity. You could additionally sign it (LD Signature) or attach a proof (Object Integrity Proofs) if necessary.

    As for sub-second resolution of updated/published... is xsd:Datetime required? I've honestly just been sending ISO Strings, which include millisecond-level accuracy.

  • @claire@social.sitedethib.com I re-read the text of the FEP and noted the following:

    > When a Delete activity is encountered, the referenced object MAY be either the full object or a reference to one.
    >
    > If object is a reference, the server MUST request the object (via its id) from the origin server directly.

    Emphasis is mine. In situations where you choose to embed the full object in the activity, then you are not bound by the MUST to refetch the object.

    Now, when talking about hard deletes, you cannot literally embed a non-existent object, so a re-fetch would be necessary, although I am hoping that 404 handlers are a great deal faster.

    I like published. I can add that in to the FEP if it makes it easier to handle situations where multiple Deletes and Updates are encountered out-of-rder due to network congestion, parallel processing, etc.

  • \xsd:dateTime\ is required as per https://www.w3.org/TR/activitystreams-vocabulary/#dfn-published but i skimmed over the definition too fast, it definitely allows fractional seconds!

    julian2:

    Emphasis is mine. In situations where you choose to embed the full object in the activity, then you are not bound by the MUST to refetch the object.

    It appears I must have read too fast once again, and was confused by the “Unexpected responses” section.

    julian2:

    Now, when talking about hard deletes, you cannot literally embed a non-existent object, so a re-fetch would be necessary, although I am hoping that 404 handlers are a great deal faster.

    That can still be an issue, negative hits are still expensive and in general you may not want to cache them (to avoid an attacker targeting something that does not exist yet).

  • julian2:

    If object is a reference, the server MUST request the object (via its id) from the origin server directly.

    i think this requirement can be removed, as the behavior on receiving a Delete is up to the receiver and not the sender. that's also where the issue lies -- receivers assuming Delete is a permanent removal. any or all of the following behaviors on receiving a Delete are "valid" in some sense:

    • do nothing to the object, just store the activity
    • expunge object from HTTP cache
    • expunge object from AS2/RDF dataset
    • edit the object to say it is "deleted"
    • convert object to a Tombstone
    • prevent reuse of the object.id
    • fetch the object using HTTP GET and handle caching/refetching using HTTP cache control headers

    having a reference doesn't imply needing to fetch it if you already have information about it. if you don't already have information about it then you can also choose not to fetch on Delete activities. the point of having an id is that you can choose whether or not to obtain additional information! that's what linked data is founded on -- the linking. every link is in effect a boundary between two records of information.

    if the goal is to prevent receivers from completely purging an object, then you can't really do this. if the goal is to stop receivers from preventing reuse of the id, then recommend that they SHOULD NOT do this.

    more generally i would ask you to consider two different senses of "deletion":

    • Delete / Undo Delete
    • Update(object.formerType=object.type, object.type=Tombstone) / Update(object.type=object.formerType)

    a Tombstone is still an Object and can have all the properties of Object btw, so it's valid to have this:

    type: TombstoneformerType: Notecontent: "[deleted]"attributedTo: 

    or this:

    type: TombstoneformerType: Notecontent: "the text is still there but the account was deleted"attributedTo: type: Tombstone formerType: Person

    or this:

    type: TombstoneformerType: Notecontent: "the text is still there but the account was deleted"attributedTo: # GET someone HTTP/1.1# HTTP/1.1 404 Not Found
  • julian2:

    If object is a reference, the server MUST request the object (via its id) from the origin server directly.

    i think this requirement can be removed, as the behavior on receiving a Delete is up to the receiver and not the sender. that's also where the issue lies -- receivers assuming Delete is a permanent removal. any or all of the following behaviors on receiving a Delete are "valid" in some sense:

    • do nothing to the object, just store the activity
    • expunge object from HTTP cache
    • expunge object from AS2/RDF dataset
    • edit the object to say it is "deleted"
    • convert object to a Tombstone
    • prevent reuse of the object.id
    • fetch the object using HTTP GET and handle caching/refetching using HTTP cache control headers

    having a reference doesn't imply needing to fetch it if you already have information about it. if you don't already have information about it then you can also choose not to fetch on Delete activities. the point of having an id is that you can choose whether or not to obtain additional information! that's what linked data is founded on -- the linking. every link is in effect a boundary between two records of information.

    if the goal is to prevent receivers from completely purging an object, then you can't really do this. if the goal is to stop receivers from preventing reuse of the id, then recommend that they SHOULD NOT do this.

    more generally i would ask you to consider two different senses of "deletion":

    • Delete / Undo Delete
    • Update(object.formerType=object.type, object.type=Tombstone) / Update(object.type=object.formerType)

    a Tombstone is still an Object and can have all the properties of Object btw, so it's valid to have this:

    type: TombstoneformerType: Notecontent: "[deleted]"attributedTo: 

    or this:

    type: TombstoneformerType: Notecontent: "the text is still there but the account was deleted"attributedTo: type: Tombstone formerType: Person

    or this:

    type: TombstoneformerType: Notecontent: "the text is still there but the account was deleted"attributedTo: # GET someone HTTP/1.1# HTTP/1.1 404 Not Found

    Okay, I am perfectly fine to relax the requirement from a MUST to a SHOULD.

    Does that resolve the thundering herd concern acceptably?

    Other solutions would entail:

    1. Setting explicit null as object (yes @trwnh@mastodon.social this is yet another example of a place where null makes sense!) if the object is hard deleted.
    2. Sending an ETag header with the Delete activity. When re-requesting, send that same value in If-Modified-Since and the receiver can opt to terminate execution early with an HTTP 304.
  • @julian how does null have anything to do with this? Delete null doesn't make sense

  • @julian how does null have anything to do with this? Delete null doesn't make sense

    @trwnh@mastodon.social hm, you're right. I should stop thinking about FEPs after business hours.


Gli ultimi otto messaggi ricevuti dalla Federazione
  • Ugh, Discourse is not receiving the whole thread. Please take a look at @jonny@neuromatch.social's replies on ActivityPub.Space

    https://activitypub.space/post/https%3A%2F%2Fneuromatch.social%2Fusers%2Fjonny%2Fstatuses%2F116133825045982326

    read more

  • read more

  • @julian
    Its also a bit of a political question: who should be in charge? I tend to favor decentering instances in favor of empowering actors. So yes instances exist and need to change domains, but imo that would be best served by decoupling actors from instances and thereby domain changes, so an instance changing domains works just as well as an actor moving domains.

    read more

  • @julian
    This one: https://codeberg.org/fediverse/fep/src/branch/main/fep/1580/fep-1580.md

    This was basically exactly what I had in mind actually, it would work equally well for whole instance moves (whether the source instance is still active or not). Using redirects from the old domain for objects is not really feasible for a bunch of reasons that come into play in core cases in which a domain migration would be necessary. Being able to migrate arbitrary objects resolves actor, instance, and object moves with a single mechanism. Predetermining IDs would be incompatible many AP implementations with and constrain the form IDs take. Re: the privacy note at the bottom, you don't need to publish actor lists either if you put the move object on the actor - the instance only needs to know the actors that the instance already knows about

    After drafting and discussing for awhile I came to the conclusion that migrations sort of have to be driven by the target instance, with the source instance's role just being issuing a signed migration object so the implementation can be as simple as possible for the source and it can be retired as quickly as possible. most of the time migrations are forced by some duress at the source, and also are likely to involve an old/out of date instances. Maintaining mappings from the source instance is fragile and inevitably prone to hijacking.

    read more

  • No, I have not seen it. I was actually trying to find an existing spec before starting this one. Is it one of the FEPs?

    read more

  • Have you looked into the migration spec that @jonny@neuromatch.social put together a couple months back?

    It was regarding content migrations between servers, but one could adopt the same for switching domains (although I think Jonny's is scoped to the user level, not at the instance level like yours)

    read more

  • Thanks, this breakdown makes sense.

    I agree that changing software while preserving everything is a much more complicated scenario. Once the software changes, routes and identifiers usually change in ways that cannot be captured cleanly by simple mapping rules. This spec is not trying to solve that problem.

    The first two cases likely cover 95 percent of real migrations. Keeping the same software and database, with either no path change or just a prefix change, is predictable and rule based. That seems like the right scope for what we are building here.

    I can clarify in the spec that full software migrations are out of scope.

    read more

  • The mapping rules outlined seem good in theory, but are not expressive enough in practice.

    There are three migration paths when migrating to a new domain:

    Keep software + db Domain changes, no path changes Keep software + db, change subfolder (e.g. / to /forum) All routes change, prefix rule applies Migrate software All routes change, none of the available rules are expressive enough

    It's possible that one could change software, migrate data, and all identifiers remain the same.

    Realistically this never happens.

    Realistically what happens is a reverse proxy mapping is put into place so requests to the old path under the new domain redirect to the new URL. In this scenario, there is no regular expression that will fully capture this.

    e.g. /t/274884/some-topic-slug to /topic/274880/some-topic-slug

    read more
Post suggeriti
  • FEP-a427: Server Domain Move

    ActivityPub Protocol
    9
    0 Votes
    9 Posts
    0 Views
    Ugh, Discourse is not receiving the whole thread. Please take a look at @jonny@neuromatch.social's replies on ActivityPub.Space https://activitypub.space/post/https%3A%2F%2Fneuromatch.social%2Fusers%2Fjonny%2Fstatuses%2F116133825045982326