#ActivityPub developers only please: how many items should be in a full collection page?
-
So, here's the trade-off: adding embedded objects can reduce the number of extra HTTP requests required to render the page of objects. For example, if showing a `followers` collection, adding each actor's name, avatar, and so on can be a real savings.
However, it puts a lot of costs on the server -- looking up cached or local data about each object.
Long story short: adding embedded objects is a pressure towards having smaller page sizes.
If you're not showing embedded objects, then filling up a collection page is usually just a couple of database queries. And adding more items to the page has very little extra cost.
The bigger your pages are, the fewer requests a client has to make to get all the data.
So, I think if you're not doing embedded objects, the pressure is towards bigger pages.
-
If you're not showing embedded objects, then filling up a collection page is usually just a couple of database queries. And adding more items to the page has very little extra cost.
The bigger your pages are, the fewer requests a client has to make to get all the data.
So, I think if you're not doing embedded objects, the pressure is towards bigger pages.
There are a couple of other confounding factors.
Adding embedded objects makes supporting HTTP Caching harder. The `ETag` header isn't too hard, but `Last-Modified` is difficult. You need to check not only what the collection page modification date is, but also each of the embedded objects (and take the max date!). It's a pain and most folks don't even implement it.
-
There are a couple of other confounding factors.
Adding embedded objects makes supporting HTTP Caching harder. The `ETag` header isn't too hard, but `Last-Modified` is difficult. You need to check not only what the collection page modification date is, but also each of the embedded objects (and take the max date!). It's a pain and most folks don't even implement it.
The other things is collection filtering. This is where you check each item in a page to see if the client can actually read it, and leaving it out if not. It's very important if you include embedded objects, and not that important if not. If you only include references, they can be checked when the client tries to fetch them.
-
The other things is collection filtering. This is where you check each item in a page to see if the client can actually read it, and leaving it out if not. It's very important if you include embedded objects, and not that important if not. If you only include references, they can be checked when the client tries to fetch them.
Another thing is whether "pages" in your collection are real objects -- buckets that fill up with items as time goes on -- or just fixed-length offsets from the most recent item. I think having real pages is much better for caching and synchronization.
-
Another thing is whether "pages" in your collection are real objects -- buckets that fill up with items as time goes on -- or just fixed-length offsets from the most recent item. I think having real pages is much better for caching and synchronization.
Anyway, here's my thought: I think the advantages of embedded objects are offset by the problems with caching. I think we should make collection pages real, stable objects, with fixed contents and real modification dates. Return only references, not embedded objects. Do filtering, though. And make pages big -- 100 items or more.
-
@macgirvin the 12-item page size is a real kick in the teeth.
-
@macgirvin the 12-item page size is a real kick in the teeth.
@evan@cosocial.ca remind me to change pagination size to 1 on April 1st
-
@macgirvin the 12-item page size is a real kick in the teeth.
@macgirvin oh, but: I disagree about configurable page sizes. I think pages should have stable contents, have last-modified dates, and be easily cacheable. It makes traversal and synchronization much better.
-
@julian you monster
-
@macgirvin oh, but: I disagree about configurable page sizes. I think pages should have stable contents, have last-modified dates, and be easily cacheable. It makes traversal and synchronization much better.
@macgirvin like, only the most recent page should be volatile. Except for deletions, older pages should not change.
-
@macgirvin like, only the most recent page should be volatile. Except for deletions, older pages should not change.
@evan@cosocial.ca in reverse chron every page changes when a new item is pushed to the stack.
Even in chronological sets if you need to account for deletions you've basically given up already because there's no immutability guarantee! This is why caching headers exist, no?
Gli ultimi otto messaggi ricevuti dalla Federazione
Post suggeriti
-
Whoop Whoop!
Moved Uncategorized -
-
-