ActivityPub API Client Reputation

evan@activitypub.space

For the ActivityPub API Task Force, I started an issue to discuss OAuth client reputation systems.

A reputation system tracks which OAuth clients are known good, known bad, or unknown. Servers could use this information to limit what clients can do. For example, a server could prevent users from logging in with a known bad client.

The reputation could be based on human curation and review, or on automated collection of evidence from historical behaviour of the client.

I'm trying to find examples in the OAuth ecosystem of this kind of reputation systems -- either local or distributed.

App store approval (and user reviews) are a good example for native apps. OpenBanking keeps a client directory that needs human curation and review.

I don't have examples from OAuth -- especially with dynamic registration or CIMD.

Any ideas?

evan@cosocial.ca

@evan@activitypub.space I want to take a moment to note how nice the NodeBB content looks in Mastodon.

brunogirin@mastodon.me.uk

@evan what factors would impact the reputation and who decides what is a good or bad client?

evan@activitypub.space

@brunogirin@mastodon.me.uk

I'd suggest that there are two parties that should get to decide what is a good or bad client:

The ActivityPub user who uses the client.
The administrator of the server that the ActivityPub user uses.

I think there's a third group, which is other admins, developers, and users, who share similar values with the user and the admin. They may have information to share with the user and/or admin.

I don't think these values are universal, so I don't think we need a universal reputation. But I can give what I think are bad things for an API client to do.

Generating activities on behalf of the user that don't match the user's express or implied intentions. For example, if the user logs into a client app, and it posts a public message, "I think this client app is the best and everyone should try it!"
Extracting the user's data for reasons that the user wasn't informed of. For example, a client app that copies all your private messages to cloud backup controlled by the app developer.
Abusing public or private resources, even if the user intends to abuse. For example, a client app for spamming, or a client app for brigading.

I think there are a few signals that could identify what I would call "bad" clients:

User complaints would be the biggest
Complaints from other users about the user's behaviour when using the app
Security researcher reports

brunogirin@mastodon.me.uk

@evan
Sounds good!

I suppose it would be useful to be able to specify the version too so that you may ban a known buggy version of a client or any version prior to a known CVE fix.

It could also be useful to make those lists shareable so that a new Fedi instance can start with something if they wish to.

thisismissem@activitypub.space

Client reputation isn't really something you can track and share in a decentralized network without introducing some centralisation. You could try to do web of trust style things, but that would mean writing a record that publicly says "good client is good", but then a malicious app could just write that record on sign-in: how many iOS apps nag you for a positive review? Particularly with somewhat dark patterns of "are you enjoying ? Yes / no" where "no" pushes you to the app's feedback and yes pushes to write a review, trying to deliberately avoid negative reviews.

The other downside of publicly disclosing which clients you use is that it tells attackers where to look for security exploits, because now you can pick a set of targets and try to attack the software they use.

Raw usage numbers also doesn't help because a bad client can quite easily become viral, see for example Cambridge analytica, who iirc used games to gain access to sensitive data.

You'd also need moderation tools that can moderate clients in some sort of meaningful way — that's near impossible for dynamic client registration. That's why we wrote the CIMD spec. A large Mastodon server usually has 10-20x the number of registered clients as number of accounts.

Things that can add up to trust are things like:

privacy policies & terms of service
client_uri (website) matching the client metadata (requires some crawling)
client authentication mechanism (public client vs private_key_jwt auth)
scopes/authorization requested being fine grain enough, instead of asking for full unrestricted access.

But OAuth security and trust models are complex and generally proprietary

evan@activitypub.space

@thisismissem said in ActivityPub API Client Reputation:
> Client reputation isn't really something you can track and share in a decentralized network without introducing some centralisation.

I think that some centralisation is fine, as long as admins or even users can choose their reputation data provider. We do this with blocklists for instances already; there's no reason to think that client blocklists would need to be any different. They'd have to have the same caveats; a trusted provider, transparency in the process, etc.

> You'd also need moderation tools that can moderate clients in some sort of meaningful way — that's near impossible for dynamic client registration.

Agreed. The best I can think of is using the redirect_uri, but that's not really unique -- especially for command-line clients that use localhost!

I think the ticket you're working on for moderating OAuth clients for Mastodon is a really big deal. I think it'd be a similar issue for ActivityPub API clients.

> That's why we wrote the CIMD spec.

Yes! Using the same identifier for clients in a verifiable way is a big help in having a reputation for using on a single server or multiple servers.

> But OAuth security and trust models are complex and generally proprietary

I think you could get to some pretty useful metrics pretty quickly, though. Some good ones to use might be:

How many people on this server (or other servers) have authorized the client
What the average rating has been (but you need a way to rate clients!)
How many Flag activities have been submitted for this client (you need a way to report clients)
Reviews of the client (you need a way to write a review of a client)

That data could be local to the server, or could be shared from other trusted servers. A trusted intermediary like IFTAS could be helpful.

thisismissem@activitypub.space

@evan said in ActivityPub API Client Reputation:
> the ticket you're working on for moderating OAuth clients for Mastodon is a really big deal.

I'm not actively working on any Mastodon features at the moment because they can't give credit where credit is due, which means it's not financially viable for me to contribute. I also just opened that ticket explaining the problem. CIMDs would fix.

> > That's why we wrote the CIMD spec.
>
> Yes! Using the same identifier for clients in a verifiable way is a big help in having a reputation for using on a single server or multiple servers.

You cannot rely on the contents of a CIMD not changing though, for that you'd need to calculate like the CBOR CID of the JSON (that's what I do in https://cimd-service.fly.dev)

> > But OAuth security and trust models are complex and generally proprietary
>
> I think you could get to some pretty useful metrics pretty quickly, though. Some good ones to use might be:

You'd be surprised, but no. Whilst I was on the hachyderm infra team, I ran a tonne of queries for research on the data they have for registered OAuth clients, and there's really not a lot of great insight, besides "this app was added a lot to accounts", which isn't really a good score of trust (see: Cambridge Analytica).

> - How many people on this server (or other servers) have authorized the client

Meaning number, overall. The top registered client on Hachyderm was actually a dead research project if memory serves (found that out after reaching out to the author, and promptly revoked all 200k access token it had left on our servers unrevoked)

> - What the average rating has been (but you need a way to rate clients!)

Not something 99.9% of people will do meaningfully, see appstore ratings and bridgading of apps to tank their scores.

> - How many Flag activities have been submitted for this client (you need a way to report clients)

You can't Flag a non-activitypub JSON document. The majority of fediverse software doesn't support multi-modal moderation reports, Pixelfed is one of the few that does.

> - Reviews of the client (you need a way to write a review of a client)

See prior note on App Stores.

> That data could be local to the server, or could be shared from other trusted servers. A trusted intermediary like IFTAS could be helpful.

Sure, maybe, but it needs to reference a CIMD at a specific content-hash. Otherwise I can attack that system by changing my metadata to gain more access

evan@activitypub.space

@thisismissem said in ActivityPub API Client Reputation:

> I'm not actively working on any Mastodon features at the moment because they can't give credit where credit is due, which means it's not financially viable for me to contribute. I also just opened that ticket explaining the problem. CIMDs would fix.

Oof. Let's hope they get around to it before the bad guys do. I'd rather we all don't learn a lesson about security the hard way.

> You can't Flag a non-activitypub JSON document.

I think you can, if you use the Link type.

{
   "@context": "https://www.w3.org/ns/activitystreams",
   "type": "Flag",
   "id": "https://social.example/activity/flag/1",
   "actor": "https://social.example/user/3",
   "object": {
       "type": "Link",
       "mediaType": "application/json",
       "href": "https://client.dev/oauth/metadata.json"
   },
   "content": "This is an example Flag activity for a CIMD document."
}

A reputation system doesn't have to be perfect to be useful. And it's much more important to collect and share negative signals than positive ones.

I understand that you don't trust app store reviews or ratings but literally billions of other people do. When I go to download an app and it's got a 2.8/5 score, it gives me pause, and I read the reviews to see what the problem is. Sometimes I'll google the app by name. I am unlikely to install it, unless it's really the only software out there that does what I need it to do.

At the very least, manual moderation is important. "This app isn't allowed on this server." That depends on human judgement, CVE reports, whatever.

I think I understand the use of the content hash, thanks!

thisismissem@activitypub.space

@evan said in ActivityPub API Client Reputation:
> @thisismissem said in ActivityPub API Client Reputation:
>
> > I'm not actively working on any Mastodon features at the moment because they can't give credit where credit is due, which means it's not financially viable for me to contribute. I also just opened that ticket explaining the problem. CIMDs would fix.
>
> Oof. Let's hope they get around to it before the bad guys do. I'd rather we all don't learn a lesson about security the hard way.

One could hope, but they weren't willing to back the huge amount of work to deprecate non-expiring access tokens, so that'll probably be exploited first, since there's quite literally millions of non-revoked access tokens out there.

I tried to do the work to fix it on my own, but it's literally months of work to implement correctly with enough test coverage. Without them either paying me or promoting/acknowledging my work, I ran out of my own budget to be able fix their problems.

> > You can't Flag a non-activitypub JSON document.
>
> I think you can, if you use the Link type.
>
> json > { > "@context": "https://www.w3.org/ns/activitystreams", > "type": "Flag", > "id": "https://social.example/activity/flag/1", > "actor": "https://social.example/user/3", > "object": { > "type": "Link", > "mediaType": "application/json", > "href": "https://client.dev/oauth/metadata.json" > }, > "content": "This is an example Flag activity for a CIMD document." > } >

That'll flag it at this point in time, and the contents can change. And software in the fediverse is unlikely to be able to understand receiving a flag like that.

> At the very least, manual moderation is important. "This app isn't allowed on this server." That depends on human judgement, CVE reports, whatever.

Yeah, requires folks to actually build moderation tools for that and ensure moderating against an application revokes its access completely. Revoking access tokens doesn't prevent usage of data already harvested or whatever, but does prevent ongoing abuse

Piero Bosio Social Web Site Personale

ActivityPub API Client Reputation

Feed RSS

Gli ultimi otto messaggi ricevuti dalla Federazione

Post suggeriti

WP group actor ID URL encoded?

Backfill from Mastodon working really well!

Does anyone know if mastodon broadcasts replies to posts?

Flag Activity