does anyone know who's behind "open.news"?

ansuz@social.cryptography.dog

> Open.News is the command center for the decentralized newsverse.

Looks like they're ingesting people's fediverse feeds into LLMs and feeding slop to people. I only noticed because it was mostly visiting non-existent or malformed URLs.

> We index live conversations across RSS, Bluesky, and Mastodon so you never miss the story behind the story. FeedBrainer's conversational AI transforms the firehose into a calm, contextual briefing tailored to you.

#slop #scrapers

ansuz@social.cryptography.dog

it says "powered by Feedbrainer.ai", and while I didn't find any matches for that I did find https://feedbrain.ai/

I can't tell if open.news is a subsidiary of feedbrain, or just someone depending on their API. If it is them, they're based in Dubai:

https://feedbrain.ai/terms

ansuz@social.cryptography.dog

sysadmins should be able to grep for an "open.news" user-agent.

They're not generating enough traffic to cause any problems, but at this point I have zero patience left for LLM companies.

ansuz@social.cryptography.dog

looking deeper into my logs it seems like their first attempts to scrape my sites coincided with one of my threads that got picked up by an unusual number of "trending-bots".

my best guess at the moment is that this service is using these bots as a starting point for their scraping campaigns, so I might just start blocking them

ansuz@social.cryptography.dog

Update on this: open.news seems to be operated by the same people as "readily.news"

They have a "sign up" button, but if you click through that (do it in a container tab if you want to check) they ask you for your mastodon account's id.

Once you enter your account's id, you should be redirected to your instance's sign-in page.

I haven't gone further than that, but I guess this service is gaining full access to people's accounts in this way, then using those accounts to scrape the network so that their AI can provide a daily digest of what happened on fedi

ansuz@social.cryptography.dog

I'm kind of tempted to make a throwaway account on mastodon.social or something like that so that I can see how the rest of this works.

I did it with pixelfed.social, and the page loads as it normally would, but with an extra hidden iframe. I don't see that behaviour with mastodon.social.

Maybe this only works with particular fedi instances that lack some security feature?

ansuz@social.cryptography.dog

Okay, so it's just the standard OAuth workflow, but I was otherwise right, and their app basically gets full access to your account just as a mobile app for mastodon might 😬

ansuz@social.cryptography.dog

hey @Mastodon,

are you aware that https://readily.news/ is leveraging full access to users' mastodon accounts to scrape followers-only fediverse posts?

ansuz@social.cryptography.dog

I found a mastodon.social account with a link to readily.news in their bio:

https://mastodon.social/@librenews

I can't tell if he's affiliated with the project/company or if they've injected their link into his bio after he'd given them access to his account. Both seem plausible.

ansuz@social.cryptography.dog

Oh, and it seems readily.news was discussed at FediForum 2023:

https://fediforum.org/2023-03/session/4-c/

Do I know anyone who attended that who remembers any relevant details?

Matt's name appears there too, so it's really looking like it might be his project, but who knows 🤷

ansuz@social.cryptography.dog

I'm working on a (hopefully brief) write-up of everything I know about this latest fediverse scraper

ansuz@social.cryptography.dog

As usual, this post turned out somewhat longer than I'd originally intended.

This is pretty much everything I know "readily.news", the latest non-consensual service attempting to scrape the Fediverse for all it is worth:

https://cryptography.dog/blog/what-little-i-know-about-readily-news/

I've done my part. You'll need to supply your own torches and pitchforks.

ansuz@social.cryptography.dog

It's been a few days since I posted about https://readily.news AKA "open.news", a service which:

1. asks for complete access to your Mastodon/fedi account

2. ingests whatever your account can see via your account and summarizes it using LLMs (seemingly from OpenAI?)

3. sends you a daily, personalized newsletter

It's a particularly bad kind of scraper because it basically hijacks existing community infra to do the scraping for it.

Because accounts' host instances are the actors gathering up all the content there's no way for remote servers to detect which of their followers' accounts have been compromised, nor to block their posts from ending up in the hands of the upstream LLM providers.

We'll probably need admins of affected instances to run a database query to detect and revoke permissions granted to this service via OAuth to limit its access.

I asked the guy who
the guy who appears to be behind it (https://mastodon.social/@librenews
) if he could confirm his affiliation, but he doesn't actually seem to be very active on Mastodon (preferring Bluesky) and so he still hasn't responded.

I'm actually a little surprised at how little reaction there's been to this based on how quickly other scrapers were run off the network, but I get that people are busy.

If you want more details, the specifics of my investigation are in this post:

https://cryptography.dog/blog/what-little-i-know-about-readily-news/

...and I'd appreciate if others could corroborate my findings.

#infosec #fediscrapers #scrapers #LLMs #AI

homebrewandhacking@mastodon.ie

@ansuz

Thanks for the heads up. I've put the block in.

Had you considered adding # fediblock to your post? Seems like a moderator scale problem.

@davey_cakes

ansuz@social.cryptography.dog

@Homebrewandhacking

I considered tagging it as such, but given the way the platform/service/malware works it doesn't seem to be a situation where the usual blocking methods will be effective.

I caught their user-agent requesting a file that had only been mentioned in a followers-only post, but I still don't know which of my followers leaked it.

ansuz@social.cryptography.dog

In case anybody who is more deeply familiar with Mastodon's database internals feels like helping to shut this service down:

I think it would be great to have a command instance admins could run to identify which (if any) of the accounts they host have handed over account access to Readily.news.

It achieves access through the OAuth confirmation dialog shown in the attached screenshot

ansuz@social.cryptography.dog

Alternatively, if anybody else feels they can toot about this in a way that gets more traction, please do!

Maybe I ought to change the headline of my article to "Techbro builds Cambridge Analytica for the Fediverse" or something more inflammatory like that?

jdp23@neuromatch.social

Yikes. Thanks for tracking this. Seems bad!

FYI @moderation we might want to have a look at this.

Also FYI @onepict -- thread starts at https://social.cryptography.dog/@ansuz/115580791098152318.

EDIT: oops, sorry, I now see that you were involved earlier in the thread

@ansuz @Homebrewandhacking

onepict@chaos.social

@jdp23 @moderation @ansuz @Homebrewandhacking It's all cool 😊.

Thank you for thinking of me Jon

But Sheesh there's a pipeline to the fediverse that ignores consent.

It's all about the content aggregation. Because obviously the fediverse only wins if everyone is on it /s 🙄

"So it still needs to be called out, because there's a pipeline to the Fediverse that thinks about the technical aspect first and the wider community a long time afterwards."

https://dotart.blog/cobbles/reflections-on-fedicon-and-foss-guilt-by-association

jdp23@neuromatch.social

Thanks @ansuz for the detailed writeup about readily.news. I can share the link with the guy who appears to be behind it on Bluesky if that would be useful, or alternatively start by engaging with him and asking questions.

@misc @haubles it looks like you were both at the fediforum session on Discovery and the Fediverse (algos, curation, interfaces). Do you happen to reember anything from the discussion about Readily, which at the time was "adding Brave Goggles to it’s daily digest to allow users to choose what gets boosted".

If you want a clickbait-y title another option is to include non-consensual, fediverse scraper, and LLM in it. That said the understaed approch of the current title is good too and it's style is more fedi-like. It's always hard to predict which of these incidents gets traction and which don't -- and stuff often flies under the radar for a while.

Piero Bosio Social Web Site Personale

does anyone know who's behind "open.news"?

Feed RSS

Gli ultimi otto messaggi ricevuti dalla Federazione

Post suggeriti

The developer of Bear Blog on the latest wave of scrapers he faced.

sysadmins/webmasters of fedi:

Dear LLM, please make me a pet site about cats.

WTF?