sysadmins/webmasters of fedi:
-
Since so many people are boosting this thread I think I'll take the opportunity to mention that I'm available for hire on a part-time or contract basis.
Feel free to reach out if you like my ideas about computer-related topics and have both the budget and need of someone who has such ideas.
I can be reached by Fediverse DM or the contact form on my website:
I just published a blog post summing up my most pertinent thoughts about dealing with badly-behaved web-scraping bots:
https://cryptography.dog/blog/AI-scrapers-request-commented-scripts/
It isn't exactly a Hallowe'en-themed article, but today is the 31st and the topic is concerned with pranking people who come knocking on my website's ports, so it's somewhat appropriate.
#infosec #bots #halloween #scrapers #AI #someMoreHashtagsHere
-
I just published a blog post summing up my most pertinent thoughts about dealing with badly-behaved web-scraping bots:
https://cryptography.dog/blog/AI-scrapers-request-commented-scripts/
It isn't exactly a Hallowe'en-themed article, but today is the 31st and the topic is concerned with pranking people who come knocking on my website's ports, so it's somewhat appropriate.
#infosec #bots #halloween #scrapers #AI #someMoreHashtagsHere
looks like someone reshared my article to Hacker News where someone has (predictably) already commented on the headline without reading the article 😅
-
looks like someone reshared my article to Hacker News where someone has (predictably) already commented on the headline without reading the article 😅
it's like reddit if every user was a meth-head tech bro
-
looks like someone reshared my article to Hacker News where someone has (predictably) already commented on the headline without reading the article 😅
my system for catching anomalous HTTP traffic just flagged someone sending me an API key as an "x-api-key" header.
that's about what I'd expect from the average HN reader
-
it's like reddit if every user was a meth-head tech bro
@ansuz Some of the replies/comments in that thread are just insane!
Scraping is fine, even when there's a robots.txt (and other signs) saying not to, but DDoSing is illegal and you should sue people who do that.
Just ... what ?!?
Some of these people are in a totally different reality.
No, not reality ... the other thing.
-
@ansuz Some of the replies/comments in that thread are just insane!
Scraping is fine, even when there's a robots.txt (and other signs) saying not to, but DDoSing is illegal and you should sue people who do that.
Just ... what ?!?
Some of these people are in a totally different reality.
No, not reality ... the other thing.
@ColinTheMathmo oh yea, a lot of them have very clearly had way too much silicon-valley-kool-aid to drink and were basically incoherent.
I was actually pleasantly surprised at a few others, though, I can't help but feel that the more sensible ones might be wasting their time commenting on HN and should just join fedi
-
@ColinTheMathmo oh yea, a lot of them have very clearly had way too much silicon-valley-kool-aid to drink and were basically incoherent.
I was actually pleasantly surprised at a few others, though, I can't help but feel that the more sensible ones might be wasting their time commenting on HN and should just join fedi
Definitely a "take the rough with the smooth" thing.
I don't tend to read HN commentary on my stuff. (Sometimes, but not always)
-
sysadmins/webmasters of fedi:
I am looking for suggestions of which search engine crawlers I should consider permitting in my robots.txt file.
There can definitely be value in having a site indexed by a search engine, but I would like to deliberately exclude all of those which are using the same data to train LLMs and other genAI. More specifically, I would only like to allow those which have an explicit stance against training on others data in this fashion.
Currently I reject everything other than Marginalia (https://marginalia-search.com/). Are there any others I should consider?
@ansuz If Marginalia is already there then:
https://alpha.mwmbl.org
https://stract.com -
@ColinTheMathmo oh yea, a lot of them have very clearly had way too much silicon-valley-kool-aid to drink and were basically incoherent.
I was actually pleasantly surprised at a few others, though, I can't help but feel that the more sensible ones might be wasting their time commenting on HN and should just join fedi
@ansuz @ColinTheMathmo this has been my experience also in the rare cases when my writing gets shared on HN (or lobste.rs or reddit, FWIW): there a bunch of ignorami who are full-in on the corporate propaganda, and a few sane voices that try to highlight how bad that is for everyone. 1/2
-
@ansuz @ColinTheMathmo this has been my experience also in the rare cases when my writing gets shared on HN (or lobste.rs or reddit, FWIW): there a bunch of ignorami who are full-in on the corporate propaganda, and a few sane voices that try to highlight how bad that is for everyone. 1/2
On the one hand, I agree that they're mostly wasting their time and would do better to spend it elsewhere; OTOH, I am glad for the work they do, and who knows, maybe their words reach others that can then do better.
2/2
-
@ColinTheMathmo oh yea, a lot of them have very clearly had way too much silicon-valley-kool-aid to drink and were basically incoherent.
I was actually pleasantly surprised at a few others, though, I can't help but feel that the more sensible ones might be wasting their time commenting on HN and should just join fedi
@ansuz responding to someone on hacker news that they're too good for HN and should join fedi would be absolutely hilarious to no one but me
-
@ansuz responding to someone on hacker news that they're too good for HN and should join fedi would be absolutely hilarious to no one but me
@rune I might have tried it if I had an account there, but then imagine if I was wrong and that was their one good take. fedi equivalent of buyer's remorse