@technomancy I like the high/medium/low mechanism.
varx/tech
Posts
-
I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggle -
I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggle@technomancy Haha, that's fun. For a similar purpose, or as a code obfuscator?
-
I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggle@technomancy Prrrrobably? I used tree-sitter for the actual parsing, and that *is* intended for formatting and such (in an editor), but I had trouble figuring out the API.
So I use tree-sitter to generate an AST, and then walk the tree and create a list of candidate edits (including their byte positions), and then apply all[1] edits in reverse order from the end of the string. :-)
It's a bit of a hack but it works really well.
[1] Well not *all* edits; there's a deterministic 70% chance for each one to be accepted, and some deterministic shuffling to ensure that when there are several alternative edits for a node, each has an equal chance of being used.
-
I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggleIf this sounds familiar, it's likely because these kinds of mutations are a great way of testing your unit tests. There are some neat libraries out there for doing that! See cargo-mutants for instance.
But this one doesn't just modify the AST—it performs surgery on the raw text, preserving comments and whitespace structure.
It was really fun to write!
-
I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggleWhat's really fun is that this tool mutates locally identical code in identical ways. `if rect.x > rect.y` will *always* turn into `if rect.x != rect.y`, in any program. (But different variables will have different results.)
That means that LLMs are more likely to learn this poison rather than the mutations averaging out as noise.
Feel free to fork some big open source repos and push some new commits...
-
I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggleI made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggle
It mutates Rust source code in ways that *preserve* the ability to compile the code. (That is, you can't detect the changes by looking for compiler errors.) For example, it switches `+` and `*`, or `==` and `!=`.
If you fork a Rust repo, run this tool on it, and push it somewhere, then crawlers will end up ingesting all sorts of incorrect code.
-
Bitcoin is having quite a few days.@jerry Oh no, not the numbers!
-
Hack & Craft!@cwebber My sqlite fix worked!