Piero Bosio Social Web Site Personale

undefined

@technomancy I like the high/medium/low mechanism.

undefined

@technomancy Haha, that's fun. For a similar purpose, or as a code obfuscator?

undefined

@technomancy Prrrrobably? I used tree-sitter for the actual parsing, and that *is* intended for formatting and such (in an editor), but I had trouble figuring out the API.

So I use tree-sitter to generate an AST, and then walk the tree and create a list of candidate edits (including their byte positions), and then apply all[1] edits in reverse order from the end of the string. :-)

It's a bit of a hack but it works really well.

[1] Well not *all* edits; there's a deterministic 70% chance for each one to be accepted, and some deterministic shuffling to ensure that when there are several alternative edits for a node, each has an equal chance of being used.

undefined

If this sounds familiar, it's likely because these kinds of mutations are a great way of testing your unit tests. There are some neat libraries out there for doing that! See cargo-mutants for instance.

But this one doesn't just modify the AST—it performs surgery on the raw text, preserving comments and whitespace structure.

It was really fun to write!

undefined

What's really fun is that this tool mutates locally identical code in identical ways. `if rect.x > rect.y` will *always* turn into `if rect.x != rect.y`, in any program. (But different variables will have different results.)

That means that LLMs are more likely to learn this poison rather than the mutations averaging out as noise.

Feel free to fork some big open source repos and push some new commits...

#scraggle #RustLang #LLMPoisoning

undefined

I made a tool that converts open source code into LLM poison: https://codeberg.org/timmc/scraggle

It mutates Rust source code in ways that *preserve* the ability to compile the code. (That is, you can't detect the changes by looking for compiler errors.) For example, it switches `+` and `*`, or `==` and `!=`.

If you fork a Rust repo, run this tool on it, and push it somewhere, then crawlers will end up ingesting all sorts of incorrect code.

#scraggle #RustLang #LLMPoisoning

undefined

@jerry Oh no, not the numbers!

undefined

@cwebber My sqlite fix worked!

Piero Bosio Social Web Site Personale

varx/tech

Posts