๐ blog!
-
๐ blog! โRemoving "/Subtype /Watermark" images from a PDF using Linuxโ
Problem: I've received a PDF which has a large "watermark" obscuring every page.
Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.
Manual Solution: Hit page down, select image, delete, repeat 500 times. โฆ
๐ Read more: https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/
โธป
#LLM #pdf #python -
๐ blog! โRemoving "/Subtype /Watermark" images from a PDF using Linuxโ
Problem: I've received a PDF which has a large "watermark" obscuring every page.
Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.
Manual Solution: Hit page down, select image, delete, repeat 500 times. โฆ
๐ Read more: https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/
โธป
#LLM #pdf #python@Edent Replacing the content with whitespace works just as well without having to fix up the /Length.
-
๐ blog! โRemoving "/Subtype /Watermark" images from a PDF using Linuxโ
Problem: I've received a PDF which has a large "watermark" obscuring every page.
Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.
Manual Solution: Hit page down, select image, delete, repeat 500 times. โฆ
๐ Read more: https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/
โธป
#LLM #pdf #python@Edent Hi! The watermarks are everything from /Artifact (this is actually the first argument to the BMC operator) to EMC. They don't have a length, but once you remove them you have to fix up the /Length on the enclosing content stream. However! Since you decompressed the stream, you can just set the /Length to 0, and any PDF viewer can figure it out.
-
๐ blog! โRemoving "/Subtype /Watermark" images from a PDF using Linuxโ
Problem: I've received a PDF which has a large "watermark" obscuring every page.
Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.
Manual Solution: Hit page down, select image, delete, repeat 500 times. โฆ
๐ Read more: https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/
โธป
#LLM #pdf #python@Edent given you have vibe coded a scriopt to remove the watermark can we also assume that the 'large watermark' is there to tell everyone that the contents of the pdf were originally generated by an LLM?
If so can't have viewers seeing that can we? -
@Edent given you have vibe coded a scriopt to remove the watermark can we also assume that the 'large watermark' is there to tell everyone that the contents of the pdf were originally generated by an LLM?
If so can't have viewers seeing that can we?@marjolica eh? I've no idea what you're talking about.
It was a pre-print book with a publisher's watermark. -
@Edent Hi! The watermarks are everything from /Artifact (this is actually the first argument to the BMC operator) to EMC. They don't have a length, but once you remove them you have to fix up the /Length on the enclosing content stream. However! Since you decompressed the stream, you can just set the /Length to 0, and any PDF viewer can figure it out.
-
undefined mora@mastodon.uno shared this topic on