in reply to Finding Rectangles in PDFs
I'm not aware of any canned module for extracting black rectangles from PDFs. I could be wrong of course, but I'm afraid you'll have to do some low-level coding yourself, similar in spirit to what you've already tried, i.e. searching the content streams for re commands, or certain combinations of line+fill operators.
Personally, when it comes to low-level messing with PDFs, I'm a big fan of pdftk, as it allows easy uncompressing of the PDF's content streams. For example, doing the following
$ pdftk GBA.pdf output - uncompress | grep ' re$' | wc 4730 23650 132978
counts 4730 rectangle drawing instructions (though they may of course not all qualify as redaction rectangles...)
Anyhow, what's the idea behind extracting those rectangles, i.e. what are you really trying to do? Maybe there is some entirely different approach to solving your problem.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Finding Rectangles in PDFs
by jffry (Hermit) on Jan 13, 2010 at 14:50 UTC | |
|
Re^2: Finding Rectangles in PDFs
by Anonymous Monk on Jan 13, 2010 at 16:50 UTC |