If I remember correctly, there is (supposed to be) a central table near the end of the PDF file that lists all of the actual objects in the file and their offsets. You will need to read the data from there because you have no guarantee that a binary stream will not happen to contain a byte sequence that looks like the beginning of an object, unless you can parse all of the objects in the PDF. This may seem spectacularly unlikely, but it will be a serious problem if you are handling untrusted and potentially malicious input.
In reply to Re: Calculated position incorrect when using regex in text file that also contains binary info
by jcb
in thread Calculated position incorrect when using regex in text file that also contains binary info
by geertvc
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |