I'm writing a parser that returns the text or pdfs to a swish-search-indexer. The delimter of the character stream is a "content-length" header that needs to be in bytes or the next header is missed. When using:
use bytes; my $size = length ($txt); no bytes; print <<EOF content-lenth: $size EOF
the "bytes" doesn't appear to work properly when the text output from "pdftotext" returns with multi-byte characters. It still seems to count characters instead of bytes and returns less then the same data written to a file, thereby throwing of my indexer.
Does anyone know a way around this besides writing the text out to a file, and getting the stat size of it?
Thanks, Jeff
In reply to use bytes and length problem by muad33b
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |