mnj200g has asked for the wisdom of the Perl Monks concerning the following question:

PERL Newbie. Also, binary newbie. I need to access a binary data file (stdf, standard test data format, http://etidweb.tamu.edu/cdrom0/image/stdf/spec.pdf). How do I even begin? Can someone explain to me how to access such a file, then extract the data to a form that is readable?

I've seen the article with the following code:

$gifname = "picture.gif"; open(GIF, $gifname) or die "can't open $gifname: $!"; binmode(GIF); # now DOS won't mangle binary input from G +IF binmode(STDOUT); # now DOS won't mangle binary output to ST +DOUT while (read(GIF, $buff, 8 * 2**10)) { print STDOUT $buff; }

What is the "8 * 2**10" doing?

I am working with SunOS 5.8 if that matters.

Thanks.

mnj200g

Edit: g0n - code tags

Replies are listed 'Best First'.
Re: Need Help: PERL and Binary Data
by GrandFather (Saint) on Dec 06, 2006 at 02:16 UTC

    read(GIF, $buff, 8 * 2**10) reads a 8192 byte (or the remainder of the file - whichever is smaller) block from the file and puts it in $buff.

    You would then typically use unpack and substr to pull apart the buffered data (refilling the buffer as required). You may find Updated QuickTime format movie file dumper helps to understand the techniques involved - it does something similar to parse quicktime files.


    DWIM is Perl's answer to Gödel
Re: Need Help: PERL and Binary Data
by Util (Priest) on Dec 06, 2006 at 09:01 UTC
    How do I even begin?

    1. You search CPAN, the central repository for Perl code. This is always the first step, because you are likely not the first person to try your task. Unfortunately, no one has submitted any STDF modules to CPAN (yet).
    2. You search Google (classic) and Google Code. When I searched for combinations of Perl, STDF, and "Standard Test Data Format", I found (and ranked worst-to-best in my opinion):
      • FreeSTDF is an active project to manipulate STDF files.
        • It is written in C.
        • It is BSD-licensed.
      • PySTDF is a *very* active project to manipulate STDF files.
        • It is written in Python.
        • It is an event-based stream parser, and so may be too complex for your purposes.
        • It is GPL-licensed.
      • Michael Hackerott has a large ( >9000 lines) body of code to handle STDF files.
        • It is written in Perl.
        • It is dual-licensed like Perl: GPL and Artistic licenses.
        • There is one module, two programs that use the module, and two independent programs.
        • The code is, IMHO, of high quality.
        • It can output an XML-ized version of STDF, for use by other programs.
      • Datalog sells STDF4X, which bears a promising description:
        STDF Binary Datalog Reader - ... In addition to its powerful data locating and filtering features and easy to use GUI, the tool can be used with your favorite scripting language such as Perl or Python to automate the frequently performed tasks. ... Output is optimally formatted for further filtering with regular expressions.
        • It is a commercial product.
        • It has an option to recognize well-known STDF quirks from other vendors, and compensate for them.
        • I have no idea what it costs, but I would bet that, long-term, it is cheaper to buy it instead of writing or adapting equivalent code.
        • It is Windows-only, but you could use Samba or NFS to allow Windows and SunOS to both work on your files.

    From looking at the STDF Specification and Michael Hackerott's code, I judge the format to be of difficulty 5 (out of 10, where 1 is tab separated values, and 10 is MPEG video {and SMB is 11}). The format is hairy enough that I would press Management to buy the STDF4X product, even if only to double-check that my Perl code was completely accurate.

    Hope this helps, and good luck to you.

Re: Need Help: PERL and Binary Data
by grep (Monsignor) on Dec 06, 2006 at 02:14 UTC
    First off, it's perl or Perl. Never PERL.

    For reading binary files first start off reading open, read and binmode. As for the '8 * 2**10' read Exponentiation. perldoc.com is an excellent resource.

    But really you should pick up Learning Perl, this will get your Perl experience off to good start. If you run into trouble, feel free to post here.

    grep
    XP matters not. Look at me. Judge me by my XP, do you?

Re: Need Help: PERL and Binary Data
by Popcorn Dave (Abbot) on Dec 06, 2006 at 04:24 UTC
    If I correctly understand what you're asking, I think you're comparing apples and oranges. A quick scan through the PDF you gave the link to didn't show any graphics, but all text.

    If you're trying to get the text out of the PDF you should look at the PDF modules on CPAN or take advantage of a service that Adobe offers to translate PDF files to text files. Then you could use Perl to do whatever you want with the data.

    Of course if you're trying to open a PDF as a binary file, the other monks have got you started in the right direction.

    Good luck!

    Revolution. Today, 3 O'Clock. Meet behind the monkey bars.