bontchev has asked for the wisdom of the Perl Monks concerning the following question:

Hello enlightened ones.

An OLE2 file (e.g., the documents produced by Word, Excel, etc.) is a "hierarchical file system in a file" with its own "files" (called "streams"), "directories" (called "storages"), FAT, clusters, etc. The package OLE::Storage provides a very nice interface to the internal structures of the OLE2 files. For instance, you can enumerate the streams and storages, open them, read them, etc.

Each stream or storage has various properties - like name, size, date, time and so on. In particular, it has a property called CLSID.

In practice, the CLSID is a 16-byte value associated with each stream or storage. For a project of mine, I need to obtain the CLSID of the root storage. The problem is, I can't figure out how to obtain it with OLE::Storage. :-(

According to the documentation, the package provides the method clsid() - as in $clsid == $D->clsid($pps). But I can't figure out how to use it. :-(

I tried

print $Doc->clsid($pps) if ($Doc->is_root($pps));

but something bizarre gets printed - like

OLE::Storage::Property=HASH(0x1c5a718)

Any ideas how to use this package properly to obtain the CLSID?

Replies are listed 'Best First'.
Re: How to get the CLSID with OLE::Storage?
by erroneousBollock (Curate) on Nov 10, 2007 at 05:48 UTC
    As per the docs, if you're sure you've got a scalar property, call the string method on it:

    use OLE::Storage::Property; use strict; my $clsid = $Doc->clsid($pps); die "not a scalar property" unless is_scalar($clsid); print "CLSID => ".$clsid->string."\n";
    Is that what you're looking for?

    -David

      > Is that what you're looking for?

      Not quite but your message suggested to me what the proper solution is, thanks. The solution is to apply the string() method to the CLSID returned by the package:

      print $Doc->clsid($pps)->string()

      works perfectly.

      Regards, Vesselin

        Interestingly, OLE::Storage does something bizarre with the byte endianness of the CLSID... For instance, in one document the CLSID of the root storage consists of the following byte sequence:

        D8 F4 50 30 B5 98 CF 11 BB 82 00 AA 00 BD CE 0B

        (I can see it with a hex editor.) For this, $Doc->clsid($pps)->string() returns

        3050F4D8-98B5-11CF-BB82-00AA00BDCE0B

        In other words, it has assumed that the CLSID consists of a little-endinan DWORD, little-endian WORD, little-endian WORD, big-endian WORD, and 6 bytes (or is it 3 big-endian WORDs?). This is a bug, IMHO. The CLSID is just a sequence of 16 bytes (no endinanness) and should be returned as such.

        Regards,
        Vesselin

        print $Doc->clsid($pps)->string()
        Err, that's exactly the same as what I wrote. :-)

        -David