tanfur has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, I have a number of CD's with gzipped .MDB files. I'd like to extract the data, without loading it up into MS Access. Is there a module for this? I'm doing it on a Solaris box. It would be -extrmely- labor intensive to load them up manually, so I want to write a script for it. Thanks, tanfur

Replies are listed 'Best First'.
Re: Read static .MDB files?
by baku (Scribe) on Feb 06, 2001 at 03:14 UTC

    Quick search on the domain of doom came up with:

    http://msdn.microsoft.com/library/default.asp?URL=/library/books/dnjet/c1_body_16.htm

    Disclaimer: I didn't browse too deep (This site dislikes Netscape a bit!) so I don't know if they actually provide enough detail for implementation; however, this seems to be the best bet.

    That'll bring you into their manual on the "Jet" (aka "Access") file format; scroll down in the left frame to find the appropriate chapters. I don't know of a module to do this already, but a DBD::MDB object would probably be the way to go (possible derived from DBD::CSV?)...?

    You might also want to look into the formats used by e.g. MS Word; they (Microsoft) tend to use a chunk-structured file format (called OLE like everything else?) similar (...) to PNG or TIFF, so Access likely is quite similar; work toward parsing Word files might have a lot of useful info. for you. One starting-point might be WV

    Another option might be some kind of MDB2CSV|XML filter :-)

    Good luck!


    Amendment: Today's post has an actual ref to an MDB2CSV converter, which should be 'importable' to DBD::CSV. Voila! Thanks to jeroenes for the info

Re: Read static .MDB files?
by MeowChow (Vicar) on Feb 06, 2001 at 04:11 UTC
    When you say "without loading it up into MS Access", do you mean "without interactively running MS Access", or do you mean "without using the MS Access database engine"?

    If they are static files, your best bet is whip up a Win32 Perl script to extract the databases from the gzips, load them up using DBI::ODBC, and them write them out again in the format or database of your choice. Is there a reason you need to actually perform the extraction on Solaris?

       MeowChow                                               
                    print $/='"',(`$^X\144oc $^X\146aq1`)[-2]
Have you had a look yet?
by hotyopa (Scribe) on Feb 06, 2001 at 03:09 UTC
    There certainly is an answer to this oft repeated question.

    See that little box next to the word search at the top of the page? Try typing 'access' in there and have a look at the results.

    (Methinks I assumed to much...)

    Update:It appears I was a little hasty. Without opening the file, you might just need OLE::Storage to read the raw OLE storage streams, but it would be complicated. It would certainly make you question whether it was worth it.

    Have a think about what runrig has suggested.

    *~-}hotyopa{-~*

      We've determined through the CB that he wants to access the data from gzipped files on Unix. DBI won't help, you'd need something like Parse::Access which doesn't AFAIK exist yet.

      Any reason why you can't get the files to a PC with Access on it, then use DBI/DBD::ODBC or Win32::ODBC? Or export the data to a delimited or other format? Maybe even write a script to unzip the files and Net::FTP them to a Windoze box with Access on it? Automate any necessary commands on the Windoze box with Net::Telnet or maybe Expect (I've never used Net::Telnet, but it looks like if you use it you don't need Expect...)?

        As I recall you don't actually need access installed to use *.mdb files. I _think_ I recall someone saying they had used DBD::ODBC to read those files under a unix system. However you may (or may not, try it) need an ODBC driver, but there is supposedly (according to the docs), a ODBC driver included with the DBD::ODBC source. Otherwise there are links to other cross-platform ODBC drivers in those docs.

      Notice how he says he wants to be able to do this *without running Access*. He wants to parse the database file directly.

      Philosophy can be made out of anything. Or less -- Jerry A. Fodor