lihao has asked for the wisdom of the Perl Monks concerning the following question:

Hi, monks:

What's the Perl equivalent(module) to grab file information just like the 'file' command under a Linux box, i.e.

bash ~> file mypic.jpg mypic.jpg: PC bitmap data, Windows 3.x format, 640 x 480 x 24

Many thanks

lihao

Replies are listed 'Best First'.
Re: Question: module to grab file information??
by rafl (Friar) on Apr 30, 2008 at 16:04 UTC

    CPAN has several modules that can guess a files type, like File::MimeInfo and File::MimeInfo::Magic. However most of those modules can't give you much more information that the filetype. Therefor I prefer using File::Extractor, which can collect detailed metadata for lots of filetypes like the resolution for image files, artist, title, etc from audio files and much more. The only downside is that it requires an external c library: libextractor.

    Also note that I might be quite a bit biased as I'm the author of File::Extractor.

      Thanks, I need information more than MIME type, so File::Type and File::MimeInfo seem not OK. I installed your module through cpan shell, but it doesnot pass the compilation.
      cpan> install ExtUtils::PkgConfig ... Can't locate ExtUtils/PkgConfig.pm in @INC .... ... cpan> install ExtUtils::PkgConfig ...snip ok info... cpan> install File::Extractor Running install for module File::Extractor Running make for F/FL/FLORA/File-Extractor-0.03.tar.gz Is already unwrapped into directory /root/.cpan/build/File-Extractor +-0.03 Makefile.PL returned status 65280 Running make test Make had some problems, maybe interrupted? Won't test Running make install Make had some problems, maybe interrupted? Won't install

      I installed libextractor(http://gnunet.org/libextractor/download/Extractor-0.5.tar.gz) by following the instruction(i.e. running "python setup.py install", everything looks ok), but I still get the same error information. ? ? any idea? many thanks...

      lihao

        CPANs error message is caused the cpan shell caching some state. Restarting the shell and invoking trying to install the module again should fix that.

        Also it looks like you installed the python bindings for libextractor but not libextractor itself. This is most likely what you want.

Re: Question: module to grab file information??
by kyle (Abbot) on Apr 30, 2008 at 16:01 UTC

    I haven't used File::Type myself, but it looks like what you want. You can also use backticks to call file, if you want.

    Update: Actually, I have used File::Type, but it was so long ago, I forgot. It worked fine at the time.

Re: Question: module to grab file information??
by tachyon-II (Chaplain) on Apr 30, 2008 at 16:27 UTC

    Depending on your task you (is it getting image size by chance?) you may find a module like Image::Size or Image::Info does the trick. If you want to see how its done here it is in Perl. Essentially just read the header and see what it matches, then unpack the size data.

    #!/usr/bin/perl -w use strict; sub image_size { return unless $_[0]; my ($width, $height, $sig); if ( $_[0] =~ m/^GIF8..(....)/s ) { $sig = 'GIF'; ($width, $height) = unpack( "SS", $1 ); } elsif ( $_[0] =~ m/^^\xFF\xD8.{4}JFIF/s ) { $sig = 'JPEG'; ($height,$width) = unpack( "nn", $1 ) if $_[0] =~ /\xFF\xC0... +(....)/s; } elsif ( $_[0] =~ /^\x89PNG\x0d\x0a\x1a\x0a/ ) { $sig = 'PNG'; ($width, $height) = unpack( "NN", $1 ) if $_[0] =~ /IHDR(.{8}) +/s; } elsif ( $_[0] =~ /BM.{16}(.{8})/s ) { $sig = 'BMP'; ($width, $height) = unpack( "LL", $1); } return $width, $height, $sig; } for my $img ( qw( c:/sample.bmp c:/sample.jpg c:/sample.gif c:/sample. +png) ) { my $data = get_file($img); my @res = image_size($data); print "$img $res[2] width $res[0] height $res[1]\n"; } sub get_file { open my $fh $_[0] or die $!; binmode $fh; local $/; <$fh> } __END__ c:/sample.bmp BMP width 512 height 384 c:/sample.jpg JPEG width 100 height 149 c:/sample.gif GIF width 110 height 58 c:/sample.png PNG width 256 height 192

      Hi, tachyon-II:

      You are right, I have used GD::Image to grab size-infomation of about 99% image files. only those with type "PC bitmap data, Windows 3.x format" not available through GD::Image. but I can grab this missing information from running Linux 'file' command anyway. Just want to know if there is a Perl module which can handle this, instead of use code like below(untested):

      my $imginfo = qx[ file $myimg ]; my (width, height)= ($imginfo =~ m{(\d+) x (\d+) x \d+});

      thanks.

      lihao

        Does the code above not work? Is Image::Size not a module that does this?

Re: Question: module to grab file information??
by cdarke (Prior) on Apr 30, 2008 at 18:47 UTC
    There is a file "command" written in Perl as part of the ppt package. I don't have any experience of it though.