in reply to Multipage TIFFs

Instead of reading the entire file, you should read just the TIFF header in each file. I'm not certain off hand if that's supported in Image::Magick, but you can get head start on reading the header here. In essense, you need to scan for the tags in the header which indicate the image boundries. Reading just the header should be VERY fast.
--
I used to drive a Heisenbergmobile, but everyone I looked at the speedometer, I got lost.

Replies are listed 'Best First'.
Re^2: Multipage TIFFs
by halley (Prior) on Nov 01, 2007 at 19:53 UTC
    There is an ImageMagick command called "identify" which will read the image just deeply enough to give answers. They call this "pinging" the file, as opposed to reading it completely. I am pretty sure the number of pages or sub-images is reported by this process.

    --
    [ e d @ h a l l e y . c c ]

Re^2: Multipage TIFFs
by oligmd (Sexton) on Nov 01, 2007 at 20:54 UTC
    I played around with Image::ExifTool and had a realization. When I printed all of the tags from a TIFF using Image::ExifTool, it didn't return any tags which explicitly contained the number of pages in the TIFF. It did, however, return tags from each of the pages past page 1 in a "$tag ($page-1)" format.

    I wrote this ugly piece of code which does the job perfectly:
    use Image::ExifTool; sub pageCount { my $FILE = shift; my $EXIFTOOL = new Image::ExifTool; my $INFO = $EXIFTOOL -> ImageInfo( $FILE ); my $TAG; my $DONE = 0; my $PAGECOUNT = "1"; while ( $DONE == 0 ) { $TAG = "ImageWidth ($PAGECOUNT)"; my $VAL = $EXIFTOOL -> GetValue( $TAG ); if ( !$VAL ) { $DONE = 1; } else { $PAGECOUNT++; } } return( $PAGECOUNT ); }
    In essence, I keep looking for pages until GetValue returns undefined. It's far from a great solution, but it works.

    Thanks for pointing me in the right direction!

      I see you solved the problem, but since I already wrote this I'll post it for anyone who can use it. The program traverses the tiff file's image directory chain. Testing on an old 900MHz windoz box, it ripped through 1200 tiff files, over 2 GB, in 23 seconds.

      use strict; my @files = glob "D:/pics/tiff/*.tif"; for (@files){ my $images = imagecounter($_); printf "%40s : %3d \n", $_, $images; } sub imagecounter{ my $tiff = shift; my @endian; my $offset = 0; my $index; my $count = 0; open (my $fh, "<", $tiff); binmode $fh; seek ($fh, $offset, 0); read ($fh, $endian[0], 2); $endian[0] = unpack "a2", $endian[0]; if ($endian[0] eq 'II'){ $endian[1] = 'v'; $endian[2] = 'V'; }elsif ($endian[0] eq 'MM'){ $endian[1] = 'n'; $endian[2] = 'N'; }else{ warn "Unknown byte order in $tiff, possible bad file\n"; return 0; } seek ($fh, 4, 0); read ($fh, $offset, 4); $offset = unpack "$endian[2]", $offset; while ($offset){ $count++; my $length; seek ($fh, $offset, 0); read ($fh, $length, 2); $length = unpack "$endian[1]", $length; $index = ($length * 12); seek ($fh, $index, 1); read ($fh, $offset, 4); $offset = unpack "$endian[2]", $offset; if ($count > 100){ warn "Count too high in $tiff, possible bad file\n"; return 0; } } close $fh; return $count; }