kosmarnik has asked for the wisdom of the Perl Monks concerning the following question:

Hi!

got an idea how to go about it quickly and efficiently? There is a CPAN module Image::TestJPG, but it's linux only.

Tried compiling it....and got nowhere.

It would be nice if the method was fast as I need to check ~2 milion jpegs.

Thanks!

Replies are listed 'Best First'.
Re: Testing JPEGs for validity (windows)
by ikegami (Patriarch) on Jul 13, 2010 at 05:57 UTC
    It doesn't need Linux, it needs libjpeg. libjpeg can be compiled on Windows.

      Could you point me in the right direction?

      I have GNUwin32, and added the path to it's lib and include directories, but still nags about lib missing. I even tried just copying the files to the mydatasrc, and it seemed ok, until it got circular. First it asked for jpeglib.h and then jconf.h etc, until it just flat-out said it can't open jpeglib.h anymore

      Currently I get up to here:

      c:\Perl\test\Image-TestJPG-1.0\Image-TestJPG-1.0>perl Makefile.PL Checking if your kit is complete... Looks good Note (probably harmless): No library found for -ljpeg MakeMaker (v6.56) Writing Makefile for Image::TestJPG::mydatasrc Writing Makefile for Image::TestJPG c:\Perl\test\Image-TestJPG-1.0\Image-TestJPG-1.0>nmake Microsoft (R) Program Maintenance Utility Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. cp TestJPG.pm blib\lib\Image\TestJPG.pm cd mydatasrc && c:\PROGRA~2\MICROS~1.0\VC\BIN\nmake.exe Microsoft (R) Program Maintenance Utility Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. cl -c -nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSO +LE -DNO_ST RICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPRIVLIB_LAST_IN_INC -DPER +L_IMPLICIT _CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX -MD -Z +i -DNDEBUG -O1 -DVERSION=\"\" -DXS_VERSION=\"\" "-IC:\Perl\lib\CORE" myda +tasrc.c mydatasrc.c c:\perl\test\image-testjpg-1.0\image-testjpg-1.0\mydatasrc\jinclude.h( +20) : fata l error C1083: Cannot open include file: 'jconfig.h': No such file or +directory NMAKE : fatal error U1077: '"c:\Program Files (x86)\Microsoft Visual S +tudio 9.0\ VC\BIN\cl.EXE"' : return code '0x2' Stop. NMAKE : fatal error U1077: 'cd' : return code '0x2' Stop.
Re: Testing JPEGs for validity (windows)
by CountZero (Bishop) on Jul 13, 2010 at 06:24 UTC
    And indeed it does not compile on Windows. It dies with errors like TestJPG.xs:47:30: error: macro "PerlProc_setjmp" requires 2 arguments, but only 1 given

    Could that be due to a missing library?

    Have you tried Image::JpegCheck? The docs are unclear whether it will check if the whole of the jpeg-stream is OK, but you can give it a try with some known good and bad files.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      #if defined(_MSC_VER) or defined(WIN32) #undef setjmp #define setjmp _setjmp #endif
      that helped my code to compile on strawberry 5.12. Not that it's the bullet-proof way, but I'm already using many win32-specific switches, so one more doesn't matter.
        Thanks, that helped.
        Now I have to install it ... and have no clue where to put the resulting files. I tried blindly copying them to perl/lib perl/lib but it won't do.

      Image::JpegCheck does what is says: checks if it's a JPEG, no necessarily a valid one.

      I've no clue on how to compile on widows 7 x64. I tried following some guides, made some progress, but could not make it compile :(

      Any other ideas?

        I've no clue on how to compile on widows 7 x64. I tried following some guides, made some progress, but could not make it compile :( Any other ideas?

        Install strawberry perl, it comes with libjpeg

Re: Testing JPEGs for validity (windows)
by kosmarnik (Acolyte) on Jul 13, 2010 at 08:34 UTC

    I sorted the dependencies for libjpeg and fixed the 'cr' error but it still won't compile:

    c:\Perl\test\Image-TestJPG-1.0\Image-TestJPG-1.0\mydatasrc> lib /out:l +ibmydatasr c.lib mydatasrc.obj Microsoft (R) Library Manager Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. c:\Perl\test\Image-TestJPG-1.0\Image-TestJPG-1.0>nmake Microsoft (R) Program Maintenance Utility Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. c:\PROGRA~2\MICROS~1.0\VC\BIN\nmake.exe -f Makefile all -nolog +o C:\Perl\bin\perl.exe -MExtUtils::Command -e "rm_rf" -- ..\blib +\arch\auto \Image\TestJPG\mydatasrc\mydatasrc.lib lib -out:..\blib\arch\auto\Image\TestJPG\mydatasrc\mydatasrc.l +ib mydatas rc.obj Microsoft (R) Library Manager Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. C:\Perl\bin\perl.exe -MExtUtils::Command -e "chmod" -- 755 ..\ +blib\arch\ auto\Image\TestJPG\mydatasrc\mydatasrc.lib cd .. C:\Perl\bin\perl.exe C:\Perl\site\lib\ExtUtils\xsubpp -typema +p C:\Perl\ lib\ExtUtils\typemap TestJPG.xs > TestJPG.xsc && C:\Perl\bin\perl.exe + -MExtUtil s::Command -e "mv" -- TestJPG.xsc TestJPG.c cl -c -nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSO +LE -DNO_ST RICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPRIVLIB_LAST_IN_INC -DPER +L_IMPLICIT _CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX -MD -Z +i -DNDEBUG -O1 -DVERSION=\"1.0\" -DXS_VERSION=\"1.0\" "-IC:\Perl\lib\CORE" + TestJPG. c TestJPG.c TestJPG.xs(47) : warning C4003: not enough actual parameters for macro + 'PerlProc _setjmp' TestJPG.xs(47) : warning C4013: 'PerlProc_setjmp' undefined; assuming +extern ret urning int Running Mkbootstrap for Image::TestJPG () C:\Perl\bin\perl.exe -MExtUtils::Command -e "chmod" -- 644 Tes +tJPG.bs C:\Perl\bin\perl.exe -MExtUtils::Mksymlists -e "Mksymlists('N +AME'=>\"Im age::TestJPG\", 'DLBASE' => 'TestJPG', 'DL_FUNCS' => { }, 'FUNCLIST' +=> [], 'IM PORTS' => { }, 'DL_VARS' => []);" link -out:blib\arch\auto\Image\TestJPG\TestJPG.dll -dll -nolog +o -nodefau ltlib -debug -opt:ref,icf -libpath:"C:\Perl\lib\CORE" -machine:x86 T +estJPG.obj mydatasrc/libmydatasrc.lib ... ... a long list of paths snipped ... -def:TestJPG.def Creating library blib\arch\auto\Image\TestJPG\TestJPG.lib and objec +t blib\arc h\auto\Image\TestJPG\TestJPG.exp TestJPG.obj : error LNK2019: unresolved external symbol _PerlProc_setj +mp referen ced in function _XS_Image__TestJPG_testJPG blib\arch\auto\Image\TestJPG\TestJPG.dll : fatal error LNK1120: 1 unre +solved ext ernals NMAKE : fatal error U1077: '"c:\Program Files (x86)\Microsoft Visual S +tudio 9.0\ VC\BIN\link.EXE"' : return code '0x460' Stop.
Re: Testing JPEGs for validity (windows)
by dk (Chaplain) on Jul 13, 2010 at 10:50 UTC
    A really quick and dirty hack would be to test if the first two bytes in file are 0xFF and 0xD8.
      Nah, that would just be checking if they're JPEGs. I need to test the integrity, decode the JPEG. That got me thinking, what about using a module like imagemagick or something to try to decode and get the error report if they have it? Got something surefire of the top of your head?
        Well, imagemagick also requires libjpeg, and so does any other tool that goes through full decoding. If I were you, I'd just apply the patch with _setjmp above to the needed tools.
Re: Testing JPEGs for validity (windows)
by sanbeg (Novice) on Sep 28, 2012 at 21:08 UTC
      Awesome, I'll test soon. Check_jpeg method is just what I was looking for as the main problem with the JPEGs I have is truncated files. Now I just need such a tool for TIFFs :) don't have as many of them (a few hundred thousand) but they're 12mp 16bit ~70MiB per file and ImageMagick is slow and produces too many false positives.
        Have you tried the Ping method in ImageMagick?

        Ping() is a convenience method that returns information about an image without having to read the image into memory. It returns the width, height, file size in bytes, and the file format of the image. You can specify more than one filename but only one filehandle:

        ...

        This a more efficient and less memory intensive way to query if an image exists and what its characteristics are.