Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Detecting 'binary' in a variable

by kirbyk (Friar)
on Jul 05, 2005 at 17:39 UTC ( [id://472535]=perlquestion: print w/replies, xml ) Need Help??

kirbyk has asked for the wisdom of the Perl Monks concerning the following question:

I have an application running under Apache/Mod-perl where a user can upload a csv file. The file gets uploaded, and sits in a perl variable, eventually to be loaded into an Oracle CLOB.

I want to detect if the file they've uploaded is binary or text only, and give the user a helpful error message. (Like, if they upload a .xls file.) Note that the file never exists on a filesystem, so I can't use any unix tricks (and I don't want to write out a temp file.)

I figure I can go character-by-character in a loop and look at the ascii values, but that seems horribly inefficient. Is there a quick regex that could do this check? I'm not worried about Unicode characters, but it'd be nice if extended ascii characters through, say, 165 (to get all the accented characters.)

-- Kirby, WhitePages.com

Replies are listed 'Best First'.
Re: Detecting 'binary' in a variable
by Transient (Hermit) on Jul 05, 2005 at 17:49 UTC
    would a simple if ( $file =~ /[^\x00-\xA5]/ ) { # binary } else { #text } suffice?

    Update: Also looks like there's a CGI::UploadEasy method "fileinfo" (in case you're using or could use that module)

      Not always

      $ perl -le '{local$/; $_=<>;}print /^[\x00-\xA5]/ ? "binary" : "text" +' \ /mnt/win/WINDOWS/system32/command.com text
        That's correct, but that's not the same regexp:

        /^[\x00-\xA5]/
        ne
        /[^\x00-\xA5]/
      Thanks, that regex does the trick.

      -- Kirby, WhitePages.com

Re: Detecting 'binary' in a variable
by brian_d_foy (Abbot) on Jul 05, 2005 at 18:01 UTC

    In Perl 5.8, you can open a virtual filehandle on a scalar reference. That might do the trick for you. If you are using an older perl, Tie::Handle::ToMemory does the same thing. You could then use the file test operators, or something like File::Magic. If that doesn't work for you, you can try to match a specific signature for an Excel file (or whatever you might get) with what you see in the uploaded data, but that's a lot more work.

    Good luck!

    --
    brian d foy <brian@stonehenge.com>

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://472535]
Approved by xorl
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2024-04-16 06:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found