horsechoker has asked for the wisdom of the Perl Monks concerning the following question:

Hello Friends,

I'm trying to do the following:

1. Open CSV File.

2. Parse text and search for any Base64 encoded text.

3. Take that text and decode it using the MIME::Base64 package.

4. Replace encoded text with decoded text.

I'm running into some issues here. I first tried using the Text::CSV package but found it a bit difficult to use (I'm a bit new to perl, and had issues accessing data directly in the array that text::csv created when reading the csv file).

The main issue I'm having is finding a legit way to just search the document for Base64 encoded text. I've exhausted my googling skills and have come to you gents for advice.

Anyone know of a way in which I can search a document for Base64 encoded text? I've been trying to use the following regex which I found on this forum. I have no idea how to implement it into my code....

m{ ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? \z }x
If needed, I can post my code thus far. Thanks!

Replies are listed 'Best First'.
Re: Searching for Base64
by zwon (Abbot) on May 12, 2010 at 21:27 UTC

    Generally Text::CSV is the right way to handle CSV files, not sure what exactly you don't like about it. Here's the example:

    use strict; use warnings; use Text::CSV; use MIME::Base64; my $csv = Text::CSV->new( { binary => 1, eol => "\n" } ); while ( my $row = $csv->getline( \*DATA ) ) { for (@$row) { $_ = decode_base64($_) if m{ ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? $ }x; } $csv->print( \*STDOUT, $row ); } __DATA__ 1,Just some data,c29tZSBzdHJpbmc= 2,VGhlIG5leHQgY29sdW1uIGlzIG5vdCBhY3R1YWxseSBCYXNlNjQgZW5jb2RlZA==,Oop +s

    Note, that "Oops" is a valid Base64, so you can't always distinct between string and Base64 encoded data.

      I appreciate the help - but I'm a bit confused as to where you're opening the csv file with this example.

        In this example I'm not opening any files, I'm using special DATA filehandle, which allows me to read data embedded into script below __DATA__ line. See also SelfLoader.

Re: Searching for Base64
by ChiefAl (Initiate) on May 12, 2010 at 21:02 UTC

    I think an example of the text you are trying to parse would be useful. But for something as simple as a CSV file split() usually does the trick. Then compare each field against the character class for Base64 encoded characters. When a mach is found convert it. Store everything up in a new array and print that out when you are done

    # $string should contain the 'slurped' file my @newfields; foreach(split(/\s*,\s*/,$string)){ s/^([A-Za-z0-9+\/]*)$/decode_base64($1)/e; push(@newfields,$_); } my $newfile = join(',',@newfields);

    The problem you are probably going have is that Base64 text looks a lot like normal text (thats kinda the point of it). You need something identify it as base64 text or a sentence or any word will also look like base 64 text.