in reply to Searching for Base64

Generally Text::CSV is the right way to handle CSV files, not sure what exactly you don't like about it. Here's the example:

use strict; use warnings; use Text::CSV; use MIME::Base64; my $csv = Text::CSV->new( { binary => 1, eol => "\n" } ); while ( my $row = $csv->getline( \*DATA ) ) { for (@$row) { $_ = decode_base64($_) if m{ ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? $ }x; } $csv->print( \*STDOUT, $row ); } __DATA__ 1,Just some data,c29tZSBzdHJpbmc= 2,VGhlIG5leHQgY29sdW1uIGlzIG5vdCBhY3R1YWxseSBCYXNlNjQgZW5jb2RlZA==,Oop +s

Note, that "Oops" is a valid Base64, so you can't always distinct between string and Base64 encoded data.

Replies are listed 'Best First'.
Re^2: Searching for Base64
by horsechoker (Initiate) on May 12, 2010 at 21:50 UTC
    I appreciate the help - but I'm a bit confused as to where you're opening the csv file with this example.

      In this example I'm not opening any files, I'm using special DATA filehandle, which allows me to read data embedded into script below __DATA__ line. See also SelfLoader.

        So the code works ALMOST as intended -- it parses the csv, looking for encoded base64 and decodes it. The script then creates a new csv file like the first, only with decoded base64. The problem is in the way the Text::CSV Package parses data from the original csv file. It seems to parse it in by column instead of by row, making it difficult for me to edit and format the data correctly. Here is the code I have so far. I'm parsing any CSV file with base64 encoded data. Here is the file I'm trying to parse:
        A1,A2,A3,A4 B1,B2,B3,B4 1,Just some data,c29tZSBzdHJpbmc= C1,C2,C3,C4 D1,VGhpcyBpcyBkZWNvZGVkIQ==,D3,D4,GibberishdlkfjsDKjdlslksJAoiasosaSDS +D==
        Here is the code I'm using the parse the data:
        #csvDecode.pl VERSION 1 #Written by Andrew Hoyt #Takes LogData from the MSS Portal and decodes the base64 characters ( +assuming there is any base64 to be decoded, and if so, that it's in c +olumn eleven). #!/usr/bin/perl use strict; use warnings; use Text::CSV; use MIME::Base64; my $i=0; my @columns; my @cleanData; my @input; my $file = 'data.csv'; my $csv = Text::CSV->new(); open (CSV, "<", $file) or die $!; my $string; while (<CSV>) { if ($csv->parse($_)) { @columns = $csv->fields(); #my $numCols = $#columns; #print "($numCols)\n"; } else { my $err = $csv->error_input; print "Failed to parse line: $err"; } foreach (@columns){ $_ = check($_); $cleanData[$i] = $_; $i++; print $_; } print "\n"; } close CSV; open (MYFILE, ">", "output.csv") or die $!; foreach (@cleanData){ print MYFILE "$_,"; #print MYFILE "\n"; } #print MYFILE "\n"; close MYFILE; sub check{ my $col = shift; $col = decode_base64($col) if m{ ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? $ }x; return $col; }
        Unfortunately I can't format my output correctly. It all gets put on one line with no line breaks, and I can't seem to figure out a way to newline the output correctly. Here is the output I get with the above code. Please note that this output is being written to a csv file called output.csv
        A1,A2,A3,A4,B1,B2,B3,B4,1,Just some data,some string,C1,C2,C3,C4,D1,Th +is is decoded!,D3,D4,GibberishdlkfjsDKjdlslksJAoiasosaSDSD==,
        Any idea how I can format my output correctly? I just need it to newline correctly like the original.