in reply to Re^2: Searching for Base64
in thread Searching for Base64

In this example I'm not opening any files, I'm using special DATA filehandle, which allows me to read data embedded into script below __DATA__ line. See also SelfLoader.

Replies are listed 'Best First'.
Re^4: Searching for Base64
by Anonymous Monk on May 25, 2010 at 11:35 UTC
    So the code works ALMOST as intended -- it parses the csv, looking for encoded base64 and decodes it. The script then creates a new csv file like the first, only with decoded base64. The problem is in the way the Text::CSV Package parses data from the original csv file. It seems to parse it in by column instead of by row, making it difficult for me to edit and format the data correctly. Here is the code I have so far. I'm parsing any CSV file with base64 encoded data. Here is the file I'm trying to parse:
    A1,A2,A3,A4 B1,B2,B3,B4 1,Just some data,c29tZSBzdHJpbmc= C1,C2,C3,C4 D1,VGhpcyBpcyBkZWNvZGVkIQ==,D3,D4,GibberishdlkfjsDKjdlslksJAoiasosaSDS +D==
    Here is the code I'm using the parse the data:
    #csvDecode.pl VERSION 1 #Written by Andrew Hoyt #Takes LogData from the MSS Portal and decodes the base64 characters ( +assuming there is any base64 to be decoded, and if so, that it's in c +olumn eleven). #!/usr/bin/perl use strict; use warnings; use Text::CSV; use MIME::Base64; my $i=0; my @columns; my @cleanData; my @input; my $file = 'data.csv'; my $csv = Text::CSV->new(); open (CSV, "<", $file) or die $!; my $string; while (<CSV>) { if ($csv->parse($_)) { @columns = $csv->fields(); #my $numCols = $#columns; #print "($numCols)\n"; } else { my $err = $csv->error_input; print "Failed to parse line: $err"; } foreach (@columns){ $_ = check($_); $cleanData[$i] = $_; $i++; print $_; } print "\n"; } close CSV; open (MYFILE, ">", "output.csv") or die $!; foreach (@cleanData){ print MYFILE "$_,"; #print MYFILE "\n"; } #print MYFILE "\n"; close MYFILE; sub check{ my $col = shift; $col = decode_base64($col) if m{ ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? $ }x; return $col; }
    Unfortunately I can't format my output correctly. It all gets put on one line with no line breaks, and I can't seem to figure out a way to newline the output correctly. Here is the output I get with the above code. Please note that this output is being written to a csv file called output.csv
    A1,A2,A3,A4,B1,B2,B3,B4,1,Just some data,some string,C1,C2,C3,C4,D1,Th +is is decoded!,D3,D4,GibberishdlkfjsDKjdlslksJAoiasosaSDSD==,
    Any idea how I can format my output correctly? I just need it to newline correctly like the original.