rpinnam has asked for the wisdom of the Perl Monks concerning the following question:

I have my input csv file as below:
RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111 RC,CNDY,1,5657,WM 85PLY,112,43200 RC,VN,1,6633,WM 8PL5L,389,2010 RC,VN,1,1921,WMNS T40PL ,289,2010
I want my output into two parts. First part is duplicating the entire row of RS header and RAd header with the number that matches with the entire row of RC. Suppose header RC has 3 rows , I want RS and RAd to have 3 rows and then combine each line as below:
RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111 RC,CNDY,1,5657,WM 85PLY,112,43200 RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111 RC,VN,1,6633,WM 8PL5L,389,2010 RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111 RC,VN,1,1921,WMNS T40PL ,289,2010
Lastly I want to join the above lines of RS,RAd and RC header.Final output should look like the one below:
RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK, RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111, RC,CNDY,1 +,5657,WM 85PLY,112,43200 RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK, RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111, RC,VN,1,6 +633,WM 8PL5L,389,2010 RS,52,,,P,,402,,,,,,,,HK,,900,G,,,,,,,1,1,1,CM,,,,,Wearing,,,44,DD,,,J +PYK, RAd,CON,,St,567,JPYK,,,Kana,Kana,456,JPKY,,1111111111, RC,VN,1,1 +921,WMNS T40PL ,289,2010

Replies are listed 'Best First'.
Re: duplicating the rows and joining three lines into one using perl script
by graff (Chancellor) on Oct 07, 2015 at 02:04 UTC
    That seems pretty simple and straightforward. What have you tried? Is there something stopping you from trying?
      Yes .I have tried importing this csv into excel. I am not able to duplicate the rows and join. Any help would be appreciated. Here is my code:
      #!/usr/bin/perl -w use strict; use Spreadsheet::WriteExcel; my $workbook = new Spreadsheet::WriteExcel("test.xls"); my $sheet = $workbook->add_worksheet(); my $moneyFormat = $workbook->add_format(); $moneyFormat->set_num_format('$#,##0'); my $boldFormat = $workbook->add_format(); $boldFormat->set_bold(); my $row=0; my @longest=(0,0,0,0,0,0,0); open(MOV, "<test.csv") || die "Can't open movies.csv: $!"; while(<MOV>) { chomp; my @fields = split(/,/, $_); for(my $col = 0; $col < @fields; $col++) { if ($row == 0) { $sheet->write($row, $col, $fields[$col], $boldFo +rmat); } else { if ($col == 2 or $col == 3) { $sheet->write($row, $col, $fields[$col], $moneyForma +t); } else { $sheet->write($row, $col, $fields[$col]) +; } } if ($longest[$col] < length($fields[$col])) { $longest[$col] = length($fields[$col]); } } $row++; } $longest[2]+=3; $longest[3]+=3; for(my $i = 0; $i < @longest; $i++) { $sheet->set_column($i,$i,$longest[$i]); } close(MOV); $workbook->close();
        Thank you for using code tags. There is absolutely no need to bring Excel into this process. If I understand correctly, here's what you want to do (in pseudo-code):
        while ( reading from the input file yields a line of data ) { if ( the line begins with "RS," ) { replace the newline character(s) at the end of the line with a c +omma save this line in a variable called $line_out } elsif ( the line begins with "RAd," ) { replace the final newline character(s) with a comma (just like a +bove) append this line to $line_out } elsif ( the line begins with "RC," ) { print $line_out with the current line appended to it. } }
        Rendering that in actual perl syntax is pretty simple, and will be less verbose than what I've shown. Let us know if you have a problem with that step.

        UPDATE: I added a more specific condition to get into the third block (for printing output), and I added commas to the regex-match conditions for all three blocks, just to be "safe".

        Another update: I didn't intend to be obtuse - here's what I meant in actual perl code:

        use strict; use warnings; my $line_out; open ( MOV, "<", "test.cxv" ) or die "test.csv: $!\n"; while (<MOV>) { if ( /^RS,/ ) { s/\s+$/,/; $line_out = $_; } elsif ( /^RAd,/ ) { s/\s+$/,/; $line_out .= $_; } elsif ( /^RC,/ ) { print $line_out . $_; } }
Re: duplicating the rows and joining three lines into one using perl script
by Anonymous Monk on Oct 07, 2015 at 02:07 UTC
      Sure, that module could be used, but the problem does not require any parsing of the CSV lines (except identifying the type of line with the two or three first characters), it actually does not even need to know that the lines are CSV.

      Using a CSV module for that seems IMHO to be overkill.