The title makes this problem seem much easier than it is. I know what I want to do programmatically but I am still new to Perl and putting what I want to do into Perl code is where I am getting hung up. The following code takes an Excel and PDF version of one file and takes what I need and outputs a CSV file.
#!/usr/bin/perl -w use warnings; use strict; use PDF::API2; use CAM::PDF; use CAM::PDF::PageText; use String::Util ':all'; use Spreadsheet::ParseXLSX; use Text::CSV; use Date::Format; use DateTime::Format::DateParse; my $pdfFilename = "<insert PDF file name>"; my $pdf = CAM::PDF->new($pdfFilename); my $text = $pdf->getPageText(1); my $endString = "DELIVERY"; my $headIndex = index($text, $endString); my $newText = substr($text, 0, $headIndex); my $header = trim($newText); my $amdNumStart = "NO. "; my $amdNumEnd = "GAS"; my $amdNumStartIndex = index($header, $amdNumStart)+3; my $amdNumEndIndex = index($header, $amdNumEnd); my $amdNumString = substr($header, $amdNumStartIndex, $amdNumEndIndex +- $amdNumStartIndex); my $amdNum = trim($amdNumString); my $poolNumStart = "30202"; my $poolNumEnd = "TIGER"; my $poolNumStartIndex = index($header, $poolNumStart); my $poolNumEndIndex = index($header, $poolNumEnd); my $poolNumString = substr($header, $poolNumStartIndex, $poolNumEndInd +ex - $poolNumStartIndex); my $poolNum = trim($poolNumString); my $dateStart = "DATED "; my $dateStartIndex = index($header, $dateStart)+6; my $dateEndIndex = index($header, $dateStart); my $dateString = substr($header, $dateStartIndex, $dateEndIndex - $dat +eStartIndex); my $reportDate = trim($dateString); my @date = split(' ', $reportDate); my $month = $date[0]; my $day = $date[1]; my $comma = chop($day); my $year = $date[2]; my $dateJoin = join(' ', $month,$day,$year); my $parsedDate = DateTime::Format::DateParse->parse_datetime($dateJoin +); my $billMonth = $parsedDate->strftime('%m%d%Y'); my $fileName = "<insert Excel file name>"; my $parser = Spreadsheet::ParseXLSX->new(); my $workbook = $parser->parse($fileName); my $outfile = "NMGC_Snapshot_$billMonth.csv"; die $parser->error(), ".\n" if (!defined $workbook); open(my $out, '>', $outfile) or die "Could not open $outfile' $!\n"; my $csv = Text::CSV->new(); my @fields = ('Customer Name','Street Address','City','Station Num','C +ustomer Num','Premise Num','Point MDQ','Priority Class','Rate Schedul +e','Consolidated Station Num'); $csv->combine($amdNum, $poolNum, $billMonth,'','','','','','',''); print $out $csv->string, "\n"; $csv->combine(@fields); print $out $csv->string, "\r\n\r\n"; for my $worksheet($workbook->worksheet(0)){ my($firstRow, $lastRow) = $worksheet->row_range(); for my $row(6 .. $lastRow){ #array of cell values my $RowCellValues = []; for (my $col = 1; $col < 10; $col++) { my $cell = $worksheet->get_cell($row, $col); next unless $cell; push($RowCellValues, $cell->value()); } my $status = $csv->print ($out, $RowCellValues); print $out "\r\n"; } }
Every once in awhile in the file there is a blank line. Once my code hits the blank line it needs to go back up one row and see if it is a consolidated station. If that is true it takes the number from that line and needs to write that number onto the end of every line up to the previous blank line. If the line right above the blank line is not a consolidated station i need to write the station number from each individual account onto the end of each line up to the previous blank line. This first example is a piece of what is currently being output to my CSV file. The first two lines are header lines.
87,30202,05012014,,,,,,, "Customer Name","Street Address",City,"Station Num","Customer Num","Pr +emise Num","Point MDQ","Priority Class","Rate Schedule","Consolidated + Station Num" "Customer1","123 Main St.",City A,48,801,6,5,2,54 "Customer2","123 Main St.","City S",22,614,79,3,2,54 "Customer3","123 Main St.","City S",23,615,95,4,2,54 "Customer4","123 Main St.","City S",24,616,66,1,2,54 "Customer5","123 Main St.","City S",25,617,14,2,2,54 "Customer6","123 Main St.","City S",26,618,22,2,2,54 "CUSTOMER CONSOLIDATED STATION",,,20,,,, "Customer7","123 Main St.",City A,16,305,16,94,1,54 "Customer8","123 Main St.","City P",28,214,43,1,2,54 "Customer9","123 Main St.","City P",29,215,99,143,2,54 "Customer10","123 Main St.","City P",30,216,39,21,2,54 "Customer Consolidated Station",31 "Customer11","123 Main St.",City A,70,902,25,7,2,54 "Customer12","123 Main St.",City A,71,903,72,6,2,54
Here is what I am needing the final output to be. You will see the consolidated station numbers added to the end of each customer record.
87,30202,05012014,,,,,,, "Customer Name","Street Address",City,"Station Num","Customer Num","Pr +emise Num","Point MDQ","Priority Class","Rate Schedule","Consolidated + Station Num" "Customer1","123 Main St.",City A,48,801,6,5,2,54,48 "Customer2","123 Main St.","City S",22,614,79,3,2,54,20 "Customer3","123 Main St.","City S",23,615,95,4,2,54,20 "Customer4","123 Main St.","City S",24,616,66,1,2,54,20 "Customer5","123 Main St.","City S",25,617,14,2,2,54,20 "Customer6","123 Main St.","City S",26,618,22,2,2,54,20 "CUSTOMER CONSOLIDATED STATION",,,20,,,, "Customer7","123 Main St.",City A,16,305,16,94,1,54,16 "Customer8","123 Main St.","City P",28,214,43,1,2,54,31 "Customer9","123 Main St.","City P",29,215,99,143,2,54,31 "Customer10","123 Main St.","City P",30,216,39,21,2,54,31 "Customer Consolidated Station",31 "Customer11","123 Main St.",City A,70,902,25,7,2,54,70 "Customer12","123 Main St.",City A,71,903,72,6,2,54,71
I have a feeling i need to read back up the array that is helping to write the CSV every time i hit a blank line, but I am not sure how to do this. Also if anyone has a better idea that would be very helpful.
In reply to Reading back up through an array by TheStig
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |