The title makes this problem seem much easier than it is. I know what I want to do programmatically but I am still new to Perl and putting what I want to do into Perl code is where I am getting hung up. The following code takes an Excel and PDF version of one file and takes what I need and outputs a CSV file.

#!/usr/bin/perl -w use warnings; use strict; use PDF::API2; use CAM::PDF; use CAM::PDF::PageText; use String::Util ':all'; use Spreadsheet::ParseXLSX; use Text::CSV; use Date::Format; use DateTime::Format::DateParse; my $pdfFilename = "<insert PDF file name>"; my $pdf = CAM::PDF->new($pdfFilename); my $text = $pdf->getPageText(1); my $endString = "DELIVERY"; my $headIndex = index($text, $endString); my $newText = substr($text, 0, $headIndex); my $header = trim($newText); my $amdNumStart = "NO. "; my $amdNumEnd = "GAS"; my $amdNumStartIndex = index($header, $amdNumStart)+3; my $amdNumEndIndex = index($header, $amdNumEnd); my $amdNumString = substr($header, $amdNumStartIndex, $amdNumEndIndex +- $amdNumStartIndex); my $amdNum = trim($amdNumString); my $poolNumStart = "30202"; my $poolNumEnd = "TIGER"; my $poolNumStartIndex = index($header, $poolNumStart); my $poolNumEndIndex = index($header, $poolNumEnd); my $poolNumString = substr($header, $poolNumStartIndex, $poolNumEndInd +ex - $poolNumStartIndex); my $poolNum = trim($poolNumString); my $dateStart = "DATED "; my $dateStartIndex = index($header, $dateStart)+6; my $dateEndIndex = index($header, $dateStart); my $dateString = substr($header, $dateStartIndex, $dateEndIndex - $dat +eStartIndex); my $reportDate = trim($dateString); my @date = split(' ', $reportDate); my $month = $date[0]; my $day = $date[1]; my $comma = chop($day); my $year = $date[2]; my $dateJoin = join(' ', $month,$day,$year); my $parsedDate = DateTime::Format::DateParse->parse_datetime($dateJoin +); my $billMonth = $parsedDate->strftime('%m%d%Y'); my $fileName = "<insert Excel file name>"; my $parser = Spreadsheet::ParseXLSX->new(); my $workbook = $parser->parse($fileName); my $outfile = "NMGC_Snapshot_$billMonth.csv"; die $parser->error(), ".\n" if (!defined $workbook); open(my $out, '>', $outfile) or die "Could not open $outfile' $!\n"; my $csv = Text::CSV->new(); my @fields = ('Customer Name','Street Address','City','Station Num','C +ustomer Num','Premise Num','Point MDQ','Priority Class','Rate Schedul +e','Consolidated Station Num'); $csv->combine($amdNum, $poolNum, $billMonth,'','','','','','',''); print $out $csv->string, "\n"; $csv->combine(@fields); print $out $csv->string, "\r\n\r\n"; for my $worksheet($workbook->worksheet(0)){ my($firstRow, $lastRow) = $worksheet->row_range(); for my $row(6 .. $lastRow){ #array of cell values my $RowCellValues = []; for (my $col = 1; $col < 10; $col++) { my $cell = $worksheet->get_cell($row, $col); next unless $cell; push($RowCellValues, $cell->value()); } my $status = $csv->print ($out, $RowCellValues); print $out "\r\n"; } }

Every once in awhile in the file there is a blank line. Once my code hits the blank line it needs to go back up one row and see if it is a consolidated station. If that is true it takes the number from that line and needs to write that number onto the end of every line up to the previous blank line. If the line right above the blank line is not a consolidated station i need to write the station number from each individual account onto the end of each line up to the previous blank line. This first example is a piece of what is currently being output to my CSV file. The first two lines are header lines.

87,30202,05012014,,,,,,, "Customer Name","Street Address",City,"Station Num","Customer Num","Pr +emise Num","Point MDQ","Priority Class","Rate Schedule","Consolidated + Station Num" "Customer1","123 Main St.",City A,48,801,6,5,2,54 "Customer2","123 Main St.","City S",22,614,79,3,2,54 "Customer3","123 Main St.","City S",23,615,95,4,2,54 "Customer4","123 Main St.","City S",24,616,66,1,2,54 "Customer5","123 Main St.","City S",25,617,14,2,2,54 "Customer6","123 Main St.","City S",26,618,22,2,2,54 "CUSTOMER CONSOLIDATED STATION",,,20,,,, "Customer7","123 Main St.",City A,16,305,16,94,1,54 "Customer8","123 Main St.","City P",28,214,43,1,2,54 "Customer9","123 Main St.","City P",29,215,99,143,2,54 "Customer10","123 Main St.","City P",30,216,39,21,2,54 "Customer Consolidated Station",31 "Customer11","123 Main St.",City A,70,902,25,7,2,54 "Customer12","123 Main St.",City A,71,903,72,6,2,54

Here is what I am needing the final output to be. You will see the consolidated station numbers added to the end of each customer record.

87,30202,05012014,,,,,,, "Customer Name","Street Address",City,"Station Num","Customer Num","Pr +emise Num","Point MDQ","Priority Class","Rate Schedule","Consolidated + Station Num" "Customer1","123 Main St.",City A,48,801,6,5,2,54,48 "Customer2","123 Main St.","City S",22,614,79,3,2,54,20 "Customer3","123 Main St.","City S",23,615,95,4,2,54,20 "Customer4","123 Main St.","City S",24,616,66,1,2,54,20 "Customer5","123 Main St.","City S",25,617,14,2,2,54,20 "Customer6","123 Main St.","City S",26,618,22,2,2,54,20 "CUSTOMER CONSOLIDATED STATION",,,20,,,, "Customer7","123 Main St.",City A,16,305,16,94,1,54,16 "Customer8","123 Main St.","City P",28,214,43,1,2,54,31 "Customer9","123 Main St.","City P",29,215,99,143,2,54,31 "Customer10","123 Main St.","City P",30,216,39,21,2,54,31 "Customer Consolidated Station",31 "Customer11","123 Main St.",City A,70,902,25,7,2,54,70 "Customer12","123 Main St.",City A,71,903,72,6,2,54,71

I have a feeling i need to read back up the array that is helping to write the CSV every time i hit a blank line, but I am not sure how to do this. Also if anyone has a better idea that would be very helpful.


In reply to Reading back up through an array by TheStig

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.