comment on

I have a set of data that I need to parse into a csv file. The data looks like the following:

                                                                 06

   01720168-00000000257980

                                      123 S Somewhere HWY 192
                                       172016-8

        Company NATURAL GAS CO., INC.  Business
        P O BOX 1547                   123 Road Dr.
        Town ST 12345                  SUITE#  1234
                                       Town, ST  12345








                                        6/23/2014            $257.98



 Business                                                  6/23/2014
 123 S Road  HWY 123                172016-8               6/09/2014
 Town  ST  12345                                             $257.98



 02CS   4/30   5/28           3117.0    3259.0      142.0
 Meter #    C204508                                 142.0     232.99
 Pipe Replacement Pgm SNR Comm                                  3.27
 RESEARCH & DEVELOPMENT TARIFF                                   .03
 3.00% Rate Increase County Co Sc Tax on 236.29                 7.09
 6.00% State Tax on 243.38                                     14.60

                                 Current Charges              257.98

                                 Previous Amount Due          351.60
                                 Payment Received  5/22       351.60CR

                                 Total Amount Due             257.98













                     1-877-123-4567               66.0  28       142.0
 8:00am to 4:00pm                                 58.6  30       203.0
                                                  70.3  28       174.0
[download]

The file has one record per 58 lines. I can handle the fine parsing that will need to happen on the lines to pull out the variables but what I am having trouble wrapping my head around is a method for grabbing 58 lines at a time and then performing the necessary processes on each iteration. For example once I have the 58 lines read in I know that line 8-11 from position 1-39 contain the return address that I will be putting as "return1","return2","return3","return4" in the CSV file. This is going to happen on every record as well as each of the other pieces I need to parse.

I thought about just using a counter and resetting it after every 58 lines while looping through the entire file but that didn't seem like it would be the best solution. As I'm by no means an expert at perl I wanted to check here with you guys to see if anyone has a better place to start or some ideas on how to make this a bit more clean and efficient.

If you need any other information please let me know.

UPDATE: I found a control character other than newline in the data at the beginning of each record. I was able to use local $/ = "\014"; (was looking for newlines or double newlines and not this character) to pull the data into a variable one record at a time. I then split the data using my @lines = split /\n/, $record; into an array with one line per.

So now I believe I can pass the array to a subroutine, perform the checks and changes I need to on each of the lines, write the line out to a csv file and then move to the next record.

In reply to Parsing a Formatted Text File by Pharazon

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.