Yes, that's what I've been looking at (trying) doing. The orders are only separated by a blank line, but they all start wth the "Order ID:" text, so looking at using that as the separator.
The report also spans multiple pages, including a header on each page, which complicates things just that little bit more also... but I'll worry about that later, once I have the logic for the full order sorted. The page header should be automatically filtered out by the regex the way it stands anyway... I think.
One thing I *could* do with a suggestion on, is how to handle breaking out of the loop at the end of each Order. About the only way I can think of to know to stop processing distributions, is to look for the start of the next Order record. In order to do that, though, the line containing data I want has to be read in at the "end" of the loop for the previous Order... and then back up at the start of the loop, it reads the next line of the file in, dropping the previous one, which contains (some of) the data I'm after.
Probably easier to show you what I mean in pseudocode to give a better idea :
while <DATA> { if (start of record) { get order details while (not a new order) { get distribution details into a hash } print order details and distributions to Excel } }
So, from the above, the issue I am having is the two While loops... the second one "eats" the order info of any Orders following the first. I'm sure I could put some post-While processing there to trap the data before it loops to the next line... but that just seems a bit... uncouth, for wont of a better word. Can't help thinking it should be more elegant (not to mention less likely to fail) than that.
In reply to Re^6: How best to strip text from a file?
by bobdabuilda
in thread How best to strip text from a file?
by bobdabuilda
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |