in reply to Re: Parsing issue
in thread Parsing issue
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Parsing issue
by vitoco (Hermit) on Sep 11, 2009 at 14:43 UTC | |
I forgot to mention in my previous post that the if (pat1) {} elsif (pat2) {} elsif ... inside a while loop method is useful when data records is not always in the same order. In this case, where your "output" format is fixed, it's better (faster) to use a per line parsing method (no while):
This way, you can control other things like key names:
Thinking a bit more on my first script, I realize that the string can be also modified in the following way: add a new delimiter just before what we detect as a field name (words separated by exactly one space, before any colon followed by a space), then split:
I also changed the delimiter to another unused char, to differentiate it from the colon inside the time value when trimming out extra spaces. BTW, this was fun! | [reply] [d/l] [select] |
|
Re^3: Parsing issue
by vitoco (Hermit) on Sep 10, 2009 at 19:30 UTC | |
Then, you can forget this trick and try some of the other ideas I gave, like parsing each row at a time and cut each record at fixed columns or use some other regexp to get field values.
Here, I used "(.*?)\s*" to get a trimmed value of any type field, but you should change each of them to a specific pattern for dates, integers... Update: Forgot one field... | [reply] [d/l] [select] |