I have no problem with your regex approach for the timestamp.
And I have no problem with your parser approach for the timestamp. TIMTOWTDI. ☺
Did you happen to view Dave Rolsky's brief presentation of the slide titled Don't Use a Parser? He explains his qualified recommendation perfectly.
I agree that ISO 8601 format for the timestamps would be preferable.
Yep. Unless jasonl is required to maintain fidelity to the original representation of the timestamps in the logs for some peculiar reason, this is his best opportunity to improve the data by keeping the reformatted timestamps in ISO 8601 format.
The OP didn't show real data for "<data1> <data2>". I suspect your '(\S+) (\S+)' may well be an oversimplification of what's really required; however, no more so than my splitting on whitespace. :-)
My '(\S+) (\S+)' was an intentional simplification, not an inadvertent oversimplification. Although I didn't explicitly state it, I was tacitly making the point that jasonl could parse the whole log record, including the timestamps, with a single regex. The pattern I used in my code snippet to match <data1> and <data2> was just a placeholder—one that happens to match the placeholder strings "<data1>" and "<data2>" literally. ☺ As you've pointed out, jasonl didn't include in his post any verisimilar example data besides just the timestamps, so we can't possibly know how to parse his actual log records properly.
Jim
In reply to Re^6: sorting logfiles by timestamp
by Jim
in thread sorting logfiles by timestamp
by jasonl
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |