This is intended as a (useful, I hope) pep-talk to others like me--newcomers to Perl.
The punchline: when the bishops, popes, and minor and major deities here stress
"there's more than one way to do it," and when they post a surprising range of
solutions when answering someone's questions, take the hint: try different solutions,
yourself. Maybe lots of them. In time the payoff could well be markedly improved performance.
Case in point: yesterday I needed a script to extract only certain lines from a
plaintext file exported from a database (comma-separated values). If field 4
contained certain text, the record in question should be printed; otherwise,
skip the record and read the next one.
My first thought was to use split to create an array of each record's fields,
then compare the contents of the array's fourth element with the arg(s)
the user had provided on the command line (the script could test for
all of several strings the user might provide as filters.) All lines would
have to be read; there's no predicting how (or if) these databases have been
sorted before they're saved to CSV format.
It all seemed very straightforward. The script, reading a 13,576-line file,
produced the desired results in about 3.5 seconds. I wasn't about to give myself
a Hero Of The People medal for that, but I could live with it. I figured I
was done. No, wait . . .
It occurred to me: what if I were to get the contents of field 4 by using
a regular expression, instead? The RE was a bit unpleasant-looking.
This sort of thing: /^[^,]+,[^,]*, etc. etc.
I looked at it a while and thought: Ridiculous. Long-ish regular expression--
big performance hit; "split" must be faster.
W r o n g. The routine using the RE ran in about 3/4 of a second--roughly 4.5
times faster. Surprised the hell out of me. And here I'd almost dumped the
second approach as "obviously" less inefficient and "therefore" slower.
So my lesson for the day, reduced to one unscientific-sounding bromide,
was: Assume nothing; try stuff. Nirvana awaits (or, if not that, then
possible improvements in execution speed :).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
(jcwren) RE: From one beginner to others . . .
by jcwren (Prior) on Jul 15, 2000 at 16:20 UTC | |
by gryng (Hermit) on Jul 15, 2000 at 17:48 UTC | |
|
(Ovid) RE: From one beginner to others . . .
by Ovid (Cardinal) on Jul 15, 2000 at 21:53 UTC | |
by greenhorn (Sexton) on Jul 16, 2000 at 01:54 UTC | |
by frankus (Priest) on Jul 18, 2000 at 17:55 UTC | |
by greenhorn (Sexton) on Jul 16, 2000 at 02:08 UTC | |
by greenhorn (Sexton) on Jul 15, 2000 at 22:22 UTC | |
by Ovid (Cardinal) on Jul 16, 2000 at 00:10 UTC | |
by Abigail (Deacon) on Jul 16, 2000 at 01:08 UTC | |
by Ovid (Cardinal) on Jul 16, 2000 at 04:08 UTC |