Re^3: Prototype mismatch: sub main::trim: none vs ($)

Replies are listed 'Best First'.
Re^4: Prototype mismatch: sub main::trim: none vs ($) by haukex (Archbishop) on Feb 09, 2017 at 16:47 UTC
Hi Marshall, You're right, I didn't mention that because of what the FAQ says: "That might not matter to you, though". That attitude reminds me of the last time I read the Camel, I realized that for a large part of the book no mention is made of performance at all. I found that quite enlightening, along the lines of "until performance becomes an issue, don't worry about it" (a friendlier version of "premature optimization is the root of all evil"). I'm not advocating totally ignoring performance, but instead for worrying less when one can allow oneself to. Personally, I prefer the one-regex solution for its brevity, and so far I've usually been in a position where that performance difference doesn't matter. Of course, that can be a luxury, so thank you for mentioning the issue in case it does matter to the OP :-) Regards, -- Hauke D	[reply]
Re^5: Prototype mismatch: sub main::trim: none vs ($) by Marshall (Canon) on Feb 09, 2017 at 20:18 UTC
Hi Hauke D, Your comments are well taken. I certainly wouldn't say that there is anything wrong at all with using the single regex for trimming the lines. A large amount of Perl that I write involves processing text files which come from various sources, sometimes cut-n-paste amalgamations generated by users. I can't think of any example where the "line trim" code or "skip blank line" code, e.g. `next if /^\s*$/;` played any significant performance role at all. Leaving a blank line in the file is so common that I almost always add that fast regex (its fast because of the anchors) to get rid of non-data lines. Performance can be a very, very application specific thing. I wrote one program that took 4 hours to run. I got complaints as to the run time. I asked, "how many times per year do you run this program?". Answer: 4 times per year. I used algorithms that made it easy for me to develop/debug and track down any questionable decision(s) and also to come as close as I could to guaranteeing that it produced a correct result. Accuracy was my main goal. I never got an answer about why 4 hours mattered. New management ordered it to be recoded, with a goal of much faster at the expense of perhaps a 10% error rate. So the new version runs much faster, but makes more mistakes. Which is better? Depends upon what you want. Algorithms and data structures make a lot more difference than this simple "what do I do with this single line" although I will admit to beating on a single critical important regex line for an entire week to squeeze some more performance out of it - if you do it a million times, it can matter a lot.	[reply] [d/l]