in reply to There has to be an easier way...

Everyone else has said what the preferred solution is.

But nobody has explained why what you did was so slow.

Full details are in Mastering Regular Expressions. However the basic theory is that Perl does a recursive search for ways to try to match your pattern to the string. The match goes from left to right in the pattern and the string. So it first tries to match the first (.*) to the end of the string. Well then it fails to get the pipe. So it backs off and tries again. And it turns out that you are doing a scenario where there are a lot of wrong partial matches you have to try first.

If you change all of the (.*)s to (.*?)s then the RE would be faster. It would be safer still to change them to ([^\|]*)s. Split is even faster, but as you learn REs keep in mind the principle that ambiguity in the RE can result in unexpected slowdowns...

Cheers,
Ben

PS Style point. Split your data into data structures early and then access the data structures directly rather than using formatted strings. In the long run I have found that to be faster, safer, and simpler.

  • Comment on RE (tilly) 1: There has to be an easier way...