in reply to RE: (Ovid - Common regex error) RE: A two-liner for Backtracking for substitutions
in thread Backtracking for substitutions
Those two little lines have saved me more trouble than you can possibly imagine and I would strongly recommend that you incorporate them. Admittedly, you just posted what you did for testing purposes, but I still have this "knee jerk" reaction regarding anything without the -w switch or use strict.#!/usr/bin/perl -w use strict;
Your first regex can be made a bit more efficient (and accurate) by eliminating the .*, matching to the beginning of the line and using the /m switch:
I haven't actually benchmarked this, but I'd bet good money that this is the case. See Death to Dot Star! for information on why .* is problematic. The accuracy issue is probably a mute point if you have relatively clean data.$mydata =~ s/^([\w\s]+)\s([\w]+)\s(0000)/$1,$2,$3/mg;
The second regex has two issues. You forgot to put parentheses around the \s0000. Those parentheses were supposed to capture this data and substitute it back using $2. I just changed it to the following:
The other problem is a really just a minor efficiency issue: \d{2} is better written as \d\d (this is from MRE, so it may be out of date for newer regex engines). Basically, when you use \d{2}, the regex engine is forced to keep track of the number of instances of \d. This slows it down just a tad (which can be significant when iterating over a large amount of data). However, when the regex engine sees \d\d, it just matches each instance of \d which is faster.$mydata =~ s/(:\d\d)\s0000/$1, 0000/g;
Hope this helps!
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just go the the link and check out our stats.
|
|---|