deprecated has asked for the wisdom of the Perl Monks concerning the following question:
This regex is being used to match this data:my ($ptid, $total, $used, $avail, $pct, $mp) = $element =~ m!^(/dev/.+[0-9]+) # which partition $ptid \s+([0-9.]+[MGK]) # total size of parition $total \s+([0-9.]+[MGK]) # used space $used \s+([0-9.]+[MGK]) # available space $avail \s+(\d{2})% # percent usage $pct \s+(.*)$!x; # mounting point $mp
which is being repeated on an hourly cronjob. So this could easily turn into several megs (or even dozens of megs) of text. Therefore, speed will be an issue.---- Sat Feb 3 12:01:01 EST 2001 Filesystem Size Used Avail Use% Mounted on /dev/hda7 904M 261M 598M 30% / /dev/hda12 852M 378M 474M 44% /devel /dev/hda10 9.8G 9.6G 256M 97% /home /dev/hda9 1.8G 1.6G 225M 88% /home/dl /dev/hda5 768M 751M 17M 98% /mnt/macos /dev/hda8 3.9G 3.4G 304M 92% /usr /dev/hda6 387M 93M 275M 25% /var /dev/hdb5 1008M 591M 365M 62% /home/ftp /dev/hdb6 1008M 209M 748M 22% /home/httpd /dev/hdb9 1.5G 1.1G 358M 75% /mnt/build /dev/hdb8 640M 456M 151M 75% /mnt/mp3
So I'm looking at this and see a pretty specific regex. I thought of substituting \S+ for .*. However, in unix (nt compatibility, obviously, is not an issue here) mounting points can include awful characters like *, \n, \a, and so on. So, basically, I see two flaws to the expression. First, the use of .* (and .+), and second the part where \s+([0-9.]+[MGK]) is captured seems repetitive. Has anyone got some regex-tuning hints here?
thanks,
brother dep.
--
i am not cool enough to have a signature.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re (tilly) 1: Getting rid of (.*) from a not-quite-complex regex.
by tilly (Archbishop) on Feb 03, 2001 at 23:01 UTC | |
by Anonymous Monk on Feb 05, 2001 at 08:16 UTC | |
by eg (Friar) on Feb 05, 2001 at 08:21 UTC | |
|
Re: Getting rid of (.*) from a not-quite-complex regex.
by lemming (Priest) on Feb 03, 2001 at 23:13 UTC | |
|
Re: Getting rid of (.*) from a not-quite-complex regex.
by dws (Chancellor) on Feb 03, 2001 at 23:45 UTC | |
|
Re: Getting rid of (.*) from a not-quite-complex regex.
by chipmunk (Parson) on Feb 04, 2001 at 22:43 UTC | |
|
Re: Getting rid of (.*) from a not-quite-complex regex.
by jeroenes (Priest) on Feb 04, 2001 at 14:55 UTC |