ViceRaid has asked for the wisdom of the Perl Monks concerning the following question:
Hullo
I have a script that's nothing but a lot of simple =~ s///s. It's a redirect script for Squid; customers point their domain to our machine, and that box then forwards that request on to whatever one of our servers should handle it. So, the script looks something like:
while ( <> ) { ... elsif ( s|http://www.theirsite.com/\W|http://our.server1/theirsite/\n| +i ){ } elsif ( s|http://www.theirsite.com/|http://our.server1/theirsite/|i ) +{ } elsif ( s|http://www.dummy.com/\W|http://our.server2/dummy/\n|i ) { } # .. ad nauseam }
A couple of questions. I'd love some ideas for how to make this work more straightforwardly, especially defining the rules more clearly than a long list of regexes. Maybe it could use a text config file which expresses the simple, similar regexes that should be compiled at start up?
Secondly, since only one rule gets applied to each incoming URL, the most frequently used rules (which we can test against the logs) should go near the top, and the others near the bottom. However, it's a royal PITA to test and develop this. Any ideas on how to benchmark this painlessly, or a better algorithm - perhaps something B-Tree-ish - to order the rules?
thanks
ViceRaid
Update: Clarified as per diotalevi's nit
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Organising lots of simple regexes
by hardburn (Abbot) on Jan 27, 2004 at 14:23 UTC | |
by ViceRaid (Chaplain) on Jan 27, 2004 at 16:05 UTC | |
by hardburn (Abbot) on Jan 27, 2004 at 16:14 UTC | |
|
Re: Organising lots of simple regexes
by Abigail-II (Bishop) on Jan 27, 2004 at 13:53 UTC | |
|
Re: Organising lots of simple regexes
by BrowserUk (Patriarch) on Jan 27, 2004 at 16:27 UTC | |
|
Re: Organising lots of simple regexes
by ambrus (Abbot) on Jan 27, 2004 at 14:30 UTC | |
|
Re: Organising lots of simple regexes
by diotalevi (Canon) on Jan 27, 2004 at 15:56 UTC | |
by ViceRaid (Chaplain) on Jan 27, 2004 at 16:11 UTC | |
by diotalevi (Canon) on Jan 27, 2004 at 17:18 UTC | |
|
Re: Organising lots of simple regexes
by ysth (Canon) on Jan 27, 2004 at 18:53 UTC |