Re: Performance Question
by Rhandom (Curate) on Feb 28, 2007 at 16:03 UTC
|
The first gut reaction is to say - use a template system such as Template::Toolkit, CGI::Ex::Template, or HTML::Template. However, that sometimes is overkill and it seems that it would be in your case.
The big thing that would help in your case is to simplify your system into something similar to the following:
my %swap = (
name_1 => $name_1,
name_2 => $name_2,
name_3 => $name_3,
name_4 => $name_4,
name_5 => $name_5,
name_6 => $name_6,
);
WHILE (<MYTEMPLATE>) {
s{<\?--(name_\d+)-->}{
defined($swap{$1}) ? $swap{$1} : ''
}eg;
}
The key thing that makes this update faster is that you are limiting yourself to a single regex pass through. Doing multiple regex passes is going to give you a speed hit.
my @a=qw(random brilliant braindead); print $a[rand(@a)];
| [reply] [d/l] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
Re: Performance Question
by kyle (Abbot) on Feb 28, 2007 at 16:02 UTC
|
This is going to leave any <?--name_x--> item in your template unchanged if you don't have a defined value for it. Is that what you want?
You might look at using something like HTML::Template or Template Toolkit 2.
| [reply] [d/l] |
|
|
Yes, that's what I want, and definitely not any warning messages saying things like "Use of uninitialized value..."
| [reply] |
|
|
If you want to only swap in values that exist then use the following:
s{(<\?--(\w+)-->)}{
defined($swap{$2}) ? $swap{$2} : $1
}eg;
That will only replace values that match have a corresponding entry in %swap.
If you are brave you can use something like this:
s/<\?--(\w+)-->(??{ defined($swap{$^N}) ? '(?=)' : '(?!)' })/$swap{$1}
+/g;
It also does what you are looking for - but without the e modifier on the swap. You will probably need to read perlre to follow it fully - but essentially you are using the (??{}) block to check your regex as you go and then using ?= and ?! to report success or failure.
my @a=qw(random brilliant braindead); print $a[rand(@a)];
| [reply] [d/l] [select] |
Re: Performance Question
by davorg (Chancellor) on Feb 28, 2007 at 16:59 UTC
|
If you have a large number of variables that you want to access in the same way, then it makes sense to store them in some kind of aggregate variable. In this case I'd go for a hash.
use CGI ':cgi';
# list of parameter names
my @params = qw(city zip lastname country other obs);
# extract all parameters into a hash
my %param;
foreach (@params) {
$param{$_} = param($_);
}
# time passes...
while (<MYTEMPLATE>) {
foreach my $p (@params) {
s/<\?--$p-->/$param{$p}/g if defined $param{$p};
# you'll probably want to print it too
}
}
| [reply] [d/l] |
Re: Performance Question
by agianni (Hermit) on Feb 28, 2007 at 16:03 UTC
|
My understanding is that this won't make a difference. You won't get warnings if $name_x isn't defined because, despite the fact that the if statement comes after the substitution statement, it's executed first. I don't think there will be a significant enough performance difference (someone else might be able to answer this from an internal perspective?), but you could Benchmark it. | [reply] |
Re: Performance Question
by agianni (Hermit) on Feb 28, 2007 at 17:34 UTC
|
Also, if you want to talk about performance, the way in which you're looking for variables to change is going to be much more of a problem. Since you're going through each field on each line, this is roughly O(n^2) in complexity. That could get really expensive if you're running those regexes on a large number of fields/lines. Rather than checking for all of the fields on each iteration of the while loop, you might do something like this:
while(<MYTEMPLATE>){
# get the names of the fields on this line
my @variables = $_ =~ m/<\?--(\S+?)-->/g;
for my $variable ( @variables ){
# replace the tag with the value of the variable
# if there is one.
s/<\?--$variable-->/$$variable/g if $$variable;
}
}
That should yield something more like O(n), which should be noticeably faster in most cases (remember that O(n^2) smaller than O(n) for small numbers), less error prone and since it's dynamic, you won't have to update it when there's a new field to check.
| [reply] [d/l] |
|
|
| [reply] |
|
|
Right, I actually just updated my post (not sure if you read it before or after) but the point is that this alternate algorithm won't save you much with a small number of variables and a small file and could actually cost you more for very small numbers, like any sized file with no variables defined.
| [reply] |
|
|
|
|
my @variables = $line =~ /<\?--(\S+?)-->/g;
or
my @variables = /<\?--(\S+?)-->/g;
for my $variable ( @variables ){
# replace the tag with the value of the variable
# if there is one.
s/<\?--$variable-->/$$variable/g if $$variable;
}
One major nitpick: I would've ++'ed your node for mentioning another WTDI, complete of interesting considerations, but I --'ed it since you're unnecessarily using symrefs whereas we warn newbies all the time about the risk of doing so. It just amounts to using the symbol table as a generic hash, which is a good reason to use a specific one instead, e.g. %value. Failing to do so you should be prepared to cope with some corner cases: what if he as a <?--0--> or <?--$--> tag? Well, unprobable enough, granted, but even without that putting all values together in a separate structure is saner. The OP is clearly a newbie, so he may read your example as advocating symrefs and take them as a programming habit, that first or later may bite him in the neck.
Also, I would throw a defined in there for one may well want to replace say <?--cost--> with 0. | [reply] [d/l] [select] |
Re: Performance Question
by wfsp (Abbot) on Feb 28, 2007 at 18:28 UTC
|
I think there's some heavy weather being made of this. So here's my go. :-)
There's not a regex or any html to be seen. Neither is there any 'hard coded' field names. All the data is taken straight from the CGI.pm param method.
This is, imho, the most effiecient way to go. :-)
#!/usr/local/bin/perl
use strict;
use warnings;
use Data::Dumper;
use CGI;
use HTML::Template;
# create query object with some params
# (straight from the docs) :-)
my $q = new CGI(
{
'dinosaur' => 'barney',
'song' => 'I love you',
'friend' => 'george',
}
);
# create data structure for HTML::Template
my @param_loop = map {
{name => $_, value =>$q->param($_), }
} $q->param;
# get the template from DATA (this would normally be in a file)
my @tmpl = <DATA>;
# blast off!
my $t = HTML::Template->new(arrayref => \@tmpl);
$t->param(param_loop => \@param_loop);
print $t->output;
__DATA__
<html><head><title>template</title></head>
<body>
<TMPL_LOOP NAME=param_loop>
<p><TMPL_VAR NAME=name> -> <TMPL_VAR NAME=value></p>
</TMPL_LOOP>
</body>
</html>
output:
<html><head><title>template</title></head>
<body>
<p>friend -> george</p>
<p>song -> I love you</p>
<p>dinosaur -> barney</p>
</body>
update:
From the docs:
If a value is not given in the query string, as in the queries "name1=&name2=" or "name1&name2", it will be returned as an empty string.
So you won't get any pesky undefs either. | [reply] [d/l] [select] |