in reply to Performance Question

Also, if you want to talk about performance, the way in which you're looking for variables to change is going to be much more of a problem. Since you're going through each field on each line, this is roughly O(n^2) in complexity. That could get really expensive if you're running those regexes on a large number of fields/lines. Rather than checking for all of the fields on each iteration of the while loop, you might do something like this:

while(<MYTEMPLATE>){ # get the names of the fields on this line my @variables = $_ =~ m/<\?--(\S+?)-->/g; for my $variable ( @variables ){ # replace the tag with the value of the variable # if there is one. s/<\?--$variable-->/$$variable/g if $$variable; } }

That should yield something more like O(n), which should be noticeably faster in most cases (remember that O(n^2) smaller than O(n) for small numbers), less error prone and since it's dynamic, you won't have to update it when there's a new field to check.

Replies are listed 'Best First'.
Re^2: Performance Question
by Rhandom (Curate) on Feb 28, 2007 at 17:46 UTC
    ++ to you for more tries at solutions.

    The thing that isn't stated by the OP is how large the templates are. If the string is relatively short then doing the multiple regex passes is fine. The longer the template becomes - if he wants it to be fast - he really needs to try and get the number of passes as low as he can.

    The final question is - is this swapping bit the bottle neck of the process. If the OP is running these as CGI processes - then it probably doesn't matter what algorithm he is using because the startup time of the CGI is going to be costly -- that is unless the templates being used are mega bytes in size.

    my @a=qw(random brilliant braindead); print $a[rand(@a)];
      Right, I actually just updated my post (not sure if you read it before or after) but the point is that this alternate algorithm won't save you much with a small number of variables and a small file and could actually cost you more for very small numbers, like any sized file with no variables defined.
        Right, I actually just updated my post (not sure if you read it before or after)

        But I don't see any indication in your post of you having done so. It is a common and recommended practice here not to silently update posts, except for typos or minor things that is, but to mark updates with visually distinctive "tags": otherwise replies may even cease to make sense, and the overall quality of the discussion may risk to be lowered.

Re^2: Performance Question
by blazar (Canon) on Mar 01, 2007 at 10:24 UTC
    my @variables = $_ =~ m/<\?--(\S+?)-->/g;

    One really minor nitpick: $_ is the topicalizer and as such it is implicit in many operations, wich makes for terse syntax. So one either wants a fully qualified variable name like $line and thus

    my @variables = $line =~ /<\?--(\S+?)-->/g;

    or

    my @variables = /<\?--(\S+?)-->/g;
    for my $variable ( @variables ){ # replace the tag with the value of the variable # if there is one. s/<\?--$variable-->/$$variable/g if $$variable; }

    One major nitpick: I would've ++'ed your node for mentioning another WTDI, complete of interesting considerations, but I --'ed it since you're unnecessarily using symrefs whereas we warn newbies all the time about the risk of doing so. It just amounts to using the symbol table as a generic hash, which is a good reason to use a specific one instead, e.g. %value. Failing to do so you should be prepared to cope with some corner cases: what if he as a <?--0--> or <?--$--> tag? Well, unprobable enough, granted, but even without that putting all values together in a separate structure is saner. The OP is clearly a newbie, so he may read your example as advocating symrefs and take them as a programming habit, that first or later may bite him in the neck.

    Also, I would throw a defined in there for one may well want to replace say <?--cost--> with 0.