mellin has asked for the wisdom of the Perl Monks concerning the following question:

I have an idea to write script that produces html-forms dynamically. It's primarily used for simple blogging (web blog) system for users from whom i cannot require too much knowledge about that sort of things, they just want to get it done. Hopefully quick too.

Getting few regular expressions right has been difficult. Element can have definition and alternative definition, both are been shown on the script produced form. Here's how i define the form elements inside the script:

# three text inputs, each with a size 30 tag form('input', 'text', 'user_name:Name', '30', 'user_email:Email|format + user@domain.com', '30', 'user_title:Topic', '30'); # textarea element, size 30 cols x 10 rows form('textarea', 'user_message:Feedback', '30x10'); # select element, name 'city', four options form('select:sort', 'city:City', 'Tampere', 'Helsinki', 'Espoo', 'Turk +u'); # input checkbox form('input', 'checkbox', 'reply|waiting for answer?'); # input submit form('input', 'submit', 'send');
So bits and pieces like this:

formElement:definition|Alternative definition

I cannot seem to get regular expressions done that could separete this kind of strings. The first one always returns true? I've tried this way:

sub split { my %key; if ($_[0] =~ /^(.+?):(.+?)$/) { $key{'name'} = $1; $key{'definition'} = $2; $key{'altdefinition'} = ''; } elsif ($_[0] =~ /^(.+?)\|(.+?)$/) { $key{'name'} = $1; $key{'altdefinition'} = $2; $key{'definition'} = ''; } elsif ($_[0] =~ /^(.+?):(.+?)\|(.+?)$/) { $key{'name'} = $1; $key{'definition'} = $2; $key{'altdefinition'} = $3; } return %key; }
What characters should i choose to be delimeters, and how could this be made "foolproof"?

Replies are listed 'Best First'.
Re: Getting regular expression right
by blazar (Canon) on Feb 15, 2005 at 11:07 UTC
    sub split {
    Do you really want to call this split()? IMHO it's not a good idea, I recommend you try something along the lines of mysplit() instead.
    my %key; if ($_[0] =~ /^(.+?):(.+?)$/) {
    This may be largely a matter of personal preferences, but I think you'd be better off shift()ing 'into' a more descriptive variable (or maybe a local()ised $_).
    $key{'name'} = $1; $key{'definition'} = $2; $key{'altdefinition'} = '';
    Also you may use the match operator's return value in list context. Maybe you knew and maybe you didn't. No harm done mentioning the possibility, IMHO.
    } elsif ($_[0] =~ /^(.+?)\|(.+?)$/) { $key{'name'} = $1; $key{'altdefinition'} = $2; $key{'definition'} = '';
    In my personal experience I find that in Perl cascaded elseifs are seldom really necessary and in general they also rarely add to clarity or conciseness.

    I may be wrong, but couldn't you choose just one separator? Also I think you'd be better off using a real split() on such a separator (and possibly do some checks on the return values, e.g. to avoid 'uninitialized' warnings)...

    What characters should i choose to be delimeters, and how could this be made "foolproof"?
    You cannot make it foolprof for as soon as you'll think you have, along a better fool will come!
Re: Getting regular expression right
by Anonymous Monk on Feb 15, 2005 at 10:58 UTC
    The first one is true if, and only if, there is a colon in the string (assuming no newlines in the string). Hence, the second alternative will never be selected. If that would match, the first alternative would have matched already.

    I'd do something like:

    if (/$_[0] =~ /^([^:]+):([^|]+)(?:\|(.*))?$/ || $_[0] =~ /^([^|)+)\|(.*)$/) { @key{qw /name definition altdefinition/} = ($1, $2, defined $3 ? $ +3 : "") }
    It would even be simpler if you used one kind of delimiter, say the pipe. Then you could do:
    if ($_[0] =~ /\|/) { @key{qw /name definition altdefinition/} = split /\|/, $_[0]; $key{altdefinition} = "" unless defined $key{altdefinition}; }
    I certainly wouldn't call by sub 'split'.
      if (/$_[0] =~ /^([^:]+):([^|]+)(?:\|(.*))?$/ || $_[0] =~ /^([^|)+)\|(.*)$/) { @key{qw /name definition altdefinition/} = ($1, $2, defined $3 ? $ +3 : "") }
      I cannot get this example working? And should @key be %key?
        I cannot get this example working?

        If you say so. Why not? No match? Error? Match when it shouldn't? Wrong content of variables? Explain yourself.

        And should @key be %key?

        No.

      Thanks. Calling the sub 'split' was just temporarily insanity :) In the real script it's called 'extra'.
Re: Getting regular expression right
by Animator (Hermit) on Feb 15, 2005 at 11:03 UTC

    Your regexes are correct. (assuming formElement:definition|Alternative definition is the input.)

    What is incorrect is that you named your own defined subroutine: split. It is a built-in function.

    Your own subroutine will only be called when you call it via &split($something). Else the built-in one will be called.

    Here are two suggestions:
    a) but a warn/print statement (for debugging purpose) in your function and see if it is called.
    b) enable warnings.