in reply to a small regexp question

Use a regex to match a character class and assemble the result using a join. This has the advantage of normalising multiple underscores (or whatever was excluded by the character class) while preserving the original string.
my $found = join ' ', $string =~ /([^_]+)/g; print "$found\n";

Replies are listed 'Best First'.
Re^2: a small regexp question
by newroz (Monk) on Jul 29, 2005 at 11:52 UTC
    I'll say why the need this weirdness.There is a web application
    internally we used in our company for some purposes.
    Then we give a pattern in an entry box.An other entry box which accepts a regular expression.
    .It uses matched value(not matched values!). And in that stuation, I can't perform any substituon or transliteration
    ...only I have to put something into $1. I hope that clarifies something. N

      So... assuming this is actually perl, and not something else using PCREs, then this actually seems to work:

      my $string = "a_sample_string"; $string =~ m/(?{ tr|_| | })(.*)/; print $1;

      That said, this is terribly fragile and dangerous. I wouldn't actually do that. You are essentially reaching out from the inside regex and altering the match variable before trying to match. It does manage to meet your criterion, though.

      Update: Remember that this is actually a destructive match, though... you aren't just returning $1, but also altering the match input, wherever that comes from. This will fail if the match input is immutable.

        This won't work if the perl code on the back end works like this:
        my $string = "a_sample_string"; my $re = q[(?{ tr|_| | })(.*)]; # or my $re = some CGI parameter extraction $string =~ m/$re/; print $1;
        Which is, you must admit much more likely than forming a string and passing it to eval (but some people do some amazingly unsecure things). Of course, if they pass it through the qr operator, we're back in business. (I assume they aren't using the obviously insecure use re 'eval';)

        The lesson here for anyone writing CGI scripts that do regexp manipulation is this:
        Using qr allows regexps to execute arbitrary code. DO NOT take arbitrary user input and feed it to qr unless you mean to allow arbitrary code execution.

        -- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/