Wassercrats has asked for the wisdom of the Perl Monks concerning the following question:

I have several complex regular expressions that act as natural language interpreters to identify such things as terms in link text that could be considered nondescript, to which the title of the target page should be appended (click here, more, etc). For myself and for user documentation, I'd like to have a list of terms that those regexes match. Some terms would include descriptions of things like when they must be preceded by x number of characters, but those descriptions would be in english, and whenever practical, a regex would be expanded to a list of terms that it could match. I'd probably edit the list for the documentation to eliminate nonsensical terms that wouldn't appear in the link text.

No such script already out there, is there?

  • Comment on Regular expression to English translation

Replies are listed 'Best First'.
Re: Regular expression to English translation
by broquaint (Abbot) on Nov 27, 2003 at 12:18 UTC
    Perhaps japhy's YAPE::Regex::Explain is what you're after e.g
    use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr/this.*(?:that)?(?!another)/) ->explain; __output__ The regular expression: (?-imsx:this.*(?:that)?(?!another)) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- this 'this' ---------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- that 'that' ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- (?! look ahead to see if there is not: ---------------------------------------------------------------------- another 'another' ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
    HTH

    _________
    broquaint

      My fear is that that's the best that could be done. I haven't completely thought through how practical what I want is, but I would need something more like a list of expressions that match with no reference to regex code.
        Then the likes of Parse::RandGen are probably worth looking at. Here's some sample usage
        use Parse::RandGen; my $p = Parse::RandGen::Regexp->new(qr/a(b|c)d?/); print $p->pick for 1 ..5 __output__ abd ab ac acd abd
        HTH

        _________
        broquaint

Re: Regular expression to English translation
by traveler (Parson) on Nov 27, 2003 at 16:49 UTC
Re: Regular expression to English translation
by Cody Pendant (Prior) on Nov 28, 2003 at 00:43 UTC

    There was a thread here a while ago about that, called something like "list of strings that match a regular expression" but I can't find it by searching, annoyingly.

    Of course with any use of star in regular expressions, the number of matches is infinite, right?



    ($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print
      Infinitely long phrases are one of my concerns with Parse::RandGen, but I might be able to do something about that. I'd really like a module that would handle a regex like /Hello\s{1,3}world.*/ by outputting:

      Hello world [and any text that follows] Hello world [and any text that follows] Hello world [and any text that follows]

      Kind of a compromise between YAPE::Regex::Explain and Parse::RandGen. Then, I would just eliminate the phrases with multiple spaces and explain in a footnote that multiple spaces between words would match. I guess I have very specialized needs and I expect to have to customize the modules.