in reply to Re: Generating string that conforms to a regexp
in thread Generating string that conforms to a regexp

I am sorry, I should've been more specific...

It seems like I want the opposite of what was discussed in that thread...

Given :

/^(\w){3}(\d){5}(\w){3,5}?/

I need to generate x number of strings based on that given regexp, such as :

ABC00000ABC
ABC01231ABCD
AAA11231BDBDB

The letters/numbers can be random.

Replies are listed 'Best First'.
Re: Re: Re: Generating string that conforms to a regexp
by dvergin (Monsignor) on Feb 02, 2002 at 00:23 UTC
    This looks fun. And since it appears that you are proposing to use a rather limited subset of regex possibilities, not too hard.

    Here's a working roughed-out beginning toward a generic solution for a certain subset of regex possibilities along the lines you describe.

    #!/usr/bin/perl -w use strict; sub randchrs { my ($type, $length) = @_; $length ||= '1'; my @w = ('A'..'Z', 'a'..'z', '0'..'9', '_'); my @d = ('0'..'9'); my @chrs; for ($type) { /w/ && do {@chrs = @w; last}; /d/ && do {@chrs = @d; last}; # other options here } if ( $length =~ /(\d+),(\d+)/ ) { $length = $1 + int(rand($2 - $1 + 1)); } my $answer = ''; for (1..$length) { $answer .= $chrs[int(rand(@chrs))]; } return $answer; } while ( my $regex = <DATA> ) { chomp $regex; $regex =~ s/\(?\\([wd])\)?({([\d,]+)})?/randchrs($1,$3)/eg; print "$regex\n"; } __DATA__ STRING: (\w){3}(\d){5}(\w){3,5} PHONE: \d{3}-\d{3}-\d{4} SSN: \d{3}-\d{4}-\d{3} MORE: several word chars: \w{3,8} WA_DL: \w{7}\d{3}\w{2} STUFF: \w\d\w\d\w\d ANGRY: "\w{5,9}", he said, "\w{5,9}, \w{5,9}, and \w{5,9}."
    That might serve to get you started.

    Note in passing: your description of the problem is a bit broader than the solution examples you offer, so you may need to tweek things accordingly. You will certainly need to tweek this example anyway.

    Grins, David

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

      I really like this solution, but could you take the regex as a passed in parameter instead of as __DATA__. I know this will entail a far more complecated procedure to interprete it, but I bet it can be done.

      Frankly, this is beyond me right now as I have way too much to do to do this myself, but I would love to see someone make this into a usable generalized module. I could see a real use.

      Just my hope that someone else will do the work to save me the time ;)


      I admit it, I am Paco.
      Wow, thanks David! Your code helped a lot! You're right, my examples were too simply, the requirements are a little broader... I forgot to mention that I might need [] as well as | in there too... But I'm working on it!