Re: Re: Regex perplexity

Thanks. I wasn't entirely clear on the input format, each file has a number of those data elements in it (i.e. the example is one file); which I need to extract by ID number. I have altered your code to this (where $id_flag is triggered once we come across the appropriate ID number):

sub FindPositions {
  my $string = $_[0];
  my $id = $_[1];
    
  my ($firstQ, $firstS);
  my ($lastQ, $lastS);
  my $id_flag;
  my $line;
    
  # pipe-ize the string
  my $string_pipe = new FileHandle("echo \'$string\' |") or die;

  while (!(defined($id_flag) && defined($firstQ) && defined($firstS)))
  {
    $line = <$string_pipe>;
    $id_flag = 1 if ($line =~ /<a name = $id>/);
    $firstQ = $1 if ($line =~ /^Query:\s+?(\d+)[\sgcat]*(\d+)/) && do{
+$lastQ=$2};
    $firstS = $1 if ($line =~ /^Sbjct:\s+?(\d+)[\sgcat]*(\d+)/) && do{
+$lastS=$2};
  }
    
  foreach $line (<$string_pipe>)
  {
    $lastQ = $2 if ($line =~ /^Query:\s+?(\d+)[\sgcat]*(\d+)/);
    $lastS = $2 if ($line =~ /^Sbjct:\s+?(\d+)[\sgcat]*(\d+)/);
  }
  return ($firstQ, $firstS, $lastQ, $lastS);
}
[download]

But I'm not sure how to make it grab the appropriate ending values. As is it grabs the last ones in the file.

Thanks

Comment on Re: Re: Regex perplexity Download Code

Replies are listed 'Best First'.
Re: Re: Re: Regex perplexity by eweaverp (Scribe) on Jun 30, 2003 at 22:30 UTC
Nevermind; I'm dumb. I just add this line: `# slice out the appropriate part of the string ($string) = $string =~ /(><a name = $id>.*?<\/pre>)/s;` [download] to the beginning of the subroutine and remove the $id_flag weirdness and it works great. Thanks for the example code, and the multi-loop approach. Apparently there _are_ some things a regex can't do! Cheers all, Evan	[reply] [d/l]
Re: Re: Re: Re: Regex perplexity by eweaverp (Scribe) on Jun 30, 2003 at 22:50 UTC
And... that made me realize that this: `sub FindPositions { my $string = $_[0]; my $id = $_[1]; # slice out the appropriate part of the string ($string) = $string =~ /(><a name = $id>.?<\/pre>)/s; my @positions = $string =~ m/<a name = $id>.?Query: (\d+).?Sbjct: +(\d+).?<\/pre>/s; push (@positions, $string =~ m/<a name = $id>.Query: \d+\s+[a-z]+ ( +\d+)\n.\nSbjct: \d+\s+[a-z]+ (\d+)\n<\/pre>/s); return @positions; }` [download] works too. First and last. Hmm. Anyway...	[reply] [d/l]