in reply to matching every occurrence of a regex

If all you wanted to do was grab the middle number on each line, this worked for me:
while(<DATA>) { if (m/\w{1,12}\s+(\d{1,5})\s+[a-zA-Z]{4}/) { my $site = $1; print "$site\n"; } } __DATA__ BC001593 91 NPSL BC001593 260 NASS BC001593 293 NAST # the output 91 263 293
As far as performing this regex until there are no more lines, I think it depends on where the protein strings are stored. Either way, if you can get the lines in to an array, you could use the foreach construct to touch every one. HTH