in reply to Generalizing Regex with Multiple Match
# 'g' doesn't seem to work
It doesn't work since there is only one pair of curly braces per line. Also, a m//g in scalar context matches once and sets the position for further matches at the end of the match (see pos).
$_ = ">r7.1 |SOURCES={GI=162960844,bw,0-4;GI=162960844,bw,9025576-9025 +608}|"; $_ =~ /GI=(\d+),(\w+),(\d+\-\d+)/g; print "$_\n"; print "-" x pos(),"^\n"; print pos(),"\n"; __END__ >r7.1 |SOURCES={GI=162960844,bw,0-4;GI=162960844,bw,9025576-9025608}| -----------------------------------^ 35
If you want to match the stuff inside the curlies and then build your structure from multiple matches, you need two passes - first isolate what's inside the curlies, then match with m//g:
use Data::Dumper; my %all_entry; while (<DATA>) { chomp; next unless (/^>/); my ($line) = />.*\{((?:GI=\d+,\w+,\d+\-\d+;?)+)\}/; push @{ $all_entry{$1}{$2} }, $3 while $line =~ /GI=(\d+),(\w+),(\d+\-\d+)/g; } print Dumper \%all_entry; __DATA__ >r7.1 |SOURCES={GI=162960844,bw,0-4;GI=162960844,bw,9025576-9025608}| >r6.1 |SOURCES={GI=152989753,bw,0-30;GI=152989753,bw,1877925-1877931}|
|
|---|