Re: Character Text Delimiters

Others have noted this is an inherently fragile data format. (An example, I think, of the Semipredicate problem.) See what happens when records in the test data below are swapped, or if 'AE(foo)' in record AD is changed to '(fubar),AE(foo)'. However, one possible way:

>perl -wMstrict -le
"my $s = 'AA(Acme Widgets. 123 Coyote St. AZ(Ariz.),USA) ,'
       . 'AB(Your Name. 99 Some St. HI(Hawaii), USA), '
       . 'AC(Dep deAstro. Uni de Val. C/Dr. M 50, 461 Bur (Val), Sp),'
       . 'AD(AE(foo), approaching breaking point AD(bar)) , '
       . 'AE(optional trailing comma, spaces on last record)'
       ;
 ;;
 my $tag  = 'AA';
 my $stop = 'ZZ';
 ;;
 EXTRACT:
 for (++(my $after = $tag);  $tag le $stop;  ++$tag, ++$after) {
   my $pre  = qr{ \G $tag [(] }xms;
   my $post = qr{ [)] (?: \s* , \s* (?= $after) | \s* ,? \s* \z) }xms;
   ;;
   last EXTRACT unless $s =~ m{ $pre (.*?) $post }xmsg;
   my $extract = $1;
   print qq{'$tag': [[$extract]]};
   }
"
'AA': [[Acme Widgets. 123 Coyote St. AZ(Ariz.),USA]]
'AB': [[Your Name. 99 Some St. HI(Hawaii), USA]]
'AC': [[Dep deAstro. Uni de Val. C/Dr. M 50, 461 Bur (Val), Sp]]
'AD': [[AE(foo), approaching breaking point AD(bar)]]
'AE': [[optional trailing comma, spaces on last record]]
[download]

Update: Enhanced discussion, improved 'robustness' of extraction (for some definition of robust), added stress-test data records to example data.

Comment on Re: Character Text Delimiters Download Code