Your code has a couple of gotchas in it, even in the fixed version. If the <BIB/>s contain record separators (normally newlines), your matches will fail (twice); first fail is the file reading line by line will break the records into two passes of your while(<$INPUT_REF_FH>){} and then . does not match newlines in regular expressions by default. Add an s flag to your regex to match it. Also .* matches nothing quite happily; unless you really want empty <BIB/>s. The use of the x is meaningless in your regex. I know this is sometimes a recommended default but to me it's distracting noise, akin to someone wasting your time with the code equivalent of "Made you look."
This is a little idiomatic but it addresses the issues–
use strictures; use open qw( :std :utf8 ); my $corpus = do { local $/; <DATA> }; my @bibs; push @bibs, $corpus =~ m{<BIB>(.+?)</BIB>}sg; s/[^\S ]+/ /g for @bibs; # Normalize whitespace. if ( @bibs ) { print "Found...\n"; print "\t* $_\n" for @bibs; } else { print "No love.\n"; } __DATA__ In fact, <BIB>Falco (2012)</BIB> today Louise is hardly isolated. More than 5 million babies have been born using the procedure, which has become almost routine. And at the age of 28, Louise became a mother herself, giving birth to a baby boy name Cameron—conceived, by the way, in the old-fashioned way (<BIB>Falco, 2012</BIB>; <BIB>ICMRT, 2012</BIB>).
Found... * Falco (2012) * Falco, 2012 * ICMRT, 2012
In reply to Re: Extract Data between Tags
by Your Mother
in thread Extract Data between Tags
by ppremkumar
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |