http://qs1969.pair.com?node_id=11137695


in reply to Re^4: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf
in thread read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf

That's a pretty strange format. It might be easier to work with if you do it in two steps, first matching everything that isn't a > with [^>]+ and then cleaning up the value:

use warnings; use strict; use Data::Dump qw/dd pp/; while ( my $tag = <DATA> ) { chomp($tag); next unless $tag =~ /\S/; # skip blank lines if ( my ($id) = $tag =~ /<endnote id=([^>]+)>/ ) { $id =~ s/\W+//g; print pp($tag)," -> ",pp($id),"\n"; } else { warn "Couldn't match ".pp($_) } } __DATA__ <endnote id=(1)>Text...</endnote> <endnote id=(2)>Text...</endnote> <endnote id=1)>Text...</endnote> <endnote id=2)>Text...</endnote> <endnote id=1.>Text...</endnote> <endnote id=2.>Text...</endnote> <endnote id=1a>Text...</endnote> <endnote id=2cb>Text...</endnote> <endnote id=a.1>Text...</endnote> <endnote id=a.2>Text...</endnote>

Note your examples aren't very consistent: Your regex so far only matches digits followed by [[:alpha:]], so it's unclear what you expect for id=a.1. You'll have to provide some representative sample input along with the expected output for that input if you want answers that actually adress your problem fully.

Btw, you'll probably want to have a look at perlretut and perlrequick.