Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^5: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf

by haukex (Archbishop)
on Oct 18, 2021 at 18:10 UTC ( #11137695=note: print w/replies, xml ) Need Help??


in reply to Re^4: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf
in thread read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf

That's a pretty strange format. It might be easier to work with if you do it in two steps, first matching everything that isn't a > with [^>]+ and then cleaning up the value:

use warnings; use strict; use Data::Dump qw/dd pp/; while ( my $tag = <DATA> ) { chomp($tag); next unless $tag =~ /\S/; # skip blank lines if ( my ($id) = $tag =~ /<endnote id=([^>]+)>/ ) { $id =~ s/\W+//g; print pp($tag)," -> ",pp($id),"\n"; } else { warn "Couldn't match ".pp($_) } } __DATA__ <endnote id=(1)>Text...</endnote> <endnote id=(2)>Text...</endnote> <endnote id=1)>Text...</endnote> <endnote id=2)>Text...</endnote> <endnote id=1.>Text...</endnote> <endnote id=2.>Text...</endnote> <endnote id=1a>Text...</endnote> <endnote id=2cb>Text...</endnote> <endnote id=a.1>Text...</endnote> <endnote id=a.2>Text...</endnote>

Note your examples aren't very consistent: Your regex so far only matches digits followed by [[:alpha:]], so it's unclear what you expect for id=a.1. You'll have to provide some representative sample input along with the expected output for that input if you want answers that actually adress your problem fully.

Btw, you'll probably want to have a look at perlretut and perlrequick.

  • Comment on Re^5: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf
  • Select or Download Code

Replies are listed 'Best First'.
Re^6: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf
by afoken (Canon) on Oct 18, 2021 at 18:27 UTC
    That's a pretty strange format.

    With quotes around the ID attribute, it would be at least a valid XML fragment, and in that case, using a proper parser would be recommended. Without the quotes, it looks more like tagsoup HTML.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137695]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2022-08-08 22:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?