Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^5: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf

by bliako (Monsignor)
on Oct 18, 2021 at 18:45 UTC ( [id://11137698]=note: print w/replies, xml ) Need Help??


in reply to Re^4: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf
in thread read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf

I suspect you want to escape the outside brackets because you want them to mean literally brackets in your input and not to denote a capture group in regex. And so you will also have to escape the dot (.) if you need its literal value. (escape: \( \) \.). So something like: if ( $tag =~ m/<endnote id=\(?(\d*)([[:alpha:].]*)\)?>/ ) { (note that dot needs no escaping inside []). But your regex will fail on your 2 last cases: <endnote id=a.1>Text...</endnote>, so why not something like: if ( $tag =~ m/<endnote id=\(?([0-9[:alpha:].]*)\)?>/ )

But if I were you I would split the code in two subs: 1) to clean the input and remove unwanted characters. 2) to parse only properly formatted input. If your input is badly formed XML then perhaps invest doing a proper (1) and then let a proper XML parser do (2). It depends on your use case and how complex it can become in the future. To paraphrase a grand writer: All good data are alike; each bad data is bad in its own way.

bw, bliako

  • Comment on Re^5: read/write delete duplicates/sort PROBLEM! - Use of uninitialized value in sprintf
  • Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137698]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (2)
As of 2024-04-20 06:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found