Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re^2: how to strip XML into Plain Text file

by Fletch (Bishop)
on Jan 26, 2005 at 01:00 UTC ( #425093=note: print w/replies, xml ) Need Help??

in reply to Re: how to strip XML into Plain Text file
in thread how to strip XML into Plain Text file

... <img alt="Next >>" src="../next_button.jpg" />*Boom*

And this is why you use a real parser, not just a regex . . .

Update: Just to clarify the above is a pathological case and if you're reasonably sure that it probably won't occur then go ahead and use the simple s///; but be aware that it's not bulletproof and know where to find the right tool when the sledgehammer doesn't cut it any more.

Replies are listed 'Best First'.
Re^3: how to strip XML into Plain Text file
by BUU (Prior) on Jan 26, 2005 at 07:55 UTC
    Since we're being pedantic about it, is '>' actually allowed inside attribute values in XML?

      Yes. Only < is not.

      Makeshifts last the longest.

      xmllint doesn't gripe about it:

      freebie:~ 677> cat foo.xml + 9:34:27 <?xml version="1.0" encoding="utf8" ?> <testing> <img alt="Next >>" src="../next_button.jpg" /> </testing> freebie:~ 678> xmllint --noout foo.xml + 9:34:29 freebie:~ 679> + 9:34:35

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://425093]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2022-09-29 08:53 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (125 votes). Check out past polls.