Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Convert backslash to slash using XML parsing

by j.goor (Acolyte)
on Nov 12, 2004 at 07:03 UTC ( [id://407286]=perlquestion: print w/replies, xml ) Need Help??

j.goor has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
Problem: I use FireFox 1.0(wheee!!!), and I want to get to my online O'Reilly books via http:\\ (not via file:\\\\).
But the very odd thing is: OReilly apparently did not read the w3c recommendations about 'how to construct a hyperlink'.
*All* hyperlinks (and other links as well) are in the backslash-notation, as if is were directory paths.

I want to write a script that:
1) slurp one page into an XML parser
2) convert the '\' 's to '/' 's
3) spit it out to another place.

I do not have much experience with XML, but this seemed a nice way to fool around with it.

My questions are:
a) do you recognize my problem regarding OReilly e-books?
b) could you provide me with a working code-snipped (this should be easy for an experienced perl/XML programmer)
c) If XML parsing is overdone, what's the best RE I can use?

P.S. InternetExplorer works just fine, but then again: who really want to use *that*?? ;-)

Thanks in advance!
  • Comment on Convert backslash to slash using XML parsing

Replies are listed 'Best First'.
Re: Convert backslash to slash using XML parsing
by Taulmarill (Deacon) on Nov 12, 2004 at 08:47 UTC
    if you only whant to translate '\' to '/', why don't you just use tr!\\!/!;?

      Why would you assume that it is safe to convert every backslash in the entire Perl CD Bookshelf to a forward slash? This would blindly ruin every example in the suite of CD books! That's a case of throwing out the baby with the bathwater. Could you imagine reading the Camel book where someone has gone through it and changed every single backslash to a forward slash? The first hello world script would look like this:

      #!/usr/bin/perl -w print "Hello world!/n";

      ...and for the record, "/n" is not the same thing as "\n".

      No, the OP realy does probably need a token parser like HTML::TokeParser, and a routine a little smarter than blind transliteration.

      One thing about the OP's post does bother me though. Does this conversion of file:\\\ to http:// mean that he's going to be making available ONLINE the entire Perl CD Bookshelf, in violation of O'Reilly's copyright?


      Dave

      THere must be more to it than that, presumably the books contain other, non-url backslashes also.

      --
      Snazzy tagline here
        yea ok, that may be. but why XML?!?
        if i wanted to be clean i would use HTML::Parser.
Re: Convert backslash to slash using XML parsing
by iburrell (Chaplain) on Nov 12, 2004 at 18:11 UTC
    What online O'Reilly books are you talking about? The ones at safari.oreilly.com use normal slashes in the URLs.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://407286]
Approved by neilh
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2024-04-16 20:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found