in reply to Re^2: Parsing HTML question
in thread Parsing HTML question

Then there should be a way to represent a web page content as we see it rather than as HTML.
You mean as a screenshot? That won't make your task easier, not at all.
There should be some possibility to do this in Perl.

Perl is Turing complete - sure you can do $this in perl. But sometimes it's not the easiest way to do $this.

For example I do this with Quick Test Pro.

What is "this"? I didn't know what "Quick Test Pro" is, so I looked it up - seems to be some kind of testing framework. Well, we have such things in perl also, like Test::More and many more test modules. But I'm a bit confused because I don't see a relation to your original question.

Maybe you should just explain what you want to achieve in the end, not ask for steps on the way that you think are necessary to achieve your goal. See XY Problem.

Replies are listed 'Best First'.
Re^4: Parsing HTML question
by vit (Friar) on Jun 23, 2008 at 22:59 UTC
    Sorry I confused you with QTP. Yes, I use it for testing, but it has a wonderfull feature to convert a web page to a string. This string is actually a screenshot. I can convert this string into words or try to match words, use reg. expr. etc.
    Now, my purpose is not testing. What I need, again, is meaningful WORDS. If by some magic way I can do in perl this conversion to string task I will be happy. Roughly the result should be the same as if I
    Open web page
    CTRL A
    CTRL C
    Paste clipboard content to xxxx.txt file
    The rest is details. So can we do this job in Perl?
      Ok, so you need to convert a HTML page into plain text. The module I recommended earlier should do just that.