saran_techie has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
Through view source option, we are getting the control names or their IDs or their classes. For example
"<input type="text" name="message" id="talkbox" size="15" maxlength="255" />", from this code, we can easily get the details of the controls.
Instead what's the other way to get the control names or their IDs or their classes without viewing the source code through view source option.
Can we fetch the same code from the server even before the web page gets completely loaded through perl scripting. Kindly assist...
Regards,
Saravanan.S
  • Comment on How to get the control names or their IDs or their classes without viewing the source code

Replies are listed 'Best First'.
Re: How to get the control names or their IDs or their classes without viewing the source code
by Corion (Patriarch) on Feb 17, 2009 at 13:17 UTC

    There is no connection between control IDs in a browser and HTML elements.

    The usual approach to automating websites is through WWW::Mechanize or Win32::IE::Mechanize. If you need control IDs, you cannot correlate them with HTML elements.

    Also have a look at Selenium.

    Maybe it would help if you explained the larger picture of what problem you are trying to solve.

      I'm as confused as corion. (Note to corion: I think that "control IDs" as referred to by OP are talking about HTML id attributes, plus the possible class attributes).

      If you want to programmatically retrieve a web page from a server using Perl instead of using a web browser at all, you can use LWP::UserAgent. That gives your Perl code the ability to retrieve HTML pages, even if there are cookie and/or authentication requirements. You then can have access to the HTML source of the page and can parse it using HTML::Tree or perhaps and get the ids and classes. Note that if the goal is to simply provide a list of ids/classes/elements in a page, you do NOT want to start down the path of using regular expressions to parse the HTML.

      But, most importantly of all, you have to specify what you are trying to do (and not just "get the IDs and classes"). If you tell us what real world problem you're trying to solve, then you will get much more specific recommendations