oldwarrior32 has asked for the wisdom of the Perl Monks concerning the following question:
Hello Monks. I need some advice about parsing HTML code using HTML::TreeBuilder.
I have some HTML code, and I need some info within a table tag. There are, let's say 20 table tags, but the info requires is in table 15.
How do I know that in table 15 is the required info? Well, I just search for the info in notepad with ctrl+F and next I count the table tags from the beginning.
As you can see the process is very tedious.
The question is that HTML::Tree builder inherits a function from HTML::Element to dump the HTML code. The dumped HTML code looks like this:
td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0.4.1.9.3 <select id="searchField9" name="searchField9" on +change="getUtilityListValues(this, "PhoneFindListForm", upd +ateUtilityList)" size="1"> @0.1.9.0.0.0.0.4.1.9.3.0 <option selected value="device.name"> @0.1.9.0 +.0.0.0.4.1.9.3.0.0 "Device Name" <option value="device.description"> @0.1.9.0.0 +.0.0.4.1.9.3.0.1 "Description" <option value="numplan.dnorpattern"> @0.1.9.0. +0.0.0.4.1.9.3.0.2 "Directory Number" <option value="callingsearchspace.name"> @0.1. +9.0.0.0.0.4.1.9.3.0.3 "Calling Search Space" <option value="devicepool.name"> @0.1.9.0.0.0. +0.4.1.9.3.0.4 "Device Pool" <option value="TypeProduct.name"> @0.1.9.0.0.0 +.0.4.1.9.3.0.5 "Device Type" <option value="pickupgroup.name"> @0.1.9.0.0.0 +.0.4.1.9.3.0.6 "Call Pickup Group" <option value="TypeCertificateStatus.name"> @0 +.1.9.0.0.0.0.4.1.9.3.0.7 "LSC Status" <option value="device.authenticationString"> @ +0.1.9.0.0.0.0.4.1.9.3.0.8 "Authentication String" <option value="TypeDeviceProtocol.name"> @0.1. +9.0.0.0.0.4.1.9.3.0.9 "Device Protocol" <option value="securityprofile.name"> @0.1.9.0 +.0.0.0.4.1.9.3.0.10 "Security Profile" <option value="commondeviceconfig.name"> @0.1. +9.0.0.0.0.4.1.9.3.0.11 "Common Device Configuration" <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.4 <select id="searchLimit9" name="searchLimit9" si +ze="1"> @0.1.9.0.0.0.0.4.1.9.4.0 <option selected value="beginsWith"> @0.1.9.0. +0.0.0.4.1.9.4.0.0 "begins with" <option value="contains"> @0.1.9.0.0.0.0.4.1.9 +.4.0.1 "contains" <option value="endsWith"> @0.1.9.0.0.0.0.4.1.9 +.4.0.2 "ends with" <option value="isExactly"> @0.1.9.0.0.0.0.4.1. +9.4.0.3 "is exactly" <option value="isEmpty"> @0.1.9.0.0.0.0.4.1.9. +4.0.4 "is empty" <option value="isNotEmpty"> @0.1.9.0.0.0.0.4.1 +.9.4.0.5 "is not empty" <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.5 <input id="searchString9" name="searchString9" o +nkeypress="javascript:onEnterKey(event)" type="text" value="" /> @0.1 +.9.0.0.0.0.4.1.9.5.0 <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.6 <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.7 <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.8 <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.9 <td class="cuesTableFilterAreaTd"> @0.1.9.0.0.0.0. +4.1.9.10
You can see that for each line there is this: "@0.1.9.0.0.0.0.4.1.9.10" or something. This tell you the position of the line in the tree.
So the question is, do you know a way to tell the module, hey I want to work from @0.1.9.0.0.0.0.4.1.9.10, or another way to make the process I described simpler?
Thanks very much!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Parse HTML using HTML::TreeBuilder
by daxim (Curate) on Oct 16, 2012 at 16:35 UTC | |
by Anonymous Monk on Oct 16, 2012 at 16:52 UTC | |
|
Re: Parse HTML using HTML::TreeBuilder
by Anonymous Monk on Oct 16, 2012 at 16:51 UTC | |
|
Re: Parse HTML using HTML::TreeBuilder
by oldwarrior32 (Sexton) on Oct 17, 2012 at 16:29 UTC |