in reply to Re: Parse html file
in thread Parse html file

Hi davido,

This is what I need to extract from the html file:

Relay Status Information FillDB File Size Limit: 0.09% ( 2772 / 3145728 Bytes ) FillDB File Count Limit: 0.01% ( 1 / 10000 Files )

Here is a sample of the html file:

div><div class="settingsectionbody" style="display: none"><ul><li>_BES +Relay_PostFile_ChunkSize: 0</li><li>_BESRelay_PostFile_ComputerFolder +Count: 100</li><li>_BESRelay_PostFile_ThrottleKBPS: 0</li><li>_BESRel +ay_PostFile_TimeoutSeconds: 300</li><li>_BESRelay_UploadManager_Buffe +rDirectoryMaxCount: 10000</li><li>_BESRelay_UploadManager_BufferDirec +toryMaxSize: 1073741824</li><li>_BESRelay_UploadManager_CompressedFil +eMaxSize: 20971520</li><li>_BESRelay_UploadManager_ChunkSize: not app +licable on root server</li><li>_BESRelay_UploadManager_ThrottleKBPS: +not applicable on root server</li></ul></div></div><hr><div class="se +ctiontitle">Relay Status Information</div><br><div class="formline">< +div class="formlabel">FillDB File Size Limit:</div><div class="formin +put">0.0% ( 0 / 3145728 Bytes )</div></div><div class="formline"><div + class="formlabel">FillDB File Count Limit:</div><div class="forminpu +t">0.0% ( 0 / 10000 Files )</div></div><br><hr><div class="sectiontit +le">Console User Information</div><br><a href="/data/login"> div><div class="settingsectionbody" style="display: none"><ul><li>_BES +Relay_PostFile_ChunkSize: 0</li><li>_BESRelay_PostFile_ComputerFolder +Count: 100</li><li>_BESRelay_PostFile_ThrottleKBPS: 0</li><li>_BESRel +ay_PostFile_TimeoutSeconds: 300</li><li>_BESRelay_UploadManager_Buffe +rDirectoryMaxCount: 10000</li><li>_BESRelay_UploadManager_BufferDirec +toryMaxSize: 1073741824</li><li>_BESRelay_UploadManager_CompressedFil +eMaxSize: 20971520</li><li>_BESRelay_UploadManager_ChunkSize: not app +licable on root server</li><li>_BESRelay_UploadManager_ThrottleKBPS: +not applicable on root server</li></ul></div></div><hr><div class="se +ctiontitle">Relay Status Information</div><br><div class="formline">< +div class="formlabel">FillDB File Size Limit:</div><div class="formin +put">0.0% ( 0 / 3145728 Bytes )</div></div><div class="formline"><div + class="formlabel">FillDB File Count Limit:</div><div class="forminpu +t">0.0% ( 0 / 10000 Files )</div></div><br><hr><div class="sectiontit +le">Console User Information</div><br><a href="/data/login">

My work environment is very strict so I am very limited in what modules can be installed.

Thanks

Replies are listed 'Best First'.
Re^3: Parse html file
by tangent (Parson) on Sep 24, 2018 at 18:01 UTC
    I see you have HTML::TreeBuilder installed. This is one way you can use that:

    Update: changed slightly to avoid errors

    my $tree = HTML::TreeBuilder->new; $tree->parse_file($file); $tree->eof; my @divs = $tree->find_by_attribute('class','formline'); for my $div (@divs) { my $label_div = $div->look_down('class','formlabel') or next; my $label = $label_div->as_text; my $input_div = $div->look_down('class','forminput') or next; my $input = $input_div->as_text; print "$label $input\n"; }
      Thanks tangent, your solution pulled the data.

      any idea what this error is referring to?

      Can't call method "as_text" on an undefined value

        There are probably some divs with class 'formline' but without the interior divs - I have updated the code above to deal with that.