in reply to Re^2: Fetching table from website using HTML::TableExtract
in thread Fetching table from website using HTML::TableExtract
Corion doesn't mean you fix the data in the webpage directly, but after you have extracted it with HTML::TableExtract which is properly inserting empty fields for data cells skipped due to colspan or rowspan. But you have multiline fields:
The second row of your $ts->rows is
Region,Level 31.03.2016,,Sanction/Renewal 01.04.2016 to 28.02.2017,,,Level 28.02.2017,,Sanction/Renewal During Current Month ,,,Level 26.03.2017,,Growth as on 26.03.2017,
After Level 31.02.2016 there's an empty field because of colspan="2". The next field is
Sanction/Renewal During Current Month
so all you have to do is removing trailing whitespace/newlines from each field:
foreach my $ts ( $te->tables ) { print "Table (", join( ',', $ts->coords ), "):\n"; foreach my $row ( $ts->rows ) { s/[\s\n]+\z/ for @$row; # <--- here # s/\n/ /gs for @$row; # uncomment if you want to convert # multiline fields into single line $OUT-> print( join( ',', @$row ), "\n"); } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Fetching table from website using HTML::TableExtract
by sachin raj aryan (Sexton) on Mar 30, 2017 at 08:25 UTC | |
by shmem (Chancellor) on Mar 30, 2017 at 09:45 UTC | |
by sachin raj aryan (Sexton) on Mar 30, 2017 at 11:54 UTC |