in reply to HTML Parser strange Null Character in data

Where you repeat $cell1 =~ s/^\s+|\s+//g; I would suggest instead that you create a subroutine (or two) to strip the white space and any problematic characters and then pass each value to that subroutine.

Have a look at the docs on Regular Expression Character Classes. For example, if the problem character is a control character then you can use the [:cntrl:] character class to remove it.

foreach my $row ($ts->rows) { @$row = map { clean_text( $_ ) } @$row; #... } #... my $cell1=$te->rows->[$rowIndex][1]; clean_text( $cell1 ); #... sub clean_text { # remove all control characters $_[0] =~ s/[[:cntrl:]]+//g; # remove white space at start/end $_[0] =~ s/^\s+//; $_[0] =~ s/\s+$//; # or remove all white space $_[0] =~ s/\s+//g; }

Replies are listed 'Best First'.
Re^2: HTML Parser strange Null Character in data
by caind (Initiate) on Mar 30, 2015 at 18:29 UTC
    Thank you.... I was going to do the subroutine a bit later, but I'll take your advice and do it now. Looking at that document and I believe it might be what I was missing. I'll let you know if this gets it. Again Thank you.