ataX has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, i have a file that i need to parse, this file comes from the output of a DSLAM command.

When i open the file i find some weird chars (-^[[1D\^[[1D|^[[1D/^) on it, that i need to remove in order to better parse the file

this is the part of the file that has the special chars:
configure qos profiles cac FD_ONTUniVideo res-voice-bandwidth 0 max-mc +ast-bandwidth 24000 res-data-bandwidth 0 configure qos profiles marker d1p marker_hsi default-dot1p 2 configure qos profiles marker d1p FD_Marker_BE default-dot1p 0 configure qos profiles marker d1p FD_Marker_CL default-dot1p 3 configure qos profiles marker d1p FD_Marker_Voice default-dot1p 6 -^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[ +1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/ +^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[ +1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D| +^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[ +1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\ +^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[ +1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D- +^[[1D\^[[1D|^[[1D/^[[1D-^[[1D\^[[1D|^[[1D/^[[1D ^[[1Dconfigure qos pr +ofiles policer Flash1d committed-info-rate 640 committed-burst-size 1 +14000 configure qos profiles policer Flash1u committed-info-rate 320 committ +ed-burst-size 114000 configure qos profiles policer Flash2d committed-info-rate 928 committ +ed-burst-size 114000 configure qos profiles policer Flash2u committed-info-rate 320 committ +ed-burst-size 114000 configure qos profiles policer Flash3d committed-info-rate 1280 commit +ted-burst-size 114000 configure qos profiles policer Flash3u committed-info-rate 320 committ +ed-burst-size 114000 configure qos profiles policer Flash4d committed-info-rate 1824 commit +ted-burst-size 114000 configure qos profiles policer Flash4u committed-info-rate 320 committ +ed-burst-size 114000 configure qos profiles policer p-neg2d committed-info-rate 480 committ +ed-burst-size 114000 configure qos profiles policer p-neg2u committed-info-rate 480 committ +ed-burst-size 114000 configure qos profiles policer p-neg4d committed-info-rate 640 committ +ed-burst-size 114000 configure qos profiles policer p-neg4u committed-info-rate 640 committ +ed-burst-size 114000 configure qos profiles policer p-neg6d committed-info-rate 1824 commit +ted-burst-size 114000 configure qos profiles policer p-neg6u committed-info-rate 1824 commit +ted-burst-size 114000 configure qos profiles policer p-res2d committed-info-rate 480 committ +ed-burst-size 114000 configure qos profiles policer p-res2u committed-info-rate 480 committ +ed-burst-size 114000 configure qos profiles policer Flash_5d committed-info-rate 320 commit +ted-burst-size 114000

Now this is a snip of my perl code that parses the file and i can't seem to find a way to strip those special chars from the file

use strict; use warnings; use IO::File; my $qosProfLog = "/svn/projects/DSLAMCheck/branches/fullcheck+multipro +cess/work/qosprofile/qosprofilelist_log_ACST8H1-11283.log"; my $QOSPROFLIST_LOG_FH = IO::File->new(); $QOSPROFLIST_LOG_FH->open($qosProfLog,"r"); while (<$QOSPROFLIST_LOG_FH>) { if ( /configure qos profiles policer/ ) { chomp; tr/\040-\176//csd; print "$_\n"; } }
Thanks for your help.

Replies are listed 'Best First'.
Re: Removing special chars from line
by jwkrahn (Abbot) on Oct 17, 2011 at 20:47 UTC
    tr/\040-\176//csd;

    You can't use all three options with tr/// at the same time.    It looks like you only need:

    tr/\040-\176//cd;
      I removed the s parameter from tr but the unprintables are still there. is there a way to remove all non-printable chars and keep the whitespace at the same time?
        It looks like the stream of characters that you want to remove are all cases of "backspace" (expressed as a four-character sequence 'esc [ 1 D', but with the 'esc' character transliterated by your terminal window display into two printable characters: "^" and "[" ), followed by one of:  / - \ |

        It's the ascii-art version of the rotating clock-hand. To get rid of that specific pattern, use a regex substitution -- I think it would go something like this (not tested):

        s{ (?: \e \[ 1 D [\\/|-] )+ }{}gx;
        The "\e" represents the escape character, the open-square-bracket needs to be preceded by backslash to be treated as a literal character, and the four "rotating hand" characters are in a character class that follows "D" (the hyphen needs to be either last or first in the class list, to avoid being interpreted as a range operator, and the backslash needs to be escaped, but "|" does not (and neither does "/", since I'm not using that as the regex delimiter).