in reply to Parsing/Removing Characters from Text File

mizmaster22:

As was mentioned on your previous thread on the same subject, if you can tell the software that your terminal doesn't handle color, much of that may go away unless it's cursor positioning stuff. But anyway, you can remove that stuff by making some regular expressions that recognize the strings and remove them, something like:

#!/usr/bin/perl use strict; use warnings; my $txt = "06:00\x1b[1;1H \x1b[24;0H \x1b[K 7 \x1b[1;1H " . "\x1b[0;7m list tvbs dnv 3334 06:00 \x1b[0m 8"; # Remove strings like <esc>[<digits_or_semicolon><letter> $txt =~ s/\x1b\[[0-9;]+[A-Za-z]//g; print $txt;

All you need to do is figure out a good regular expression to match the bits you want to delete, and remove them as shown in the example. (The one I provided might delete too much, so be sure to test thoroughly.)

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Replies are listed 'Best First'.
Re^2: Parsing/Removing Characters from Text File
by mizmaster22 (Novice) on Jun 16, 2011 at 19:05 UTC
    roboticus, thank you so much. I have been racking my brain on this problem for a long time now. Me and regex do not get along very well at all. The output is nearly flawless now except I am getting a one carriage line or new line symbol in front of each sentence, but other than that it looks great here is my extremely simple parsing code:

    use warnings; use strict; use File::Slurp; my $s = read_file("calldata.txt"); $s =~ s/\x1b\[0-9;+A-Za-z//g; write_file("calldata.txt", $s); __END__

    I looked those symbols up with a hex editor and it looks like they are 0d and 0a. Once again, thank you very much for the help.