Moron has asked for the wisdom of the Perl Monks concerning the following question:

Despite plenty of similar cases having been discussed here, this exact issue is thornier than it may seem at first sight and I couldn't find a close enough match to this: Simply stated, the problem is, given a string, determine whether it is safe enough to print the string without doing anything nasty to just about any screen or printer oon the basis of this being assumed to be a text file (especially if we are talking about printing large files where a typical binary file being let through would cause mayhem). I refined the problem statement to this:

Using a regexp, filter the following character types: \x08 \x09 \x0B \x0C \x20 and any symbolic character (punctuation, numbers, letters, standard symbols...) and detect whether anything is left over. I can conceive of an ugly way to do this using ranges of characters - but is that really the best?

Many thanks in advance for any suggestions,

-M

Free your mind

Replies are listed 'Best First'.
Re: Testing for unprintable characters
by blazar (Canon) on Nov 15, 2005 at 16:27 UTC
Re: Testing for unprintable characters
by davorg (Chancellor) on Nov 15, 2005 at 16:30 UTC

    You could use the POSIX [:print:] character class. Or, rather, its inverse.

    if ($string =~ /[^[:print:]]/) { print "Found unprintable characters\n"; }
    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: Testing for unprintable characters
by Roy Johnson (Monsignor) on Nov 15, 2005 at 16:34 UTC
    use POSIX; for ("foo; is ok", "foo\x3") { if (/[^[:print:]]/) { print "Contains unprintables\n"; } else { print "$_ is safe\n"; } }
    See POSIX isprint function.

    Caution: Contents may have been coded under pressure.

      Despite the name, you don't need to use the POSIX module in order to use POSIX character classes.

      --
      <http://dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

Re: Testing for unprintable characters
by thundergnat (Deacon) on Nov 15, 2005 at 16:48 UTC

    Or, use a named parameter regex.

    my $string = "\x07\x1b\x04Safe \x12to print\x03.\n"; $string =~ s/\P{Print}//g; print $string if length $string;
Re: Testing for unprintable characters
by Moron (Curate) on Nov 15, 2005 at 16:45 UTC
    Thanks all - it looks like the posix regexp wraps the issue up nicely.

    -M

    Free your mind