in reply to Getting rid of non-standard ASCII characters

Transliterate: tr/\0-\037\177-\377//d; Btw, all characters less than 128 are ASCII, you wanted to eliminate control characters - which includes newline, carriage return and tab. 'Printable ASCII' describes what you want to keep.

Update: Ionizor, you're wrong. There are lots of extended 8-bit character sets and ms code pages, but none are ASCII, which is 7-bit.

Update2: pg, it compiles for me, but I did err in placing leading zeros in the escaped octals, to make more than three octal digits. Repaired, and thanks.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re^2: Getting rid of non-standard ASCII characters
by Ionizor (Pilgrim) on Dec 19, 2002 at 02:29 UTC

    Technically ASCII is 8 bit, so all characters less than 256 are ASCII. I believe 128 - 255 are all printable, so "Printable 7-bit ASCII" is probably more accurate.

    Nitpick, nitpick, nitpick, I know...

    Update: I'm wrong. These people are too. An explanation of ISO 646 I found here pretty much sums it up: "ASCII uses only 7 bits and allows the most significant eighth bit to be used as parity bit, highlight bit, end-of-string bit (all of which are considered bad practice nowadays) or to include additional characters for internationalization (i18n for which we need 8bit-clean programs that do none of afore-mentioned silly tricks) but ASCII defined no standard for this and many manufacturers invented their own proprietary codepages." Sorry.

Re: Re: Getting rid of non-standard ASCII characters
by pg (Canon) on Dec 19, 2002 at 02:47 UTC
    You said: "tr/\0-\037\0177-\0377//d;"


    Maybe you didn't test your solution throughly:-) A quick fix, including tester, could be:
    for (0..255) { $s .= chr(); } $s =~ tr/\0-\37\177-\377//d; # fixed, no more leading zeroes print $s;