in reply to Re^2: Find what characters never appear
in thread Find what characters never appear

If you want to avoid potential issues w/ regex metacharacters, you can use a set of hash keys to track what's been seen and rebuild the regex once for each character:

#!/usr/bin/perl use strict; use warnings; my %char_hash = (); $char_hash{ chr($_) } = undef foreach (33 .. 127); my $chars = join "", keys %char_hash; my $regex = "([\Q$chars\E])"; while (<DATA>) { while (/$regex/g) { delete $char_hash{$1}; $chars = join "", keys %char_hash; $regex = "([\Q$chars\E])"; } } my @good_array = keys %char_hash; print @good_array; __DATA__ !"#$%&'()*+,-./01234567 89:;<=>?@ABCDE FGHIJKLMOPQRSTUVWXYZ[\]^_`abcdefghijklmnop qrstuvwxyz{|}~

though I feel like there must be a simpler way of implementing this approach.

Replies are listed 'Best First'.
Re^4: Find what characters never appear
by Narveson (Chaplain) on Sep 05, 2009 at 13:35 UTC

    This ran in just a few minutes against my big 2GB file.

    All I had to do was change the printable range to 33..126, change <DATA> to <>, and for my own curiosity, add print "$1 seen on line $.\n"; after delete $char_hash{$1};