in reply to hex in regexp

The standard way is to do this is like the following:

$ perl -Mstrict -Mwarnings -E ' > my $x = q{X-X!X?X}; > $x =~ s/([[:^upper:]])/unpack q{H2}, $1/eg; > say $x; > ' X2dX21X3fX

Note: I used upper instead of ascii for my example code.

There may be modules that also supply this functionality. If there are, I'm sure other monks will provide details.

-- Ken

Replies are listed 'Best First'.
Re^2: hex in regexp
by shamanoff (Initiate) on May 22, 2012 at 05:04 UTC

    Thank you Ken! Your code almost worked. The thing is that for some reason it didn't substitute the symbols with their hex-codes, but just removed them from the string. For example please see the example of the string I'm trying to modify: "çe quil Y a Yå" I am still testing trying to figure out why it happened. Additional question - how exactly can I wrap the hex code in a <> brackets in this example?

      Changing upper to ascii and using your string, I get:

      $ perl -Mstrict -Mwarnings -E ' my $x = q{çe quil Y a Yå}; $x =~ s/([[:^ascii:]])/unpack q{H2}, $1/eg; say $x; ' c383c2a7e quil Y a Yc383c2a5

      Check that you didn't make a typo when entering your code. If you are still having problems, please post your code - as it is, I can't reproduce your problem.

      To wrap the codes in < and >, or any other characters, you can just concatenate the characters at the beginning and end of the hex code:

      $ perl -Mstrict -Mwarnings -E ' my $x = q{çe quil Y a Yå}; $x =~ s/([[:^ascii:]])/q{<} . unpack(q{H2}, $1) . q{>}/eg; say $x; ' <c3><83><c2><a7>e quil Y a Y<c3><83><c2><a5>

      -- Ken

        I double checked the code and haven't found any typo:
        #!/usr/bin/perl $file = @ARGV[0]; $out_file = $file.".out"; my $count = 0; print "Processing file: $file\n"; print "Output file: $out_file\n"; open FILE, "$file" or die "cannot open $file file"; open OUT, ">>$out_file" or die "cannot create $out_file file"; #go through the file rows while (<FILE>) { chomp; # if row contain non-ASCII symbols - it is counted if ($_ =~ /[[:^ascii:]]/) { print "String before modification:\n $_ \n"; s/[[:^ascii:]]/q{<} . unpack(q{H2}, $1) . q{>}/eg; print "String after modification:\n $_ \n"; print OUT "$_\n"; $count++ ; } else { print OUT "$_\n"; } } print "there are $count possibly corrupted rows\n"; close FILE; close OUT;
        the contents of processed file:
        0,1,,,1,~=_<>}[]||$? £??§^`??¿ ?. Qu est çe quil Y a Yå,0,20120 +306101731 0,1,,,1,Teá,0,20120314104403 0,1,,,1,Q<8a>rUì ,0,20120306103345 1,1,,,3,,0,20120331152610 1,1,,,3,,0,20120331152612
        Thanks, Ken! I found the typo - forgot to put the search expression in brackets ([...]). Let me say that I am sorry for my impatience. :)