in reply to Hex-matching Regex pattern in scalar

You're not performing any substitution, you're just matching, as qr just returns a regex object

#!/usr/bin/perl -- use strict; use warnings; use Path::Tiny qw/ path /; my $infile = '...'; my $outfile = '...'; my $find = VerifyHex( '\xe9' ); my $replace = VerifyHex( '\x65' ); my $IF = path( $infile )->openr_raw; my $OF = path( $outfile )->openw_raw; while( my $str = <$IF> ){ $str =~ s{$find}{$replace}g; print $OF $str; } close $IF; close $OF; sub VerifyHex { my( $str ) = @_; if( $str =~ m/(\\[a-zA-Z0-9][a-zA-Z0-9])/ ){ return "$1"; } die "evil input $str"; }

Yes you could use https://metacpan.org/pod/Path::Tiny#edit_lines-edit_lines_utf8-edit_lines_raw but I didn't want to change code too much

Replies are listed 'Best First'.
Re^2: Hex-matching Regex pattern in scalar ( substitution
by CliffG (Novice) on May 20, 2016 at 11:13 UTC
    Thanks but this doesn't work for me. I know that qr returns a regex object, I was only using it because a straight substution doesn't do the trick. I don't have Path::Tiny so I amended your example:
    #!/usr/bin/perl -- use strict; use warnings; my $infile = 'C:\Scripts\Working2\Users_0.xml'; my $outfile = 'C:\Scripts\Working2\Users2.xml'; my $find = VerifyHex( '\xe9' ); my $replace = VerifyHex( '\x65' ); open(IF, "<$infile") or die "Could not open $infile $!"; open(OF, ">$outfile") or die "Could not open $outfile $!"; binmode IF; binmode OF; while( my $str = <IF> ){ $str =~ s{$find}{$replace}g; print OF $str; } close IF; close OF; sub VerifyHex { my( $str ) = @_; if( $str =~ m/(\\[a-zA-Z0-9][a-zA-Z0-9])/ ){ return "$1"; } die "evil input $str"; }
    The input file contains this (edited from an extract from an IBM tool), which says it's UTF-8 but isn't:
    <?xml version="1.0" encoding="UTF-8" ?> <foundation Version="1.0.0"> <contributor> <userId>C12760</userId> <name>Shilpaé Durgale</name> </contributor> </foundation>
    Sadly the output is unchanged from the input, the é is not replaced with e. As you can tell I'm no genius with Perl and I'm sure I'm missing something fundamental. Thoughts?