in reply to Re: Hex-matching Regex pattern in scalar ( substitution
in thread Hex-matching Regex pattern in scalar
The input file contains this (edited from an extract from an IBM tool), which says it's UTF-8 but isn't:#!/usr/bin/perl -- use strict; use warnings; my $infile = 'C:\Scripts\Working2\Users_0.xml'; my $outfile = 'C:\Scripts\Working2\Users2.xml'; my $find = VerifyHex( '\xe9' ); my $replace = VerifyHex( '\x65' ); open(IF, "<$infile") or die "Could not open $infile $!"; open(OF, ">$outfile") or die "Could not open $outfile $!"; binmode IF; binmode OF; while( my $str = <IF> ){ $str =~ s{$find}{$replace}g; print OF $str; } close IF; close OF; sub VerifyHex { my( $str ) = @_; if( $str =~ m/(\\[a-zA-Z0-9][a-zA-Z0-9])/ ){ return "$1"; } die "evil input $str"; }
Sadly the output is unchanged from the input, the é is not replaced with e. As you can tell I'm no genius with Perl and I'm sure I'm missing something fundamental. Thoughts?<?xml version="1.0" encoding="UTF-8" ?> <foundation Version="1.0.0"> <contributor> <userId>C12760</userId> <name>Shilpaé Durgale</name> </contributor> </foundation>
|
|---|