mdunnbass has asked for the wisdom of the Perl Monks concerning the following question:
Scenario:
Big text file. maybe HTML, maybe plain text. Looking to replace all
instances of $foo with $stuff1.$foo.$stuff2.
$foo is from uc(chomp($foo = <STDIN>));, and is only English letters. However, let's say $foo = 'GOAT'; the file I'm looking through doesn't have 'GOAT' anywhere, but it does have asdfGxxOxxAxxTqwerty and other permutations.
I want to make:
asdfG..O..A..Tqwerty
become:
'asdf'.$stuff1.'GxxOxxAxxT'.$stuf2.'qwerty'
And before I can mislead you or anything, I don't care if the xx's are xx or if the qwerty is qwerty. They could be anything matching [^A-Z]. I just want to ignore them and leave them undisturbed.
Essentially, I want to do this globally, and the only change I want to make to the text file is adding $stuff1 and $stuff2.
I've tried various things, all to abysmal and spectacular failure. I think it should be something along the following lines, but I know as is, this is wrong:
# disclaimer - assume use strict and warnings are both in effect, and # variables are lexically named elsewhere.. this is a toss off bit o' + code while (my $line = <FH>) { $test =~ join ( split ( /(?:[^A-Z])/, $line ) ); # the previous line will excise any non uppercase, non letters from $l +ine # now, we compare $line to our $foo while ($test =~ /$foo/g ) { $line =~ s/$1/(stuff1 . $1 . $stuff2)/e } }
my $x = <STDIN>; chomp $x; $x = uc($x); my $regex = join('[^A-Z]*', split //, $x);
I just wasn't able to get the original suggestions to work for me, and I attributed it to me not describing the problem well enough.
Thanks for any insights
Matt
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: More Regexp Confusion
by GrandFather (Saint) on Feb 01, 2007 at 22:22 UTC | |
Re: More Regexp Confusion
by johngg (Canon) on Feb 01, 2007 at 23:54 UTC | |
by mdunnbass (Monk) on Feb 05, 2007 at 14:59 UTC | |
Re: More Regexp Confusion
by AltBlue (Chaplain) on Feb 01, 2007 at 22:15 UTC | |
by mdunnbass (Monk) on Feb 01, 2007 at 22:29 UTC |