comment on

I posted a question similar to this one last week, and got some very helpful answers. But, since I didn't frame the question well, I didn't get the answers I was looking for

Scenario:
Big text file. maybe HTML, maybe plain text. Looking to replace all instances of $foo with $stuff1.$foo.$stuff2.

$foo is from uc(chomp($foo = <STDIN>));, and is only English letters. However, let's say $foo = 'GOAT'; the file I'm looking through doesn't have 'GOAT' anywhere, but it does have asdfGxxOxxAxxTqwerty and other permutations.

I want to make:

asdfG..O..A..Tqwerty

become:

'asdf'.$stuff1.'GxxOxxAxxT'.$stuf2.'qwerty'

And before I can mislead you or anything, I don't care if the xx's are xx or if the qwerty is qwerty. They could be anything matching [^A-Z]. I just want to ignore them and leave them undisturbed.

Essentially, I want to do this globally, and the only change I want to make to the text file is adding $stuff1 and $stuff2.

I've tried various things, all to abysmal and spectacular failure. I think it should be something along the following lines, but I know as is, this is wrong:

# disclaimer - assume use strict and warnings are both in effect, and 
# variables are lexically named elsewhere..  this is a toss off bit o'
+ code

while (my $line = <FH>) {
  
  $test =~ join ( split ( /(?:[^A-Z])/, $line ) );
  
# the previous line will excise any non uppercase, non letters from $l
+ine
# now, we compare $line to our $foo
  
  while ($test =~ /$foo/g ) {
    $line =~ s/$1/(stuff1 . $1 . $stuff2)/e
    }
  
  }
[download]

Of course, I recognise that the $1 there correspoonds to the GOAT and not the GxxOxxAxxT, but I am at a loss as to how to proceed. Any thoughts? I know that many people responded to my OP suggesting something like:

my $x = <STDIN>;
chomp $x;
$x = uc($x);
my $regex = join('[^A-Z]*', split //, $x);
[download]

..and that the method I proposed above is more of the inverse...

I just wasn't able to get the original suggestions to work for me, and I attributed it to me not describing the problem well enough.

Thanks for any insights
Matt

In reply to More Regexp Confusion by mdunnbass

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.