comment on

I'm not sure I like the following solution, but how about turning your hash keys into regular expressions?

my $data = <<EOT;
This is just lines of text here, and also there.  Consider
this human readable text; it's full of letters and
punctuation.
EOT

my %table = ( '\b[Ll]ines of text\b' => "foo.html",
              '\b[Tt]his\b'          => "bar.html",
              '\b[Ii]t\'?s full\b'   => "foobar.html" );

$data =~ s/($_)/<a href="$table{$_}">$1<\/a>/g foreach keys %table;

print $data;
[download]

The above seems to work, but if you have 5000+ keys, that's going to take a huge amount of time to code and run about as fast as a one-legged dog. There's also the problem that it's easy to write inefficient regexes.

The other idea that comes to mind is using the String::Approx module to try to match the keys to text. The problem with that is that it will be slower and more error prone :(

tedv wrote:

Of course, we link the initial This but not the this starting line 2.

Why "of course"? Can you explicitly state a rule? Are you only trying to match the first occurrence of each string? If so, take off the /g on the substitution and it will work much faster.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

In reply to (Ovid - regexes as hash keys)Re: Mass Text Replacement by Ovid
in thread Mass Text Replacement by tedv

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.