paulu has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I was wondering if you might help. I just inherited a database full of media titles with special characters. This snippet from my script successfully removes the characters and replaces the whitespace, however I am ending up with __ in a few rare occasions and wonder if someone has a better way of doing what I am trying here. I'm a Perl newb, but I'm coming up to speed as quickly as possible. Any suggestions are welcome.
while ( my ($vid,$tvid) = $sth1->fetchrow_array() ) { $tvid =~ tr/ \`\~\!\@\#\$\%\^\&\*\(\)\-\=\+\[\{\]\}\|\;\:\'\"\,\<\ +.\>\?\\\//_/d; $sth2->execute($tvid,$vid) or die $DBI::errstr; }

Replies are listed 'Best First'.
Re: All special character substitution and/or drop from column names.
by Fletch (Bishop) on Feb 24, 2005 at 13:47 UTC

    Perhaps you want the /s flag on your tr/// to squash repeated transliterated characters? See perldoc perlop for more details.

    Update: And perhaps tr/A-z/_/cs might accomplish what you're trying to do a bit more readably (not exactly the same as the original but probably close enough for this application).

      Thanks, I was just trying some of the other tr options, as I have mostly used s/// for this sort of thing.
Re: All special character substitution and/or drop from column names.
by chb (Deacon) on Feb 24, 2005 at 13:54 UTC
    You could also try to specify the characters you want, and replace everything else (this makes sure you don't miss any special characters you haven't thought of (yet)). untested:
    $tvid =~ s/[^a-zA-Z0-9_-]/_/g;

    Update: Fletchs solution using tr///cs would be more efficient.
Re: All special character substitution and/or drop from column names.
by dave_the_m (Monsignor) on Feb 24, 2005 at 14:14 UTC
    You're getting the occasional underscore because you include one in the replacement chars. Try changing tr/.../_/d to tr/...//d

    Dave.

      The _ was to replace the spaces. However in a "Name & Something Else" I ended up with "Name__Something_Else".
        Ah sorry, I didn't notice the double underscore. In that case, as other people have pointed out, the /s option is the way to go:
        $ perl -le'$_="Name & Something Else"; tr/ &/_/sd; print' Name_Something_Else

        Dave.