Replacing charecters in files

raj8 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Replacing charecters in files by tachyon (Chancellor) on Sep 03, 2003 at 04:08 UTC
rename the files? `my @files = glob("./*"); for my $filename (@files) { (my $newname = $filename ) =~ s/[^A-Za-z0-9\.]/_/; if ( -e $newname ) { warn "$newname already exists, skipping rename on $filename\n" +; } else { rename $filename, $newname; } }` [download] If you want to change what is actually in the files you can use an inplace edit with a suitable regex `perl -pi.bak -e 's/[^\w\.\t \n]//' <files>` [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l] [select]
Re: Replacing charecters in files by Abigail-II (Bishop) on Sep 03, 2003 at 08:31 UTC
I hope I understood your problem right. I think your problem is that `$command` may contain characters that are special for the shell, and they cause the open to fail. Answers to these problems can be found in `man perlipc` and `perldoc -f exec`. What you need to do is a "safe pipe open", opening a pipe that doesn't involve the shell. The way to do this is forking your process, with a pipe from the child to the parent, and then doing an exec to `$command` in the child. Forking a child and opening a pipe between them can be done in a single Perl command: `my $pid = open my $kid => "-\|";` [download] This forks the program, returning the child PID in the parent, while opening a pipe from the child to the parent. If the fork fails, `$pid` is undefined. The next tricky thing in the `exec`. If we would do a simple `exec $command`, Perl would call the shell if `$command` contains special characters, and that is what we are trying to avoid. If the command had arguments, we could supply `exec` with a list (of more than one element) and `exec` would avoid calling the shell, but we don't have that option. But there is another way we can have `exec` avoid calling the shell, and that is by giving it a block as first argument. The result of the block will be how the program we are going to call is named, so we can just supply `$command`. This would give us: `exec {$file} $file or die "exec() failed: $!\n";` [download] A complete program that does a safe pipe open: `#!/usr/bin/perl use strict; use warnings; my $file = '....'; # Command with special characters. my $pid = open my $kid => "-\|"; die "fork() failed: $!\n" unless defined $pid; unless ($pid) { exec {$file} $file or die "exec() failed: $!\n"; } while (<$kid>) { print; } __END__` [download] Abigail	[reply] [d/l] [select]
Re: Replacing charecters in files by esh (Pilgrim) on Sep 03, 2003 at 05:04 UTC
I don't think you've provided quite enough information to get a complete answer. Two aspects of your question seem vague to me: You provide some samples of "special characters" but end it with "etc..." Knowing exactly what you consider to be "acceptable characters" and what you consider to be "special characters" may change the answer a bit. You say you want to "strip special characters" without "damaging the file information". I would need to know what the resulting output is going to be used for in order to determine if the information is damaged in the stripping process. On the second point, it may help to provide both a description of what the information is going to be used for and some examples of what your input and desired output should be. If deleting the special characters damages the information, you may want to encode them or escape them, but the way to do this is highly context dependent. Without additional information, all I can offer is to add one line to your sample code: `open(INFILE, "$command \|"); print "$command -report\n"; while (<INFILE>) { $files = @f[15]; # Delete special characters like ; ' $ ^ $files =~ tr/;'$^//d print OUTFILE "$SQL_insert ('$files');\n"; ... }` [download] -- Eric Hammond	[reply] [d/l]
Re: Replacing charecters in files by bart (Canon) on Sep 03, 2003 at 10:08 UTC
You probably don't want to strip them, but escape them, because, if you strip them, you do damage the information. Anyway, whatever you do, you should do it by modifying $file before inserting it in that string. You can escape apostrophes (') and backslashes (\) this way: `$file =~ s/([\\'])/\\$1/g;` [download] If you do want to strip some characters, like semicolons and quotes, in the most straightforward manner, you can do: `$file =~ tr/";//d;` [download] But likely, you may be wanting a smarter way of processing the data, and use some clever `s///` trick. However, I have no idea on what is a generally acceptable format for all cases. p.s. Please don't use `@f[15]`, use `$f[15]` instead. If you ran this script with warnings enabled, you'd get a warning about it. Perhaps it works, but my rule of thumb is that you should only use the "@" syntax only when in list context — it is used in scalar context, here. Perl may disagree and never like array slices (because that's what you used) of just one item, so that's where we disagree. :)	[reply] [d/l] [select]
Re: Replacing charecters in files by williamp (Pilgrim) on Sep 03, 2003 at 05:58 UTC
If the files are from a non *nix platform blank spaces will probably be a problem as well, $file =~ s/ /_/;	[reply]
Re: Re: Replacing charecters in files by TomDLux (Vicar) on Sep 03, 2003 at 08:26 UTC
If the files are from a *nix platform blank spaces will still be a problem. -- `TTTATCGGTCGTTATATAGATGTTTGCA`	[reply]
Re: Replacing charecters in files by Hutta (Scribe) on Sep 03, 2003 at 12:50 UTC
You should also look into using the DBI module to communicate with the database, which would avoid having to do any shell escapes on your data at all. You'd still have to quote out database-special chars, but the DBI module provides a quote() method that handles it for you.	[reply]