remove line break from 1 line

djbryson has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to remove a line break the occurs within a meta tag in multiple documents. This code runs, but doesn't actually work. The Regex matches, but it doesn't actually chomp. And now that i think about it, if this did work it would remove all line breaks in the file. Should I be doing this line by line instead of slurp?

#!C:\Perl\bin\perl.exe 

use strict;
use File::Find;
use File::Slurp;
use Time::Local;

print "\nRunning find-slurp_search-within.pl... \n\n";

my $root = "C:/"; 
print "changing to $root\n";
chdir $root;       
my $no_switch=0;
my $no_files=0;
my @log = ();
my $dir;

# find (\&Wanted, "department", "managers", "mybranch", "mycity", "myi
+nfo", "resources"); 
find (\&Wanted, "working");  #directories - comman delimited
sub Wanted {
  print "*Processing: $root$File::Find::name \n";
   if ($_ =~ /\.htm(l)?$/i) {
     my $file_slurp = read_file("$root$File::Find::name");  
   if ($file_slurp =~ m/(<meta[\s\r\n\t]+name="revision"[\s\r\n\t]+con
+tent=[\s\r\n\t]+)/i) { 
        chomp($file_slurp);   # remove line break 
        open(FILE, ">$root$File::Find::name") or die "ERROR: Can't ope
+n $root$File::Find::name";
        print FILE "$file_slurp";        
    close (FILE);
    print "file overwritten\n";
        push @log,"$root$File::Find::name \n\n"; 
        $no_switch++;
        }
   $no_files++;
   } #end if matches filetype

  else { print "file type not processed\n\n"; }  #add $_ if you want t
+o see URL of file not processed.
} # end sub
[download]

Comment on remove line break from 1 line Download Code

Replies are listed 'Best First'.
Re: remove line break from 1 line by liverpole (Monsignor) on Jan 19, 2007 at 18:48 UTC
Hi djbryson, Should I be doing this line by line instead of slurp? Yes, I would do it line by line. For one thing, your `chomp` statement is only being applied to the single line that you read in, rather than a list of lines. For another, your regular expression is only being applied once, rather than to all of the lines individually. But finally, if you want to debug what lines you're reading, and what they look like after any modifications, it's a LOT easier when you read them into an array. s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/	[reply] [d/l]
Re^2: remove line break from 1 line by Anonymous Monk on Jan 19, 2007 at 18:58 UTC
ok, i've switched to line by line... but it's still not removing the line breaks. #!C:\Perl\bin\perl.exe use strict; use File::Find; use File::Slurp; use Time::Local; print "\nRunning find-slurp_search-within.pl... \n\n"; my $root = "C:/"; # use forward slash, you can use mapped drives. print "changing to $root\n"; chdir $root; my $no_switches=0; my $no_files=0; my @log = (); my $dir; # find (\&Wanted, "department", "managers", "mybranch", "mycity", "myi +nfo", "resources"); find (\&Wanted, "working"); #directories - comman delimited sub Wanted { print "*Processing: $root$File::Find::name \n"; if ($_ =~ /\.htm(l)?$/i) { open(xFILE,$_) or die "ERROR: couldn't open file"; my @file = <xFILE>; my $line; foreach $line (@file) { if ($line =~ m/(<meta[\s\r\n\t]+name="revision"[\s\r\n\t]+con +tent=$)/im) { print $line; chomp($line); close (xFILE); $no_switches++; } # end if } $no_files++; } #end if matches filetype else { print "file type not processed\n\n"; } #add $_ if you want + to see URL of file not processed. } # end sub # add timestamp, # files, # matches to log open(LOG, ">H:/Web/perl/log.txt") or die "ERROR: Can't open log.txt"; my $timestamp = localtime(); print LOG "$no_files files - $no_switches matches - $timestamp\n"; foreach(@log) { print LOG; } close (LOG); print "log file updated. $no_files files - $no_switches matches"; [download]	[reply] [d/l]
Re^3: remove line break from 1 line by liverpole (Monsignor) on Jan 19, 2007 at 19:02 UTC
Well, one thing that's immediately obvious is that you are only `chomp`ing if the regular expression matches. Why not just chomp all lines on input? ... `chomp(my @file = <xFILE>);` [download] Another useful idiom allows you to combine `my` with your `foreach` statement: `# No need to do # my $line; # foreach $line (@file) { # foreach my $line (@file) {` [download] So what do you get when you put a `print` of the lines after they're input? s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/	[reply] [d/l] [select]
Re^4: remove line break from 1 line by djbryson (Beadle) on Jan 19, 2007 at 19:23 UTC
Re^4: ugh, need help by djbryson (Beadle) on Jan 19, 2007 at 19:30 UTC
Re^5: RESOLVED!!! by djbryson (Beadle) on Jan 19, 2007 at 20:02 UTC
Some notes below your chosen depth have not been shown here
Re: remove line break from 1 line by jettero (Monsignor) on Jan 19, 2007 at 18:47 UTC
I'm pretty sure chomp only works on a single line... I'm definitely sure that reading the entire file into a single variable isn't waht you want (particularly if you want to chomp). I would read it line by line and `while(<$in>) { chomp }` if I were you. But if you must slurp the entire file into a var (which is fine when the files are small), then I would probably choose something like `$entire_file =~ s/[\r\n]//sg`. -Paul	[reply] [d/l] [select]
Re^2: remove line break from 1 line by chargrill (Parson) on Jan 20, 2007 at 06:11 UTC
Actually... excerpted from perldoc -f chomp If you chomp a list, each element is chomped, and the total number of characters removed is returned. So , something like this will remove the locale equivalent of a newline from every element: `my @input = <STDIN>; chomp( @input );` [download] Now everything entered on STDIN separated by a newline has been chomped. Just be wary of precedence... Also from `perldoc -f chomp`: Note that parentheses are necessary when you're chomping any- thing that is not a simple variable. This is because "chomp $cwd = `pwd`;" is interpreted as "(chomp $cwd) = `pwd`;", rather than as "chomp( $cwd = `pwd` )" which you might expect. Similarly, "chomp $a, $b" is interpreted as "chomp($a), $b" rather than as "chomp($a, $b)". --chargrill `s*lil; $=join'',sort split q; s;.;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$,$/)` [download]	[reply] [d/l] [select]