djbryson has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to remove a line break the occurs within a meta tag in multiple documents. This code runs, but doesn't actually work. The Regex matches, but it doesn't actually chomp. And now that i think about it, if this did work it would remove all line breaks in the file. Should I be doing this line by line instead of slurp?
#!C:\Perl\bin\perl.exe use strict; use File::Find; use File::Slurp; use Time::Local; print "\nRunning find-slurp_search-within.pl... \n\n"; my $root = "C:/"; print "changing to $root\n"; chdir $root; my $no_switch=0; my $no_files=0; my @log = (); my $dir; # find (\&Wanted, "department", "managers", "mybranch", "mycity", "myi +nfo", "resources"); find (\&Wanted, "working"); #directories - comman delimited sub Wanted { print "*Processing: $root$File::Find::name \n"; if ($_ =~ /\.htm(l)?$/i) { my $file_slurp = read_file("$root$File::Find::name"); if ($file_slurp =~ m/(<meta[\s\r\n\t]+name="revision"[\s\r\n\t]+con +tent=[\s\r\n\t]+)/i) { chomp($file_slurp); # remove line break open(FILE, ">$root$File::Find::name") or die "ERROR: Can't ope +n $root$File::Find::name"; print FILE "$file_slurp"; close (FILE); print "file overwritten\n"; push @log,"$root$File::Find::name \n\n"; $no_switch++; } $no_files++; } #end if matches filetype else { print "file type not processed\n\n"; } #add $_ if you want t +o see URL of file not processed. } # end sub

Replies are listed 'Best First'.
Re: remove line break from 1 line
by liverpole (Monsignor) on Jan 19, 2007 at 18:48 UTC
    Hi djbryson,

        Should I be doing this line by line instead of slurp?

    Yes, I would do it line by line.  For one thing, your chomp statement is only being applied to the single line that you read in, rather than a list of lines.

    For another, your regular expression is only being applied once, rather than to all of the lines individually.

    But finally, if you want to debug what lines you're reading, and what they look like after any modifications, it's a LOT easier when you read them into an array.


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
      ok, i've switched to line by line... but it's still not removing the line breaks.
      #!C:\Perl\bin\perl.exe use strict; use File::Find; use File::Slurp; use Time::Local; print "\nRunning find-slurp_search-within.pl... \n\n"; my $root = "C:/"; # use forward slash, you can use mapped drives. print "changing to $root\n"; chdir $root; my $no_switches=0; my $no_files=0; my @log = (); my $dir; # find (\&Wanted, "department", "managers", "mybranch", "mycity", "myi +nfo", "resources"); find (\&Wanted, "working"); #directories - comman delimited sub Wanted { print "*Processing: $root$File::Find::name \n"; if ($_ =~ /\.htm(l)?$/i) { open(xFILE,$_) or die "ERROR: couldn't open file"; my @file = <xFILE>; my $line; foreach $line (@file) { if ($line =~ m/(<meta[\s\r\n\t]+name="revision"[\s\r\n\t]+con +tent=$)/im) { print $line; chomp($line); close (xFILE); $no_switches++; } # end if } $no_files++; } #end if matches filetype else { print "file type not processed\n\n"; } #add $_ if you want + to see URL of file not processed. } # end sub # add timestamp, # files, # matches to log open(LOG, ">H:/Web/perl/log.txt") or die "ERROR: Can't open log.txt"; my $timestamp = localtime(); print LOG "$no_files files - $no_switches matches - $timestamp\n"; foreach(@log) { print LOG; } close (LOG); print "log file updated. $no_files files - $no_switches matches";
        Well, one thing that's immediately obvious is that you are only chomping if the regular expression matches.

        Why not just chomp all lines on input? ...

        chomp(my @file = <xFILE>);

        Another useful idiom allows you to combine my with your foreach statement:

        # No need to do # my $line; # foreach $line (@file) { # foreach my $line (@file) {

        So what do you get when you put a print of the lines after they're input?


        s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: remove line break from 1 line
by jettero (Monsignor) on Jan 19, 2007 at 18:47 UTC

    I'm pretty sure chomp only works on a single line...

    I'm definitely sure that reading the entire file into a single variable isn't waht you want (particularly if you want to chomp). I would read it line by line and while(<$in>) { chomp } if I were you.

    But if you must slurp the entire file into a var (which is fine when the files are small), then I would probably choose something like  $entire_file =~ s/[\r\n]//sg.

    -Paul

      Actually... excerpted from perldoc -f chomp

      If you chomp a list, each element is chomped, and the total number of characters removed is returned.

      So , something like this will remove the locale equivalent of a newline from every element:

      my @input = <STDIN>; chomp( @input );

      Now everything entered on STDIN separated by a newline has been chomped.

      Just be wary of precedence... Also from perldoc -f chomp:

      Note that parentheses are necessary when you're chomping any- thing that is not a simple variable. This is because "chomp $cwd = `pwd`;" is interpreted as "(chomp $cwd) = `pwd`;", rather than as "chomp( $cwd = `pwd` )" which you might expect. Similarly, "chomp $a, $b" is interpreted as "chomp($a), $b" rather than as "chomp($a, $b)".


      --chargrill
      s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)