http://qs1969.pair.com?node_id=540357

droog114 has asked for the wisdom of the Perl Monks concerning the following question:

Ok, got an easy one. I want to filter out all lines in a text file that contain any number of #'s or *'s. I just can't get it to work. Anyone want to provide any tips for me?

Replies are listed 'Best First'.
Re: Simple regexp question
by Tanktalus (Canon) on Mar 31, 2006 at 03:42 UTC

    "I just can't get it to work" implies there is something you've tried. What is it?

    /[#*]/ seems like the right idea to me, but perhaps that isn't fitting into the code you've got, so more context would help us help you.

      Hi, sorry for the lack of info, I am new at this. Anyways, my text file features the following set of lines at variable lengths # **************************** I basically just want to get rid of all #'s and *'s in the file.
        perl -pe "s/[#*]//g" infile > outfile
        or in-place:
        perl -i.bak -pe "s/[#*]//g" filename
        If you want to delete the entire line, then
        perl -pe "next if /[#*]/" infile > outfile
        or in-place:
        perl -i.bak -pe "next if /[#*]/" filename
Re: Simple regexp question
by Samy_rio (Vicar) on Mar 31, 2006 at 04:17 UTC

    Hi droog114, Try this,

    TIMTOWTDI

    use strict; use warnings; open(FILE, shift) || die($!); while(<FILE>){ my $hah = $_ =~ s/\#/$&/g; my $ast = $_ =~ s/\*/$&/g; print '*' x 70, "\nLine No. $.\n",'*' x 70, "\n"; $hah = 0 if (!($hah)); $ast = 0 if (!($ast)); print "Number of #'s\t$hah\n"; print "Number of *'s\t$ast\n"; } close(FILE);

    You should view this How (Not) To Ask A Question

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';



      TIMTOWTDI
      I'd change my $ast = $_ =~ s/\*/$&/g; to my $ast = tr/*//;, and do similarly for the $hah variable. Or, leaving the default variable intact, my $ast = tr/*/*/;. To each their own.

      Of course, the code doesn't "filter out lines with *'s and #'s" -- it prints a line of asterisks with the line number, and then prints the number of asterisks and number of hashes. I take it you simply intended to give the OP some code to work on without giving everything, while pointing out the question should have indicated some effort on his part.
Re: Simple regexp question
by l.frankline (Hermit) on Mar 31, 2006 at 04:11 UTC

    Hi,

    I couldn't understand your expectation. If my guess is right, then try the following:

    open INFILE,"$filename" || die "cant open the input file";
    while (<INFILE>) {
        push (@filtered,$_) if (/[\#\*]+/);
    }
    print "@filtered";

    Regards
    Franklin

    Don't put off till tomorrow, what you can do today.

      • "$filename"
        doesn't need to be in quotes.
        $filename
        will do fine.

      • open INFILE, '<', $filename
        is less error-prone and safer than the two-arg version.

      • It's good to use local variables for file handles. Replace
        open INFILE, '<', $filename
        with
        open local *INFILE, '<', $filename
        or
        open my $fh_in, '<', $filename

      • open local *INFILE, '<', $filename || die "cant open the input file";
        is the same as
        open local *INFILE, '<', ($filename || die "cant open the input file");
        which is definitely not what you want. Use
        open(local *INFILE, '<', $filename) || die("Unable to open the input file: $!\n");
        or
        open local *INFILE, '<', $filename or die "Unable to open the input file: $!\n";

      • We don't need to know how many # and * we have in a row, so
        if (/[\#\*]+/)
        can be simplified to
        if (/[\#\*]/)
        And since # and * are not special in character class, you don't need to escape them. The following in sufficient.
        if (/[#*]/)
        Also, the parens are optional on a if suffix, and just add clutter in this case. I'd remove them as follows:
        if /[#*]/

      • I think "to filter out lines" means "to remove lines". If so, replace
        if /[#*]/
        with
        unless /[#*]/

      • Your print is adding spaces. Replace
        print "@filtered";
        with
        print @filtered;

      We get the following:

      open(local *INFILE, '<', $filename) or die("Unable to open the input file: $!\n"); my @filtered; while (<INFILE>) { push(@filtered, $_) unless /[#*]/; } print @filtered;

      What follows is a more elegant but more memory intensive alternative:

      open(local *INFILE, '<', $filename) or die("Unable to open the input file: $!\n"); my @filtered = grep { !/[#*]/ } <INFILE>; print @filtered;
      A reply falls below the community's threshold of quality. You may see it by logging in.