in reply to search and extract lines which contain a word

On a unix system:
grep 'gateway' infile > outfile
On a unix system:
perl -ne'print if /gateway/' infile > outfile
On a Windows system:
perl -ne"print if /gateway/" infile > outfile

An example using file handles:

# Usage: script.pl infile outfile # or # Usage: perl script.pl infile outfile use strict; use warnings; my ($qfn_in, $qfn_out) = @ARGV; open(my $fh_in, '<', $qfn_in) or die("Unable to read file \"$qfn_in\": $!\n"); open(my $fh_out, '>', $qfn_out) or die("Unable to create file \"$qfn_out\": $!\n"); while (<$fh_in>) { if (/gateway/) { print $fh_out $_; } }

Update: Fixed copy & paste error mentioned in reply.
Update: Must have been asleep! Fixed missing my mentioned in reply.

Replies are listed 'Best First'.
Re^2: search and extract lines which contain a word
by cdarke (Prior) on Mar 26, 2008 at 12:36 UTC
    Slight typo in the file handles code:
    open($fh_out, '<', $qfn_out)
    Should read::
    open($fh_out, '>', $qfn_out)
Re^2: search and extract lines which contain a word
by rudder (Scribe) on Mar 26, 2008 at 14:52 UTC

    Note, when using strict, both $fh_in and $fh_out must have "my" in front of them in the calls to open.

      I would argue that my should be there either way, but that Perl only detects the error when use strict; is missing.

      But that's debatable. Anyway, thanks. Fixed.

Re^2: search and extract lines which contain a word
by m@cky (Initiate) on Mar 27, 2008 at 06:38 UTC
    Hi, perl -ne "print if /gateway/" infile > outfile This method works great! But how can search for multiple strings? E.g. I want to search for lines containing the string 'gateway' and 'TIMESTAMP' and extract the entire line if found. Please kindly advice, thank you!
      perl -ne "print if /gateway/ && /TIMESTAMP/" infile > outfile
      Hi Experts, Forget my last question post. I've managed to get it done. Not sure if i got it done 'intelligently' though. =P Can anyone please advice on this instead? Based on the current code below, how do i EXCLUDE extracting lines that contain any of the array of names defined, although strings 'gateway' & 'timestamp' are found in the particular line? E.g. ## To exclude extracting lines with username 'andrew' ## dcndksckdsnckdc gateway jdscjscjsdh timestamp andrew dkcd
      print "Specify full directory of logfile.\n"; $file = <STDIN>; chomp $file; print "\nIdentified file is $file\n"; print "Is this correct? (y/n)\n"; $confirm = <STDIN>; chomp $confirm; if ($confirm eq "y") { print "\nCommencing work on $file.\n"; workings(); } else { print "Terminating program...\n"; } ## Subroutine to open & read logfile sub workings { print"In Subroutine workings now.\n"; open (LOGFILE, $file); print "Opening logfile...\n"; open (OUT, ">>", 'D:\Temp\test.txt') or die "$!"; print "Opening output file...\n"; print "Checking for string GATEWAY...\n"; while (<LOGFILE>) { if (/gateway/) { $count = $count + 1; print OUT; print "Extracting line ...\n"; } elsif (/TIMESTAMP/) { $count = $count + 1; print OUT; print "Extracting line ...\n"; } } print "\nTotal of $count lines extracted.\n"; close OUT; close LOGFILE; } ##Array to contain usernames @names = qw(mddinzam khairulz sawaikha caoyx jchew khamshae hartinib t +eev xiaolh yettynm yussofyu leegm narayam karuppa doraira edanor raza +liha sambanr jwong lamks linwb rashid ongsp ooisc mohdrosn);

        That prints out the lines containing either or both of "gateway" and "TIMESTAMP". Your if is equivalent to

        if (/gateway/ || /TIMESTAMP/) { $count = $count + 1; print OUT; print "Extracting line ...\n"; }

        However, I was under the impression you want lines containing both. If you want the lines containing both, your if would look like

        if (/gateway/ && /TIMESTAMP/) { $count = $count + 1; print OUT; print "Extracting line ...\n"; }

        As for excluding the listed names, one way is to start by building a regex using one of the following two methods:

        my ($exclude_re) = map qr/$_/, join '|', map quotemeta, @names;

        or

        use Regexp::List qw( ); my $exclude_re = Regexp::List->new()->list2re(@names);

        The just make sure the line doesn't match that regexp:

        if (/gateway/ && /TIMESTAMP/ && !/$exclude_re/) { $count = $count + 1; print OUT; print "Extracting line ...\n"; }
      you can have multiple alternatives:

      if /(gateway|TIMESTAMP)/

      A paren around the collection of alternatives, a | between each alternative.