nvdierdo has asked for the wisdom of the Perl Monks concerning the following question:

I want to match 2 strings on different lines in a file. The strings are not on the same line. New lines are coded as a <cr><lf> pair (windows).

My text file is myfile.txt:

..some random text..name="Bob"..some more random text containing new lines..
..some random text..id="437"..yet some more text

I tried the following which didn't work:

perl -ne "print \"$1\n$2\n\" while /name=\"(.*)\".*id=\"(.*)\"/gs" myfile.txt

What is wrong with this?

Replies are listed 'Best First'.
Re: how to deal with newline
by Corion (Patriarch) on Oct 27, 2013 at 15:27 UTC

    -n reads the file line by line. Have you looked at what $_ contains for each of your checks?

Re: how to deal with newline
by NetWallah (Canon) on Oct 27, 2013 at 16:38 UTC
    Try this ..
    perl -ne '$h{$1}=$2 for /(\w+)="([^"]*)"/g}{print qq($_\t= $h{$_}\n) f +or qw(id name)' myfile.txt id = 437 name = Bob
    Internal Quotes will have to be escaped and external quotes changed to be double-quotes for Windows.
    But you already knew that....

                 When in doubt, mumble; when in trouble, delegate; when in charge, ponder. -- James H. Boren

Re: how to deal with newline
by Laurent_R (Canon) on Oct 27, 2013 at 16:55 UTC

    Besides the line by line issue already identified, one of the problems with your regex is greedy match. If your line contains something like this:

    'some random text..name="Bob"..some more random text surname="Dylan" some more text'

    the name=\"(.*)\" part of your regex will match as much as it can between quotes, i.e.:

    Bob"..some more random text surname="Dylan

    You have to either make your * quantifier non greedy by adding the ? qualifier, name=\"(.+?)\", or match characters which are anything but quotes, name=\"([^"]+)\" (I also changed * to + because it does not seem to make too much sense to match an empty string between quotes).

Re: how to deal with newline
by Lennotoecom (Pilgrim) on Oct 27, 2013 at 20:02 UTC
    perl -ne 'print $1,"\n" foreach /[name|id]="(\w+)/' myfile.txt update
    on the second thought:
    perl -ne 'print "$1\n" if /[name|id]="(\w+)/' myfile.txt
    would be enough