in reply to Re^3: Help please
in thread Help please

This node falls below the community's minimum standard of quality and will not be displayed.

Replies are listed 'Best First'.
Re^5: Help please
by graff (Chancellor) on Jun 29, 2005 at 21:08 UTC
    (sigh)

    There can be cases in your data where a single line contains multiple matches (that is, "it" followed by stuff followed by end-of-sentence punctuation could occur more than once on one line) -- this would certainly be true if your text file contains no line-breaks anywhere in the middle of the long text string.

    That's why the most recent code I suggested in a previous reply went like this:

    while (<>) { while ( /\bit (.*?)[.?!]/ig ) { print "\n$1\n"; } }

    (apologies to others for repeating that; but it seems like everyone else has already abandoned this thread anyway)

    Note the second "while" loop, and the "g" flag on the regex match. This is a way of looking for the same pattern repeatedly in a single string value, and performing the same operations (inside the loop) on every match. Also note the "?" qualifier that follows the ".*" inside the parens -- this makes the match "non-greedy", which is very important here.

    I actually tested this myself, using your sample text, and the sort of command line that you reported using, and it most certainly does work -- it produced the following output:

    manually till April 1996 was happily feeding modules through to the CPAN archive sites made sense for the module listing part of the Module List to be built +from that database

    If it doesn't work for you this time, then you have to start looking at some non-perl issues, like:

    Are you running this inside a command-line shell window? Because if you're on a ms-windows system, and you type that command line into the "Run..." type-in box from the "Start" menu, then you probably won't get to see anything -- you have to start up a "MS-DOS Prompt" window, and run that command at the DOS prompt in that window.