in reply to Grabbing the words

Have you tried anything so far?? Please show us some code, maybe we can judge the problem better

Also, have you tried hacking up your algorithms though the use of $' and $` ? They will match the string on the left side and on the right side of a successfull regex-match. But they are pretty heavy performance wise, and perl doesnt get them unless you specifically state that you require (by using them). And if you use them once, Perl will make them available for all regex matches in your code, *thats* pretty heavy....
Manav

Replies are listed 'Best First'.
Re^2: Grabbing the words
by agynr (Acolyte) on Mar 22, 2005 at 12:18 UTC
    I want to get the next word from the pattern.For increasing the size of the pattern I have to get the next word from the left and right both ways. Suppose in the beginning we have
    $pattern="(\s*TOTAL\s*FUND\s*OPERATING\s*)(\S.{0,21}?\S)(\s*\s*EXPENSE +S\s*F\s*)"
    And after the pattern is occuring so many times in the document in which I had to search then I had to increase the size of the search .After increasing the size of the pattern it becomes
    $pattern="(ther\s*Expenses\s*0\s*2600\s*\s*TOTAL\s*FUND\s*OPERATING\s* +)(\S.{0,21}?\S)(\s*\s*EXPENSES\s*Fund\s*as\s*a\s*shareholder\s*in\s*u +nderlying\s*fund\s*indirectly\s*bears\s*pro\s*rata\s*)"
    I hope this will help u get a better picture of the problem.

      I'm confused as to why you have to 'increase the size of the search'. (by which I'm not sure if you mean you're just changing the allowed range within the second set of capturing parenthesis, or something else).

      Could you also give some sample input, and what you would like as the output? I think that would help me understand the problem. (assuming the data that you're working with isn't confidential for some reason... which it might be, if its financial reports, based on the headers)

      Update: okay, I should've looked at the two regex closer -- he's adding to the first and third capturing sets. (I still don't understand why that logic is being used, though