in reply to Re^2: while loop logic
in thread while loop logic

I've tried out all the solution suggested in this post but there's still one problem I'm struggling to cope with. It transpires that whilst ignoring data between comments there are some cases where I need to run the regexp against what's left of the data in the record after the comments have been stripped e.g.
asdfgh|kjkhg|poioiu|ytr|kkk|aaa /* vbfew */ kkkwwwqqqsss
In this case the comment is complete within the record but I still need to check the rest of the record (aaa kkkwwwqqqsss) for any regexp matches.
In fact this issue applies throught the data sets I'm trying to deal with. It was my mistake in not explaining this properly when asking for assistance. Which by the way has been excellent.

Replies are listed 'Best First'.
Re^4: while loop logic
by BrowserUk (Patriarch) on Jan 16, 2006 at 10:29 UTC

    This seems to handle both the scenarios you've outlined

    #! perl -slw use strict; while( <DATA> ) { chomp; if( m[/\*] ) { $_ .= <DATA> until m[\*/]; s[\s?/\* .+? \*/\s?][]smg; } print join '-', split '\|'; }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Sadly it doesn't. I'm not sure if things are complicated by the fact that the text field can contain asteris that are not considered part of a comment e.g.
      case1 aaa|bbb|ccc|ddd|eee|fff * hhh /*xyzxyz*/ abc or case2 aaa|bbb|ccc|ddd|eee|fff * hhh /*xyz sss|ddd|ggg|hhh|jjj|xyz*/ abc or case3 aaa|bbb|ccc|ddd|eee|fff * hhh /*xyz jjj|kkk|lll|ppp|ooo|blah blah blah sss|ddd|ggg|hhh|jjj|xyz*/ abc
      Sometimes the comment block is in the same record, at other times it spans 1 or more records. So in the example I would still need to check (fff * hhh and abc ) to see if they met my pattern match (becuase they're outside the comment block. Having had only limited Perl exposure this is really taxing me.

        If you remove the spaces from the left hand side of the substitution, or add the /x modifier, it will deal with all of those also. Not sure why I omitted the /x.

        However, depending where the end of the comment comes, it can sometimes leave an extra newline, so an additional chomp is called for:

        while( <DATA> ) { chomp; if( m[/\*] ) { $_ .= <DATA> until m[\*/]; s[ \s? / \* .+? \* / \s? ][]smgx; chomp; } print join '-', split '\|'; }

        Any other late breaking 'funnies' you need to deal with?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.