in reply to small change to Text::ParseWords - evil consequences?

Hmm... maybe I'm being dumb here, but I think that if this is doing anything at all, it is doing the wrong thing. Let me start nit-picking:
  1. you drop $delimiter into a /x regexp, but it is elsewhere de-/x'ed as (?-x:$delimiter)... so if someone passes in a $delimiter containing whitespace, it will be interpretted differently by your addition than elsewhere.
  2. you capture your new $delimiter, but you aren't storing the value? Odd, you could just drop the parenthes or make them (?:$delimiter) or, as to the previous point, (?-x:$delimiter)
  3. the other primary sub-regexes are anchored at the beginning of the line, so they can tell what came before the ending sub-regex, but your new one is not.
  4. excepting, of course, for the /x-or-not-/x question surrounding $delimiter, anything caught by your new regex should have been caught by the regex for unquoted stuff (unless, of course, it had an open quote with no matching close quote)

Can you give me the calling context (the value of $delimiter and $line)?


------------
:Wq
Not an editor command: Wq

Replies are listed 'Best First'.
Re: Re: small change - evil consequences?
by selena (Acolyte) on Oct 22, 2003 at 04:34 UTC

    ok - let me start with, I did not completely understand the original regex's and your explanation helped me there. I changed the $delimiter section as you suggested.

    My goal was to capture all fields implied in a tab-delimited line, as in something that would do the right thing when given this:

    aabel Allyn Abel EL0 20030612182307 10.5.4.166 abeeman Ali Beeman EL1 ajens@ttsd.k12.or.us 200309 +02191509 10.8.0.219

    There are actually 8 fields in both lines, based on the tabbing.

    The problem with the previous code (sans the '| ($delimiter)' ) was that when I specified '\t' as a field seperator, the regex did not recognize multiple tabs correctly.

    Ugh. and now that i go off and test this with the original module (because I was going to show you the problem exactly..), its working with the original regex. *sigh*

    Perhaps reading Text::ParseWords caused me to clean up some code elsewhere that was causing the issue.

    Thanks for taking the time to read through this.

      Maybe I'm not understanding your problem, but wouldn't a split(/\t+/, $line) get the tab-delimited fields?

      Arjen

        Using \t+ as the delimiter regex wouldn't work for him because he wants to see null fields. That is, if he's got "a\tb\t\td" he wants to split it into ('a','b','','d').

        I think that the original issue he was having was something different.


        ------------
        :Wq
        Not an editor command: Wq