in reply to Re^3: How can I replace the pattern in the 6 th field?
in thread How can I replace the pattern in the 6 th field?

... split on the dash ...

That seems to depend on the eighth or other subsequent field being a negative number. In the following, doesn't  Sender get its parens zapped?

c:\@Work\Perl\monks>perl -wMstrict -le "print 'perl version: ', $]; ;; my $rec = 'Jun 12 10 mail (sender@sender.com) - (recip1@domain.com),(recip2@d +omain.com) 1.889 25623, queued_as: B67837C0052 Subject goes here Send +er(sender@sender.com)'; print qq{'$rec'}; ;; my @F = split /-/, $rec; $F[1] =~ s/[()]//g; my $fixed = join '-', @F; print qq{'$fixed'}; " perl version: 5.008009 'Jun 12 10 mail (sender@sender.com) - (recip1@domain.com),(recip2@doma +in.com) 1.889 25623, queued_as: B67837C0052 Subject goes here Sender( +sender@sender.com)' 'Jun 12 10 mail (sender@sender.com) - recip1@domain.com,recip2@domain. +com 1.889 25623, queued_as: B67837C0052 Subject goes here Sendersende +r@sender.com'
But it has no dependence on any Perl version above 5.8 — nor indeed, I think, above 5.0!


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^5: How can I replace the pattern in the 6 th field?
by Laurent_R (Canon) on Jun 18, 2018 at 21:11 UTC
    That seems to depend on the eighth or other subsequent field being a negative number. In the following, doesn't Sender get its parens zapped?
    Yeah, you're right++, the solution really depends on the actual detailed data format, which we don't really know.

    Splitting the input on the first dash (with the split /PATTERN/,EXPR,LIMIT syntax with an appropriate LIMIT (I guess this should be 2, I can't test right now) should probably yield the desired result.

      ... the solution really depends on the actual detailed data format, which we don't really know.

      Well, we know what I would consider to be a second approximation to the actual, rather messy data format (see My ACTUAL data...) — with more to come I shouldn't be surprised.

      Splitting the input on the first dash ... with an appropriate LIMIT ...

      But a data record like
          Jun 12 09 mail (sender@sender.com) - (recip1@domain.com) 0.075 9387, queued_as: C77837C0050 Subject goes here Sender(sender@sender.com)
      has multiple parenthesized fields/substrings after the first dash, not all of which have to be fixed up, so it seems we're stuck with split-ing on whitespace.


      Give a man a fish:  <%-{-{-{-<

        Yes, sure, if the parens should be removed only on the sixth (or actually seventh) field, with the field separator deemed to be one space (or possibly several consecutive spaces), and not thereafter, then we have to split on white space and make the substitution on the field (seventh) which needs to be changed. This requirement was not in the OP (to which I was really answering) and was added at a later point.

        The problem is still that the data seems to be almost free format (or at least with a very poorly defined format), I would also not be surprised if we discover new requirements.