#!/usr/local/bin/perl -w use strict; $/="\'\n"; #set the IFS while (<DATA>) { chomp; # chomp strips the IFS s/\?\'/\'/g; # fix the "quoted" ' marks my @fields=split /\+/; print "[$_] $fields[$_]\n" for 0..$#fields; } __DATA__ 0010+2+O'Reilly' 023++++234+35+White+++17+' g?'day mate+++' # output [0] 0010 [1] 2 [2] O'Reilly [0] 023 [1] [2] [3] [4] 234 [5] 35 [6] White [7] [8] [9] 17 [0] g'day mate

update

possibly a bit more elegant for certain values of elegance

$/="\'\n"; while (<DATA>) { chomp; s/\?\'/\'/g; my $i; print "[",$i++,"] $_\n" for split /\+/; }

I keep getting voted down for this node so I think I had better explain myself. I am not doing it all in regex as the OP wanted to do but the OP says the data is in a file one record per line terminated with a '. As he has to read the line in anyway, and we can guess from the example string given in the lead post that he is also chomping the line he reads why not use the IFS to solve the terminal ' issue efficiently. Once you have reached this point the fix "?'" and split is surely more efficient and maintainable than some confusing regex.

Cheers,
R.


In reply to Re: Improved regexp sought by Random_Walk
in thread Improved regexp sought by myomancer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.