comment on

In order to avoid failure with embedded newlines (or your other record-separator of choice), I use this:

    my $old_INPUT_RECORD_SEPARATOR = $/;
    $/ = $self->record_delimiter;
    open (DELIMFILE, '<', $filename) or (Carp::confess("Cannot open fi
+le [$filename]: $!"));
    my $record;
    while (<DELIMFILE>) {
        chomp;
        $record = $_;
        # If a line contains an odd amount of doublequotes ("), then w
+e'll need to continue reading until we find another line that contain
+s an odd amount of doublequotes.
        # This is in order to catch fields that contain recordseparato
+rs (but are encased in ""'s).
        if (grep ($_ eq '"', split ('', $_)) % 2 == 1) {
            # Keep reading data and appending to $record until we find
+ another line with an odd number of doublequotes.
            while (<DELIMFILE>) {
                $record .= $_;
                if (grep ($_ eq '"', split ('', $_)) % 2 == 1) { last;
+ }
            }
        } ## end if (grep ($_ eq '"', split...))
        push (@{$ar_returnvalue}, ReadRecord($self, $record));
    } ## end while (<DELIMFILE>)
    close (DELIMFILE);
    $/ = $old_INPUT_RECORD_SEPARATOR;
[download]

And ReadRecord uses a regex to consume the string field by field:

my $field_value;
my $delimiter = $self->field_delimiter;
while ($inputstring) {
    undef $field_value;
    if ($inputstring =~ /^"/) {
        $field_value = $inputstring;
        if ($inputstring =~ /^"(([^"]|"")+)"(?:[$delimiter]|$)/p) {
            ($field_value, $inputstring) = ($1, ${^POSTMATCH});
            # Unescape escaped quotes
            $field_value =~ s/""/"/g;
        } else {
            Carp::confess("Parsing error with remaining data [$inputst
+ring]");
        }
    } else {
        $field_value = $inputstring;
        if ($inputstring =~ /^([^$delimiter"]*)(?:[$delimiter]|$)/p) {
            ($field_value, $inputstring) = ($1, ${^POSTMATCH});
        }
    } ## end else [ if ($inputstring =~ /^"/)]
}
[download]

This conforms to RFC 4180 :)

In reply to Re^3: split string by comma by Neighbour
in thread split string by comma by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.