iamnewbie has asked for the wisdom of the Perl Monks concerning the following question:

my $testb = "hello,'world, yo',matt"; print $_, "\n" for split ',', $testb;

for e.g. my input is hello,'world, yo',matt and i want o/p as below

hello

'world, yo'

matt

Replies are listed 'Best First'.
Re: How to use split() in perl to ignore white space and ','
by Athanasius (Archbishop) on Feb 11, 2015 at 06:49 UTC

    Hello iamnewbie, and welcome to the Monastery!

    It looks as though your data is in CSV (comma separated values) format, in which case the best approach is to use a dedicated CSV module. For example:

    #! perl use strict; use warnings; use Text::CSV_XS; my $testb = "hello,'world, yo',matt"; my $csv = Text::CSV_XS->new({ keep_meta_info => 1, quote_char => "' +" }); my @records; if ($csv->parse($testb)) { my @fields = $csv->fields; for my $col (0 .. $#fields) { if ($csv->is_quoted($col)) { push @records, $csv->{quote_char} . $fields[$col] . $csv->{quote_char}; } else { push @records, $fields[$col]; } } } else { warn "parse() failed on argument: ", $csv->error_input, "\n"; $csv->error_diag; } print "$_\n\n" for @records;

    Output:

    16:47 >perl 1156_SoPW.pl hello 'world, yo' matt 16:47 >

    (Code adapted from the documentation for Text::CSV_XS.)

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Hi Athanasius, Thanks for a warm welcome, Can't we optimize it using regex memory ?
        Sure, if you can exactly tell what the regex should accomplish -
        1. split commas
        2. except commas between quotes
        3. except it's not between any quotes, but the quotes must be balanced
        4. perhaps the quotes are nested
        5. escaped quotes (\') have to be exempt
        6. the same conditions for double quotes
        7. them mixed and matched
        8. what about escaped commas (\,)?
        9. other special cases I forgot to think of?
        I do use split sometimes, but only if it's guaranteed to be commas only. As soon as a special case becomes visible on the horizon, I use Text::CSV (which in turn uses Text::CSV_XS if possible).
        Can't we optimize it

        Please explain in what way you find the code sub-optimal.

        using regex memory

        I'm sorry, I don't understand. What it this 'regex memory' you speak of?

        "...Can't we optimize it using regex memory?"

        I'm not sure what you mean but i guess you perhaps mean something like this:

        my $string = "hello,'world, yo',matt"; my @result = $string =~ /(\.+),('.+'),(.+)/; print qq($_\n) for @result;

        See also Is guessing a good strategy for surviving in the IT business? and perlretut.

        But i'm sure that this is not really an optimization. I think, the some solutions given already are probably better.

        Edit: I tried to be more precise in judgement...

        Best regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

Re: How to use split() in perl to ignore white space and ','
by choroba (Cardinal) on Feb 11, 2015 at 08:55 UTC
    Crossposted on StackOverflow. It's considered polite to inform about crossposting, so people not attending all the sites don't waste their effort hacking a solution to a problem already solved at the other end of the Internet.
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: How to use split() in perl to ignore white space and ','
by perloHolic() (Beadle) on Feb 11, 2015 at 09:43 UTC

    Welcome iamnewbie

    For something this simple, given that you've only supplied limited information, I wouldn't advise going so far as to include a CSV parsing module just for that - assuming of course that your input is in fact a .CSV delimited file.

    I would use the split and a small regex to split your input seperated by commas in the vein of something like...

    while<> { my @arr = split(/,/,$_); printf ("%s\n",$arr[0]); printf ("%s%s\n",$arr[1],$arr[2]); printf ("%s\n",$arr[3]); }

    Output prints the first part of array to one line, the 2nd and 3rd to a second line and the last to another. This of course would be a specific possible solution to you specific question, there are many things assumed in this solution - So apologies if this doesn't solve your problem, or is unclear in anyway.

    Hope this helps - All the best