albascura has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone. I got a problem I can't understand. Basically I have a text file which is made like this:

planet;World|Earth planet;celestial body|moon psychology;therapy|sociology psychology;humanity|sociology

I need to split every line using ";" as a delimiter, and I do the following:

#!/usr/bin/perl use strict; use warnings; my $file = $ARGV[0] or die "Need to get CSV file on the command line\n +"; my $sum = 0; open(my $data, '<', $file) or die "Could not open '$file' $!\n"; while (my $line = <$data>) { chomp $line; print $line; my ($target,$variables) = split (/;/ , $line); print $target."\n"; print $variables."\n"; }

But, as on output, I get just the following:

planet World|Earth planet

which is not what I was expecting since I thought this code would work on the whole file, not just the first row and a half. What am I doing wrong?

Thanks in advance for any help

Replies are listed 'Best First'.
Re: Parse csv file line by line
by davido (Cardinal) on May 05, 2013 at 22:23 UTC

    The code you posted doesn't produce the output you posted. However, it does seem to do what you want. So you haven't posted the portion of your code that is producing the problem. The following code was modified only in the formatting of its output, and where we're taking input from. Aside from that it's your code:

    use strict; use warnings; my $sum = 0; while (my $line = <DATA>) { chomp $line; print "Original data:[$line]\n"; my ($target,$variables) = split (/;/ , $line); print "\t$target\n"; print "\t$variables\n"; } __DATA__ planet;World|Earth planet;celestial body|moon psychology;therapy|sociology psychology;humanity|sociology

    ...produces...

    Original data:[planet;World|Earth] planet World|Earth Original data:[planet;celestial body|moon] planet celestial body|moon Original data:[psychology;therapy|sociology] psychology therapy|sociology Original data:[psychology;humanity|sociology] psychology humanity|sociology

    Where's the problem?


    Dave

      This is exactly what I am not getting. I posted the whole code, but really, it works on the first line of the file and stops when it maches the ";" on the second line. I am really not getting why it is not working..

        Post the output of perl -V
Re: Parse csv file line by line
by thomas895 (Deacon) on May 06, 2013 at 00:13 UTC

    I would like to introduce to you a good friend of mine, Text::CSV_XS.
    Many spreadsheets we have parsed together. Perhaps you would like to give him a try?

    ~Thomas~ 
    "Excuse me for butting in, but I'm interrupt-driven..."
Re: Parse csv file line by line
by karlgoethebier (Abbot) on May 06, 2013 at 07:25 UTC

    You can also try hexdump -c file.csv to see if there is something weird in it.

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

Re: Parse csv file line by line
by hdb (Monsignor) on May 06, 2013 at 07:16 UTC

    Perhaps there are some funky, non-printable characters in your text file. Comparing the output of your code and the code itself, print $line; has not produced anything either. Can you look at your file in a hex editor to see whether there are extra characters? Also, try to switch off buffering to see what happens ($|=1; at the beginning of your code.)

Re: Parse csv file line by line
by jaredor (Priest) on May 06, 2013 at 14:42 UTC

    Your results for a script not what others are getting running same script? Your results not changing when you change the script? Then one thing you can do is check to make sure that the script you think is running is actually the script you are running, e.g., put a print statement at the top of the script that you know has to show up the next time you run the script. If you are on Linux, the quick way to ensure you are running the script (let's call it "myscript.pl") in the current working directory is to prefix it with "./" when you call it, i.e., "./myscript.pl".

    Another thing to check is that if you are working in a Linux environment with a Windows generated text file, that you use "dos2unix" on the file before manipulating it. This debugging check is related to the "non-printing character" comments already made above.

    Your dialog with the other commentors made me think the above may be worth mentioning.

    HTH