juanito23 has asked for the wisdom of the Perl Monks concerning the following question:

Hello team, I have a very mis-formatted string exported to csv file from oracle db table (around 60 columns table) Seems each line is very badly formatted so I got some perl code to format the same. As example a single original line from csv file when I print in a Linux terminal:

""6357445" + "1349947" + + "0" + + "1" + + "3" + + "2" + + "1" + "1" + + "1" + + "1" + + "-2" + + "1394531830" + + "1394531830" + "14 +15599200" + "0" + + "0" + + "0" + + "0" + + "196" + +"29240" + "378" + + "1394531846" + + "0" + + "0" + + "0" + + "8201" + "0" + + "64" + + "0" + + "2" + + "89799" + + "8201" + "8980 +5" + "-1" + + "-1" + + "Local Cell id=2, Cell Name=21842C11" + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + "0" + +"1394531057" + "1394531057 +" + "1415599200" + + "1394531092" + + "" + + + + + + + + + + + "0" + + "RAT_INFO=GL, AFFECTED_RAT=L" + + + + + + + + + + + + + + + + "" + + "" + + + + + + "10.35.201.146" + + + + + + + + + + + "195;" + + + + + + + + + + + "0" + " +5705354" + "0" + + "" + + "" + + + "0" + + "0" + "Cell""

This looks odd, but that's only 1 single line, splitted in some weird characters. After some perl code I got this:

my $file = `filename`; chomp($file); open(my $data, '<', $file) or die "Couldn't open '$file'\n "; while (my $line = <$data>){ $line =~ s/\s+/ /g ; $line =~ s/\" \"/\"\"/g ; $line =~ s/\n+//g ; $line =~ s/\s+$//; if ($line !~ /select/ && $line !~ /spool/ && $line !~ /\|\|/ + ){ my @array = $line; my $result; for (@array){ chomp; $result .= $_ . " "; } $result =~ s/\s+//g; print "$result\n";
Result of this is
""6357445""1349947""0""1""3""2""1""1""1""1""-2""1394531830""1394531830 +""1415599200""0""0""0""0""196""29240""378""1394531846""0""0""0""8201" +"0""64""0""2""89799""8201""89805""-1""-1""LocalCellid=2,CellName=2184 +2C11""0""1394531057""1394531057""1415599200""1394531092""""0""RAT_INF +O=GL,AFFECTED_RAT=L""""" "10.35.201.146""195;""0""5705354""0""""""0""0""Cell""
Ok, so I have one line still splited for some reason. Here each field is separated by double quote "" character, so we can expect empty fields as well. My target is to have this in a single line in order to access to each element, even the NULL ones!

Hope this is clear, any thoughts?

Thanks!

Replies are listed 'Best First'.
Re: extract string columns with perl
by LanX (Saint) on Apr 15, 2014 at 11:12 UTC
    > Hope this is clear,

    not really, and not CSV (comma separated)

    That's what you want?

    DB<107> $str =~ s/^""//; # remove start => 1 DB<108> $str =~ s/""$//; # remove end => 1 DB<109> @result = split /" "/, $str # separate quotepairs => ( 6357445, 1349947, 0, 1, 3, 2, 1, 1, 1, 1, -2, 1394531830, 1394531830, 1415599200, 0, 0, 0, 0, 196, 29240, 378, 1394531846, 0, 0, 0, 8201, 0, 64, 0, 2, 89799, 8201, 89805, -1, -1, "Local Cell id=2, Cell Name=21842C11", 0, 1394531057, 1394531057, 1415599200, 1394531092, "", 0, "RAT_INFO=GL, AFFECTED_RAT=L", "", "", "10.35.201.146", "195;", 0, 5705354, 0, "", "", 0, 0, "Cell", )

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    update

    the OP was updated with code-tags, such that now the fields are separated by many whitespaces. to make my code work use split /"\s+"/ now.

      Hi, this looks to be close to what I need, so I will work on it. I think the issue is with the data extraction from csv file, since if I assign the whole string to a variable in perl, I will get much better result. Thanks!
Re: extract string columns with perl
by Laurent_R (Canon) on Apr 15, 2014 at 17:04 UTC
    Would it not be simpler to configure Oracle and/or the SQL query or PL/SQL extraction script to get the data in the proper format in the first place?

    I don't usually get such a mess from Oracle databases, so there must be something wrong in your configuration (although I don't know what).

      Hi, I tried to format already the sql, but I am a bit limited, since I need to get the columns in a certain order like: select column2, column4, column1, etc... from table; and then spool this into a csv file

      configurations in sql session before spooling are as this: set feedback off

      set heading on

      set FEEDBACK off

      set verify off

      set numwidt 20

      set echo off

      SET TERMOUT OFF

      set linesize 10000

      set pagesize 0

      Not sure if I can do anything else to help with the format? Thanks!
        Well, I am using something similar (although not sure for every detail). One additional statement you could use to get a CSV is:
        set colsep ,
        Also, you may want to use the trim function in your select statement.