Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Pre-process csv files before using

by sk (Curate)
on Aug 06, 2005 at 18:19 UTC ( #481529=note: print w/replies, xml ) Need Help??


in reply to Pre-process csv files before using

Will you know which columns numbers you want to keep? If so, you can just return the required portion of the array and use it for average calculation.

perl -e '@array = qw (hello world 1 2 3); @req = qw (2 3 4); $str = pr +int join (",",@array[@req]), $/;' __END__ 1,2,3

In the above code i have an array called array then i have a required columns array called req. @array[@req] returns a list with values from the required columns.

Regarding the removal of "." from the header, at what point do you want to do it? Before you start reading the CSV into DBI or when you write out? Before you use it in DBI would require you to do an inplace edit. Writing out the correct header (without period) should be very easy. A simple regex on the your "variable column" array.

perl -e '@wperiod = qw (hi.1 hi.2 hi.3); print join (",", map {s/\.//g +; $_} @wperiod), $/;' __END__ hi1,hi2,hi3

-SK

Replies are listed 'Best First'.
Re^2: Pre-process csv files before using
by DrAxeman (Scribe) on Aug 06, 2005 at 18:40 UTC
    Is there any way that I can add a line after
    <code> shift @cols; <code> That will look at the @cols array and then strip column names that meet a specific regular expression?
      Would something like this do?

      perl -e'@cols =qw (need1 dont1 need2 dont2 need3); for (@cols) { push( +@req,$_) unless /dont.*/;} print join (",",@req),$/;' __END__ need1,need2,need3

      You mentiioned regex so the above should help you. However if the names you want to choose are sent as input then you might want to construct a hash from the list and then drop the ones that should be excluded by checking the hash

      .

        The problem is that I don't nessessarily know what all the column names are. I know that any column that ends with "Bandwidth" or has "MSTCPLoopback" in it I don't want.

        I'm trying to approach this from:
        shift @cols; #Remove the first column for (@cols) { delete $cols[$regex here] }
        Then start my sql stuff.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://481529]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2023-11-30 08:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?