in reply to Re: Pre-process csv files before using
in thread Pre-process csv files before using

Is there any way that I can add a line after
<code> shift @cols; <code> That will look at the @cols array and then strip column names that meet a specific regular expression?
  • Comment on Re^2: Pre-process csv files before using

Replies are listed 'Best First'.
Re^3: Pre-process csv files before using
by sk (Curate) on Aug 06, 2005 at 18:51 UTC
    Would something like this do?

    perl -e'@cols =qw (need1 dont1 need2 dont2 need3); for (@cols) { push( +@req,$_) unless /dont.*/;} print join (",",@req),$/;' __END__ need1,need2,need3

    You mentiioned regex so the above should help you. However if the names you want to choose are sent as input then you might want to construct a hash from the list and then drop the ones that should be excluded by checking the hash


      The problem is that I don't nessessarily know what all the column names are. I know that any column that ends with "Bandwidth" or has "MSTCPLoopback" in it I don't want.

      I'm trying to approach this from:
      shift @cols; #Remove the first column for (@cols) { delete $cols[$regex here] }
      Then start my sql stuff.
        You can just grep out the ones you want.
        my @cols = ...; shift @cols; # blindy throw away first column @cols = grep( $_ !~ /(Bandwidth|MSTCPLoopback)$/ , @cols); # exclude +ones ending with "Bandwidth" or "MSTCPLoopback"
        re: your code, note that delete is for hashes .. if you wanted to modify your code, you would use the slice function, but grep is much more powerful and perl-ish.
        I am not going to do Re:Re:Re now as it makes it hard to read. This reply is for this node Re^8: Pre-process csv files before using

        Could you please wrap your column header info inside code tags? Long lines don't wrap otherwise. Thanks!

        #!/usr/bin/perl use strict; use warnings; my $str = <DATA>; my @origcols = split /,/,$str; my @cols = (); foreach (@origcols) { $_ =~ s/\.//g; push ( @cols , $_ ) unless /Bandwidth.*|MSTCPLoop.*/ ; } map {print $_,$/ } @cols; __DATA__ PDHCSV40EasternDaylightTime.240,ERWW.COMMUNITIES.MemoryPagesPER.sec,ER +WWCOMMUNITIESNetwor kInterfaceEthernetAdapterModuleBytesTotalPERsec,ERWWCOMMUNITIESNetwork +InterfaceEthernetAd apterCurrentBandwidth


        PDHCSV40EasternDaylightTime240 ERWWCOMMUNITIESMemoryPagesPERsec ERWWCOMMUNITIESNetworkInterfaceEthernetAdapterModuleBytesTotalPERsec

        as you could see the periods are gone and no warnings. Also the column with bandwidth is not listd. Can you make sure your origcols array is correctly populated?


        The code I gave you above should do what you are looking for.

        perl -e'@cols =qw (field1 fieldband field3 fieldMSTCP okfield5); for ( +@cols) { push(@req,$_) unless /band.*|MSTCP.*/;} print join (",",@req +),$/;' __END__ field1,field3,okfield5

        Here i am excluding names with band and MSTCP field names. Since I don't have any field names I am just hard coding it here

        . -SK