Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
I have a file that has to be parsed, and I need to remove a value from a certain location, here is an example of the file
DHCP Administrators,2,"Enterprise Admins,Users,DC=INTERNETNET,DC=com Domain Admins,Users,DC=INTERNETNETCOM,DC=com",, DnsAdmins,2,"Enterprise Admins,Users,DC=INTERNETNET,DC=com Domain Admins,Users,DC=INTERNETCOM,DC=com",, AS400 Query Database,2,"CN=Joe Car,OU=Systems/Operations,OU=MIS,OU=Use +r Accounts,DC=INTERNETNET,DC=com Ricrad Tallar,OU=Systems/Operations,OU=MIS,OU=User Accounts,DC=INTERNE +TNETCOM,DC=com",, Deptpar Access,8,"John Class,OU=Marketing,OU=User Accounts,DC=INTERNET +NET,DC=com Judy Lipa,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com George Grey,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com Artur More,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com Raimun Sirilo,OU=Executive,OU=User Accounts,DC=INTERNETNET,DC=com Amilcar Ove,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com Daniel Santos,OU=Executive,OU=User Accounts,DC=INTERNETNET,DC=com Paula Corte,OU=Executive,OU=User Accounts,DC=INTERNETNET,DC=com" Human_Resources,3,"CN=Katarine Gilly,OU=Executive,OU=User Accounts,DC= +INTERNETNET,DC=com Chris Head,OU=Human Resources,OU=Finance & Administration,OU=User Acco +unts,DC=INTERNETNET,DC=com Susany Cadru,OU=Human Resources,OU=Finance & Administration,OU=User Ac +counts,DC=INTERNETNET,DC=com"

I am using "splip" but I am having problems with "
I just need to format the file looking inside of the part that is in quotes for "CN=andanamehere" and get rid of the rest, and have the line like:
Human_Resources,3,"CN=Katarine Gilly" AS400 Query Database,2,"CN=Joe Car"

If it is not there just ignore it.
Any help?

Replies are listed 'Best First'.
Re: CSV File Question
by Tanktalus (Canon) on Jan 13, 2005 at 00:36 UTC

    I'd start with Text::CSV (well, actually I use DBD::CSV which uses Text::CSV, but it's probably overkill here). It handles the ""'s just fine. Then you can look the third field, split on ,'s, find the part that is /^CN=/, put that back as the third field, and write it all out again.

Re: CSV File Question
by nedals (Deacon) on Jan 13, 2005 at 00:50 UTC
    use strict; while (<DATA>) { chomp; if (/CN=\w+/) { $_ =~ s/^(.+?"CN=.+?),.+$/$1"/; print "$_\n"; } } __DATA__ DHCP Administrators,2,"Enterprise Admins,Users,DC=INTERNETNET,DC=com Domain Admins,Users,DC=INTERNETNETCOM,DC=com",, DnsAdmins,2,"Enterprise Admins,Users,DC=INTERNETNET,DC=com Domain Admins,Users,DC=INTERNETCOM,DC=com",, AS400 Query Database,2,"CN=Joe Car,OU=Systems/Operations,OU=MIS,OU=Use +r Accounts,DC=INTERNETNET,DC=com Ricrad Tallar,OU=Systems/Operations,OU=MIS,OU=User Accounts,DC=INTERNE +TNETCOM,DC=com",, Deptpar Access,8,"John Class,OU=Marketing,OU=User Accounts,DC=INTERNET +NET,DC=com Judy Lipa,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com George Grey,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com Artur More,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com Raimun Sirilo,OU=Executive,OU=User Accounts,DC=INTERNETNET,DC=com Amilcar Ove,OU=Marketing,OU=User Accounts,DC=INTERNETNET,DC=com Daniel Santos,OU=Executive,OU=User Accounts,DC=INTERNETNET,DC=com Paula Corte,OU=Executive,OU=User Accounts,DC=INTERNETNET,DC=com" Human_Resources,3,"CN=Katarine Gilly,OU=Executive,OU=User Accounts,DC= +INTERNETNET,DC=com Chris Head,OU=Human Resources,OU=Finance & Administration,OU=User Acco +unts,DC=INTERNETNET,DC=com Susany Cadru,OU=Human Resources,OU=Finance & Administration,OU=User Ac +counts,DC=INTERNETNET,DC=com"
      Almost there I just need to rid of the rest of the stuff inside the quotes that isn't CN
        ..I just need to rid of the rest of the stuff inside the quotes that isn't CN

        As requested, the above returns..
        AS400 Query Database,2,"CN=Joe Car"
        Human_Resources,3,"CN=Katarine Gilly"

        What do you need it to return?