Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have two files of double quoted values with a key in common, which I merge into a new file with redundant values removed.

dataa.txt
"state","location","name","school"
"Connecticut","New Haven","Jones, Jenny","Yale University"
"Massachusetts","Boston","Jones, James","Harvard University"
"New York","Ithaca","Smith, John","Cornell University"
"New York","New York","Williams, David","Columbia University"

datab.txt
"name","birth","birthplace"
"Jones, James","1954","Springfield"
"Jones, Jenny","1950","Middletown"
"Smith, John","1953","Albany"
"Williams, David","1954","Pittsfield"

resulting in datac.txt


name^state^location^school^birth^birthplace
Jones, Jenny^Connecticut^New Haven^Yale University^1950^Middletown
Jones, James^Massachusetts^Boston^Harvard University^1954^Springfield
Smith, John^New York^Ithaca^Cornell University^1953^Albany
Williams, David^New York^New York^Columbia University^1954^Pittsfield

My code is as follows:


use strict; use warnings; use Text::CSV; use Data::Dumper; my $csv = Text::CSV->new ({ quote_char => '"', escape_char => '"', sep_char => ',', eol => $\, always_quote => 1, quote_space => 1, quote_null => 1, binary => 0, keep_meta_info => 1, allow_loose_quotes => 0, allow_loose_escapes => 0, allow_whitespace => 0, blank_is_undef => 0, empty_is_undef => 0, verbatim => 0, auto_diag => 0, }); open (OUTPUT,'>datac.txt') or die $!; my (%hash1); my $file1 = 'dataa.txt'; open (CSV1,"<",$file1) or die $!; while (<CSV1>) { if ($csv->parse($_)) { my @fields1 = $csv->fields(); my $key1 = $fields1[2]; splice(@fields1,2,1); my $line1 = join ( '^' , @fields1); %hash1 = ($key1 => $line1); } else { my $err = $csv->error_input; print "Failed to parse line: $err"; } my $file2 = 'datab.txt'; open (CSV2,"<",$file2) or die $!; while (<CSV2>) { if ($csv->parse($_)) { my @fields2 = $csv->fields(); my $key2 = $fields2[0]; splice(@fields2,0,1); my $line2 = join ( '^' , @fields2); my $new = (); if (exists $hash1{$key2}) { my $new = join ('^' , $key2 , $hash1{$key2} , $line2); print OUTPUT $new,"\n"; } } else { my $err = $csv->error_input; print "Failed to parse line: $err"; } } } close CSV1; close CSV2;

Replies are listed 'Best First'.
Re: Need doubled quoted values
by ikegami (Patriarch) on Mar 02, 2015 at 02:22 UTC
    $csv->print(\*OUTPUT, \@fields);

    my @output_cols = qw( name state location school birth birthplace ); my $csv_in = Text::CSV->new({ auto_diag => 1, binary => 1, }); my $csv_out = Text::CSV->new({ auto_diag => 1, binary => 1, eol => "\n", sep_char => "^", }); my %school_by_name; { open(my $fh_in, '<', 'dataa.txt') or die($!); $csv_in->column_names( $csv_in->getline($fh_in) ); while (my $row = $csv_in->getline_hr($fh_in)) { $school_by_name{ $row->{name} } = $row; } } open(my $fh_in, '<', 'datab.txt') or die($!); open(my $fh_out, '>', 'datac.txt') or die($!); $csv_in->column_names( $csv_in->getline($fh_in) ); $csv_out->print($fh_out, \@output_cols); while (my $row = $csv_in->getline($fh_in)) { my $name = $row->{name}; my $school_row = $school_by_name{$name} or die("Can't find school for $name\n"); %$row = ( %$school_row, %$row ); $csv_out->print($fh_out, [ @$row{@output_cols} ]); }
Re: Need doubled quoted values
by Tux (Canon) on Mar 02, 2015 at 09:11 UTC
    use Text::CSV_XS "csv"; my $n = csv (in => "pm1118344b.csv", key => "name"); csv (in => csv ( in => "pm1118344a.csv", headers => "auto", on_in => sub { $_[1]{$_} = $n->{$_[1]{name}}{$_} for "birth +", "birthplace" }), sep => "^", headers => [qw(name state location school birth birthplace)], ); => name^state^location^school^birth^birthplace "Jones, Jenny"^Connecticut^"New Haven"^"Yale University"^1950^Middleto +wn "Jones, James"^Massachusetts^Boston^"Harvard University"^1954^Springfi +eld "Smith, John"^"New York"^Ithaca^"Cornell University"^1953^Albany "Williams, David"^"New York"^"New York"^"Columbia University"^1954^Pit +tsfield

    Enjoy, Have FUN! H.Merijn
      So "csv()" is slurping the files into memory?

        Or parts of it (using fragment and/or filter), or not at all. Depends on how you call it, but basically, yes.


        Enjoy, Have FUN! H.Merijn
Re: Need doubled quoted values
by Anonymous Monk on Mar 02, 2015 at 01:24 UTC
    what is your question?
      My question is, how do I double quote the individual values in datac.txt?
        Use map{"\"$_\""} before you use join

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)

        PS: Je suis Charlie!

        Update
        Added escaped double quotes

        update
        Fixed lost underscore