'outer' join is perhaps not the best term to use; what you describe is a full join (or full outer join; but the word 'outer' is redundant).

The easiest way is probably indeed with DBI/DBD-CSV but I wanted to run this through postgres using foreign data wrapper 'file_fdw':

cat pm-1123847.sh #!/bin/sh # # File_Deal - the key here is parent_cusp,deal,tranche # echo "parent_cusp|cusp|isin|deal|tranche|det_date|col_type 38375U|36182D|36182D1|HMAG|HMBSWEE|20150416|mortgage 383333|361333|3618333|HABS|HABSDDE|20150330|mortgage2" > /tmp/file_dea +l.txt # # File_ATT - the key here is Vendor, deal, tranche # echo "Vendor|visp|barnembly|deal|tranche|Fund|subamt|colamt|basamt 38375U|3DD82D|36FF333|HMAG|HMBSWEE|9010|765423|364633|46566 38EE33|361DD3|36LLE33|H99S|HAOOODE|2330|377233|347433|34488" > /tmp/fi +le_att.txt echo " drop foreign table if exists t1 cascade; drop foreign table if exists t2 cascade; drop server if exists s cascade; " | psql -X echo " create server s foreign data wrapper file_fdw; create foreign table t1 ( parent_cusp text, cusp text, isin text, deal text, tranche text, det_date text, col_type text ) server s options (filename '/tmp/file_deal.txt', format 'csv', delimiter E'|', +header 'true'); create foreign table t2 ( vendor text, visp text, barnembly text, deal text, tranche text, fund text, subamt text, colamt text, basamt text ) server s options (filename '/tmp/file_att.txt', format 'csv', delimiter E'|', h +eader 'true'); select * from t1; select * from t2; select coalesce(t1.parent_cusp, t2.vendor) as parent_cusp , t1.cusp , t1.isin , coalesce(t1.deal, t2.deal) as deal , coalesce(t1.tranche, t2.tranche) as tranche , t1.det_date , t1.col_type , t2.fund , t2.subamt , t2.colamt , t2.basamt from t1 full join t2 on (t1.parent_cusp, t1.deal, t1.tranche) = (t2.vendor, t2.deal, t2.tranche) ; " | psql -X echo "File_Output parent_cusp|cusp|isin|deal|tranche|det_date|col_type|Fund|subamt|colam +t|basamt 38375U|36182D|36182D1|HMAG|HMBSWEE|20150416|mortgage|9010|765423|36463 +3|46566 383333|361333|3618333|HABS|HABSDDE|20150330|mortgage2|||| 38EE33|||H99S|HAOOODE|||2330|377233|347433|34488 "
$ ./pm-1123847.sh DROP FOREIGN TABLE DROP FOREIGN TABLE DROP SERVER CREATE SERVER CREATE FOREIGN TABLE CREATE FOREIGN TABLE parent_cusp | cusp | isin | deal | tranche | det_date | col_type -------------+--------+---------+------+---------+----------+--------- +-- 38375U | 36182D | 36182D1 | HMAG | HMBSWEE | 20150416 | mortgage 383333 | 361333 | 3618333 | HABS | HABSDDE | 20150330 | mortgage +2 (2 rows) vendor | visp | barnembly | deal | tranche | fund | subamt | colamt + | basamt --------+--------+-----------+------+---------+------+--------+------- +-+-------- 38375U | 3DD82D | 36FF333 | HMAG | HMBSWEE | 9010 | 765423 | 364633 + | 46566 38EE33 | 361DD3 | 36LLE33 | H99S | HAOOODE | 2330 | 377233 | 347433 + | 34488 (2 rows) parent_cusp | cusp | isin | deal | tranche | det_date | col_type + | fund | subamt | colamt | basamt -------------+--------+---------+------+---------+----------+--------- +--+------+--------+--------+-------- 38375U | 36182D | 36182D1 | HMAG | HMBSWEE | 20150416 | mortgage + | 9010 | 765423 | 364633 | 46566 383333 | 361333 | 3618333 | HABS | HABSDDE | 20150330 | mortgage +2 | | | | 38EE33 | | | H99S | HAOOODE | | + | 2330 | 377233 | 347433 | 34488 (3 rows) File_Output parent_cusp|cusp|isin|deal|tranche|det_date|col_type|Fund|subamt|colam +t|basamt 38375U|36182D|36182D1|HMAG|HMBSWEE|20150416|mortgage|9010|765423|36463 +3|46566 383333|361333|3618333|HABS|HABSDDE|20150330|mortgage2|||| 38EE33|||H99S|HAOOODE|||2330|377233|347433|34488

'Export' is easy with

copy ( <some SQL> ) to '/tmp/file_output.txt' with (format csv, delimiter '|', header true)

I've not decided for myself how this compares to a variant with DBI.

I guess I'll go make such a DBI/DBD-CSV version now...

UPDATE: added the actual output file generation; some small text edits.


In reply to Re: Outer Join on 2 files by erix
in thread Outer Join on 2 files by healingtao

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.