comment on

'outer' join is perhaps not the best term to use; what you describe is a full join (or full outer join; but the word 'outer' is redundant).

The easiest way is probably indeed with DBI/DBD-CSV but I wanted to run this through postgres using foreign data wrapper 'file_fdw':

cat pm-1123847.sh
#!/bin/sh
#
# File_Deal - the key here is parent_cusp,deal,tranche
#
echo "parent_cusp|cusp|isin|deal|tranche|det_date|col_type
38375U|36182D|36182D1|HMAG|HMBSWEE|20150416|mortgage
383333|361333|3618333|HABS|HABSDDE|20150330|mortgage2" > /tmp/file_dea
+l.txt

#
# File_ATT - the key here is Vendor, deal, tranche 
#
echo "Vendor|visp|barnembly|deal|tranche|Fund|subamt|colamt|basamt
38375U|3DD82D|36FF333|HMAG|HMBSWEE|9010|765423|364633|46566
38EE33|361DD3|36LLE33|H99S|HAOOODE|2330|377233|347433|34488" > /tmp/fi
+le_att.txt

echo "
drop foreign table if exists t1 cascade;
drop foreign table if exists t2 cascade;
drop server        if exists s  cascade;
" | psql -X

echo "
create server s foreign data wrapper file_fdw;

create foreign table t1 (
    parent_cusp text,
    cusp text,
    isin text,
    deal text,
    tranche text,
    det_date text,
    col_type text
) server s
options (filename '/tmp/file_deal.txt', format 'csv', delimiter E'|', 
+header 'true');

create foreign table t2 (
    vendor text,
    visp text,
    barnembly text,
    deal text,
    tranche text,
    fund text,
    subamt text,
    colamt text,
    basamt text
) server s
options (filename '/tmp/file_att.txt', format 'csv', delimiter E'|', h
+eader 'true');

select * from t1;

select * from t2;

select 
    coalesce(t1.parent_cusp, t2.vendor) as parent_cusp
  , t1.cusp
  , t1.isin
  , coalesce(t1.deal, t2.deal)          as deal
  , coalesce(t1.tranche, t2.tranche)    as tranche
  , t1.det_date
  , t1.col_type
  , t2.fund
  , t2.subamt
  , t2.colamt
  , t2.basamt

from t1 full join t2 
  on (t1.parent_cusp, t1.deal, t1.tranche) =
     (t2.vendor, t2.deal, t2.tranche)
;
" | psql -X 

echo "File_Output

parent_cusp|cusp|isin|deal|tranche|det_date|col_type|Fund|subamt|colam
+t|basamt
38375U|36182D|36182D1|HMAG|HMBSWEE|20150416|mortgage|9010|765423|36463
+3|46566
383333|361333|3618333|HABS|HABSDDE|20150330|mortgage2||||
38EE33|||H99S|HAOOODE|||2330|377233|347433|34488
"
[download]


$ ./pm-1123847.sh
DROP FOREIGN TABLE
DROP FOREIGN TABLE
DROP SERVER
CREATE SERVER
CREATE FOREIGN TABLE
CREATE FOREIGN TABLE
 parent_cusp |  cusp  |  isin   | deal | tranche | det_date | col_type
-------------+--------+---------+------+---------+----------+---------
+--
 38375U      | 36182D | 36182D1 | HMAG | HMBSWEE | 20150416 | mortgage
 383333      | 361333 | 3618333 | HABS | HABSDDE | 20150330 | mortgage
+2
(2 rows)

 vendor |  visp  | barnembly | deal | tranche | fund | subamt | colamt
+ | basamt
--------+--------+-----------+------+---------+------+--------+-------
+-+--------
 38375U | 3DD82D | 36FF333   | HMAG | HMBSWEE | 9010 | 765423 | 364633
+ | 46566
 38EE33 | 361DD3 | 36LLE33   | H99S | HAOOODE | 2330 | 377233 | 347433
+ | 34488
(2 rows)

 parent_cusp |  cusp  |  isin   | deal | tranche | det_date | col_type
+  | fund | subamt | colamt | basamt
-------------+--------+---------+------+---------+----------+---------
+--+------+--------+--------+--------
 38375U      | 36182D | 36182D1 | HMAG | HMBSWEE | 20150416 | mortgage
+  | 9010 | 765423 | 364633 | 46566
 383333      | 361333 | 3618333 | HABS | HABSDDE | 20150330 | mortgage
+2 |      |        |        |
 38EE33      |        |         | H99S | HAOOODE |          |         
+  | 2330 | 377233 | 347433 | 34488
(3 rows)

File_Output

parent_cusp|cusp|isin|deal|tranche|det_date|col_type|Fund|subamt|colam
+t|basamt
38375U|36182D|36182D1|HMAG|HMBSWEE|20150416|mortgage|9010|765423|36463
+3|46566
383333|361333|3618333|HABS|HABSDDE|20150330|mortgage2||||
38EE33|||H99S|HAOOODE|||2330|377233|347433|34488
[download]

'Export' is easy with

copy ( <some SQL> ) to '/tmp/file_output.txt'
with (format csv, delimiter '|', header true)
[download]

I've not decided for myself how this compares to a variant with DBI.

I guess I'll go make such a DBI/DBD-CSV version now...

UPDATE: added the actual output file generation; some small text edits.

In reply to Re: Outer Join on 2 files by erix
in thread Outer Join on 2 files by healingtao

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.