comment on

Since I am a PERL newbie, I thought of seeking wisdom before inadvertently converting my original problem into an X-Y problem!

I have a master list with names in 1st column followed by 2 more columns, with numbers, 1st number smaller than the 2nd. The names can be repeated more than once in this list, and are not sorted in any order. The two numbers associated with a name in each row can be different when the names are repeated, but not necessarily. Like so

Alex 3 44

Barry 2 44

James 6 45

Drew 9 43

Alex 124 175

Though it may be obvious, there may only be ONE master list or file name, the first element in @ARGV from $bash

Then I have multiple secondary files (could be just one to several, dont know a priori)- in the same format as the master file list, i.e also containing names that can be the same list OR more commonly a subset of the names in the master list. So these files also have 3 columns, 1st column with a name, followed by 2 columns of numbers, 1st one smaller than the 2nd. For example, the 1st secondary file's contents could be, in no particular alphabetical or numerical order:

James 1 22

Alex 89 120

Alex 134 155

Barry 12 24

While the 2nd secondary file's contents could be likewise:

Alex 154 174

James 29 45

Drew 19 54

Drew 139 154

My final output needs to contain the following information in a grid form

For each name from primary file, IF is present in the secondary files, AND when the secondary numerical range is equal to or within the primary's numerical range, indicate as present, and include secondary numerical range numbers.

Else fields for name should be indicated as absent, and range start and end filled with zeroes or just left empty.

Based on the rules above, my output should look as below with some sort of informative headers for the output columns that I casually made up:

name 1'start 1'end file#1 #1start #1end file#2 #2start #2end

Alex 3 44 absent 0 0 absent 0 0

Barry 2 44 present 12 24 absent 0 0

James 6 45 present 1 22 present 29 45

Drew 9 43 absent 0 0 absent 0 0

Alex 124 175 present 134 155 present 154 174

Dear Monks - How should I go about doing this? This problem is a little too tricky for me because of the repetitive nature of names combined the possibility of their different numerical ranges for each occurrence of the repeated name. This means that I might mistakenly try to match the wrong secondary range to the primary range, and conclude that a match does NOT exist, when reality I have compared ranges that should NOT have been compared, and should have instead looked for the numerical range of other instance(s) of the name. Does that sort of make sense? Perhaps I am obfuscating by typing more than I should....

Thanks in advance for your advice, have a nice weekend!

In reply to multi column multi file comparison by onlyIDleft

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.