Dear monks, i have a list of lines in a file. Each line is compared with the other line. If one line has a substring of the other line, the shortest of the two is deleted. Only the longest line is stored.
input file: mylist_1 sublist_153 sublist_87 sublist_876 sublist_78 mylist_6 sublist_8 mylist_2 sublist_12 sublist_34 sublist_09 mylist_3 sublist_87 sublist_09 mylist_7 sublist_8 sublist_9 mylist_9 sublist_56 in the above example, line 2 is a substring of line5. since line5 is l +onger than line2, only only line5 is taken for results. another examp +le is line4 is a substring of line3, since line3 is longer, i take on +ly that for results. another example: apple orange cake juice apple fruits car van bus jeep sumo hat people van car in the above example, apple in found in 2 lines, but i keep only the l +ine which has many elements compared the other. car is found in last +line and 3rd line. but i take only the 3rd line as hit because it has + many elements compared to last line. so my result would be: apple orange cake juice car van bus jeep sumo hat my program: #!/usr/bin/perl open(FH,"input_file.txt") or die "can not open input file"; while($line=<FH>){ @collect=split(/\s+/,$line); push(@aoa,join("#",@collect)); } my %h; for(@aoa){ push(@uaoa,$_ if !$h{join $;, @$_}++; } foreach(@uaoa){ print "$_\n"; } the desired output for this problem: mylist_1 sublist_153 sublist_87 sublist_876 sublist_78 mylist_7 sublist_8 sublist_9 mylist_2 sublist_12 sublist_34 sublist_09 mylist_9 sublist_56
please help :(

In reply to how to find the unique lines in a file? by patric

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.