Hi Guys, I have this file...
2009-01-08 09:29:19 ABCDEF 943973 MS08-011 Security Update + for Microsoft Works Suite 2005 (KB943973) 2009-01-08 09:29:19 ABCDEF 943973 MS08-011 Security Update + for Microsoft Works Suite 2005 (KB943973) 2009-01-08 09:29:19 ABCDEF 951944 MS08-055 Security Update + for the 2007 Microsoft Office System (KB951944) 2009-01-08 09:29:19 ABCDEF 953432 Update for Microsoft Of +fice Outlook 2003 (KB953432) 2009-01-08 09:29:19 ABCDEF 954038 MS08-051 Security Update + for 2007 Microsoft Office System (KB954038) 2009-01-08 09:29:19 ABCDEF 954326 MS08-052 Security Update + for the 2007 Microsoft Office System (KB954326) 2009-01-08 09:29:19 ABCDEF 956391 Cumulative Security Upd +ate for ActiveX Killbits for Windows 2000 (KB956391) 2009-01-08 09:29:20 ABCDEF 956828 MS08-072 Security Update + for the 2007 Microsoft Office System (KB956828) 2009-01-08 09:29:20 ABCDEF 956828 MS08-072 Security Update + for the 2007 Microsoft Office System (KB956828) 2009-01-08 09:29:20 ABCDEF 957832 Update for Microsoft Of +fice Outlook 2003 Junk Email Filter (KB957832) 2009-01-08 09:29:22 ABCDEF 958439 MS08-074 Security Update + for the 2007 Microsoft Office System (KB958439) 2009-01-08 09:29:22 ABCDEF 958439 MS08-074 Security Update + for the 2007 Microsoft Office System (KB958439)
I want to remove the duplicate lines so the output looks like this:--
953432 Update for Microsoft Office Outlook 2003 (KB953432) + ABCDEF 2009-01-08 956391 Cumulative Security Update for ActiveX Killbits for Wi +ndows 2000 (KB956391) ABCDEF 2009-01-08 957832 Update for Microsoft Office Outlook 2003 Junk Email Fi +lter (KB957832) ABCDEF 2009-01-08 MS08-011 943973 Security Update for Microsoft Works Suite 2005 + (KB943973) ABCDEF 2009-01-08 MS08-051 954038 Security Update for 2007 Microsoft Office Syst +em (KB954038) ABCDEF 2009-01-08 MS08-052 954326 Security Update for the 2007 Microsoft Office +System (KB954326) ABCDEF 2009-01-08 MS08-055 951944 Security Update for the 2007 Microsoft Office +System (KB951944) ABCDEF 2009-01-08 MS08-072 956828 Security Update for the 2007 Microsoft Office +System (KB956828) ABCDEF 2009-01-08 MS08-074 958439 Security Update for the 2007 Microsoft Office +System (KB958439) ABCDEF 2009-01-08

But i am not getting the last line "MS08-074 958439 Security Update for the 2007 Microsoft Office System (KB958439) ABCDEF 2009-01-08" in my output.this is the output i am getting:--

953432 Update for Microsoft Office Outlook 2003 (KB953432) + ABCDEF 2009-01-08 956391 Cumulative Security Update for ActiveX Killbits for Wi +ndows 2000 (KB956391) ABCDEF 2009-01-08 957832 Update for Microsoft Office Outlook 2003 Junk Email Fi +lter (KB957832) ABCDEF 2009-01-08 MS08-011 943973 Security Update for Microsoft Works Suite 2005 + (KB943973) ABCDEF 2009-01-08 MS08-051 954038 Security Update for 2007 Microsoft Office Syst +em (KB954038) ABCDEF 2009-01-08 MS08-052 954326 Security Update for the 2007 Microsoft Office +System (KB954326) ABCDEF 2009-01-08 MS08-055 951944 Security Update for the 2007 Microsoft Office +System (KB951944) ABCDEF 2009-01-08 MS08-072 956828 Security Update for the 2007 Microsoft Office +System (KB956828) ABCDEF 2009-01-08

this is the code i am using:--

#!/usr/local/bin/perl open (MYFILE, 'file.txt'); @file = <MYFILE>; close (MYFILE); print (" - Found (" . scalar ( @file) . ")\n"); foreach $line (@file) { chomp ($line); @split=split(/\t/, $line); @date=split(/\s+/, $split[0]); push (@sort ,"@split[1]\t@split[2]\t@split[3]\t@split[ +4]\t@date[0]"); } @sorted = sort (@sort); foreach $Endpoint (@sorted) { $Endpoint =~ s +/\s*$//; print "FIRST - $Endpoint\n"; } undef (@sort); print ("Found (" . scalar (@sorted) . ")\n"); print ("Remove duplicate lines\n"); $prev=""; $index=0; foreach $line (@sorted) { $index++; if ("$prev" eq ""){ $prev = $line; }else { if ($prev eq $line) { } else { push (@filtered,$prev); } } if ($index == scalar(@softed)){ push (@filtered,$line); } $prev = $line; } @sorted = sort (@filtered); undef (@filtered); print ("Found (" . scalar (@sorted) . ")\n"); print ("format each line so that its formated as BulletinID,KBID,T +itle,Endpointname,Date\n"); foreach $line(@sorted){ @split=split(/\t/, $line); print "LINEPRINT - $line\n"; push (@sort ,"@split[2]\t@split[1]\t@split[3]\t@split[0]\t@spl +it[4]"); } @sorted = sort (@sort); undef (@sort); print ("Found (" . scalar (@sorted) . ")\n"); foreach $Endpoint (@sorted) { $Endpoint =~ s +/\s*$//; print "$Endpoint\n"; }

In reply to Removing duplicate lines from a file by green_lakers

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.