However the IPaddress and Action part of each line may contain duplicates, its these duplicates I want to remove but still keep the output in time order.Now, if "1.2.3.4 PowerOff" occurs today at 08:18 and again today at 10:20, do you want to keep the first record and delete the later one, or vice versa?
If you keep the first and delete later repeats, you just keep the IP/Action data as hash keys, and assuming the data are being read in chronological order, only output lines whose IP/Action are not yet in the hash.
In order to delete earlier occurrences and keep only the latest one, you have to store Date/Time as the value for each IP/Action key, and after you've read the whole input stream, sort the hash by its values in order to print each "hash_value hash_key" in chronological order.
In reply to Re: Removing duplicate entries in a file which has a time stamp on each line
by graff
in thread Removing duplicate entries in a file which has a time stamp on each line
by lambden
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |