Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi ,

I have a file which have input as like below

pet="cat"|hate="rat"|like="dog" hate="rat"|like="dog"|pet="cat" hate="rat"|like="horse"|pet="cat" pet="cow"|hate="rat"|like="dog" hate="rat"|like="dog"|pet="cow"

And output im looking is

pet="cat"|hate="rat"|like="dog" pet="cow"|hate="rat"|like="dog"

Any first instance where I find unique value for pet and hate I wanted to print.

Thanks

2018-05-16 Athanasius added code and paragraph tags

Replies are listed 'Best First'.
Re: make a search pattern and remove duplicate from the file
by jimpudar (Pilgrim) on May 11, 2018 at 04:24 UTC

    You should use HTML tags to format your post. It's very difficult to tell what you are asking the way you posted it!

    I'm assuming you meant this:

    Hi , I have a file which have input as like below pet="cat"|hate="rat"|like="dog" hate="rat"|like="dog"|pet="cat" hate="rat"|like="horse"|pet="cat" pet="cow"|hate="rat"|like="dog" hate="rat"|like="dog"|pet="cow" And output im looking is pet="cat"|hate="rat"|like="dog" pet="cow"|hate="rat"|like="dog" Any first instance where I find unique value for pet and hate I wanted + to print. Thanks

    You should have tried your hand at writing a solution and posted the code, but since I'm feeling generous here is a solution which you can pipe your input file to:

    #! /usr/bin/env perl use strict; use warnings; my %pet_hate; while (<>) { my ($pet) = /pet="(\w+)"/ or next; my ($hate) = /hate="(\w+)"/ or next; my ($like) = /like="(\w+)"/ or next; $pet_hate{"$pet:$hate"} //= { pet => $pet, hate => $hate, like => $like, }; } foreach (values %pet_hate) { printf(qq{pet="%s"|hate="%s"|like="%s"\n}, $_->{pet}, $_->{hate}, $_->{like}); }

    Best,

    Jim

Re: make a search pattern and remove duplicate from the file
by NetWallah (Canon) on May 11, 2018 at 06:36 UTC
    Here is a one-liner that generates the structure you need to analyze this info:
    $ perl -MData::Dumper -lanF\\\| -e 'for (@F){@x=split /=/;$h{$x[0]} +=$x[1]};$pet_hate{$h{pet}}{HATE}{$h{hate}}++;$pet_hate{$h{pet}}{LIKE} +{$h{like}}++; }{print Dumper \%pet_hate' your-file.txt $VAR1 = { '"cow"' => { 'HATE' => { '"rat"' => 3 }, 'LIKE' => { '"dog"' => 3 } }, '"cat"' => { 'LIKE' => { '"horse"' => 1, '"dog"' => 2 }, 'HATE' => { '"rat"' => 3 } } };
    I did not understand your specifications well enough to figure out how you came to get the output you want. but you can loop through this structure (or something close) to generate that.

                    Memory fault   --   brain fried

      I think what he is asking for is to print out only the first instance of any line which has a unique value of the pet, hate tuple.

      Any other lines should be ignored.

      Since the structure you are building does not keep track of which line came first, looping over it will never bring you to the answer he is looking for.

      If you really wanted to do this with a one liner, (and I'm definitely not saying you should), I would do this, which will print exactly the result he asked for:

      perl -wlF'\|' -e 'for (@F) { /(\w+)="(\w+)"/; $r{$1} = $2 } $x="$r{pet}:$r{hate}"; $s{$x} ? next : ++$s{$x} && print' <input pet="cat"|hate="rat"|like="dog" pet="cow"|hate="rat"|like="dog"

      Best,

      Jim

        Your comment propagates the OP's usage of the word "unique" - and that causes confusion because your implementation reports the "first" usage of the tuple, which may not necessarily be a "unique" occurrence.

        In order to find "unique", you will need to do that after all records have been ingested, and post-process, as my code does.

        Anyway - this only illustrates non-specific specifications - and these nits are not worth picking.

        Cheers.

                        Memory fault   --   brain fried