GeorgMN has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

First off. I searched the archives here. What i found did not really compute (in my head at least), or was too far removed from what i need to accomplish.

Here is the surrounding criteria. My goal is to take several values from a file, line by line. The items in these lines have an important relation to one another that needs to be kept intact. I then need to compare some of these elements and compare them to elements from another file. If some the values on a given line match (MAC address in my case), i need to splice the lines from both files together. The match between the two files is the scalar "$mac".

From what i read so far I should be using a hash reference for this. My problem: I am not a programmer and i am not sure how to use hashes correctly. I cannot seem to push the scalars into the hash.

my $ERRORCOUNT=0; my %HASH; my %OTHERHASH; ... foreach my $LINE (@LINES) { # the sub-routine works and is returning the desired output my ($vlan, $mac, $int, $counter) = linesifter($LINE); if ($error eq 1) { ERRORCOUNT++; } else { my @VLANS, @MACS, @INTS); push $HASH{$MACS[$mac]}, [$VLANS[$vlan], [$INTS[$int]]; } # here will follow similar code from another file # compare $HASH[$MACS[keymatch]] to $OTHERHASH[$MACS[$keymatch]] # then do *magic* and line up all items that are referenced by matchin +g MAC key in both files onto one line. Store in a file. sub linesifter { #... #...omitted # various regex'es for grep'ing data #... return ($vlan, $mac, $int, $error); }

I am sure i got the hash declarations wrong. Any input on how i can compare to hashes or arrays or lines within each to one another and interpolate the data is very welcomed. Thank you as always.

#!/usr/bin/perl # # use strict; use warnings; use File::Glob ':globally'; my @SHOW_ARP = <~/Documents/show_outputs/*arp*.txt>; my @SHOW_MAC = <~/Documents/show_outputs/*mac*.txt>; #my $CLEAN_ARP_REC = '~/Documents/clean_arp.tmp'; #my $CLEAN_MAC_REC = '~/Documents/clean_mac.tmp'; my $CLEAN_ARP_REC = '/home/gtreptow/Documents/show_outputs/clean_arp.t +mp'; open (OUTPUT, "> $CLEAN_ARP_REC") or die "Could not open $CLEAN_ARP_RE +C."; # my global hash that should contain all my matches #my %ARP_HASH = (); my %MAC_HASH; my (@VLANS, @MACS, @INTS); foreach my $FILES (@SHOW_MAC) { open (INFILE, '<', $FILES); print "using file: $FILES\n"; my $FAILCOUNTER = 0; my $COUNTER = 0; chomp (my @LINES = <INFILE>); foreach my $LINE (@LINES) { my ($VLAN, $MAC, $INT, $COUNTER) = &mac_linesifter($LINE); + # Call sub-routine "linesifter" and pass every line to it if ($COUNTER eq 1) { $FAILCOUNTER++; #print "Saw an INCOMPLETE\n"; # Test each l +ine } else { #print "$VLAN \t $MAC \t $INT \t $COUNTER \n"; #$MAC_HASH = {}; #$MAC_HASH->{MAC}= "$MAC"; #$MAC_HASH->{VLAN}="$VLAN"; #$MAC_HASH->{INT}="$INT"; my ($VLANS, @MACS, @INTS); push @{ $MAC_HASH{$MACS[$MAC]} }, [$VLANS[$VLAN] ,$INTS[$I +NT]]; } } # we also need to get the hostname print "!!! $FAILCOUNTER ARP records ignored due to incomplete entries +!!!\n"; print "\n"; print "Contents of Hash ARP_HASH:\n"; foreach my $VALUE (sort keys %MAC_HASH) { print "$VALUE, $ARP_HASH{$VALUE}[0],$ARP_HASH{$VALUE}[1],$ARP_HASH +{$VALUE}[2]\n"; } } close (OUTPUT); ### BEGIN SUB-ROUTINE 2 ### sub mac_linesifter { my ($LINE) = @_; my $VLAN; my $MAC; my $INT; my $COUNTER = 0; if ($LINE !~ m/^\*/) # skip if line d +oes not start with "*" { $COUNTER=1; no warnings "exiting"; next; } if ($LINE =~ m/^\s*$/) # skip empty lin +es { $COUNTER=1; no warnings "exiting"; next; } if ($LINE =~ m/static/) # skip static entri +es { $COUNTER=1; no warnings "exiting"; next; } if ($LINE =~ m/(^\*\ [0-9]*)/) # match VLAN + identifier { # remove leading "* " $VLAN = substr($1,2); } if ($LINE =~ m/([0-9a-f]{4}\.[0-9a-f]{4}\.[0-9a-f]{4})/) + # match MAC { $MAC = $1; } # match either - example Eth101/0/48", Eth12/1, Po12 - add regex h +ere for additional interace type matches if ($LINE =~ m/([A-Z]{1}[a-z]{2}[0-9]*\/[0-9]*\/[0-9]*|[A-Z]{1}[a- +z]{2}[0-9]*|[Po]{2}[0-9]{2})/) { $INT = $1; #print "int match: $1\n"; } return ($VLAN, $MAC, $INT, $COUNTER); # return +these values from sub-routine } ### END SUB-ROUTINE 2 ###

Replies are listed 'Best First'.
Re: Merging hashes at key match
by Anonymous Monk on Mar 22, 2016 at 15:12 UTC

    Please see How do I post a question effectively? and http://sscce.org/ - it would be much easier to answer your question if you could provide sample input data, a minimal code example that compiles, and some expected output corresponding to the example input.

    Anyway, it seems like the line you are asking about is push $HASH{$MACS[$mac]}, [$VLANS[$vlan], [$INTS[$int]]; - it looks like you're trying to push an array reference ([$VLANS[$vlan], $INTS[$int]]) onto the value of $HASH{$MACS[$mac]}, but what is unclear from your code is what kind of a data structure that contains. Since you're using push, I'm going to guess it's an array reference. In that case, you need to dereference the array reference first via push @{...}, ...;, or in your case:

    push @{ $HASH{$MACS[$mac]} }, [$VLANS[$vlan], $INTS[$int]];

    But whether that'll get you the data structure you want depends on what you want your output data structure to look like, which you haven't shown us.

    The document perldsc is a nice cookbook of code examples for Perl data structures that will very likely help you.

      Hello - I updated the code field. Maybe you can take another look. Thank you - Georg
Re: Merging hashes at key match
by FreeBeerReekingMonk (Deacon) on Mar 22, 2016 at 20:52 UTC
    my @VLANS, @MACS, @INTS); push $HASH{$MACS[$mac]}, [$VLANS[$vlan], [$INTS[$int]];

    Here, I assume, you want to store all found VLANS, and MAC addresses, however, instead of pushing them as blobs do this:

    In each iteration you get a new $mac, just add one to the hash value of $mac $MACS{$mac}++;

    This way, you can get a list back later, like so: @ALL_MACS = sort keys %MACS

    AND, you can check if that MAC address came by more than once by checking the value $MACS{$mac}

    But instead of that, one trick I often do is or-ing: like so:

    When found in mac.txt $MACS{$mac} |= 1;

    When found in arp.txt $MACS{$mac} |= 2;

    This way, when $MACS{$mac} == 3 it is in both. And it can be multiple times in either one of them, the value is still 3.

    Storing your data.

    # make the data a string $value = join("|", $VLAN, $MAC, $INT); # put them into a 1 dimentional hash (however, servers with the same m +ac hash overwrite each other, although you can check) if( defined $HASH{$mac} ) { warn "Two servers with the same $mac"; }else{ $HASH{$mac}=$value; }

    This way, you can iterate later over your $mac with:

    for my $mac (keys %HASH){ my $value = $HASH{$mac}; ($VLAN, $MAC, $INT) = split("|", $value); }

    # or put them in a multi-dimensional hash. $HASH{$mac}{$value}++;

    This way, you can iterate later over your $mac with:

    for my $mac (keys %HASH){ for my $value (keys %{$HASH{$mac}}){ } }

    Now post some example mac.txt and arp.txt so we can really help you.

      Thank you all kindly for you input! Georg

      Content of the files i am trying to combine, or at least add the IP address to physical interface together:

      [gtreptow@nostromo show_outputs]$ cat *arp* | grep 10\.60 ARPA Vlan2010 Internet 10.60.198.59 1 d8d3.85a5.e3c4 ARPA Vlan198 Internet 10.60.226.31 0 0050.56b0.3d17 ARPA Vlan2260 Internet 10.60.227.25 1 0050.56b0.7e6e ARPA Vlan2270 Internet 10.60.226.24 0 0050.56b0.3580 ARPA Vlan2260 Internet 10.60.228.31 2 0050.56b0.3569 ARPA Vlan2280 Internet 10.60.226.25 0 0050.56b0.7fb0 ARPA Vlan2260 Internet 10.60.249.1 1 0009.0f09.0a01 ARPA Vlan2490 Internet 10.60.198.62 3 18a9.0559.c862 ARPA Vlan198 Internet 10.60.197.61 1 0015.1766.1cc8 ARPA Vlan1970 Internet 10.60.227.26 0 0050.56b0.7f8a ARPA Vlan2270 Internet 10.60.197.60 2 047d.7b4a.70ee ARPA Vlan1970 Internet 10.60.226.27 0 0050.56b0.19a3 ARPA Vlan2260 Internet 10.60.250.28 4 001b.90f0.67c1 ARPA Vlan250 Internet 10.60.249.31 0 d89d.67f4.495c ARPA Vlan2490 Internet 10.60.227.5 - 0008.e3ff.fd90 ARPA Vlan2270 [gtreptow@nostromo show_outputs]$ cat hcs51-4_sh_mac_add.txt HCS51-4# sh mac address-table | ex Po24 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Over +lay MAC age - seconds since last seen,+ - primary entry using vPC Peer +-Link VLAN MAC Address Type age Secure NTFY Ports/SWI +D.SSID.LID ---------+-----------------+--------+---------+------+----+----------- +------- * 3920 000c.2999.7b8c dynamic 0 F F Eth101/1/33 * 3920 000c.29b5.f5b3 dynamic 0 F F Eth101/1/33 * 3920 0014.a998.c962 dynamic 0 F F Eth102/1/27 * 2491 0050.56b0.3858 dynamic 0 F F Eth101/1/5 * 2491 0050.56b0.5c3e dynamic 20 F F Eth101/1/5 * 2491 0050.56b0.608e dynamic 0 F F Eth101/1/5 * 2491 0050.56b0.66fc dynamic 0 F F Eth101/1/5 * 2491 0050.56b0.7d2b dynamic 0 F F Po21 * 2491 0050.56b0.7e40 dynamic 0 F F Eth101/1/5 * 2491 d89d.67f4.495c dynamic 0 F F Eth101/1/15 * 2280 0050.56b0.52e8 dynamic 0 F F Po21 * 2280 0050.56b0.57df dynamic 0 F F Eth101/1/14 * 2280 0050.56b0.7258 dynamic 0 F F Po21 * 2280 0050.56b0.7e80 dynamic 0 F F Po21 * 2280 0050.56b0.7f99 dynamic 0 F F Po21 * 2280 0050.56b0.7f9a dynamic 0 F F Po21 * 2280 0050.56b0.7fe5 dynamic 0 F F Po21 * 2271 0050.56b0.1186 dynamic 0 F F Eth101/1/5 * 2271 0050.56b0.1eee dynamic 10 F F Eth101/1/5 * 2271 0050.56b0.24f1 dynamic 0 F F Eth101/1/14 * 2271 0050.56b0.7e6e dynamic 0 F F Eth101/1/5 * 2271 0050.56b0.7f8a dynamic 0 F F Eth101/1/5 * 2261 0050.56b0.221d dynamic 0 F F Eth101/1/14 * 2261 0050.56b0.357f dynamic 0 F F Eth101/1/14 * 2261 0050.56b0.7e59 dynamic 0 F F Eth101/1/14 * 2261 0050.56b0.7f8e dynamic 0 F F Eth101/1/14 * 2260 0050.56b0.19a3 dynamic 0 F F Eth101/1/14 * 2260 0050.56b0.7de5 dynamic 0 F F Eth101/1/14 * 2240 0050.56b0.0010 dynamic 0 F F Eth101/1/14 * 2240 0050.56b0.0012 dynamic 0 F F Eth101/1/14 * 2240 0050.56b0.2556 dynamic 0 F F Eth101/1/14 * 2240 0050.56b0.6261 dynamic 0 F F Eth101/1/14 * 2040 0050.56b0.002f dynamic 100 F F Eth101/1/14 [gtreptow@nostromo show_outputs]$ ... and so on....
        How about this? (I tried to made it simple and readable, instead of terse. But ask if you do not understand a part, or do not know how to extend it to what you need)

        #!/usr/bin/perl use strict; use warnings; my @FILENAMES_ARP = <./*arp*.txt>; my @FILENAMES_MAC = <./*mac*.txt>; # use File::Slurp; this way this works: my @LINES =read_file($FILE + ); my %MACS; my $FILE; my $CLEAN_ARP_REC = './clean_arp.tmp'; open (OUTPUT, ">", $CLEAN_ARP_REC) or die "Could not open $CLEAN_ARP_R +EC because $!"; for $FILE (@FILENAMES_MAC){ print "processing $FILE...\n"; if( open(FH, "<", $FILE) ){ while(<FH>){ chomp; next unless /^\*/; # skip if line does not start with "*", + for example, empty lines next if /static/; # skip static entries my @F = split(/\s+/, $_); my $VLAN = $F[1]; my $MAC =$F[2]; my $INT = $F[7]; next unless $MAC=~/^[0-9a-f]{4}\.[0-9a-f]{4}\.[0-9a-f]{4}$ +/; if($MACS{$MAC}){ warn "$MAC already defined, skipping entry at line $.\ +n"; }else{ $MACS{$MAC}{VLAN}=$VLAN; $MACS{$MAC}{INT}=$INT; } } close FH; }else{ warn "Could not open $FILE, skipping. Errormsg=$!\n"; } } for $FILE (@FILENAMES_ARP){ print "Processing $FILE...\n"; if( open(FH, "<", $FILE) ){ while(<FH>){ chomp; my @F = split(/\s+/, $_); my $MAC = $F[3]; my $VLAN = $F[5]; my $IP = $F[1]; next unless $IP; # must be defined? if($MACS{$MAC}){ $MACS{$MAC}{IP}=$IP; # add this extra data $MACS{$MAC}{_}=1; # have to "merge" } } close FH; }else{ warn "Could not open $FILE, skipping. Errormsg=$!\n"; } } for my $MAC (sort keys %MACS){ if( defined $MACS{$MAC}{_} ){ print OUTPUT $MAC . ", " . $MACS{$MAC}{IP} . ", " . $MACS{$MAC}{VLAN} . ", " . $MACS{$MAC}{INT} ."\n"; }else{ # normal line, do not print ? } } close OUTPUT;