Whew, im not sure where to begin.

https://en.wikipedia.org/wiki/Hosts_(file)
The hosts file contains lines of text consisting of an IP address in the first text field followed by one or more host names. Each field is separated by white space – tabs are often preferred for historical reasons, but spaces are also used. Comment lines may be included; they are indicated by a hash character (#) in the first position of such lines. Entirely blank lines in the file are ignored.

First there is no rule that the first four lines of the hosts file will be comments

hosts files may have blank lines

Any line may have a comment, any text after the # is taken to be a comment. A comment may make the line otherwise blank if there is no ip/names before it.

Not every ip in the hosts file HAS to be 127.0.0.1. Mine has lines like

192.168.2.1 wifi.mylan
So i can access things on my local net by name.

notice followed by one or more host names. Multiple names are allowed on one line for the same ip address.

I think this will correctly read a hosts file, and do what you are after.

use strict; use warnings; my $file_read='C:/WINDOWS/system32/drivers/etc/hosts'; my $ha=[]; my $names={}; my $ips={}; #open (my $hf,'<',$file_read) or die "Can't open $file_read: $!";; my $hf=\*DATA; while (my $line=<$hf>) { chomp $line; # print $line."\n"; my $h={}; $h->{line}=$line; my ($pre,$comment)=split('#',$line,2); $h->{comment}=$comment if ($comment); if ($pre) { my @parts=split(/\s+/,$pre); if (scalar(@parts)>1) { my $ip=shift @parts; $h->{ip}=$ip; push @{$ips->{$ip}},@parts; $h->{names}=[@parts]; for my $name (@parts) {$names->{$name}=$ip; } } # parts } # pre push @$ha,$h; } #line use Data::Dumper; print Dumper($ha); print Dumper($ips); print Dumper($names); #open($out, '>', $file_write)|| die "\n error opening file $file_write + \n"; my $out=\*STDOUT; print $out "#Hosts file\n"; print $out "#Last Modified -> ". localtime() . "\n"; print $out "# \n"; print $out "# localhost: Needs to stay like this to work\n"; print $out "127.0.0.1\t localhost\n"; print $out "# \n"; delete $names->{localhost} if ($names->{localhost}); my @ksort=sort {my $r1=$names->{$a} cmp $names->{$b}; return $r1 if($r +1); $a cmp $b} keys(%$names); for my $name (@ksort) { print $out $names->{$name}."\t".$name."\n"; } __DATA__ # Copyright (c) 1993-1999 Microsoft Corp. # # This is a sample HOSTS file used by Microsoft TCP/IP for Windows. # # This file contains the mappings of IP addresses to host names. Each # entry should be kept on an individual line. The IP address should # be placed in the first column followed by the corresponding host nam +e. # The IP address and the host name should be separated by at least one # space. # # Additionally, comments (such as these) may be inserted on individual # lines or following the machine name denoted by a '#' symbol. # # For example: # # 102.54.94.97 rhino.acme.com # source server # 38.25.63.10 x.acme.com # x client host 127.0.0.1 localhost ads.pointroll.com scanner2.malware-scan.com +localhost adsys.townnews.com adimages.townnews.com ad.doubleclick.net + pagead2.googlesyndication.com ad.yieldmanager.com view.atdmt.com ads +.revsci.net servedby.advertising.com jeffcity30.autochooser.com perfo +rmanceoptimizer.com cache.fimservecdn.com pixel.quantserve.com ads.yi +mg.com this.content.served.by.adshuffle.com img-cdn.mediaplex.com cac +he.fimservecdn.com adserving.cpxinteractive.com pixel.quantserve.com +s0.2mdn.net 127.0.0.1 www.zip2save.com d1.openx.org c3.openx.org partner.goo +gleadservices.com media.ljworld.com everythingmidmo.com www.everythin +gmidmo.com edge.quantserve.com pixel.quantserve.com ad-g.doubleclick. +net ads.yimg.com ad.wsod.com s0.2mdn.net s0.2mdn.net 192.168.1.1 nat.mylan 192.168.1.100 dhcp1.nat.mylan 192.168.2.1 wifi.mylan 192.168.2.100 dhcp1.wifi.mylan 192.168.1.234 lxle0 lxle0.mylan 192.168.1.200 me me.mylan 192.168.1.200 me me.mylan 192.168.254.251 wan.mylan
last part of result
#Hosts file #Last Modified -> Fri Mar 24 01:39:59 2017 # # localhost: Needs to stay like this to work 127.0.0.1 localhost # 127.0.0.1 ad-g.doubleclick.net 127.0.0.1 ad.doubleclick.net 127.0.0.1 ad.wsod.com 127.0.0.1 ad.yieldmanager.com 127.0.0.1 adimages.townnews.com 127.0.0.1 ads.pointroll.com 127.0.0.1 ads.revsci.net 127.0.0.1 ads.yimg.com 127.0.0.1 adserving.cpxinteractive.com 127.0.0.1 adsys.townnews.com 127.0.0.1 c3.openx.org 127.0.0.1 cache.fimservecdn.com 127.0.0.1 d1.openx.org 127.0.0.1 edge.quantserve.com 127.0.0.1 everythingmidmo.com 127.0.0.1 img-cdn.mediaplex.com 127.0.0.1 jeffcity30.autochooser.com 127.0.0.1 media.ljworld.com 127.0.0.1 pagead2.googlesyndication.com 127.0.0.1 partner.googleadservices.com 127.0.0.1 performanceoptimizer.com 127.0.0.1 pixel.quantserve.com 127.0.0.1 s0.2mdn.net 127.0.0.1 scanner2.malware-scan.com 127.0.0.1 servedby.advertising.com 127.0.0.1 this.content.served.by.adshuffle.com 127.0.0.1 view.atdmt.com 127.0.0.1 www.everythingmidmo.com 127.0.0.1 www.zip2save.com 192.168.1.1 nat.mylan 192.168.1.100 dhcp1.nat.mylan 192.168.1.200 me 192.168.1.200 me.mylan 192.168.1.234 lxle0 192.168.1.234 lxle0.mylan 192.168.2.1 wifi.mylan 192.168.2.100 dhcp1.wifi.mylan 192.168.254.251 wan.mylan
YMMV


In reply to Re: parsing a terrible /etc/hosts by huck
in thread parsing a terrible /etc/hosts by f77coder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.