onkar has asked for the wisdom of the Perl Monks concerning the following question:

I am Parsing GNU/Linux configuration file to automate the process.
CONFIG_TULIP_NAP => y CONFIG_USB_SISUSBVGA_CO => y CONFIG_PC30 => m CONFIG_USB_TES => m CONFIG_I2C_NFORCE => m CONFIG_MTD_SBC_GX => m CONFIG_NTFS_DEBUG => Z CONFIG_DRM_SI => m CONFIG_SCSI_ISCSI_ATTR => m CONFIG_ACPI_SLEE => y CONFIG_DLCI_MA => 8 CONFIG_SCSI_IPR_DUMP => Z CONFIG_ECONET_NATIV => y CONFIG_BRIDG => m CONFIG_IP_ADVANCED_ROUTE => y CONFIG_AIC79XX_REG_PRETTY_PRIN => y CONFIG_IF => m some of the keys are chopped off like CONFIG_BRIDG there are so many entries in the config file with this name. CONFIG_BRIDGE_NF_EBTABLES=m CONFIG_BRIDGE_EBT_BROUTE=m CONFIG_BRIDGE_EBT_T_FILTER=m CONFIG_BRIDGE_EBT_T_NAT=m CONFIG_BRIDGE_EBT_802_3=m CONFIG_BRIDGE_EBT_AMONG=m CONFIG_BRIDGE_EBT_ARP=m CONFIG_BRIDGE_EBT_IP=m CONFIG_BRIDGE_EBT_LIMIT=m CONFIG_BRIDGE_EBT_MARK=m CONFIG_BRIDGE_EBT_PKTTYPE=m CONFIG_BRIDGE_EBT_STP=m CONFIG_BRIDGE_EBT_VLAN=m CONFIG_BRIDGE_EBT_ARPREPLY=m CONFIG_BRIDGE_EBT_DNAT=m CONFIG_BRIDGE_EBT_MARK_T=m CONFIG_BRIDGE_EBT_REDIRECT=m CONFIG_BRIDGE_EBT_SNAT=m CONFIG_BRIDGE_EBT_LOG=m CONFIG_BRIDGE_EBT_ULOG=m similarly, CONFIG_AIC79XX_REG_PRETTY_PRINT=y is shown only as CONFIG_AIC79XX_REG_PRETTY_PRIN => y I am maintaining hash with configuration name as key and value to be y,m,Z I am puzzled as to why the some keys are getting chopped off while others are not ? ====== code => #!/usr/bin/perl # This part of the script takes the old config # file and sets only those options into the new # config files which are set. open CONF1 ,"<","config-2.6.18-custom" or die $!; open CONF2 ,"<","config-2.6.16.27-custom" or die $!; my @lines1= <CONF1>; my @lines2= <CONF2>; sub trim{ my $string = shift; $string =~ s/^\s+|\s+$//g; return $string; } sub get_config_info_in_mem { my %hash_config = (); my ($lref) = @_; my (@lines) = @$lref; for $line (@lines) { if (substr($line,0,6) eq "CONFIG" ) { $i=index($line,"="); $val=substr($line,$i+1,length($line)-1-$i); $conf=substr($line,0,$i-1); # print "$conf=$val\n"; $conf=trim($conf); $val=trim($val); $hash_config{$conf} = $val; } else { if(substr($line,0,8) eq "# CONFIG") { $i=index($line,"CONFIG_"); $i_space=index($line," ",$i+1); $conf=substr($line,$i,$i_space-$i); $val="Z"; # print "$conf=$val\n"; $conf=trim($conf); $val=trim($val); $hash_config{$conf} = $val; } } } return %hash_config; } sub print_hash_in_mem { my ($hc) = @_; my %hash_config =%$hc; while( my($conf,$val) = each(%hash_config)) { print "$conf => $val\n"; } } my %hash_config1=get_config_info_in_mem(\@lines1); #print $hash_config1."\n"; print_hash_in_mem(\%hash_config1); # my $hash_config2=get_config_info_in_mem(@lines2); # print_hash_in_mem(%hash_config2);

Replies are listed 'Best First'.
Re: Hash , Keys strangely chopped !!
by Transient (Hermit) on Jun 19, 2009 at 17:22 UTC
    As ikegami said above, you need to use $i not $i-1 in the substr. Running the data you listed on my machine resulted in all being chopped (probably because of the lack of spaces before the "=")

    I would suggest, however, using split instead of index and substr. It's simpler and cleaner with no messy index residue!
    my ( $conf, $val ) = split /=/, $line, 2;
Re: Hash , Keys strangely chopped !!
by ikegami (Patriarch) on Jun 19, 2009 at 17:15 UTC

    some of the keys are chopped off

    $conf=substr($line,0,$i-1);
    should be
    my $conf=substr($line,0,$i);
    or better yet
    my $conf = substr($line, 0, $i);

    If the "=" is at index $i==4 (the 5th character), you want to extract 4 characters (indexes 0 to 3), not three.

    I am puzzled as to why the some keys are getting chopped off while others are not ?

    They're all getting chopped. Some of them probably have spaces before the "=", so you end up chopping one of the spaces instead of a visible character.

    Finally,
    $val=substr($line,$i+1,length($line)-1-$i);
    can be written simpler as
    $val=substr($line,$i+1);
    and more readably as
    $val = substr($line, $i+1);

    Update: Of course, the following is simpler yet:

    my ($conf, $val) = map { trim($_) } split(/=/, $line, 2);

    Update: Doh! Transient posted a split solution at the same time as my update.

    Update: Removed recommendation to use chomp. It doesn't help since the user trim trailing whitespace.

      Thanks , my ($conf, $val) = map { trim($_) } split(/=/, $line, 2); worked good for me.
Re: Hash , Keys strangely chopped !!
by Marshall (Canon) on Jun 20, 2009 at 19:51 UTC
    I don't think that the string functions are the way to go here. I would use regex to make 2 hashes and then merge/adjust them in whatever way that you need. This is some example code:
    #!/usr/bin/perl -w use strict; use Data::Dumper; open (my $config_file_main ,"<","config18.txt" )or die $!; open (my $config_file_update ,"<","config27.txt" )or die $!; my %file1 = read_cfg ($config_file_main); my %file2 = read_cfg ($config_file_update); # this combines both hashes into one !!!! # A simple "merge" operation. # More complex things can be done %file1 = (%file1, %file2); # Usually it is better to print in # a human readable format with sort # foreach my $key (sort keys %file1) { print "$key = $file1{$key}\n"; } # This returns a list instead of list ref # Of course using a ref is more efficient, but here I # do something simple to illustrate points above.. # sub read_cfg { my $file_handle = shift; my %hash; while (<$file_handle>) { next if /^\s*$/; #skip blank lines my ($name,$value) = $_ =~ /^\s*(\S+)\s*=>\s*(\S+)/; $hash{$name} = $value; } close ($file_handle); return %hash; }
    config27: CONFIG_BRIDGE_NF_EBTABLES => m CONFIG_BRIDGE_EBT_BROUTE => y CONFIG_BRIDGE_EBT_T_FILTER => z
    config18: CONFIG_BRIDGE_NF_EBTABLES => m CONFIG_BRIDGE_EBT_BROUTE => m CONFIG_BRIDGE_EBT_T_FILTER => m CONFIG_BRIDGE_EBT_T_NAT => m CONFIG_BRIDGE_EBT_802_3 => m CONFIG_BRIDGE_EBT_AMONG => m CONFIG_BRIDGE_EBT_ARP => m CONFIG_BRIDGE_EBT_IP => m CONFIG_BRIDGE_EBT_LIMIT => m CONFIG_BRIDGE_EBT_MARK => m CONFIG_BRIDGE_EBT_PKTTYPE => m CONFIG_BRIDGE_EBT_STP => m CONFIG_BRIDGE_EBT_VLAN => m CONFIG_BRIDGE_EBT_ARPREPLY => m CONFIG_BRIDGE_EBT_DNAT => m CONFIG_BRIDGE_EBT_MARK_T => m CONFIG_BRIDGE_EBT_REDIRECT => m CONFIG_BRIDGE_EBT_SNAT => m CONFIG_BRIDGE_EBT_LOG => m CONFIG_BRIDGE_EBT_ULOG => m
    Prints:
    CONFIG_BRIDGE_EBT_802_3 = m CONFIG_BRIDGE_EBT_AMONG = m CONFIG_BRIDGE_EBT_ARP = m CONFIG_BRIDGE_EBT_ARPREPLY = m CONFIG_BRIDGE_EBT_BROUTE = y CONFIG_BRIDGE_EBT_DNAT = m CONFIG_BRIDGE_EBT_IP = m CONFIG_BRIDGE_EBT_LIMIT = m CONFIG_BRIDGE_EBT_LOG = m CONFIG_BRIDGE_EBT_MARK = m CONFIG_BRIDGE_EBT_MARK_T = m CONFIG_BRIDGE_EBT_PKTTYPE = m CONFIG_BRIDGE_EBT_REDIRECT = m CONFIG_BRIDGE_EBT_SNAT = m CONFIG_BRIDGE_EBT_STP = m CONFIG_BRIDGE_EBT_T_FILTER = z CONFIG_BRIDGE_EBT_T_NAT = m CONFIG_BRIDGE_EBT_ULOG = m CONFIG_BRIDGE_EBT_VLAN = m CONFIG_BRIDGE_NF_EBTABLES = m