james28909 has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow Monks!
I am in a conundrum again (yes again!). I am parsing a file that has specific parameters inside of it. I have completed the script and it does work on most files in question. What the problem is, is sometimes in these files there are parameters like "parameter_00" and "parameter_01" and "parameter_02" ect ect. I, however, will never know how many (could be hundereds of these entries in any given file). In the hash i predeclare the parameter and the read length. So if you're wondering what those values are, thats what they are

So, my question is, do i need to go ahead and add all these into the hash manually, or is there a way i can do it on the fly?
-attached files for testing at the bottom-
use strict; use warnings; my %hash = ( ACCOUNT_ID => '16', ACCOUNTID => '16', ANALOG_MODE => '4', APP_VER => '8', ATTRIBUTE => '4', BOOTABLE => '4', CATEGORY => '4', CONTENT_ID => '48', DETAIL => '1024', GAMEDATA_ID => '32', ITEM_PRIORITY => '4', LANG => '4', LICENSE => '512', NP_COMMUNICATION_ID => '16', NPCOMMID => '16', PADDING => '8', PARAMS => '1024', PARAMS2 => '12', PARENTAL_LEVEL => '4', PARENTAL_LEVEL_A => '4', PARENTAL_LEVEL_B => '4', PARENTAL_LEVEL_C => '4', PARENTAL_LEVEL_D => '4', PARENTAL_LEVEL_E => '4', PARENTAL_LEVEL_F => '4', PARENTAL_LEVEL_G => '4', PARENTAL_LEVEL_H => '4', PARENTAL_LEVEL_I => '4', PARENTAL_LEVEL_J => '4', PARENTAL_LEVEL_K => '4', PARENTAL_LEVEL_L => '4', PARENTAL_LEVEL_M => '4', PARENTAL_LEVEL_N => '4', PARENTAL_LEVEL_O => '4', PARENTAL_LEVEL_P => '4', PARENTAL_LEVEL_Q => '4', PARENTAL_LEVEL_R => '4', PARENTAL_LEVEL_S => '4', PARENTAL_LEVEL_T => '4', PARENTAL_LEVEL_U => '4', PARENTAL_LEVEL_V => '4', PARENTAL_LEVEL_W => '4', PARENTAL_LEVEL_X => '4', PARENTAL_LEVEL_Y => '4', PARENTAL_LEVEL_ => '4', PARENTALLEVEL => '4', PATCH_FILE => '32', PS3_SYSTEM_VER => '8', REGION_DENY => '4', RESOLUTION => '4', SAVEDATA_DETAIL => '1024', SAVEDATA_DIRECTORY => '64', SAVEDATA_FILE_LIST => '3168', SAVEDATA_LIST_PARAM => '8', SAVEDATA_PARAMS => '128', SAVEDATA_TITLE => '128', SOUND_FORMAT => '4', SOURCE => '4', SUB_TITLE => '128', TARGET_APP_VER => '8', TITLE => '128', TITLE_ID => '16', TITLE_00 => '128', TITLE_01 => '128', TITLE_02 => '128', TITLE_03 => '128', TITLE_04 => '128', TITLE_04 => '128', TITLE_05 => '128', TITLE_06 => '128', TITLE_07 => '128', TITLE_08 => '128', TITLE_09 => '128', TITLE_10 => '128', TITLE_11 => '128', TITLE_12 => '128', TITLE_13 => '128', TITLE_14 => '128', TITLE_15 => '128', TITLEID001 => '16', TITLEID002 => '16', TITLEID003 => '16', TITLEID004 => '16', TITLEID005 => '16', TITLEID006 => '16', TITLEID007 => '16', TITLEID008 => '16', TITLEID009 => '16', TITLEID010 => '16', TITLEID011 => '16', TITLEID012 => '16', TITLEID013 => '16', TITLEID014 => '16', TITLEID015 => '16', TITLEID016 => '16', VERSION => '8', XMB_APPS => '4' ); open my $param, '<', shift; binmode($param); my @params; my @array; my @other_array; my @split; my @read_array; seek $param, 0x08, 0; read $param, my $buf, 0x02; seek $param, 0x0C, 0; read $param, my $rev, 0x02; my $temp1 = pack( "v*", unpack( "n*", $buf ) ); my $param_loc = unpack( 'H*', $temp1 ); my $temp2 = pack( "v*", unpack( "n*", $rev ) ); my $key_table = unpack( 'H*', $temp2 ); seek $param, hex($param_loc), 0 or die; while ( $buf !~ /\x00{2}/ ) { read $param, $buf, 2; push @array, $buf; } for (@array) { @other_array = join( '', @array ); } for (@other_array) { @params = split( /\x0/, $_ ); } while ( my ( $temp_parameter, $max ) = each %hash ) { for my $elem (@params) { if ( $temp_parameter =~ /^$elem$/ ) { push( @split, "$temp_parameter $max" ); } } } for ( sort @split ) { my ( $temp, $read_length ) = split( / /, $_ ); push( @read_array, "$temp $read_length" ); } seek( $param, hex($key_table), 0 ); for my $value (@read_array) { my ( $parameter, $read ) = split( /\s/, $value ); chomp($read); read( $param, my $value, $read ); print "=============\n$parameter : $value\n=============\n"; }
Please never mind the syntax, i know i could have named stuff better, but if you look inside of the working and non working files, you will see what i mean. It is the entries "title_00" "title_01" ect ect(same as the ones in the hash that end with "x" or "xx" or "_xx".

The main question is, do i have to go ahead and declare abunch of them inside my hash?
Test files

Replies are listed 'Best First'.
Re: Declaring Hash entries
by james28909 (Deacon) on Jan 05, 2015 at 03:55 UTC
Re: Declaring Hash entries
by Anonymous Monk on Jan 05, 2015 at 01:57 UTC
    Does this answer your question ?
    my %hash = GetThemPairs( $file );
      Hmmm, not really. I can try to further explain tho. The hash i have is based off of a list of parameters, i just scan the files for any of these parameters and they are at a certain offset. So i really dont know which parameter is in the file until i load it. then this code above spits out the parameters and associated data/value
      Ah ok, so your saying that the hash will need to be predeclared in order to be able to parse specific data?

      EDIT: actually i just figured out, there is an index table inside of this file, so i dont even need a hash :l

        haha Re: Iteratively unpack structure from binary file ( ReadBytes, ReadFloat, ReadInt32 ), http://vitadevwiki.com/index.php?title=System_File_Object_%28SFO%29_%28PSF%29#Header_SFO

        #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; Main( @ARGV ); exit( 0 ); sub ReadBytes { my( $fh, $bytes ) = @_; $bytes or Carp::croak 'Usage: ReadBytes( $filehandle, $bytes ) '; my $readed = read $fh, my($data) , $bytes; $readed == $bytes or Carp::carp "Only read($readed) but wanted($by +tes): $! ## $^E "; $data; } use constant CAN_PACK_QUADS => !! eval { my $f = pack 'q'; 1 }; sub Int8 { unpack 'c', $_[-1] } sub UInt8 { unpack 'C', $_[-1] } sub Int16 { unpack 's<', $_[-1] } sub UInt16 { unpack 'S<', $_[-1] } sub Int32 { unpack 'l<', $_[-1] } sub UInt32 { unpack 'L<', $_[-1] } sub Int64 { unpack( ( CAN_PACK_QUADS ? 'q<' : 'a8' ), $_[-1] ) } sub UInt64 { unpack( ( CAN_PACK_QUADS ? 'Q<' : 'a8' ), $_[-1] ) } sub ReadInt8 { Int8( ReadBytes( $_[-1], 8 /8 ) ); } sub ReadUInt8 { UInt8( ReadBytes( $_[-1], 8 /8 ) ); } sub ReadInt16 { Int16( ReadBytes( $_[-1], 16/8 ) ); } sub ReadUInt16 { UInt16( ReadBytes( $_[-1], 16/8 ) ); } sub ReadInt32 { Int32( ReadBytes( $_[-1], 32/8 ) ); } sub ReadUInt32 { UInt32( ReadBytes( $_[-1], 32/8 ) ); } sub ReadInt64 { Int64( ReadBytes( $_[-1], 64/8 ) ); } sub ReadUInt64 { UInt64( ReadBytes( $_[-1], 64/8 ) ); } sub Float { unpack 'f', $_[-1] } sub ReadFloat { Float( ReadBytes( $_[-1], 32/8 ) ); } #~ perlpacktut says #~ f A single-precision float in native format. #~ d A double-precision float in native format. #~ see perlport sub Double{ unpack 'd', $_[-1] } sub ReadDouble{ Float( ReadBytes( $_[-1], 32/8 ) ); } #~ sub Main { #~ my $file = shift; #~ use autodie qw/ open /; #~ open my($fh), '<:raw', $file; #~ seek $fh, 0x14, 0; #~ my %stuff; #~ $stuff{key_table_offset} = ReadBytes( $fh , 2); #~ #~ $stuff{key_table_offset} = ReadUInt16( $fh ); #~ dd( \%stuff ); #~ #~ } sub Main { for my $file ( @_ ){ Stuffing( $file ); } } sub Stuffing { my $file = shift; use autodie qw/ open /; open my( $fh ), '<:raw', $file; my %stuff; $stuff{header}{magic} = ReadBytes( $fh, 4 ); #~ $stuff{header}{version} = ReadInt64( $fh ); #~ $stuff{header}{key_table_start_offset} = ReadInt64( $fh ); #~ $stuff{header}{data_table_start_offset} = ReadInt64( $fh ); #~ $stuff{header}{number_entries} = ReadInt64( $fh ); $stuff{header}{version} = UInt32( ReadBytes( $fh, +4 ) ); $stuff{header}{key_table_start_offset} = UInt32( ReadBytes( $fh, +4 ) ); $stuff{header}{data_table_start_offset} = UInt32( ReadBytes( $fh, +4 ) ); $stuff{header}{number_entries} = UInt32( ReadBytes( $fh, +4 ) ); seek $fh, 0x14, 0; ## $stuff{index_table}{key_table_offset} = ReadUInt16( $fh ); $stuff{index_table}{param_fmt} = ReadUInt16( $fh ); #~ $stuff{index_table}{param_fmt} = ReadBytes( $fh, 2 ); $stuff{index_table}{param_format} = { 4 => 'utf-8-special', 516 => 'utf-8-charstring-nul', 1027 => 'uint32', }->{ $stuff{index_table}{param_fmt} }; #~ 04 00 Little Endian utf-8 Special Mode Used in contents + generated by the system (e.g.: save data) #~ 04 02 Little Endian utf-8 Character string, NULL finish +ed (0x00) #~ 04 04 Little Endian integer 32 bits unsigned #~ fail #~ perl -le " print pack qw/ L< /, $_ for qw/ 0400 0402 040 +4 /; " #~ fail #~ "\4\2", 516 #~ fail #~ perl -le" die pack 'H*', 516 " #~ #~ $ perl -le" binmode STDOUT; print qq/\4\0\4\2\4\4/ " |hexdump #~ 00000000: 04 00 04 02 04 04 0A - | + | #~ 00000007; #~ #~ { open my($fh), '<:raw', \qq/\4\0/; dd( ReadUInt16( $fh ) ); } + ## 4 #~ { open my($fh), '<:raw', \qq/\4\2/; dd( ReadUInt16( $fh ) ); } + ## 516 #~ { open my($fh), '<:raw', \qq/\4\4/; dd( ReadUInt16( $fh ) ); } + ## 1028 $stuff{index_table}{param_length} = ReadUInt32( $fh ); $stuff{index_table}{param_max_length} = ReadUInt32( $fh ); $stuff{index_table}{data_table_offset} = ReadUInt32( $fh ); #~ typedef struct{ #~ u16 keyOffset; //offset of keytable + keyOffset #~ u16 param_fmt; //enum (see below) #~ u32 paramLen; #~ u32 paramMaxLen; #~ u32 dataOffset; //offset of datatable + dataOffset #~ } indexTableEntry_t; dd( \%stuff ); } ## end sub Stuffing sub StuffingBytes { my $file = shift; use autodie qw/ open /; open my( $fh ), '<:raw', $file; my %stuff; $stuff{header}{magic} = ReadBytes( $fh, 4 ); $stuff{header}{version} = ReadBytes( $fh, 4 ); $stuff{header}{key_table_start_offset} = ReadBytes( $fh, 4 ); $stuff{header}{data_table_start_offset} = ReadBytes( $fh, 4 ); $stuff{header}{number_entries} = ReadBytes( $fh, 4 ); $stuff{index_table}{key_table_offset} = ReadBytes( $fh, 2 ); $stuff{index_table}{param_fmt} = ReadBytes( $fh, 2 ); $stuff{index_table}{param_length} = ReadBytes( $fh, 4 ); $stuff{index_table}{param_max_length} = ReadBytes( $fh, 4 ); $stuff{index_table}{data_table_offset} = ReadBytes( $fh, 4 ); dd( \%stuff ); } ## end sub StuffingBytes __END__ $ perl keytable.pl nonworking.file working.file { header => { data_table_start_offset => 684, key_table_start_offset => 404, magic => "\0PSF", number_entries => 24, version => 257, }, index_table => { data_table_offset => 0, key_table_offset => 0, param_fmt => 516, param_format => "utf-8-charstring-nul", param_length => 6, param_max_length => 8, }, } { header => { data_table_start_offset => 372, key_table_start_offset => 228, magic => "\0PSF", number_entries => 13, version => 257, }, index_table => { data_table_offset => 0, key_table_offset => 0, param_fmt => 516, param_format => "utf-8-charstring-nul", param_length => 6, param_max_length => 8, }, }