comment on

What you are doing is sound. To test this, I created two files called cp_123456 and cp_123457. I then modified your snippet to use my files and I use Data::Dumper to view the resulting data structure:

use strict;
use warnings;
use Data::Dumper;

my %city_data = ();
foreach my $filename  ( 'cp_123456', 'cp_123457') {
    my ( $city, $id ) = split /_/, $filename;
    open THIS, "<$filename" or die "Can't open $filename for reading: 
+$!";
    chomp ( my @tmp = <THIS> ); 
    close THIS;
    push @{ $city_data{$city}{$id} }, @tmp;
}

print Dumper( \%city_data );
[download]

The output was the following:

$VAR1 = {
          'cp' => {
                    '123456' => [
                                  'this is some data',
                                  'this is a test'
                                ],
                    '123457' => [
                                  'this is the second file',
                                  'this is another line',
                                  'this is the third line'
                                ]
                  }
        };
[download]

Whenever I want to create a complex data structure, I have to decide what's the best way to get at the data. One important issue is to avoid iterating, if possible. As these structures get larger, iteration can kill your performance.

After I have the basic idea of the structure laid out, I use Data::Dumper to output a subset of the structure so I can see that the results are what I expect. Using the debugger and entering the command x \%city_data has essentially the same effect.

However, I find that complex data structures in Perl are, for me, similar to regexes in that at times I tend to use them innapropriately. Often, as the amount or complexity of the data increases, a complex data structure can become unmanageable. Using a database to handle that data can solve some serious headaches if you're concerned about scalability issues.

Another issue to ask here is, what do you do if you wind up with two files with the same name? If someone accidentally copies a file to another folder that you also happen to read from, do you have duplicate data? You may wish to test for this.

Also, is there any data validation? I realize that you just posted a snippet, but don't forget to test for this. Plus, if there is any chance that another process will be accessing the files while you are reading them, consider using flock.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

In reply to (Ovid) Re: Confused about complex data structures. by Ovid
in thread Confused about complex data structures. by r.joseph

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.