Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Open Flat File

by BBQ (Curate)
on May 05, 2000 at 07:28 UTC ( [id://10310]=CUFP: print w/replies, xml ) Need Help??

Back in the days when I didn't have access to databases, I used to write everything to flatfiles. This sub opens a delimited CSV file and takes the column names from the first line (which gets choped off) then it rolls through using the 1st row as a "primary key".

An example of what a database file might be (tab delimited):
PEOPLE CARS CAR_COLORS AGES John Volks avocado 26 Mira Fiat (yuck) black 25 Mom Twingo cherry 50
I haven't used it for a long time, but maybe someone has comments or even has a use for it...
my @columns = OpenFlatFile('TEST') or warn('file missing, unaccessible or empty'); foreach (keys %{$columns[0]}) { # might as well have said 'keys %PEOPLE' print "$PEOPLE{$_} drives a $CAR_COLORS{$_} $CARS{$_} and is $AGES{$ +_} years old.\n" } sub OpenFlatFile { my ($datfile) = $_[0]; my $delimiter = "\t"; my $val; open(FILE,$datfile) or return(0); my @tmp = <FILE>; close(FILE); my $head = shift(@tmp); chop $head; @{$datfile} = split($delimiter,$head); foreach (@tmp) { my $x = 0; chop; my @line = split($delimiter,$_); foreach $val (@{$datfile}) { if ((! defined ${$val}{$line[0]}) && ($line[$x] ne '')) { ${$val}{$line[0]} = $line[$x]; } ++$x; } } return(@{$datfile}); }

Replies are listed 'Best First'.
RE: Open Flat File
by perlmonkey (Hermit) on May 08, 2000 at 11:33 UTC
    Thanks for the post BBQ, the routine is fast and functional. But, personally, I would not use it because of the namespace pollution. If I were reading your code and did not know that %PEOPLE was dynamically defined in OpenFlatFile, it might take me a long time to figure out where it was defined. Also with your method you cant very well 'use strict'.

    Here is a design I would use. It is a tiny bit slower, but you would not notice for files under 5000 lines, and if the file is a greater size, then use a database:
    use strict; use Data::Dumper; my $data = OpenFlatFile("flatfile.txt", "\t"); my @PEOPLE = @{$data->{'PEOPLE'}}; my @CAR_COLORS = @{$data->{'CAR_COLORS'}}; my @CARS = @{$data->{'CARS'}}; my @AGES = @{$data->{'AGES'}}; for(my $i=0; $i < $data->{'ROWCOUNT'}; $i++) { print "$PEOPLE[$i] drives a $CAR_COLORS[$i] $CARS[$i] "; print "and is $AGES[$i] years old.\n" } print Data::Dumper->Dump([$data]), "\n"; sub OpenFlatFile { my ($datafile, $delimeter) = @_; open(FILE,$datafile) or return {}; chomp (my @tmp = <FILE>); close FILE; my @headers = split /\Q$delimeter\E/, shift @tmp; my $data = {}; $data->{'HEADERS'} = [@headers]; $data->{'ROWCOUNT'} = scalar(@tmp); $data->{'COLCOUNT'} = scalar(@headers); foreach my $line (@tmp) { my @a = split /\Q$delimeter\E/o, $line; push @{$$data{$_}}, shift @a foreach (@headers); } return $data; }
    This code is basically the same as yours, but It creates only one variable to encapsulate the data. For my data stucture I chose to implement a hash of arrays to easily preserver the file order, but you could implement a hash or hashes unsing the first row as the second hash keys. The downside of this approach is that the code becomes a tad more verbose, but for me the verbosity adds clarity, which is far more valueable in my opinion.

    For a tidbit of clarity this is what the Data::Dumper produces:
    $VAR1 = { 'AGES' => [ 26, 25, 50 ], 'CARS' => [ 'Volks', 'Fiat (yuck)', 'Twingo' ], 'ROWCOUNT' => 3, 'HEADERS' => [ 'PEOPLE', 'CARS', 'CAR_COLORS', 'AGES' ], 'PEOPLE' => [ 'John', 'Mira', 'Mom' ], 'CAR_COLORS' => [ 'avocado', 'black', 'cherry' ], 'COLCOUNT' => 4 };
      Wow! That is so cool! I've always wanted to put in a rownumber variable in there somewhere, but I the approach that I had thought of was by generating a $ROWNUM{$key} variable.

      About the namespace littering, that has always been one of my concerns, but I've noticed that it is very hard to be flexible if you don't allow the variables to be "public". You see, one of the reasons why I was never running this sub under strict, was because I had it maintaining a forum, and some of the file headers got dynamically allocated depending on the posters UID, message ID, etc. By using strict, I couldn't flood the process with hashes that I didn't know the name for before hand and that made my code impossible to run. It gets complicated if you know that one of your tables contains a hash of the headers used in other tables.

      I've never managed to overcome that, but I would like to (maybe by moving everything out of main::(?)). Maybe someday when I have free time, I'll look into it just for the hell of it. Nowadays I'm using DBMSs, so I don't worry about it that much.

      Thanks for the facelift!
      Wow. What a find. Fits my application perfectly for tracking a few dozen things in a table.

      Now, I've tried several versions of a SaveFlatFile sub, but this being my first try at references, it just doesn't click/work. Can someone help me here? Pretty please with perl on top??

      Thanks, Doug

RE: Open Flat File
by Simplicus (Monk) on May 05, 2000 at 18:11 UTC
    pretty nifty, and a good bit simpler than what I have
    written to read flat files. (I use them to save a few
    settings (like users and passwords) between scripts in web-
    based reporting tools for our computer repair
    center.) May I have your permission to shamelessly
    reuse your code?
    Simplicus
      Thanks for the kind words, and yes, you may indeed! That's the reason why I posted it in the first place! :o)

      The reason why I wrote this was because I wanted to have a way to be able to say $CARS{'John'},$CARS{'Mom'} or do averages on %AGES. On the downside, depending on what you're 1st column (the pseudo-primary key), you get a hash that has values like
      $PEOPLE{'John'} eq 'John' $PEOPLE{'Mira'} eq 'Mira' $PEOPLE{'Mom'} eq 'Mom'
      and I always thought that was a very silly waste of memory. I guess one could easily modify the sub to make it more memory/disk efficient, but then again, this is not intended as a database substitute, nor is it intended to run on files with 5000+ lines. So I guess that shouldn't be a worrying issue.

      I also have a SaveFlatFile() (with a home-made "file locking system") which may be of interest. But I think its a bit too long to post here since I had to strip the comments on OpenFlatFile just so that it would fit. If someone finds this could also be of any use, I'll clean it up and post it later on as a link or something...
Re: Open Flat File
by prmartin (Initiate) on Jul 19, 2004 at 04:14 UTC
    Dear BBQ,

    Praise upon you!

    This is just what a beginner like me requires: it's simple enough to see through without much prior knowledge. With your permission I'll modify if for my specific requirements.

    Yours,

    P

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://10310]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-18 17:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found