Re: Learning Exercise

I have something to say. Abstraction. Wait, let me repeat that a few times.
Abstraction
Abstraction
RRRAAAGGHHH :)

Ok, got that out of my sysAbstractiontem.

You do not have one problem, that of a piece of software that allows you to make entries in an address book, you have two.

1. A piece of software that allows me to persistantly store rows of structured data.
2. A piece of software that provides a user interface to 1.

The reason for breaking it up into concepts like that (a large application will have tens or hundreds of such "problems") is that you can then go through and replace entire sections wholesale with already-available applications.

It doesn't take a genius to realise that 1. is A Database. Thats exactly what you want, a database. Now, maybe for convenience you'd rather not install MySQL or PostgreSQL or DB2 just to run your address book, and with the specs as they stand I wouldn't blame you (I already have MySQL everywhere so I'd use that if I were solving the same problem), however that doesn't mean you can't take advantage of other, simpler database systems, notably the Berkley DB system (see perldoc NDBM_File and others suggested in these comments).

Ok, so, for the sake of argument/learning, lets say you decide to implement both 1 and 2 yourself. The important thing to understand is that that doesn't mean that they're no longer 1 an 2, they're still best implemented as seperate entities. An example interface would be this:

sub list_entries {
    my ($filename) = @_;
    ...
    return { $key => { ...}, $key => {...}, .. ];
}

sub add_entry {
    my ($filename, $key, %data) = @_;
    ...

sub delete_entry {
    my ($filename, $key) = @_;
    ...

sub update_entry {
    my ($filename, $key, %data) = @_;
    ....
[download]

Where $key represents the unique identifier for that record (you might consider it the first name + last name, I tend to use numerics under such circumstances since duplicate names are all too common). Although to be honest, I think it would work out better as an object or tie.

Once you have defined that interface, a lot of the pain goes away. You have a nicely defined set of places where, if you so chose, you could convert the code to NDBM or MySQL without screwing up unrelated interface code, you also have an obvious set of logical operations, so it is easy to visualise which may require locking or equivalent to work properly, and further abstraction of the file opening is made simple by the common interface.

You could then wipe your mind clean of everything to do with low-level locks, file operations, seeks() etc, and write the user interface. Trust me when I say that one of the best qualities a good coder can have is the ability to write code in such a way that, writing or reading, it requires as little in-brain memory as possible. Abstraction, OO, structured programming, all these are essentially attempts at 1) reusing code 2) allowing the programmer to forget things.

If I were writing this same app, under these circumstances, I would write it like this:

package MyDatabase;

sub new {
    my $class = shift;
    my $self = {};
    bless $self, $class;
    return $self;
};
  
sub set_schema {
    my ($self, @fields) = @_;
    $self->{'fields'} = [@fields];
};

sub set_filename {
    my ($self, $filename) = @_;
    $self->{'filename'} = $filename;
};

sub read_entries {
    my ($self) = @_;
    $self->{'data'} = {};
    open(F,"<".$self->{'filename'}) || die "Could not open ".$self->{'
+filename'}." to read";
  
    while (my $line = <F>) {
        my ($key, @field_values) = split(/:/,$line);
        for (@{$self->{'fields'}}) {
            $self->{'data'}{$key}{$_} = shift(@field_values);
        }
    }

    close(F);
};

sub write_entries {
    my ($self) = @_;
    open(F,">".$self->{'filename'}) || die "Could not open ".$self->{'
+filename'}." for writing";

    for my $key (keys %{$self->{'data'}}) {
        # Prints out the key then the data fields in order.
        print F join(':',($key, map {$self->{'data'}{$key}{$_}} @{$sel
+f->{'fields'}}))."\n";
    }

    close(F);  
}

sub list_entries {
    my ($self) = @_;
    $self->read_entries();
    return %{$self->{'data'}};
}

sub set_entry {
    my ($self, $key, %data) = @_;
    eval { $self->read_entries(); }; # drop exceptions for the read on
+ set, if the db doesn't exist, we'll create it.
    $self->{'data'}{$key} = \%data;
    $self->write_entries();
}

sub delete_entry {
    my ($self, $key) = @_;
    $self->read_entries();
    delete $self->{'data'}{$key};
    $self->write_entries();
}
[download]

Holy Smoke! you say, Thats almost as long as my entire program previously! and to that I say, well, yes. But, it works. Not only does it work, but it will work in the general case of wanting to store rows of data, not just the trivial case of wanting to store address data. Further, you can update a given element correctly, you don't have globals flying around all over the place risking life and sanity as they conflict with other parts of the code, it is trivial to modify to utilise more efficient backends, such as a RDBMS or NDBM, no code is duplicated, we open for reading only in one place, we open for writing only in one place, thus locking would be a trivial addition. AND you can just plug it in to any other application with similar requirements.

The objectives of good programming are to avoid ever writing the same thing twice. Unfortunately, this has to balance against the objective of getting things done now. Obsessive generalisation results in you writing a turing machine :). This kind of problem however is absolutely ripe for effective generalisation, as I have demonstrated.

I recommend considering any programming problem in this light. How much more time will it take you to make it work for the general case? it took me 9 minutes and 25 seconds to write and debug that code (I counted :), longer perhaps than if I had simply integrated it all into the GUI and done both, but now I can add locking in a couple of lines, add NDBM support with a couple of minor changes, etc etc. Time saved in the future is often far more valuable than time saved now. As always, careful when you bet :)

Regarding the general style of your code, I highly recommend reading one of the many styleguides on the 'net. I of course prefer my own (http://phirate.exorsus.net/programming/index.php?item=style_guide) but thats not to say there aren't many valid styles which promote readable, maintainable code. Any exercise in learning programming should be accompanied by an exercise in making the code readable and usable. Add to that concepts like automated test-suites and in-code documentation (w00t! POD) when you can, for extra bonus points and feelings of pride that you just can't share with anyone who isn't a programmer (:/).

Most of all, enjoy yourself, programming is an art as well as a science, making something beautiful is all in the knowing and the taking the time.

(bugs in the above class are possible, although naturally I like to think myself above such things :)

(test)

my $db = new MyDatabase();

$db->set_filename("test.db");
$db->set_schema("first name","last name","email");

$db->set_entry("testentry",(
    "first name" => "Phi",
    "last name" => "RatE", 
    "email" => 'this@is.a.test.com'));
 
my %entries = $db->list_entries();

for (keys %entries) {
    for my $field (keys %{$entries{$_}}) {
        print "$field -> ".$entries{$_}{$field}."\n";
    }
}
[download]

Comment on Re: Learning Exercise Select or Download Code

In Section Seekers of Perl Wisdom