Re: Design elegance : How to best design this simple program ? (modular)

But well, anyway, i would like to try and process the data "on the go" (while reading it) if possible, as it represents a kind of "design challenge".

So, you realize that processing the data "on the go" will make for a complicated design. For me, that means "bad design" not "design challenge".

You don't get better at design by flexing your design muscles trying to do a really good job at figuring out a design for an overly complex problem.

You get better at design by learning how to recognize the most simple and cohesive parts of a problem and how to factor them out so that the parts left to be designed are fewer and/or smaller until you get to the point that everything is simple to design.

When I see something that is complicated to design, my primary reaction is to assume that things have been factored badly and reconsidering how some things have been factored is the best next step because factoring them "better" may leave me with a much simpler design problem.

Complex design problems that don't get factored into a bunch of simple design problems with minimal interconnectedness turn into complex designs and complex software and repeated failures.

That is, the correct design step when given the design challenge you gave to yourself is to say "Doing calculations 'on the fly' makes things complicated and makes it harder to separate concerns and so makes for less modular design. Is there a fundamental reason that calculations have to be done 'on the fly'? If not, we should just drop that idea from the design."

Congratulations, you will have successfully solved your design challenge when you drop it.

- tye

Comment on Re: Design elegance : How to best design this simple program ? (modular)

Replies are listed 'Best First'.
Re^2: Design elegance : How to best design this simple program ? (modular) by mascip (Pilgrim) on Jun 20, 2012 at 19:22 UTC
Hi all, thank you for interesting, helpful answers. I should have said what it is about before, but it wouldn’t help with my problem, nor my learning process. It’s a program for processing data from experiments. I got the data from biologists, who grow phytoplankton in controlled environment (i’ve got data from more than 100 experiments). There’s data from different captors, each of them retrieving one or more measurements. And i need to do fairly complex operations to determine parameters for mathematical models of population dynamics. So first, i need to preprocess, in order to get “aggregated information” for each line (each line corresponds to a given moment/timestamp). Then, i will estimate parameters. But i’m only speaking about the information aggregation here. Estimating parameters will come later, and i’m not sure i’ll do it with Perl. I’ll see later, that's another question. Now, back to my design questions : 1. I don’t make the spreadsheets myself. But i could fit all these data in a database to get help from MySQL queries. I didn’t think about this. What i don’t like about it is that - i need to do one extra operation (organise and transfer my data to the database) - i need to learn how to use MySQL again. It’s not complex i know, but i haven’t used it in 6 years now. And i’m thinking that what MySQL can do, Perl can do. Am i wrong thinking this? 2. I don’t use globs for finding files, but the File::Find::Rule CPAN module. Are globs better ? 3. Thank you for inspiring comments : - Design is about learning to recognize the most simple and cohesive parts of a problem. When the parts are smaller, everything is simple to design. - Write tests to... test your understanding of the problem. => That’s what i discovered recently by writing my first tests before coding. Thank you for the phrasing, it was good to read. - First, write a fast and dirty proof of concept (well, after having written a test for the feature i’m implementing) : this is the best way to learn about the problem. Then only, refactor and redesign. 4. I agree with your various comments : my design problem will be solved when i will drop it (nice phrasing, tye). I can do i all with a script. For each experiment, I could just read each of the spreadsheets, store the data into a hash, and then process it all. That IS making things easier, and thus better design. Nonetheless, yesterday i wrote stuff down on paper and came down with a rough idea of how to do process stuff “on the fly”. I’m going to write “dirty code” here and now (not compiled and probably won’t compile - i’m on holidays without my computer), copying directly from the paper to the forum. I’m not going into the details of the implementation though: just the rough idea. The key for “reading on the go” was to create data Readers, and then pass them all to a method that reads lines simltaneously. The objects i will create are : - Reader::Data, that read data with a get_next_data() method, for a given file - Experiment, which will know the directory path for the experiment, and the paths for its data files too - possibly some objects for representing the data, or data sets, with methods to do some calculations on them. But that’s another story. And no Roles needed, as i just don’t need them now. Here is what i would have done (but won’t, thanks to what you all said), for those who would be interested : # - - - process_bio_data_in_directory.pl # Responsibility : process global results for all experiments. # Where to find the experiment directories and data files my $DATA_DIR = ‘C:/bio_data/’; my $experiment_dir_regex = qr/Exp/; my $data_file_regex = qr/data_/; ### Comment : i lined these three = signs, but the different font of t +he <code> didn't leave them lined. That is very annoying, # 1. Find all the experiment directories, and their related data files my $list_experiments = find_experiments_and_their_data_files_in({ dir =>$DATA_DIR, with_experiment_regex => $experim_dir_regex, with_data_file_regex => $data_file_regex, }); ### Comment for the reader : ### an Experiment object will have a path, and a list of data_files. ### I feel that this class makes my code more readable, and i can use +$data_file_regex here, ### and not have to think about it again. # 2. Process the data for each experiment my %global_results; EXPERIMENT: foreach my $experiment ( @{$list_experiments} ) { my $hash_aggr_infos = calculate_aggr_info_for_experiment($experime +nt); process_global_results_with( $hash_aggr_infos, \%global_results ); + } [download] Then, the “on the go” data processing is organized in calculate_aggr_infos_for_experiment(). # - - - calculate_aggr_infos.pm package Calculate::AggrInfos; # Responsibility : calculate the aggregate information for one experim +ent. sub calculate_aggr_infos_for_experiment { my $experiment = shift; # 1. Initialize Readers for each data file my $hash_data_reader_of_file = initialize_data_file_readers_for_ex +periment($experiment); # 2. Calculate aggregate information my $hash_aggr_infos = calculate_aggr_infos_with_readers( $hash_dat +a_reader_of_file ); return $hash_aggr_infos; } # - - - end sub calculate_aggr_infos_for_experiment() [download] And next, the calculate_aggr_infos_with_readers() subroutine, which reads the files in parallel and processes them. # in the same file and package as before sub calculate_aggr_infos_with_readers { my $hash_data_reader_of_file = shift; my @data_files = keys %{$hash_data_reader_of_file}; DATA: while ( $hash_data_now = get_next_data_for_all_readers($hash_data_ +reader_of_file) ) { # check that the time value is the same for each set of data check_if_time_is_the_same_for_all_data($hash_data_reader_of_fi +le); # calculate aggregated information my $hash_aggr_infos = calculate_aggr_infos_from_data($hash_dat +a_now); # it’s a bit more complex, as i need a bit of “past data histo +ry” # to calculate the aggregated information } # end while (DATA) } # - - - end sub calculate_aggr_infos_with_readers() [download] That’s it ! i won’t go into more detail. Sorry for posting code that doesn’t work, and must contain many mistakes. It mut not be nice to read. Any comments are very welcome if you had the courage to read all this. Even just on code layout, or the way i name my variables. I’d like to improve this for readability, too.	[reply] [d/l] [select]
Re^3: Design elegance : How to best design this simple program ? (modular) by mascip (Pilgrim) on Jun 20, 2012 at 19:23 UTC
Arg, i should have replied to the last message in the page.	[reply]