What is better: Static input data in separate file or embedding static input data in code.

Perl300 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I am using perl v5.24.1
I have to run the code recursively for a set of 4 values, multiple times. Each time the set of 4 values will be different but I'll know these before running the program

So what I have done is created a text file and use it as input file for the program.
The filename is stored in a variable in program and that variable is used each time input file needs to be accessed. I add the list of input values in this input file in space separated format. So the input file looks like:

contents of input.txt:
text1 string1 word1 number1
text2 string2 word2 number2
text3 string3 word3 number3
text4 string4 word4 number4

In code I am reading this input.txt one line at a time, spliting that line on space as delimiter ad storing each value in variable. Then run the program for these list of variables. So my program runs as many times as the number of lines in the input.txt

It is working fine but I have few questions:
Q-1) Is this approach fine or is there any better way to do it?
Q-2) Will it be better to embed the data in input.txt in the code itself rather than keeping it in a separate file?
Q-3) If YES to question 2 then what data structure should I use that is easy to edit in future as well as easy to loop over?

open (my $fh, '<', $input_file) or die "Can't read from '$input_file':
+ $!";

while (<$fh>)
{
    my ($var1, $var2, $var3, $var4) = split (' ', $_);
    #Call to subroutine1 (by passing $var1, $var2, $var3, $var4) which
+ calls subroutine2, which calls subroutine3.
}
[download]

Comment on What is better: Static input data in separate file or embedding static input data in code. Download Code

Replies are listed 'Best First'.
Re: What is better: Static input data in separate file or embedding static input data in code. by stevieb (Canon) on Nov 02, 2017 at 20:45 UTC
I always prefer to have separate data file, unless it's a test or something. What I would do in your case, is default to a file name, but allow the user to specify an alternate at the command line if they don't want to use the default. Go ahead and try this code without an argument (ie. `perl script.pl`), then create a second valid file, and run it with the arg (eg: `perl script.pl input2.txt`). Then, send in an invalid file name. In this case, it'll use the default (`input.txt`) (eg: `perl script.pl not_exist.txt`): `use warnings; use strict; my $file = 'input.txt'; if (@ARGV){ if (! -e $ARGV[0]){ warn "$ARGV[0] is not a valid file, using default\n"; } else { $file = $ARGV[0]; } } open my $fh, '<', $file or die $!; while (<$fh>){ print; }` [download]	[reply] [d/l] [select]
Re: What is better: Static input data in separate file or embedding static input data in code. by Discipulus (Canon) on Nov 02, 2017 at 21:05 UTC
Hello Perl300, personally i (ab)use a lot the `__DATA__` token. It's useful because can be used as other (even if bareword) filehandles. It's free because magically appears to be valid as it is encountered in the file. I use it when data is small or when I need to test some data exceptions so I put there worst cases mixed to some sane data and eventually some invalid row. When I'm happy I just add a line opening an external file and I put the new (now lexical!) filehandle into the loop. Also I hate Excel. More than the program in itself I hate the fact many people look at you bad if you do not declare to be an Excel wizard. So when some Excel task is given to me I export to comma separeted plain file and I put that blob under `__DATA__` token. Then I excel.. In such, rare cases, i stick with the data into the file because both have no sense without the other. But generally yes seems a good principle to have data separted from logic. For default i use sometimes:`$ARGV[0]//='input.txt'` or `$file=$ARGV[0]//='input.txt'` for programs where only one argument is admitted. If more I use always Getopt::Long For small hacks i (ab)use oneliners `perl -lane ...` does a lot of useful work for you in nothing. For example it auto `chomp` lines, thing you forgot to do in your above example ;=) PS another useful thing for data are heredocs: sometimes I use one or more of them to test some XML parser logic extracted from a huge unreadable file. L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re: What is better: Static input data in separate file or embedding static input data in code. by hippo (Archbishop) on Nov 03, 2017 at 10:37 UTC
Q-1) Is this approach fine or is there any better way to do it? It's fine so long as your dataset is regular and well-formed. If it starts to become less so there are other options: use a different format (XML, JSON, YAML, etc.), use a module (say Text::xSV or Text::CSV_XS), pre-process it, etc. Q-2) Will it be better to embed the data in input.txt in the code itself rather than keeping it in a separate file? Almost invariably no. Do you use a version control system? Hopefully the answer is "yes" in which case you would be committing deltas every time you change the dataset if it were embedded in the source. Equally, anyone else using your code would have to fire up an editor and know enough perl to be able to change the data without messing up the script syntax. Far, far better to have the data separated from the code. The only time it is a good idea to embed data in code is for testing (including SSCCE). Your test data may not change (usually should not change) as if it did your test results might not be conclusive or reproducible. All IMHO, of course.	[reply]
Re: What is better: Static input data in separate file or embedding static input data in code. by Anonymous Monk on Nov 02, 2017 at 23:03 UTC
If your data is small and changes with each run, you might want to consider just putting it on the command line: `$ myprogram.pl 'text1 string1 word1 number1,text2 string2 ... # then process @ARGV for ( split /,/, $ARGV[0] ){ subroutine1( split/ /, $_ ); }` [download]	[reply] [d/l]
Re: What is better: Static input data in separate file or embedding static input data in code. by Anonymous Monk on Nov 03, 2017 at 13:26 UTC
Unless you can demonstrate that the time required to process the external file would make a difference ... which I daresay you can't ... then I would leave the data in a file because you can very easily edit such a file without mucking-up your program's source code. You also retain the possibility of being able to specify one of several data-files, should you choose.	[reply]
Re: What is better: Static input data in separate file or embedding static input data in code. by Perl300 (Friar) on Nov 03, 2017 at 18:38 UTC
Thank you stevieb, Discipulus, Anonymous Monk, hippo for all your inputs. This is not for test so I'll not add data in code and will keep it in separate file as it is now. Making the default file and giving user option to give a different filename is pretty useful but this has to be run through a cron in this particular case. I have not used __DATA__ much until now but will explore it.	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.