comment on

I often find myself in the same situation, needing to read a script's configuration from a file. This is my expecrience:

Firstly, I decided to separate configuration data from code (edit: rephrased that for clarity) which are read at start time with a CLI option (e.g. --config myapp.conf) to the script or passed on as arguments to subroutines/constructor. In the latter case I am flexible: either pass a configuration filename or a configuration Perl data structure which was created earlier by reading the configuration file (as a means of caching the configuration -- static I assume -- data).

There are many choices for the configuration data file format. As a rule (mine) I avoid storing the configuration as Perl data structure and reading and eval()'ing that code. Because it is a wide-open door for your script to execute unknown/injected user-specified code pretending to be data. (I know that you said your script is only for you and the data is static, living somewhere in the distribution's homedir.)

Re: Storable: it is an interesting alternative to directly eval()'ing Perl data structures from separate data/configuration files (which can become easily arbitrary Perl code!). Unfortunately it comes with a security warning about loading untrusted Storable-based data even with default settings. And they know better than me.

At this point, I should mention that you can invent your own format. But since what you want is pretty standard, then what's the point? Additionally, I often have unicode content in my configuration files and this is correctly handled by the modules I am mentioning here (and tested by them, ouch!). And that's a vote against writing your own.

So, my quest ended with a choice of YAML, JSON, or the so-called "windows INI" config files (you know the [Section1] thingy) - surely there must be others, excuse my oversight. INI can be read/wriiten with Config::Tiny but M$ may decide to put a copyright on the file format in the future - who knows? And direct you to an online .NET service, hardwired with ChatGPT data collection and captchas, perhaps biometric, just for reading your files.

So, for me, the options narrowed down to YAML or JSON.

My choice is JSON. Mainly because I try to avoid any programming tool which uses and counts spaces as part of the code. I find space-counting (space is the only invisible ASCII character > 31) irritating. I detest these products, personally, as I was never fan of the de Sade inflection (edit: neither von Masoch's). Memo-to-self: create a format which utilises the audible bell instead of space. Hey! why not backspace?

And so JSON then. This can be read/written easily with JSON/JSON::XS.

JSON has disadvantages for readability: no multi-line strings and no comments are allowed. And double quotes must be escaped. So readbility is bad, especially for long strings as is my case (multiline bash scripts). Manual editing can be tedious for long strings. Additionally, on parsing errors, JSON/JSON::XS print the location as the number of characters from start and print just a tiny bit of the faulty section which makes it very difficult for me to pinpoint the error (just a few characters which invariably end up only spaces, tabs and newlines). So, huge frustration for me there.

That said, and to be fair, YAML supports both comments and multi-line strngs. But, alas, it has the dreaded space as king! (naked!)

Shamleless Plug: As I said, I do heavy use of configuration files. I started with plain JSON. But because I wanted to allow for comments, multiline strings/verbatim/heredoc sections and template-style variables. I eventually called all the above enhancements "Enhanced JSON" (adhoc term) and whipped up a module to read and write these files with the existing JSON doing all the heavy lifting. The module is Config::JSON::Enhanced. But for what you presented here, plain JSON is just enough. Or YAML.

bw, bliako

In reply to Re: A good way to input data into a script w/o an SQL database by bliako
in thread A good way to input data into a script w/o an SQL database by ObiPanda

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.