dgaramond2 has asked for the wisdom of the Perl Monks concerning the following question:

Is there a YAML module which can dump YAML in its compact form? I.e. instead of emitting:

key1: val1
key2:
  - elem1
  - 2
  - 3

it emits

{key1: val1, key2: (elem1, 2, 3)}
?

(Sorry, don't know how to escape braces).

(Btw, I've resorted to JSON for solving my problem, but still wonder why there is no equivalent for this in the YAML modules.)

Replies are listed 'Best First'.
Re: Dumping compact YAML
by ELISHEVA (Prior) on Jul 21, 2009 at 12:44 UTC

    You have posted two questions today that seem concerned with "heaviness" - this node and Counting line of code. Unless you are working with a large number of records (100K and up) and have very, very tight memory, disk-space, bandwidth, or interprocess communication constraints (e.g. realtime applications), saving a few whitespace characters is not going to make much difference to the speed or efficiency of your program. In the example above you saved a grand total of 10 characters or (N*4)-2 where N is the number of items in your list.

    Perhaps you could share with us the reason this is an issue for you? It might help us do a better of job of suggesting solutions.

    Even if you have a very good reason for being concerned about space or format, I caution you about using JSON in place of YAML. Despite claims that JSON is a subset of YAML there are some important and potentially significant differences between the two. Please see Re: Caching or using values across multiple programs for details.

    Best, beth

      Thanks for the responses.

      For my application, I am storing database row diffs in a TEXT column. For each change (UPDATE, INSERT, DELETE) a diff will be stored to allow undo/redo to previous versions. It is expected that many of the times the change will be an UPDATE of only one or two columns (out of several/many). I will be storing the changes as hash (or hashref technically in Perl) of column names and values.

      YAML or JSON interests me because of the balance between readability and compactness. Of course, shaving off whitespaces will only save a few bytes in this case and in my particular example. But compacting a deep data structure obviously saves a bit more. For example, compare: [1, [2, [3, [4, [5, [6, [7, [8, [9, 10]]]]]]]]] compact representation in YAML (49 bytes) and it's non-compact one (230). It's a 4.5x compression ratio.

        Well if you want to bring compression into the picture you're back down to around 14-20 bytes which again is most likely chump change in the big picture. :)

        $ l {foo,bar}.yml* -rw-r--r-- 1 fletch fletch 275 Jul 21 11:28 bar.yml -rw-r--r-- 1 fletch fletch 91 Jul 21 11:28 bar.yml.bz2 -rw-r--r-- 1 fletch fletch 80 Jul 21 11:29 bar.yml.gz -rw-r--r-- 1 fletch fletch 110 Jul 21 11:29 foo.yml -rw-r--r-- 1 fletch fletch 77 Jul 21 11:29 foo.yml.bz2 -rw-r--r-- 1 fletch fletch 61 Jul 21 11:29 foo.yml.gz $ for i in {foo,bar}.yml ; { print $i ; cat $i } foo.yml --- a: [1, [2, [3, [4, [5, [6, [7, [8, [9, [10]]]]]]]]]] b: [1, [2, [3, [4, [5, [6, [7, [8, [9, [10]]]]]]]]]] bar.yml --- a: - 1 - - 2 - - 3 - - 4 - - 5 - - 6 - - 7 - - 8 - - 9 - - 10 b: - 1 - - 2 - - 3 - - 4 - - 5 - - 6 - - 7 - - 8 - - 9 - - 10

        Of course if you're compressing before tossing blobs into your DB you've lost at least immediate readability (but then on the other other hand that's just a short helper utility from being back hyoomon readable nicely indented).

        As another suggestion, if you've got a (relatively) small class of input data you might just roll your own serialize routine which spits out a more compact YAML representation rather than using one of the off-the-shelf modules.

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Dumping compact YAML
by Your Mother (Archbishop) on Jul 21, 2009 at 18:08 UTC

    JSON::XS is, in my view, the best serialization module we've got. Unless you're serializing objects or code or something that JSON can't cover, I'd pick it over YAML. I use YAML for its human readability and ability to dump regular expressions and other special structures. JSON(::XS) is the best choice for plain data structures.

Re: Dumping compact YAML
by Fletch (Bishop) on Jul 21, 2009 at 13:43 UTC

    If you're that worried about 5 bytes perhaps YAML (and/or Perl) aren't really what you should be using.

    $ wc key1: val1 key2: - elem1 - 2 - 3 5 9 39 $ wc {key1: val1, key2: (elem1, 2, 3)} 1 6 34

    (Granted your inlined form isn't valid YAML and the non-inline sequence will eat up a couple extra characters per additional item, but the point remains)

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Dumping compact YAML
by leocharre (Priest) on Jul 21, 2009 at 15:32 UTC
    YAML helps solve a set of problems. These do not specifically include a) compactness of config options or b) speed of loading config options.

    You don't want YAML to have these things. You do not want YAML to be able to be written the way you want it.