Re^2: fastest file processing Config file format

I agree 100% to the C-Solutions :). "split" is only reliable if the `key' cannot have spaces, CSV then might be a lot easier (and more portable)

I also don't think 300 is a *big* config file.

use strict;
use warnings;

use Benchmark qw(timethese cmpthese);
use YAML qw();
use YAML::Syck qw();
use XML::Simple qw();
use Config::Fast qw();
use Text::CSV_XS;

my $x = "x" x 100;
my %bigHash = map { $_ => "Str ".substr $x, int rand 100 } map { "x$_"
+ } 0..3000;

YAML::DumpFile ("delme.yml", \%bigHash);

open my $out, ">", "delme.xml" or die "Can't create demle.xml: $!\n";
print $out XML::Simple::XMLout (\%bigHash);
close $out;

open $out, ">", "delme.fst" or die "Can't create delme.fst: $!\n";
print $out "$_ $bigHash{$_}\n" for keys %bigHash;
close $out;

my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\n" });
open $out, ">", "delme.csv" or die "Can't create delme.csv: $!\n";
$csv->print ($out, [ $_, $bigHash{$_} ]) for keys %bigHash;
close $out;

cmpthese (
    -3, {
    YAML => sub { my $newHash = YAML::LoadFile           ("delme.yml")
+; },
    Syck => sub { my $newHash = YAML::Syck::LoadFile     ("delme.yml")
+; },
    fast => sub { my $newHash = Config::Fast::fastconfig ("delme.fst")
+; },
    XML  => sub { my $newHash = XML::Simple::XMLin       ("delme.xml")
+; },
    slurp => sub {
        local @ARGV = "delme.fst";
        my %newHash = map { split " ", $_, 2 } <>;
        },
    csv   => sub {
        open my $fh, "<", "delme.csv";
        my %newHash;
        while (my $row = $csv->getline ($fh)) {
        $newHash{$row->[0]} = $row->[1];
        }
        },
    });
[download]

        Rate  YAML   XML  fast   csv  Syck slurp
YAML  1.39/s    --  -10%  -54%  -97%  -98%  -99%
XML   1.54/s   11%    --  -49%  -97%  -98%  -99%
fast  3.02/s  117%   96%    --  -94%  -96%  -98%
csv   47.2/s 3289% 2959% 1462%    --  -43%  -65%
Syck  82.2/s 5804% 5228% 2622%   74%    --  -39%
slurp  136/s 9638% 8689% 4389%  187%   65%    --
[download]

Enjoy, Have FUN! H.Merijn

Comment on Re^2: fastest file processing Config file format Select or Download Code

Replies are listed 'Best First'.
Re^3: fastest file processing Config file format by GrandFather (Saint) on Sep 29, 2009 at 19:08 UTC
I chose 300 for two reasons: it's a reasonable guess at what the OP might mean by 'some hundreds', and even the slowest configuration technique I tried meets the time criteria the OP gave for a configuration file of that size. The slurp solution wasn't intended as a reliable way to handle configuration information, but as an indicative upper limit for a Perl solution to the problem. It's interesting to note however that fastconfig uses the same file format and has the same potential issues as the slurp solution. True laziness is hard work	[reply]

Replies are listed 'Best First'.

Re^3: fastest file processing Config file format
by GrandFather (Saint) on Sep 29, 2009 at 19:08 UTC

I chose 300 for two reasons: it's a reasonable guess at what the OP might mean by 'some hundreds', and even the slowest configuration technique I tried meets the time criteria the OP gave for a configuration file of that size.

The slurp solution wasn't intended as a reliable way to handle configuration information, but as an indicative upper limit for a Perl solution to the problem. It's interesting to note however that fastconfig uses the same file format and has the same potential issues as the slurp solution.

True laziness is hard work

[reply]