in reply to Re: fastest file processing Config file format
in thread fastest file processing Config file format

I agree 100% to the C-Solutions :). "split" is only reliable if the `key' cannot have spaces, CSV then might be a lot easier (and more portable)

I also don't think 300 is a *big* config file.

use strict; use warnings; use Benchmark qw(timethese cmpthese); use YAML qw(); use YAML::Syck qw(); use XML::Simple qw(); use Config::Fast qw(); use Text::CSV_XS; my $x = "x" x 100; my %bigHash = map { $_ => "Str ".substr $x, int rand 100 } map { "x$_" + } 0..3000; YAML::DumpFile ("delme.yml", \%bigHash); open my $out, ">", "delme.xml" or die "Can't create demle.xml: $!\n"; print $out XML::Simple::XMLout (\%bigHash); close $out; open $out, ">", "delme.fst" or die "Can't create delme.fst: $!\n"; print $out "$_ $bigHash{$_}\n" for keys %bigHash; close $out; my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\n" }); open $out, ">", "delme.csv" or die "Can't create delme.csv: $!\n"; $csv->print ($out, [ $_, $bigHash{$_} ]) for keys %bigHash; close $out; cmpthese ( -3, { YAML => sub { my $newHash = YAML::LoadFile ("delme.yml") +; }, Syck => sub { my $newHash = YAML::Syck::LoadFile ("delme.yml") +; }, fast => sub { my $newHash = Config::Fast::fastconfig ("delme.fst") +; }, XML => sub { my $newHash = XML::Simple::XMLin ("delme.xml") +; }, slurp => sub { local @ARGV = "delme.fst"; my %newHash = map { split " ", $_, 2 } <>; }, csv => sub { open my $fh, "<", "delme.csv"; my %newHash; while (my $row = $csv->getline ($fh)) { $newHash{$row->[0]} = $row->[1]; } }, });

=>

Rate YAML XML fast csv Syck slurp YAML 1.39/s -- -10% -54% -97% -98% -99% XML 1.54/s 11% -- -49% -97% -98% -99% fast 3.02/s 117% 96% -- -94% -96% -98% csv 47.2/s 3289% 2959% 1462% -- -43% -65% Syck 82.2/s 5804% 5228% 2622% 74% -- -39% slurp 136/s 9638% 8689% 4389% 187% 65% --

Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^3: fastest file processing Config file format
by GrandFather (Saint) on Sep 29, 2009 at 19:08 UTC

    I chose 300 for two reasons: it's a reasonable guess at what the OP might mean by 'some hundreds', and even the slowest configuration technique I tried meets the time criteria the OP gave for a configuration file of that size.

    The slurp solution wasn't intended as a reliable way to handle configuration information, but as an indicative upper limit for a Perl solution to the problem. It's interesting to note however that fastconfig uses the same file format and has the same potential issues as the slurp solution.


    True laziness is hard work