Re: Performance problems on splitting long strings

Replies are listed 'Best First'.
Re^2: Performance problems on splitting long strings by Laurent_R (Canon) on Jan 30, 2014 at 19:40 UTC
Thank you. I know how to use the benchmark module, no problem with that, what I am looking for is some other ideas on how to split my data more efficiently, in order to benchmark these ideas.	[reply]
Re^3: Performance problems on splitting long strings by hdb (Monsignor) on Jan 30, 2014 at 20:12 UTC
Why then did you not bother to write a few lines like: `use strict; use warnings; use Benchmark 'cmpthese'; my $string = map { ('a'..'z')[rand 26] } 1..30; my @sub_fields; cmpthese( -1, { regex1 => sub { @sub_fields = $string =~ /\w{5}/g }, regex2 => sub { @sub_fields = $string =~ /.{5}/g }, unpack => sub { @sub_fields = unpack '(A4)', $string }, substr => sub { @sub_fields = map { substr $string, 5$_, 5 + } 0..length( $string )/5-1 }, });` [download] that already shows that the regex idea is vastly inferior: `Rate substr unpack regex1 regex2 substr 696486/s -- -57% -94% -94% unpack 1603093/s 130% -- -85% -86% regex1 10731041/s 1441% 569% -- -4% regex2 11165392/s 1503% 596% 4% --` [download]	[reply] [d/l] [select]
Re^4: Performance problems on splitting long strings by Cristoforo (Curate) on Jan 30, 2014 at 20:42 UTC
The $string variable contains '30'. I think you meant `my $string = join '',map { ('a'..'z')[rand 26] }1..30;` With this correction, unpack is faster. :-) `Rate regex1 regex2 substr unpack regex1 225055/s -- -1% -4% -53% regex2 228189/s 1% -- -3% -53% substr 235177/s 4% 3% -- -51% unpack 481548/s 114% 111% 105% --` [download]	[reply] [d/l] [select]
Re^5: Performance problems on splitting long strings by hdb (Monsignor) on Jan 30, 2014 at 20:53 UTC
Re^4: Performance problems on splitting long strings by Not_a_Number (Prior) on Jan 30, 2014 at 20:57 UTC
Probably of minor importance to your benchmark, but your `unpack` template should be: `unpack '(A5)', $string # Not '(A4)'`	[reply] [d/l] [select]
Re^4: Performance problems on splitting long strings by Laurent_R (Canon) on Jan 30, 2014 at 22:35 UTC
Why then did you not bother to write a few lines like:... Thank you for your answer, hdb, I think I said quite clearly in the original post that I intended to do a benchmark and that I was really looking for some ideas on possibly more efficient ways of doing the splitting, in order to benchmark them along with the ideas I explained. Possibly a Perl function unknown to me, or a use that I did not think about of a function known to me, or a module that I don't know about, whatever. As for the `unpack` function, I have used it about 5 times in the last 10 years and I had forgotten about the '*' option and I missed it when I looked at the documentation (which, in my humble opinion, could be clearer). Lacking that option, working my way around it was possible but would have made the benchmark less significant because of the added penalty due to this workaround. I will benchmark all the options that have proposed here and publish the results later on this post.	[reply] [d/l]
Re^5: Performance problems on splitting long strings by hdb (Monsignor) on Jan 31, 2014 at 07:33 UTC
Re^6: Performance problems on splitting long strings by Laurent_R (Canon) on Jan 31, 2014 at 18:41 UTC
Re^5: Performance problems on splitting long strings by AnomalousMonk (Archbishop) on Jan 31, 2014 at 22:57 UTC
Re^6: Performance problems on splitting long strings by Laurent_R (Canon) on Feb 01, 2014 at 00:55 UTC