Re^2: Best way to store/sum multiple-field records?

Replies are listed 'Best First'.
Re^3: Best way to store/sum multiple-field records? by GrandFather (Saint) on Dec 23, 2014 at 01:34 UTC
Never code for "efficiency". Instead code for clarity and maintainability. Compared with almost anything else your code does, declaring variables takes no time at all. Even if it took a huge amount of time like 1/1000 of a second, that is still tiny compared to the time it takes to read a line from disk and process it. And even if it represented a large portion of the time for each loop iteration, unless you are processing thousands of lines that overhead just isn't noticeable. In practice the overhead is likely to be much less than 1 millionth of a second and nothing to worry about ever. Just remember: premature optimization is the root of all evil. Perl is the programming world's equivalent of English	[reply]
Re^3: Best way to store/sum multiple-field records? by Anonymous Monk on Dec 23, 2014 at 01:36 UTC
I would have thought that declaring variables over and over would be less efficient than declaring them once at the start - that was my reasoning for declaring them before the loop, anyways... Not only 'premature optimization is the root of all evil'; not only such a microoptimization is completely meaningless; but it's actually the other way around... declaring variables inside a loop is quite a bit faster. I guess due to Perl's own optimizations... `use strict; use warnings; use Benchmark qw( cmpthese ); my @strings = qw( USERID1\|2215\|Jones\| USERID1\|1000\|Jones\| USERID3\|1495\|Dole\| USERID2\|2500\|Francis\| USERID2\|1500\|Francis\| ); cmpthese( 1_000_000, { outside => sub { my ( $x, $y, $z ); for (@strings) { ( $x, $y, $z ) = split /\\|/; } }, inside => sub { for (@strings) { my ( $x, $y, $z ) = split /\\|/; } } } );` [download] result: `Rate outside inside outside 109890/s -- -38% inside 176678/s 61% --` [download]	[reply] [d/l] [select]
Re^4: Best way to store/sum multiple-field records? by choroba (Cardinal) on Dec 23, 2014 at 02:10 UTC
To speed up split, specify the number of elements: `($x, $y, $z) = split /\\|/, $_, 3;` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re^4: Best way to store/sum multiple-field records? by BrowserUk (Patriarch) on Dec 23, 2014 at 02:52 UTC
Not only 'premature optimization is the root of all evil'; not only such a microoptimization is completely meaningless; but it's actually the other way around... declaring variables inside a loop is quite a bit faster. Really? Go figure> (It's (much) more complicated than that!): use strict; use warnings; use Benchmark qw( cmpthese ); my @strings = qw( USERID1\|2215\|Jones\| USERID1\|1000\|Jones\| USERID3\|1495\|Dole\| USERID2\|2500\|Francis\| USERID2\|1500\|Francis\| ); cmpthese( -1, { outside => sub { my ( $x, $y, $z ); for (@strings) { ( $x, $y, $z ) = split /\\|/; } }, outside2 => sub { my ( $x, $y, $z ); for (@strings) { ( $x, $y, $z ) = split /\\|/, 3; } }, inside => sub { for (@strings) { my ( $x, $y, $z ) = split /\\|/; } }, inside2 => sub { for (@strings) { my ( $x, $y, $z ) = split /\\|/, 3; } }, } ); __END__ C:\test>junk Rate outside inside inside2 outside2 outside 58201/s -- -38% -71% -73% inside 93659/s 61% -- -53% -57% inside2 197610/s 240% 111% -- -10% outside2 218802/s 276% 134% 11% -- [download] When you can explain that; then you may pontificate on the subject. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^5: Best way to store/sum multiple-field records? (carte blanche) by tye (Sage) on Dec 23, 2014 at 05:39 UTC
When you can explain that; then you may pontificate on the subject. Explanation: You wrote broken code. `$_ = 'USERID1\|2215\|Jones\|'; my( $x, $y, $z ) = split /\\|/; print "( $x, $y, $z )\n"; ( $x, $y, $z ) = split /\\|/, 3; print "( $x, $y, $z )\n"; __END__ ( USERID1, 2215, Jones ) ( 3, , )` [download] Thanks for the permission. (: Update: Oh, and the explanation for the other part of the "(much) more complicated" mystery: `Rate outside inside inside2 outside2 outside 58201/s -- -38% -71% -73% inside 93659/s 61% -- -53% -57% inside2 197610/s 240% 111% -- -10% outside2 218802/s 276% 134% 11% --` [download] That is, why is "outside" faster than "inside" while "inside2" is faster than "outside2"? Well, that's the classic point I try to get people to remember all the time: "11%" is simply "noise". Whether "inside2" or "outside2" will "win" depends on mostly random stuff (which one gets run first being the least random contributor that I've noticed). - tye	[reply] [d/l] [select]
Re^6: Best way to store/sum multiple-field records? (carte blanche) by Laurent_R (Canon) on Dec 23, 2014 at 19:02 UTC
Re^7: Best way to store/sum multiple-field records? ("significant") by tye (Sage) on Dec 23, 2014 at 19:49 UTC
Some notes below your chosen depth have not been shown here
Re^6: Best way to store/sum multiple-field records? (carte blanche) by BrowserUk (Patriarch) on Dec 23, 2014 at 10:33 UTC
Re^5: Best way to store/sum multiple-field records? by GotToBTru (Prior) on Dec 23, 2014 at 21:42 UTC
I feel better about this now ;) 1 Peter 4:10	[reply]
Re^6: Best way to store/sum multiple-field records? by BrowserUk (Patriarch) on Dec 23, 2014 at 21:49 UTC