in reply to Re^5: Fast Replacement (0.01 seconds)
in thread Fast Replacement
Wait, please tell me I'm reading the results wrong.... In my benchmarks yours was faster. But in your benchmarks, "a", which is your algorithm, is taking 5.xx seconds per iteration, whereas "b", which is mine, is taking 0.17-0.9 seconds per iteration. Your benchmark seems to be showing the regexp approach winning by a landslide.
Dave
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^7: Fast Replacement (0.01 seconds)
by BrowserUk (Patriarch) on Jun 14, 2013 at 22:13 UTC | |
Using the eval subroutines was a step too far. Whilst much better than eval for every line, the additional subroutine call still has a substantial impact. Going back to hardcoded trs, and you get the picture we were both expecting:
Updated benchmark code:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by davido (Cardinal) on Jun 14, 2013 at 22:21 UTC | |
When I have a moment I want to check to see if this is a faster way of finding the 50k-th "!":
It *could* be, because it keeps all the work inside of Perl's internals. If Perl had a native function that said, "find_nth($string, '!')", it would beat any regexp solution, but it doesn't (at least not as a built-in). Update: "Quantifier in {,} bigger than 32766 in regex; marked by <-- HERE in m/(?:[^!]*!){ <-- HERE 49999}[^!]*!/ at mytest10.pl line 27." (I forgot about that.) Update 2: Assuming we're only dealing with ASCII, this is trivial using Inline::C. Walking the string and making the change up to 50k times will be trivial, and fast. If I get around to it I'll post an example. Dave | [reply] [d/l] [select] |
by davido (Cardinal) on Jun 16, 2013 at 16:52 UTC | |
Just for fun... I created an nth_index in C++ that sorta mimics the behavior of index. ..."sorta", in that it will return -1 if the nth string isn't found, and will assume an offset greater than haystack length is equal to the haystack size-1, or less than zero is zero. But where it will most obviously fall short is that it's only capable of dealing in standard "byte" sized chars; no Unicode.
The results:
This goes back to the original benchmark where I just made copies. I understood the xor-flag method, but I already had this written and didn't bother to rewrite. There's a gross inefficiency in the C++ version, that if fixed, would greatly improve the speed of that solution. The biggest inefficiency is that I'm copying the c-strings that Perl passes into the subroutine into C++ "string" objects. This is a linear operation, and really totally unnecessary if I were motivated enough to forgo the "string::find" method and roll my own instead. If I were to do that I could just deal directly with C strings and avoid making another copy. But I wanted to play with C++ strings just to see how it would look. Nevertheless, the results are predictably good. Don't try this on Unicode strings though. And if the needle string has a length greater than one, we would have to be aware that my method will detect needles that overlap each other. If we wanted to avoid that the C++ function should add the length of the needle to "pos". The exercise was beneficial, as I discovered a bug in Inline::CPP's documentation on how to declare user typemaps, that I'll be able to fix in an upcoming release. Dave | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Jun 16, 2013 at 18:39 UTC | |
I came up with a similar I::C implementation. I'd run it against yours but yours doesn't compile on my machine at the moment:
Got the C++ to compile -- missing newline at the end of the file, These are the results:
And the additional test:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
Re^7: Fast Replacement (0.01 seconds)
by BrowserUk (Patriarch) on Jun 14, 2013 at 21:52 UTC | |
Holy crap! You're right! (I saw what I was expecting to see :( ) Unless there is some bug I haven't spotted, ... With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |