Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: Rosetta Code: Long List is Long (faster)

by eyepopslikeamosquito (Archbishop)
on Dec 27, 2022 at 21:38 UTC ( [id://11149141]=note: print w/replies, xml ) Need Help??


in reply to Re: Rosetta Code: Long List is Long (faster)
in thread Rosetta Code: Long List is Long

I was shocked to notice this thread is almost a month old already ... it's better to publish this at long last, be it "final" optimized version or not (I'm sure it can be improved a lot), before the thread is dead cold and whoever participated have to make effort to read their own code because of time elapsed

Thanks for posting. I suggest you just take your time and enjoy it without worrying too much about how old the thread is. This is actually one of my favourite features of Perl Monks, so much so that I wrote a meditation about it: Necroposting Considered Beneficial. :)

Your reply motivated me into finally getting around to installing Linux on my newish Windows laptop. Being lazy, I did this with a single wsl --install command to install the default Ubuntu distribution of Linux from the Microsoft Store. AFAIK, the main alternative is to install VMware, followed by multiple different Linux distros.

Anyways, after doing that I saw similar performance of both my fastest Perl and C++ versions.

For llil2grt.cpp on Windows 11:

llil2grt start get_properties CPU time : 4.252 secs emplace set sort CPU time : 1.282 secs write stdout CPU time : 1.716 secs total CPU time : 7.254 secs total wall clock time : 7 secs

On Ubuntu WSL2 Linux (5.15.79.1-microsoft-standard-WSL2), running on the same hardware, compiled with the identical g++ -o llil2grt -std=c++11 -Wall -O3 llil2grt.cpp, we see it runs slightly faster:

llil2grt start get_properties CPU time : 3.63153 secs emplace set sort CPU time : 1.09085 secs write stdout CPU time : 1.41164 secs total CPU time : 6.13412 secs total wall clock time : 6 secs

Sadly, it's becoming increasingly obvious that to make this run significantly faster, I'll probably have to change the simple and elegant:

hash_ret[word] -= count;
into something much uglier, possibly containing "emplace_hint" or other std::map claptrap ... and I just can't bring myself to do that. :) More attractive is to leave the simple and elegant one-liner alone and instead try to inject a faster custom std::map memory allocator (I have no idea how to do that yet).

Conversely, my fastest Perl solution llil2grt.pl runs slightly slower on Ubuntu:

llil2grt start get_properties : 10 secs sort + output : 22 secs total : 32 secs
compared to 10, 20, 30 secs on Windows 11. Perl v5.34.0 on Linux vs Strawberry Perl v5.32.0 on Windows.

Update: while this shortened version llil2cmd-long.pl runs a bit faster on Ubuntu (but not on Windows):

llil2cmd-long start get_properties : 7 secs sort + output : 21 secs total : 28 secs

The injection of CR into output lines is only required on Windows (actually, not required at all)

Yes, you're right, Windows nowadays seems perfectly happy with text files terminated with \n rather than the traditional DOS \r\n. By default, Perl and C++ both output text files with "\n" on Unix and "\r\n" on Windows. I'll update my test file generator to generate identical "\n" terminated files on both Unix and Windows.

Update: Test file generators updated here: Re^3: Rosetta Code: Long List is Long (Test File Generators). Curiously, \n seems to be slower than \r\n on Windows if you don't set binmode! I am guessing that chomp is slower with \n than with \r\n on a Windows text stream.

Update: Are you running Strawberry Perl on Windows? Which version? (Trying to understand why your Windows Perl seems slower than mine).

Update: The processor and SSD disk (see Novabench top scoring disks) on my HP laptop:

Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz, 1501 Mhz, 4 Core(s), 8 Logi +cal Processor(s) Disk (238 GB SSD): Intel Optane+238GBSSD (ave sequential read 513 MB/ +s; ave sequential write 280 MB/s) - score 73

References Added Later

Updated: Noted that llil2grt.pl on Ubuntu Linux runs slightly slower on Linux than Windows, along with detailed timings. Clarified that I'm running Windows 11. Added more detail on my laptop hardware. Mentioned llil2cmd-long.pl, developed later.

Replies are listed 'Best First'.
Re^3: Rosetta Code: Long List is Long (faster)
by Anonymous Monk on Dec 28, 2022 at 13:09 UTC

    By "same PC" to run a test under Windows vs. Linux, I meant classic dual boot and GRUB as I have, but perhaps virtualization is no longer a hindrance, and, moreover, not in this case. Because I compiled llil2grt.cpp in both OSes (command line and options you advised), here's what I got:

    $ ./llil2grt big1.txt big2.txt big3.txt >out.txt llil2grt start get_properties CPU time : 3.67015 secs emplace set sort CPU time : 1.19724 secs write stdout CPU time : 1.52074 secs total CPU time : 6.3882 secs total wall clock time : 6 secs >llil2grt.exe big1.txt big2.txt big3.txt >out.txt llil2grt start get_properties CPU time : 5.577 secs emplace set sort CPU time : 1.675 secs write stdout CPU time : 2.484 secs total CPU time : 9.736 secs total wall clock time : 10 secs

    Same relative difference I observe running Perl script. So looks like it's not an issue of Perl and Strawberry version you asked me about (which is latest available "5.32.1.1-64bit-PDL", BTW). I, further, compiled llil2grt.cpp using minGW shell and g++ which came with older 5.26 Strawberry, and got same 10 secs.

    I'm clueless why this PC is slow with Windows. Perhaps either MS or Dell issued a patch in recent years to address Kaby Lake CPU (mine is i5-7500T) "vulnerability"? I vaguely remember it was said performance would suffer if such patch to guard against mythical attackers would be applied. Just a guess, sorry if it's ridiculous. On the other hand, J script performs the same in both OSes.

    Speaking of which, to return to "interpreted language is faster than compiled C++" -- of course it's dataset bias to a large extent. I ran test with "long" files from another test file generator you suggested previously:

    $ ./llil2grt long1.txt long2.txt long3.txt >out.txt llil2grt start get_properties CPU time : 0.559273 secs emplace set sort CPU time : 0.004393 secs write stdout CPU time : 0.003275 secs total CPU time : 0.567 secs total wall clock time : 1 secs $ jconsole llil.ijs long1.txt long2.txt long3.txt out_j.txt Read and parse input: 1.50791 Classify, sum, sort: 0.70953 Format and write output: 0.00430393 Total time: 2.22175 $ diff out.txt out_j.txt $

    (NUM_LENGTH changed to 10, otherwise same code)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11149141]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-18 02:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found