in reply to Re^3: external sort performance improved?
in thread external sort performance improved?

Thanks BrowserUK for your response: 1>I am getting the warning as "Warning: the specifed memory size is be +ing reduced to the available paging memory".The paging file size of t +his system is set to 12284 MB.(Even when I run this on w2k8 64 bit 12 +GB RAM system)(Let me know if I need to change the paging file size) 2>The output of sort which you have mentioned and output of my code di +ffers when the date and time is same.With the sort option you provide +d, the entire line is considered for sorting and hence the text lines + after the date and time is also getting sorted(which I dont want tex +t lines to get sorted when data and time is same).Is there a way to d +o it? I have attached the output of both sorts Your sort: 2012/12/13 @ 19:00:27,792 @ ,, at com. 2012/12/13 @ 19:00:27,792 @ ,, at com. 2012/12/13 @ 19:00:27,792 @ ,, at com. 2012/12/13 @ 19:00:27,792 @ ,, at java.lang.reflect.Method.i +nvoke(Method.java:597) 2012/12/13 @ 19:00:27,792 @ ,, at java.lang.Thread.run(Threa +d.java:662) 2012/12/13 @ 19:00:27,792 @ ,, at sun.reflect.GeneratedMetho +dAccessor1387.invoke(Unknown Source) 2012/12/13 @ 19:00:27,792 @ ,, at sun.reflect.DelegatingMeth +odAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2012/12/13 @ 19:00:27,792 @ ,, at sun.rmi.server.UnicastServ +erRef.dispatch(UnicastServerRef.java:305) 2012/12/13 @ 19:00:27,792 @ ,, at sun.rmi.transport.Transpor +t$1.run(Transport.java:159) My sort: 2012/12/13 @ 19:00:27,792 @ ,, at com. 2012/12/13 @ 19:00:27,792 @ ,, at com. 2012/12/13 @ 19:00:27,792 @ ,, at com. 2012/12/13 @ 19:00:27,792 @ ,, at sun.reflect.GeneratedMetho +dAccessor1387.invoke(Unknown Source) 2012/12/13 @ 19:00:27,792 @ ,, at sun.reflect.DelegatingMeth +odAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2012/12/13 @ 19:00:27,792 @ ,, at java.lang.reflect.Method.i +nvoke(Method.java:597) 2012/12/13 @ 19:00:27,792 @ ,, at sun.rmi.server.UnicastServ +erRef.dispatch(UnicastServerRef.java:305) 2012/12/13 @ 19:00:27,792 @ ,, at sun.rmi.transport.Transpor +t$1.run(Transport.java:159) 2012/12/13 @ 19:00:27,792 @ ,, at java.lang.Thread.run(Threa +d.java:662)

Replies are listed 'Best First'.
Re^5: external sort performance improved?
by BrowserUk (Patriarch) on Apr 17, 2012 at 10:22 UTC
    I am getting the warning as "Warning: the specified memory size is being reduced to the available paging memory".The paging file size of this system is set to 12284 MB.(Even when I run this on w2k8 64 bit 12GB RAM system

    That is just a warning, it doesn't prevent the sort from working. I'm not sure if it is a bug in the way the program determines the amount of memory available; or if the "paging memory" it talks of is some specialised subset of the available memory.

    Either way, when you get that warning, it means the program will use the maximum amount it thinks it can use.

    The output of sort which you have mentioned and output of my code differs ...

    That's unfortunate. sort.exe doesn't have a way to restrict the key length.

    The next fastest solution would be to download GNU CoreUtils and either put the entire package in your path, or just the sort.exe (and it dependancies:  libintl3.dll & libiconv2.dll somewhere in your path and use the command:

    sort -S 3G -k 1,26 dataf -o dataf.sorted

    (Note:This sort utility is a pre-compiled 32-bit binary, so 3 GB is the maximum it can handle)

    The sort will be substantially slower than with the windows supplied sort, but should be quicker than your perl script.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      I have installed coreutils and added path entry as mentioned.ran the command <1>sort -S 3G -k 1,26 sort_input.txt <2>sort -S 3G -k 1,26 sort_input.txt -o sort_output.txt but this is resulting in error "Input file specified two times"

        Sounds like the Windows sort utility is appearing first in your path. You might consider renaming the coreutils sort.exe to (say) gnusort.exe.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

        use absolute paths or manage your %PATH%, windows already comes with a sort utility