Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Good use for PDL?

by sherab (Scribe)
on Aug 01, 2010 at 00:05 UTC ( [id://852286]=perlquestion: print w/replies, xml ) Need Help??

sherab has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I am working on a project to analyze stock purchases. It's been an ongoing process and I have occasionally had to ask advice of the monks who have been extremely helpful, thanks!

Specifically,.....
X person makes a stock purchase on given date
X person makes another stock purchase of the same stock on given date
X person sells Z number of shares on a given date.......
......
......

Generally the info would look need to come out looking like this (CSV delimited)


stock,transDate,action,price,numberShares,closingPrice,gain,cumulativeVal,portfolioShares,avgPriceShare,portfolioPctChange
LNUX,1/1/10,Bought,$1.00,1000,$1.00,"-$1,000.00","$1,000.00",1000,$1.00,0.00%
LNUX,1/15/10,Bought,$1.50,1000,$1.50,"-$1,500.00","$3,000.00",2000,$1.25,16.67%
LNUX,1/31/10,Sell,$1.75,500,$2.00,"$1,000.00","$3,000.00",1500,$1.25,62.50%

Items in the above example that would need to be calculated include things like avgPriceShare and portfolioPctChange. These represent average price per share and the amount of the portfolio change since the last calculation.

We need to calculate this on about 25,000 accounts so any sort of solution would need to lend itself to batch processing (running on Linux). It seems like this sort of matrix calculation might be child's play for MatLab, maybe even overkill, given what I've seen about it. It may even be possible in Excel but I don't want to use a Microsoft solution if I don't have to since we don't use Microsoft anywhere in the enterprise.

Other monks have suggested that PDL may be the way to go. At this point I need to make a recommendation on what kind of expert to hire. We are storing all of our data in MySQL.

Many thanks for this and for previous advice I have gotten on this question.

J

Replies are listed 'Best First'.
Re: Good use for PDL?
by BrowserUk (Patriarch) on Aug 01, 2010 at 10:04 UTC

    I don't think people have asked the right questions yet.

    You say 25,000 accounts, but fail to mention how big their portfolios are? Or how frequently you will be repeating these calculations?

    PDL is generally a way of speeding up complex math on very large numbers of similar calculations.

    But 25,000 is not a large number. And averaging is not a complex calculation. By way of giving a feel, averaging 25k sets of 10 floats using straight Perl takes less 1/20th of a second:

    [0] Perl> @a = map [ map{ rand( 1000 ) } 1 .. 10 ], 1 .. 25e3;; [0] Perl> $t = time; push @$_, sum( @$_ ) / @$_ for @a; printf "Took %.6f\n", time() - $t;; Took 0.045407

    So, the questions you need to ask yourself are:

    1. Do you need the speed?

      Is the need sufficient to warrant the extra complexity?

    2. Do the actual calculations lend themselves to PDLs strengths?

      That's harder to answer, but from your description, I don't think they do.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Good use for PDL?
by ahmad (Hermit) on Aug 01, 2010 at 00:29 UTC

    I would suggest staying with MySQL and use it for calculation.

    You can calculate everything on the fly, like this:

    select *,(`price`+`closingPrice`) / 2 AS `avgshareprice` from `tableNa +me`

    Your question doesn't give me a clue on where you're stuck at the moment or your MySQL tables structure (notice I've used fields names of the csv file you gave).

    And what's the goal of this analysis ?

      And what's the goal of this analysis ?
      Well, he is analyzing stock purchases, keep track of who sold what and when, probably to derive some trading strategies for himself.

      All in all it's probably a safe bet that the ultimate goal is the furthering of some humanitarian cause.

Re: Good use for PDL?
by stefbv (Curate) on Aug 01, 2010 at 09:42 UTC

    I believe that a Data warehouse implementation is more appropriate.

    Regards, Stefan

Re: Good use for PDL?
by aufflick (Deacon) on Aug 02, 2010 at 06:49 UTC

    You might want to consider R (a GNU version of S+) which will also make any graphing you want to do easy.

    r-project.org

    If you want to batch R calculations from Perl you can use one of the various R related modules on cpan R

Re: Good use for PDL?
by etj (Deacon) on May 28, 2022 at 22:20 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://852286]
Approved by ahmad
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2024-03-28 11:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found