in reply to mapping coordinates- suggestion needed

PS:max for the coordinates in both cases is 20000000

Just to confirm: you have two datasets of 20e6 items each. Where each item is a pair of coordinates; where each coordinate pair can range from '1,2 >tag' through '19999999,20000000 >tag' ?

  • Comment on Re: mapping coordinates- suggestion needed

Replies are listed 'Best First'.
Re^2: mapping coordinates- suggestion needed
by baxy77bax (Deacon) on Oct 14, 2010 at 16:58 UTC
    Yes , that is exactly the case. where the ">tag" is something that should be connected to the coordinate so that i can say, after mapping, tagX is connected to the coordinate associated with unameY

    and the coordinate does not necessary need to be two consecutive numbers , but it can

      Let's see. If the "coordinates" of the time intervals are in seconds, 20e6 covers about 8 months, which doesn't really tally with your "those datasets have piled up over the years".

      But if they were in minutes, then it represents 39 years. Why would anyone care what user ran what job 38 years ago? They've an above average chance of being dead already.

      And then there is the "rules are that even if only part of the job_id_interval crosses the uname interval, this should be reported." bit. How can a userid be responsible for a particular job if they didn't log on until the last (second|minute|lesser known time unit) of the jobs life?

      And then there's that coincidence between there being 20e6 logons, 20e6 jobs; & 20e6 intervals during which those logos and job runs occurred?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        ok,

        i tried to make the problem as simple as possible. the lines(coordinates for the job id's and uname's) are fair share time coefficients.

        at one point i was so pissed with sge fair-share system, priority list and job dispatcher that i wrote simmilar engine for myself. and this thing has now been running for a 1,5 years.

        reason i post this question is because you guys always manage to come up with a more elegant solution (and i'm not looking for an exact code, just a suggestion)

        And then there's that coincidence between there being 20e6 logons, 20e6 jobs; & 20e6 intervals during which those logos and job runs occurred?

        yes, logical observation and the answer would be obvious if i explained how exactly the grid engine works. but there would be too many if's and then's if i explained everything.As i said i tried to simplify the problem as best as i could

        cheers