Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^2: Improving on MS-SQL over DBI and DBD::ODBC

by bprew (Monk)
on Nov 24, 2004 at 16:33 UTC ( [id://410163]=note: print w/replies, xml ) Need Help??


in reply to Re: Improving on MS-SQL over DBI and DBD::ODBC
in thread Improving on MS-SQL over DBI and DBD::ODBC

I would agree with mpeppler. In my experience, you can usually make the application perform more work on the database, and this can reduce the amount of data you're pushing around, especially over an ISDN line.

Also, I don't know how quickly you can connect to the database, but I've seen new connections to an oracle database take up to 20 seconds.

So if you have multiple connections in your program, it might be worthwhile to "cache" the database handle somewhere and then only connect once. Unfortunately, I don't know much about the application overall, so its hard to suggest architectural changes when one doesn't know the architecture to begin with :).

Finally, most of the time I find myself attempting to optimize something that's not a problem. Do you have mathmatical evidence (ie hard numbers) that show that the ODBC driver is solely the cause of your performance issues? I'm not saying that you're not correct, but I just want to make sure you're not trying to solve the wrong problem.


--
Ben
  • Comment on Re^2: Improving on MS-SQL over DBI and DBD::ODBC

Replies are listed 'Best First'.
Re^3: Improving on MS-SQL over DBI and DBD::ODBC
by mpeppler (Vicar) on Nov 24, 2004 at 16:52 UTC
    Also, I don't know how quickly you can connect to the database, but I've seen new connections to an oracle database take up to 20 seconds.
    MS-SQL (and Sybase) is pretty good in that respect - creating a new connection is a relatively fast operation, though that doesn't invalidate your point - reconnecting over and over is not a good thing (especially over a slow link)

    Michael

Re^3: Improving on MS-SQL over DBI and DBD::ODBC
by radiantmatrix (Parson) on Nov 24, 2004 at 19:00 UTC

    Hm, thank you for the SQL construction suggestions. I have tried to push as much as possible onto the server, but I'm not an SQL expert; perhaps I should find one to take a look at my statements. Unfortunately, I'm not sure there's a whole lot of optimization I can do in this manner, as a large portion of my work is to get DB records and compare them with text files or other data sources.

    I don't ever open multiple connections, except when there are multiple instances of my application (which can't be helped), so that isn't the cause of my woes. One db handle is enough, thank you. ;-)

    I can't prove that it's the ODBC driver that's the problem. I do know that performing an identical query using a third-party SQL tool is much faster (even when I account for connection times). I also know that native access methods are often faster as opposed to ODBC, so I'm hoping to find a DBD driver that uses MS-SQL natively. I can at least benchmark, then.

    Finally, while my performance is "acceptable" to my users and my employer, I consider it borderline; I'm wanting to refactor before adding yet more DB things.

    I wish I could just post the code, but NDA and all that... so please ask me anything you need to help you formulate a starting point for me to research.


    radiantmatrix
    require General::Disclaimer;
    Perl is

      Fetching 50k rows over a slow connection is bound to be slow, no matter what you do. I just tried this - pulling 50k rows from a pretty fast server (4 way Xeon, relatively lightly loaded) over my DSL line that achieves approximately 80KB/second, and just grabbing the data (with a pure C program) and a SQL query that just fetches the first rows (no WHERE clause, no ORDER BY) takes almost two minutes.

      In this case I think your major issue is the amount of data that you have to pull over the slow connection - I'd try to find ways to reduce that - maybe some client-side caching if the same data can be re-used, or a better/more specific WHERE clause to avoid fetching data that is not relevant.

      Michael

      Hmm... I have found in the past that you can make some good strides by looking at not only the SQL statement, but how its used in the context of the system.

      The real question is, where are you spending your time? Is it in the SQL query itself, in the record fetching, in the comparison? If you're making a lot of small queries, perhaps there is a way to combine them into a few larger queries.

      It sounds like you're fetching a fair amount of rows (60k+). Is there a way you can reduce the number of rows you're grabbing, perhaps by caching the data on the client side, so you don't have to grab all the data every time? Or allow the user to filter the data better, so they don't have to look at as much data?

      Without knowing some of the business requirements and more about the environment, its hard to make pointed suggestions, but hopefully it can give you something to think about.


      --
      Ben

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://410163]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (3)
As of 2024-04-16 20:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found