in reply to XSL transformation vs. parsing of html - Performance benefits

I basically want to now how much better performance I can get by using for exampel XML::XSLT

Who says it will be faster;) Although I'm biased towards doing things the XML way, i.e. use XSLT to transform a XML document, I sometimes encounter performance issues with XSLT. How complex is the resultset and what is the kind transformation your aiming for? The 78kb doesn't sound to scary to me. I suppose you can also limit the size of the resultset, much like you can limit the returned number of rows for a mySQL query? I wonder why mySQL was the bottleneck and what product you use now. Please provide some more info and maybe an example of a (representative) resultset and the kind of transformation you want to do.

Cheers

Harry

  • Comment on Re: XSL transformation vs. parsing of html - Performance benefits

Replies are listed 'Best First'.
Re^2: XSL transformation vs. parsing of html - Performance benefits
by DreamT (Pilgrim) on Jun 11, 2010 at 08:46 UTC
    Ok. Let's see...

    The xml result consists of 'articles', where each row starts with 'article', containing 36 attributes (There is some header data also)
    <response> <serv> <host>xxx</host> <app>yyy</app> <port>12345</port> </serv> <function> <query> <func>aaa</func> <parameters> <parameter> <name>bbb</name> <value>1</value> </parameter> <parameter> <name>ccc</name> <value>10</value> </parameter> <parameter> <name>filter</name> <value>ggg</value> </parameter> </parameters> </query> <result> <articles> <!- article starts here -> <article> <attribute1>..</attribute1> ... <attribute36>..</attribute36> </article> </articles> </result> <time>13</time> </function> </response>

    Every article entry varies from c:a 1.4 - 2.6 kb in size, which gives us an average of 2 kb/article. The number of articles varies from 1-36 (it's hard to give an average here, since it's varies a lot), so let's say that it varies between 2-72 kb using the average value, so let's say 1-100 kb to be on the safe side.
    So, it's not really that much data to process. As you said I can limit the size of the resultset, but I want to calculate using these values, also to be on the safe side

    The MySQL database is a bottleneck because it holds so much more data than just article information. The new data source (which I can't reveal) is designed for just this kind of data, and will provide better performance.So, I'm basically tries to make the parse as good as possible compared to the "query".

      I'm not convinced of the business case for using XML. It's hard to believe that a relational database is outperformed by the new datasource. But then again the datasource is a bit of a mystery:) I worked with different types of databases including native XML databases. In my experience it is hard to beat the RDBMS in terms of retrieval speed. Unless you have some exotic data structure and need many joins to grab the data together. You can often de-normalize data to fix that though. Size of the database is hardly ever a problem in my experience.

      I agree that it's not much data to process but XML is (very) verbose, you probably have even less data to process when you use a RDBMS. From what I can derive from your example the data structure is quite simple, i.e. rows in a table with a variable number of columns (your attributes). So I guess the transformation is also not too complex. I would not call my att attributes attribute1 .. attribute36 though, but give them meaningful names.

      It's not too difficult to write some test cases and Benchmark the stuff to gain confidence in the solution.

      Cheers

      Harry