in reply to OT: Scalable web application architecture

You should describe more details of your application to get some reliable answers, imho... Is it CGI or pure mod_perl? Apache 1 or 2? How many rows exist in db tables and how complex queries are? How many requests per second? How much memory and CPU GHz? What is successive response time for you?
I can recommend to use some profiler to observe, what does the bottleneck really. Perhaps it is not SQL engine...
  • Comment on Re: OT: Scalable web application architecture

Replies are listed 'Best First'.
Re^2: OT: Scalable web application architecture
by badaiaqrandista (Pilgrim) on Dec 07, 2005 at 12:13 UTC
    You should describe more details of your application to get some reliable answers, imho... Is it CGI or pure mod_perl? Apache 1 or 2?

    It is web application running on mod_perl 1 with apache 1.3. But the classes that implements business objects basically don't care if it runs under mod_perl or from command line. They only need the database with a certain schema.

    How many rows exist in db tables and how complex queries are?

    There could be more than a million record in the cache table. The operation on the table is basically like this (in perl-like pseudocode):

    sub search { my ($arrival_date, $departure_date, $number_of_guests) = @_; validate_search_cache($arrival_date, $departure_date); my @search_results = ... do SELECT query to get a list of available + room-package combination between $arrival_date and $departure date f +or $number_of_guests ... return @search_results; } sub validate_search_cache { my ($arrival_date, $departure_date) = @_; my @validated_dates = ... do SELECT query that returns a list of da +tes between $arrival_date and $departure_date having valid combinatio +ns of room-package ... my @invalid_dates = ... look for dates between $arrival_date and $d +eparture_date that doesn't exists in @validated_dates ... foreach (@invalid_dates) { initialize_search_cache($_); } } my @field_list = ( { name => 'valid_price', op => sub { ... computation to check if price valid on a certain date ... }, }, { name => 'available_room_count', op => sub { ... calculcation to get the available room count on a certain date ... }, }, ... ); sub initialize_search_cache { my ($date) = @_; my @keys; my @values; foreach (@field_list) { push @keys, $_->{name}; push @values, $_->{op}->($date); } my $sth = $dbh->prepare("INSERT INTO (".(join ',', @keys).") VALUES + (".(join ',', map "?", @values).")"; $sth->execute(@values); }

    I hope that explains how the code works. The queries are mostly simple, except when doing the search for availability.

    badaiaqrandista

      Your caching via database table has a serious problem: It is simultaneously queried via select(s), update(s) and refreshed via 'insert into'. These operations have very different indexing needs to perform well, these are practically impossible to meet simultanerously. Not to speak of the locking issues.

      A better architectural idea would be to cache some data in a perl-Structure, update it there and push it back into the database after processing is done. This solution requires some amount of bookkeeping, in particular if it is a distributed system.

      If it is havily distributed (several geographical locations) think about replication.

      Please check the indexing of the underlying database architecture. Some millions of lines are easily processed if the indexing is right.

      How about data architecture? Have you separated the static data from the dynamic data? The property, package and room data seems to be static, the reservation data is dynamic, separate these into different tables.

        Aren't there applications that simultaniously reading and writing to a table? How about applications in banks or airline reservation system works? What's the difference between their application and my application? What makes them tick?

        Anyway, I agree with your idea of separating dynamic and static data. Thanks for your suggestion.

        badaiaqrandista