After using Class::DBI, DBIx::Class, (tested Hibernate, etc) and all the other ORM mappers to do db stuff.
I was always thinking that there is something broken with this approach (Ok.:) not broken, but always seemed a little off).
On the other hand using SQL directly is not that good either ! And thinking of SQL always take off the real target, 'cause you have to think how to structure your data in non language-domain terms.
How much easier would be to use directly array,hash ot LoL w/o bothering to do all this conversation details, connection pooling, prepearing statements...etc.
Not to mention the increase in productivity of the programmer and speed that should be possible if a good way to do this is available
What was my surprise to find couple of weeks ago, that there is such thing available now ? not for perl thought ;( !!
What are your opinions for similar thing for perl ?
For erlang :
http://www.bluishcoder.co.nz/2005/11/mnesia.html
For Java and C#
http://www.db4o.com/about/productinformation/

PS> another advantage of mnesia is that erlang allow cheaper concurrency, unparalleled by most of the today languages. And with the boom of multi-core processors today !!

Concurency and Mnesia would be my favorite features for inclusions in Perl6 ;))

Replies are listed 'Best First'.
Re: Integrated non-relational databases ?
by perrin (Chancellor) on Sep 26, 2007 at 03:42 UTC

    People have been pushing things like this for years. Seriously, it was an old idea a decade ago, when the first round of major object database vendors were going bankrupt.

    Why don't they replace RDBMSes? My guess would be that part of it has to do with an exaggeration of the performance and scalability they offer. We've developed a very good understanding of how to do concurrency and data safety in large-scale RDBMSes, and I don't believe the object database vendors when they claim they've duplicated all of that.

    A lot of it though is probably the thing you suggest is a weakness: SQL. With SQL, many ad hoc reporting tasks don't require the help of a programmer at all. Remember, SQL was invented so that business people could write their own reports, and some of them do. Some minimally trained HTML jockeys do too. When you lock up the data behind a Java or Erlang API, you lose something valuable.

    By the way, there have been interfaces for Perl over the years to things like AceDB and ObjectStore.

      but from what I read in the recent news, most of the big sites are more and more abandoning the RDBMS systems in favor in most of cases of hand made solutions. Sometimes completely RRDBMS-less, sometimes a mix.
      What I'm saying is that current RDBMS can't handle very large data sets in a real-time environment.
      F.e. I was recently doing experiments with a very simple table and 10_000_000 records which fit into memory. At the moment I decided to use something else which is not lookup, let say GROUP BY, execution time is a minutes instead of milliseconds.

      That is why I was thinking if you are doing this uplifting in a domain "language structures", it would be easier I think to think of more efficient caching schemes, shredding and similar techniques, so you can stay in "millisecond range" easier even for very large datasets.
      Mind me this is just thought not some conclusion on which is best :). It is very hard to test such things in large scale and of course the requirements of every apps are different.
      http://radar.oreilly.com/archives/2006/04/database_war_stories_5_craigsl.html
      Look at the links at the end of the article too
        Yeah, I read that series when it came out. I don't see how it came to the conclusion that sites are not using RDBMes. Most people interviewed said they use MySQL and have figured out how to scale it. Google wrote something custom for some of their data, and one guy used Berkeley DB, but most of them use RDBMSes for most things. Even Google makes heavy use of MySQL.
        RDBMS can and do handle huge data sets. Was your GROUP BY grouping by an indexed column? Which RDMBS were you using for that? What kind of hardware were you using? That really seems a bit too drastic of a change, although I rarely have tables with more than 100,000 records. Is that a single GROUP BY, or is that the effect when you change a whole class of queries that overlap to use it? More servers and database replication is often the answer. Was it memory bound, processor bound, or IO bound?

        The problem with handmade solutions and with anything tied closely to a certain language is that you're giving up large amounts of flexibility. SQL was designed specifically so that different programs in different languages could communicate to the same database and use the same data manipulation routines on the same data. You lose that if you're building it in some specialized database language that has no other support. While in some cases it's worthwhile to forgo convention and flexibility for performance, you have to be sure of what you're losing and what you're gaining. To be sure requires a lot more than a bit of ad-hoc testing on one example without accounting for possible machine deficiencies.

Re: Integrated non-relational databases ?
by mr_mischief (Monsignor) on Sep 25, 2007 at 23:22 UTC
    I'm not sure what exactly it is you're wanting. The mnesia examples don't look very impressive compared to some of the things I've seen done with Perl. So I may be missing your point here, but I'll mention some options for you to consider besides DBI and the DBIx modules.

    If you don't need it to be relational, what's wrong with tied hashes, DBM::Deep, Persistent, or Storable? Is Rosetta more what you'd like?

    After a quick look at CPAN's tools for data persistence that don't seem to involve SQL or RDBMS software, XML::Simple, Data::Table, AsciiDB::TagFile, DBM::DBass, DBM::Any, and Data::CTable look interesting, too.

    Tie::LDAP gives you directory service access instead of a database interface, which could come in quite handy for data with complex schemata that get updated infrequently.

      Mnesia can handle much bigger DB and is ACID,transaction based, replicated and can span several computers. Didn't know DBM::Deep can handle so much records... but I suspect over 1 milion it wont be that fast.
        DBM::Deep has ACID transactions and can handle data as large as your Perl can address. If you have a 32-bit Perl, you'd limited to files of 2G. If you have a 64-bit Perl, then you have 64-bits of addressability.

        Beyond just having ACID transactions, it's PurePerl and handles Perl datastructures in a way that nothing else does. The point is that the rest of your program doesn't even know that it's working with a DBM::Deep datastructure vs. a normal Perl datastructure.


        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
        Much bigger than which?

        Tie::LDAP for one can do all that. It's non-SQL. LDAP, including OpenLDAP, can span about as many computers as you can afford for your data center. It uses structured and schema-restricted data that can form a tree or a graph and not just a table. Using Tie::LDAP, you access it from the language.

        PostgreSQL is ACID, transaction based, and can be replicated, but it does use SQL.

        Most importantly, where's the patch?

Re: Integrated non-relational databases ?
by doom (Deacon) on Sep 30, 2007 at 02:10 UTC
    Well, what I think is that the RDBMS people have been working on the problem of data storage for a long time now, and it's really silly that a bunch of newbie programmers keep trying to ignore what they've come up with.

    What's so scary about SQL?