Re: Integrated non-relational databases ?
by perrin (Chancellor) on Sep 26, 2007 at 03:42 UTC
|
People have been pushing things like this for years. Seriously, it was an old idea a decade ago, when the first round of major object database vendors were going bankrupt.
Why don't they replace RDBMSes? My guess would be that part of it has to do with an exaggeration of the performance and scalability they offer. We've developed a very good understanding of how to do concurrency and data safety in large-scale RDBMSes, and I don't believe the object database vendors when they claim they've duplicated all of that.
A lot of it though is probably the thing you suggest is a weakness: SQL. With SQL, many ad hoc reporting tasks don't require the help of a programmer at all. Remember, SQL was invented so that business people could write their own reports, and some of them do. Some minimally trained HTML jockeys do too. When you lock up the data behind a Java or Erlang API, you lose something valuable.
By the way, there have been interfaces for Perl over the years to things like AceDB and ObjectStore.
| [reply] |
|
|
but from what I read in the recent news, most of the big sites are more and more abandoning the RDBMS systems in favor in most of cases of hand made solutions. Sometimes completely RRDBMS-less, sometimes a mix.
What I'm saying is that current RDBMS can't handle very large data sets in a real-time environment.
F.e. I was recently doing experiments with a very simple table and 10_000_000 records which fit into memory. At the moment I decided to use something else which is not lookup, let say GROUP BY, execution time is a minutes instead of milliseconds.
That is why I was thinking if you are doing this uplifting in a domain "language structures", it would be easier I think to think of more efficient caching schemes, shredding and similar techniques, so you can stay in "millisecond range" easier even for very large datasets.
Mind me this is just thought not some conclusion on which is best :). It is very hard to test such things in large scale and of course the requirements of every apps are different.
http://radar.oreilly.com/archives/2006/04/database_war_stories_5_craigsl.html
Look at the links at the end of the article too
| [reply] |
|
|
Yeah, I read that series when it came out. I don't see how it came to the conclusion that sites are not using RDBMes. Most people interviewed said they use MySQL and have figured out how to scale it. Google wrote something custom for some of their data, and one guy used Berkeley DB, but most of them use RDBMSes for most things. Even Google makes heavy use of MySQL.
| [reply] |
|
|
RDBMS can and do handle huge data sets. Was your GROUP BY grouping by an indexed column? Which RDMBS were you using for that? What kind of hardware were you using? That really seems a bit too drastic of a change, although I rarely have tables with more than 100,000 records. Is that a single GROUP BY, or is that the effect when you change a whole class of queries that overlap to use it? More servers and database replication is often the answer. Was it memory bound, processor bound, or IO bound?
The problem with handmade solutions and with anything tied closely to a certain language is that you're giving up large amounts of flexibility. SQL was designed specifically so that different programs in different languages could communicate to the same database and use the same data manipulation routines on the same data. You lose that if you're building it in some specialized database language that has no other support. While in some cases it's worthwhile to forgo convention and flexibility for performance, you have to be sure of what you're losing and what you're gaining. To be sure requires a lot more than a bit of ad-hoc testing on one example without accounting for possible machine deficiencies.
| [reply] |
|
|
|
|
Re: Integrated non-relational databases ?
by mr_mischief (Monsignor) on Sep 25, 2007 at 23:22 UTC
|
I'm not sure what exactly it is you're wanting. The mnesia examples don't look very impressive compared to some of the things I've seen done with Perl. So I may be missing your point here, but I'll mention some options for you to consider besides DBI and the DBIx modules.
If you don't need it to be relational, what's wrong with tied hashes, DBM::Deep, Persistent, or Storable? Is Rosetta more what you'd like?
After a quick look at CPAN's tools for data persistence that don't seem to involve SQL or RDBMS software, XML::Simple, Data::Table, AsciiDB::TagFile, DBM::DBass, DBM::Any, and Data::CTable look interesting, too.
Tie::LDAP gives you directory service access instead of a database interface, which could come in quite handy for data with complex schemata that get updated infrequently.
| [reply] |
|
|
Mnesia can handle much bigger DB and is ACID,transaction based, replicated and can span several computers.
Didn't know DBM::Deep can handle so much records... but I suspect over 1 milion it wont be that fast.
| [reply] |
|
|
| [reply] |
|
|
Much bigger than which?
Tie::LDAP for one can do all that. It's non-SQL. LDAP, including OpenLDAP, can span about as many computers as you can afford for your data center. It uses structured and schema-restricted data that can form a tree or a graph and not just a table. Using Tie::LDAP, you access it from the language.
PostgreSQL is ACID, transaction based, and can be replicated, but it does use SQL.
Most importantly, where's the patch?
| [reply] |
Re: Integrated non-relational databases ?
by doom (Deacon) on Sep 30, 2007 at 02:10 UTC
|
| [reply] |