Re: X-prize: A Knowledge Protocol

Replies are listed 'Best First'.
Re^2: X-prize: A Knowledge Protocol by BrowserUk (Patriarch) on Oct 15, 2004 at 22:43 UTC
I've read a few bits on the semantic web efforts. The problem I see with it is that it is fairly typical of all the latest specifications coming out of W3c and similar bodies. All-encompassing; over-engineered; heavyweight. The remarkable thing about most of the early protocols upon which the internet is based, is just how lightweight and simple they are. You can connect to a telnet server from a wap-enabled mobile phone and do everything you could from a fully-fledged terminal. You can connect to a pop3 server and do everything from a command line. Same for ftp, sftp, smtp and almost all of the other basic protocols. All the bells and whilstles that a good email program, terminal program etc. layer on top are nice to haves, but the underlying protocols that drive the network are simple. What I've seen of the semantic web talks about using XML (already to complicated). XPath (worse). Resource Description Framework (hugely complicated). Layers upon layers, complications on top of complications. Simple fundamental principles that stood the early protocols in good stead have been forgotten or ignored. Question: What makes XML so difficult to process? Answer: You have to read to the end to know where anything is. The Mime protocol followed early transmission protocol pratices. Each part or sub-part is preceeded by it's length. That way, when your reading something, you know how much you need to read. You can choose to stop when you have what you want. XML on the other hand, forces you to read to the end. You can never be sure that you have anything at all, until you have read the closing tag for the first element you receive. That's what makes processing XML as a stream such a pain. XMLTwig and similar tools allow you to pretend that you can process bite sized chunks, but if at the final step, the last close tag is the wrong one; corrupted; missing; then all bets are off because according to the XML standard, it isn't a "well-formed document", and the standard provides for no recovery or partial interpretation. Any independant mechanism, like XMLTwig or even the way Browsers provide for handling of imperfectly formed HTML, is outside of the specification, therefore not subject to any rules. This is why different browsers, present the same ill-formed HTML in different ways. A transmission protocol that didn't provide for error detection and error recovery would be laughed out of court. It's my opinion that any data communication protocol that says: "Everything must be perfect or we aren't going to play", should equally be laughed out of court. The sad thing is, XML could be fixed, in this regard, quite easily. I think that continuing the tradition of: `fieldname: field data\n\n` [download] has a lot of merit. I also think that de-centralisation of information provider directory has a huge merit. The problem with what I've read of the semantic web is that either every client has to individually contact every possible information provider to gather information; or it has to contact a central information provider directory service, which requires large volumes of storage and processor power, and therefore will need to be funded. Once you get a paid-for (by the client) service, the reponses from the service are controlled by commercial interests, and are then possible subjects for paid-for (by the information providers) prioritisation. Once again the clients--you, me and other Joe Public--end up getting to see only what some commercial service provider is paid the most to show us. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: X-prize: A Knowledge Protocol
by BrowserUk (Patriarch) on Oct 15, 2004 at 22:43 UTC

I've read a few bits on the semantic web efforts. The problem I see with it is that it is fairly typical of all the latest specifications coming out of W3c and similar bodies.

All-encompassing; over-engineered; heavyweight.

The remarkable thing about most of the early protocols upon which the internet is based, is just how lightweight and simple they are. You can connect to a telnet server from a wap-enabled mobile phone and do everything you could from a fully-fledged terminal. You can connect to a pop3 server and do everything from a command line. Same for ftp, sftp, smtp and almost all of the other basic protocols.

All the bells and whilstles that a good email program, terminal program etc. layer on top are nice to haves, but the underlying protocols that drive the network are simple.

What I've seen of the semantic web talks about using XML (already to complicated). XPath (worse). Resource Description Framework (hugely complicated).

Layers upon layers, complications on top of complications. Simple fundamental principles that stood the early protocols in good stead have been forgotten or ignored.

Question: What makes XML so difficult to process?

Answer: You have to read to the end to know where anything is.

The Mime protocol followed early transmission protocol pratices. Each part or sub-part is preceeded by it's length. That way, when your reading something, you know how much you need to read. You can choose to stop when you have what you want.

XML on the other hand, forces you to read to the end. You can never be sure that you have anything at all, until you have read the closing tag for the first element you receive. That's what makes processing XML as a stream such a pain. XMLTwig and similar tools allow you to pretend that you can process bite sized chunks, but if at the final step, the last close tag is the wrong one; corrupted; missing; then all bets are off because according to the XML standard, it isn't a "well-formed document", and the standard provides for no recovery or partial interpretation.

Any independant mechanism, like XMLTwig or even the way Browsers provide for handling of imperfectly formed HTML, is outside of the specification, therefore not subject to any rules. This is why different browsers, present the same ill-formed HTML in different ways.

A transmission protocol that didn't provide for error detection and error recovery would be laughed out of court. It's my opinion that any data communication protocol that says: "Everything must be perfect or we aren't going to play", should equally be laughed out of court.

The sad thing is, XML could be fixed, in this regard, quite easily.

I think that continuing the tradition of:

fieldname: field data\n\n
[download]

has a lot of merit. I also think that de-centralisation of information provider directory has a huge merit.

The problem with what I've read of the semantic web is that either every client has to individually contact every possible information provider to gather information; or it has to contact a central information provider directory service, which requires large volumes of storage and processor power, and therefore will need to be funded.

Once you get a paid-for (by the client) service, the reponses from the service are controlled by commercial interests, and are then possible subjects for paid-for (by the information providers) prioritisation.

Once again the clients--you, me and other Joe Public--end up getting to see only what some commercial service provider is paid the most to show us.

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

[reply]
[d/l]
[select]