Re: Transforming axml into hyperlinks
by roboticus (Chancellor) on Apr 15, 2007 at 13:18 UTC
|
simonodell:
Sorry this is totally unrelated to your question, but I've seen a few of your aXML posts over the last few days, and thought I'd make a comment. (Note: I haven't read your aXML stuff in detail, so this suggestion might prove unsuitable for your purposes.)
Languages and parsing can have a few tricky corners for the unwary, and I suspect you may be digging an ugly hole for yourself. The XML grammar, as simple as it is, may interact with yours in odd ways unless you thoroughly determine what you want. So far, it feels like you're flying by the seat of your pants, so it's quite possible you may find yourself coding around in circles, fixing a language bug, then a new special case arising that you'll have to fix. An ugly merry-go-round.
Looking at what you want, I'd suggest you go a slightly different way: Create a namespace for your keywords, and then build your processor (similar to an XSLT processor) that contains your plugin architecture. You could use an attribute to control processing order, rather than new syntax elements. This could give you some standard XML similar to:
<link action="..."><aX:qd procorder="after">ref</aX:qd></link>
This will give you a couple of advantages:
You'll still be able to fly by the seat of your pants, but you'll won't have the potential spectre of grammar interactions.
You'll be able to use standard XML parsers to do some of the heavy lifting, and you'll be able to concentrate on the unique bits of your idea.
Since it's a standard XML file, you can use a standard XSLT processor for pre/post processing your files. (Nice for your users when your tool is one of many in a chain of operations. For example, a user might use an XSLT stylesheet to insert some aXML syntax into documents for use with your tool.)
...roboticus | [reply] [d/l] |
Re: Transforming axml into hyperlinks
by graff (Chancellor) on Apr 15, 2007 at 04:40 UTC
|
This thread is my first introduction to "aXML". I presume the "definitive" reference would be here, which makes it look like a commercial product (first warning signal) with an obvious MS/ASP/.NET focus (another yellow flag). If this is one of those "extensions" to XML syntax -- intended to give the buyer "special powers" not conferred on those who insist on compliance with openly agreed-upon standards -- it is a misguided pursuit.
That said, how is it possible that angle brackets can be embedded in an attribute value, without being converted to < > ? Is that prudent? And why would a "qd" entity be embedded immediately inside another "qd" entity? And why would there be up to three different ways of bracketing this "qd" sort of thing (whatever it is)?
And what does this have to do with Perl?
Update: Sorry -- I over-reacted there. Seeing the perl part of the question:
axml string:
<link action="<qd><qd arg="(qd)ref3(/qd)">ref2</qd></qd>">[qd]ref[/qd]
+</link>
How could one obtain the result:
<a href="action.pl?action=someaction">a link</a>
using conventional cpan modules?
I guess Text::Balanced might be a way to start, but really, you might just have to go straight to Parse::RecDescent. But I'm not sure you've given enough of the "motivating principles" to clarify what it is you really need. Based on what I understand from the OP, this would do it:
s{<link action=<qd>.*?</link>}{<a href="action.pl?action=someaction">a
+ link</a>};
That goes against the PM grain of using XML parsing modules when parsing XML, but this doesn't really look like XML, so why worry about it? | [reply] [d/l] [select] |
|
|
No, the term aXML is used by others for different meanings, there is no definitive reference, sorry i didnt think about people who havent been watching my sorry display for the last few days...
My aXML is completely opensource, but my code is way too ansi c, and not perl prose enough for most peoples liking. Anyway I'm looking into doing it right and to do so I need information. Problem is I don't know what information I need, hence I'm trying to explain what aXML does in the hope someone on the other side can shed light on the path for its future development.
UPDATE: Looked at Text::Balanced, very interesting i will study it properly and look into using it for the next version.
| [reply] [d/l] |
|
|
Okay... I think you're getting ahead of yourself a bit with the embedding and ordering. First off, your using "special" tag delimiters (parens and square brackets) that have some greater-than-zero probability of showing up as actual text data in a given chunk of typical content.
How are you going to know (decide) which parens (brackets) are wrapped around your special "axml" symbols, and which are (normal) chunks of actual text data? And what's going to happen when the text content itself -- independent of any added axml markup -- contains unbalanced parens (which happens more than you might want to admit.
Aren't you running a risk, of indeterminate but non-zero probability, that "name space" collisions will occur? That is, the names being assigned to your special replacement placeholders are going to have to be distinct from whatever might occur as actual data (e.g. tokens that might occur between parens or square brackets).
And that's all distinct from the problem of the complexity imposed on the content when you start talking about embedding/nesting replacement strings, and forcing precedence relations based on parens vs. square brackets vs. angle brackets. I can't get my head all the way around it, but I can foresee clashes in store... an angle-bracket style directive inside a paren-style directive, etc.
The task you're trying to accomplish with this mechanism can't be all that complicated. It's not just the fact that it never would be; I think it's also a matter that if it were that complicated, you'd need a different mechanism to make the control of it comprehensible to humans.
| [reply] |
|
|
aXML allows for 2 types of non-standard bracket delimiters, ( ) and
[ ], Which mean respectively, process this tag before all others, and
+ process this tag after all others.
Please don't take it as a personal offense, but it this a wise choice to begin with? Personally I see problems with it:
- clashes with parens and squares in the text, as already pointed out by graff;
- creation of something that is somewhat like XML, but in fact is not.
Now, as far as the second point goes, it if were compellingly necessary, I don't think it would be a problem, although it would still leave you with the need to invent a wheel only slightly different from those that are already available, and thus to reinvent many wheels. Or to take existing tools, understand how they work and modify them to suit your needs, which would save you considerable time and give you more guarantees of doing the Right Thing™ but would still be less trivial than one may naively expect. Even without that, the approach is somewhat inelegant. XML is not exactly "simple", and I'm not really a big fan of it, but it has an elegance of its own. Having three different breeds of tags strikes me as breaking that. If the tags still have to nest correctly, you could achieve the same thing with attributes. Granted, it would be more verbose, but since you're dealing with "a sort of XML" anyway, that should not be a concern. Otherwise you would be using a lwml instead. If you go with attributes, then you will have regular XML and you could use one out of many already existing tools to parse XML.
| [reply] [d/l] |
Re: Transforming axml into hyperlinks
by Cody Pendant (Prior) on Apr 15, 2007 at 06:42 UTC
|
How can one derive the string "someaction" from another string which doesn't contain it?
By magic, maybe?
Seriously, if "someaction" was contained anywhere in that weird XML string, it would be pretty trivial to get it out. But it isn't.
So, what exactly did you mean?
If you want to create a link based on the content of the string, say for instance action.pl?action=ref3 it's very simple. But the question as asked is pretty much unanswerable, as far as I can tell.
($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
=~y~b-v~a-z~s; print
| [reply] [d/l] |
|
|
Ok, let me explain... first a quick recap the data available ;
action.pl?ref=a%20link&ref2=ref3&ref3=someaction
and the axml string:
<link action="<qd><qd arg="(qd)ref3(/qd)">ref2</qd></qd>">[qd]ref[/qd]
+</link>
<qd> refers to a plugin which handles query data, and returns values from the query data hash... so if
a qd command is given;
<qd>ref3</qd> the result returned will be "someaction"
as you can see through the example, the qd command is invoked several times before the link command, due to the heirachy of the tag types and their nesting. by the time the link command runs, all the qd's have done their thing, and the link command looks like ;
<link action="someaction">a link</link>
the example contained various levels of tags to try and illustrate how they can be used in conjunction with each other to build data on the fly, an ability which i extensively use (maybe abuse) in my work. I just noticed an error in the example which may well of obfuscated what i was trying to get at... bah!
perhaps a better example would be just;
<link action="<qd>ref3</qd>"><qd>ref</qd></link>
which doesnt do anything funky with brackets but at least it works...
the processing order would be;
<qd>ref3</qd> ...
result ; <link action="someaction"><qd>ref</qd></link>
<qd>ref</qd> ...
result ; <link action="someaction">a link</link>
<link ...
result ; <a href="action.pl?action=someaction">a link</a>
I know this example is rather unexciting, but when you consider that any tag can become a handle for a plugin if a plugin is present with the same name, and that plugins can have any names then perhaps the idea starts to make some sense in a crazy naive way :)
Anyway, i have dozens of plugins made covering a range of server side tasks like reading and writing xml files, to logging users in / out, retreiving session variables etc etc, all embeddable in a html document. the tag type things might be confusing to others but it seems so clear to me.. maybe im just looking at it from a skewed perspective as its author :)
| [reply] [d/l] [select] |