Before getting into details, what is the over-riding reason for mucking with the existing 'int main()' interface?
DrWhy writes:
"This code is mostly in one big long 'int main(int argc, char **argv)'. I just turned this into a 'regular' function and modified the argument list to contain slots for all the important formerly-command line options. Then I'm writing serveral different XSUBs that will call this with different arguments representing the major different modes of functionality provided."
I don't know why you need separate XSUBs. Couldn't you have pure perl subs for the different "modes" and have those call a single XS interface?
"These XSUBs need to return differnt data depending on whether they are called in scalar or array context, so they have PPCODE sections that manipulate the stack directly. That somewhat lengthy code that sets up the return values on the stack is identical in all the XSUBs. "
I have not problem with duplicating code in XSUBs, this usually happens. I don't see where you're losing anything.
"So I'm wondering, why not just have the central function (which is not going to be an XSUB itself -- it's in the .xs file above the MODULE line) do the stack manipulation itself. "
You got a little ahead of me here. The XSUBs, once run through xsubpp do more that just stack manipulation. ( This , I think, is the breeze to your house of cards. Nevertheless, indulge me while I continue... ) Otherwise I think that you're talking about writing a native perl extension. While there is no technical reason why you couldn't do this I think there are practical reasons why you shouldn't do this.
For maintainability I would want the new code to be in perl as far as possible, unless it meant standing on my head. This to minimize the amount of time spent thinking in both C and perl. I would also want to keep code-that-someone-else wrote untouched. If you didn't touch it then you can't be the one that broke it.
Yes, it would have been nice if the author had provided multiple interfaces, but he didn't. Trying to patch the code so that it has them seem to be going in the wrong direction. "Sewing new cloth on old clothes." In addition the new code would be neither C++ nor perl. It would no longer be testable as a standalone.
The effort spent on modifing the existing C interface would be better spent defining a new, testable, perl interface. Then backing in functions written in pure perl.
And finaly, refering to perlxs:
Of course, one could write such glue code directly in C. However, this would be a tedious task, especially if one needs to write glue for multiple C functions, and/or one is not familiar enough with the Perl stack discipline and other such arcana.
While you may be familiar with "such arcane" the poor guy maintaining the code after you wont be. Have pitty.
Nevertheless, Thank you for your question. It really got the gray matter churning. I hope I wasn't to harsh. I also appologize if I mis-understood what you were trying to do.
s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s
|-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,,
$|=1,select$,,$,,$,,1e-1;print;redo}
| [reply] [d/l] |
To append somewhat to what starbolin noted and to more directly answer your questions (though I mostly aggree, and will partially re-iterate what starbolin has already said:
1) You can manipulate the Perl stack anywhere you want, really; as long as you do it right.
2) I would aggree with starbolin in that this design might not be the greatest. If you really want this module to be in XS, I would try to stick to a couple philosophies I follow (take it as a grain of salt, though): white your XS like you would write your Perl. In this case, you wouldn't want to write a PP module that would impicitly push onto a list to be returned so that another routine would be able to return it. Although it would work, it would be confusing and tough to debug and/or modify at a future date.
Everything considered, it is a little hard to say what the 'right' way to do it would be without a little more knowledge on the problem scope. Perhaps dropping some psudo code as a comment might better explain the module and what you are trying to do with it, therefore allowing for better analysis. Otherwise, I hope this addendum helps out a little and good luck with your module!
| [reply] [d/l] |
[write] your XS like you would write your Perl
My XS philosophy is "write XS code like C code, and provide a C-friendly interface". I'm not sure my philosophy actually conflicts with yours, even though it sounds somewhat opposite. I find that the more Perl-like code you put into XS, the more fragile the results are (in multiple ways).
So I'd write the wrapper functions in Perl and write one XS function that takes C-friendly arguments and returns C-friendly results and let the Perl code check for list vs. scalar context etc. That usually results in cleaner, simpler, easier to maintain and extend and more powerful code.
| [reply] |
Thanks starbolin, wazzuteke, and, tye. All of your posts are very helpful. For simplicity, I'm going to respond to all of them at once.
First a little more about the code I'm adapting. It's code to calculate measures of 'translation quality' for human language translations. They can be used for actual human performed translations, but we use them to test machine translation systems.
(In case you are interested, there are six measures to choose from, only one is done on each run of the original program: IBM BLEU, NIST, NIST's variation on IBM BLUE, PER -- position-independent error rate, WER -- Levenshtien-distance-based error rate, and the average of WER and PER. Each of these measures is quite compute intensive -- a pure Perl version we have here at work takes like 30 minutes at least to run the IBM BLEU calculation. Thus the motivation to make a Perl extension in C++. I would have used C, but I'm not very experienced with C/C++ and we already have this C++ implementation.)
As for abandoning the int main() interface, 1) it needs to return double, not int -- when called in scalar context, 2),
I didn't want to mess with the typemap headache of translating an array ref on the perl side into a char ** to feed to main() -- that is not a straightforward proposition, and 3) making slots in the interface for each kind of information makes for a much cleaner or at least easier to grok interface, than just having a char ** with random stuff in it.
Wazzuteke's second point about grokability of manipulating the stack too far down is exactly what I was worried about, but hadn't been able to think through clearly enough on my own. I definitely agree with your point that setting the return values too far down is a bad idea. I was just worried about the alternate maintenance difficulty of managing six copies of the same code. Which one is worse? Then I looked more closely at my six wrapper XSUBs and realized that three were nearly identical to each other and the other three were also nearly identical to each other so I realized I could take advantage of XS'a ALIAS keyword here. and reduce the number of XSUBs to two. Since the prospect of having to maintain two copies of the same thing is much less daunting I've definitely moved away from my original idea. I now have a design very similar to the pseudo-example below. Actually the XS part is nearly the same -- the non-XS part is greatly simplified, but I've preserved enough so I think you can get an idea of what's going on.
double calculateStuff(int calculationType, char * translationFile, AV
+*referenceFiles, AV* report, char *extra1, int extra2) {
double score = 0.0;
ifstream trans(translationFile);
vector<ifstream> refs;
numRefs = av_len(referenceFiles);
for (I32 i = 0; i < numRefs; i++)
files[i] = new ifstream(SvPV(*(av_fetch(refFiles, i, 0))));
// figure out what the score should be
switch calculationType {
case 0:
// do calculation type number 0
case 1:
// do calculation type 1
case ...
// yadayada up through type 5
default:
// complain I don't know what to do
}
if (report) {
av_push(report,newSVpv('score', PL_na));
av_push(report,newSVnv((NV)score);
av_push(report,newSVpv('variance', PL_na));
av_push(report,newSVnv((NV)variance);
av_push(report,newSVpv('length penalty', PL_na));
av_push(report,newSVnv((NV)lengthPenalty);
//etc., etc.
}
return score;
}
MODULE = blah PACKAGE = blah
void
type1(char *hypFile, AV *AV_refFiles, int extra = 0)
ALIAS:
type5 = 5
type4 = 4
PREINIT:
AV *report = 0;
bool wantarray = false;
double score;
PPCODE:
wantarray = GIMME_V == G_ARRAY ? true : false;
if (wantarray) report = newAV();
score = computeScore((ix ? ix : 1), hypFile, refFiles, report,
+ 0, extra);
if (wantarray) {
EXTEND(SP,av_len(report));
for (I32 i = 0; i < av_len(report); i++)
PUSHs(sv_2mortal(*(av_fetch(report, i, false))));
}
else {
PUSHs(sv_2mortal(newSVnv((NV)score)));
}
void
type0(char *hypFile, AV *AV_refFiles, char * extra = 0)
ALIAS:
type2 = 2
type3 = 3
PREINIT:
AV *report = 0;
bool wantarray = false;
double score;
PPCODE:
wantarray = GIMME_V == G_ARRAY ? true : false;
if (wantarray) report = newAV();
score = computeScore(ix, hypFile, refFiles, report, extra, 0);
if (wantarray) {
EXTEND(SP,av_len(report));
for (I32 i = 0; i < av_len(report); i++)
PUSHs(sv_2mortal(*(av_fetch(report, i, false))));
}
else {
PUSHs(sv_2mortal(newSVnv((NV)score)));
}
I like the point that pretty much everyone made one way or another that you should do Perl stuff with Perl and other stuff in XS/C/C++. Trying to do too much Perl stuff in C/XS is kind of taking the scenic route, and forcing the next guy to look at your code to take the scenic route too, and as for doing hard/slow stuff in Perl, well, that's the whole point of XS.
Please feel free to comment on the design above, I'm still not sure I've set things up the best way, especially in terms of separating Perl and C++/XS the best way.
Thanks again,
--DrWhy
"If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."
| [reply] [d/l] |