perl -COE and, for example, CGI::Application

isync has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I have a script that uses the CGI::Application framework and is used to serve utf-8 webpages. Today I added a functionality which reads a binary image file from disk and streams it out to the web user. While incorporating CGI::Application::Plugin::Stream for that I found out that my image was broken unless I switched STDOUT into binary mode just before print()-ing/outputting the raw image data. But why?

I checked every bit of the output-process and finally reached my switches from the perl shebang
#!/usr/bin/perl -COE

So far this set of switches in combination with decoding_utf8() on user inputted data served me quite well. Until today when removing the switches magically solved my problems with the raw data streamout. As it seems, -COE breaks the binary data output with print().

Now, should I keep it this way, without -COE? As said, I generally serve utf-8 webpages with this script and it handles binary file prints as well.

I am aware of the fact that I need to decide wheater I use -COE on the perl command-line and use a binmode, ':raw' on the output or the other way round: I encode all non-raw outputs to :utf8 before output except the raw data. Is there a best practice? The fact that CGI::Application::Plugin::Stream uses no fancy STDOUT commands seems to indicate that it's better to encode all text outputs before print, or it may be because the module is not utf-8 aware. Any suggestions?

Would be great if some knowing monk could outline what combination (-switches, encode, decode) is best to use 1. on the shebang, 2. form-data and 3. on output.

BTW: in CGI::Fast mode: should it switch STDOUT back into what it was before, after my print, or will CGI::Application magically clear the binmode on the next cycle? See:

...read $fh in binmode etc.
binmode STDOUT, ':raw';
print $buffer;
print '';
close ( $fh );
binmode STDOUT, ':utf8';   <-- :utf8 the equivalent of -CO right?
return;
[download]

Comment on perl -COE and, for example, CGI::Application Select or Download Code

Replies are listed 'Best First'.
Re: perl -COE and, for example, CGI::Application by karavelov (Monk) on Sep 23, 2008 at 00:13 UTC
What I usually use for output encoding is just: `binmode STDOUT, ':utf8';` If you run in under FastCGI, put this line in cgiapp_prerun. In the mode where you output the binary data insert `binmode STDOUT,':raw';` If you expect unicode data from forms, you could put this code in your cgiapp_prerun stage: `my $vars = $self->query->Vars; while ( my($k,$v) = each %$vars ){ next unless defined $v; next if $self->query->upload($k); # uploads are binary next if $k eq 'auth_password'; # else MD5 crashes; $self->query->param( -name=>$k, -value=>Encode::decode_utf8($v) ); }` [download] Using applications written with CGI::Application under FastCGI is a little bit troublesome. The recommended in the documentation method of programatically switching runmodes with "prerun_mode" leaves the object further unusable. So the the plugins that use this method (Forward, Redirect, Session, Authentication, Authorization) I find them unusable in fast-cgi environment.	[reply] [d/l] [select]
Re^2: perl -COE and, for example, CGI::Application by isync (Hermit) on Sep 23, 2008 at 10:29 UTC
So, you would advocate for not using any switches on the perl shebang, right? Putting `binmode STDOUT, ':utf8';` in cgiapp_prerun and doing `binmode STDOUT,':raw';` just on runmodes where I output binary data is a very elegant solution, I think, as it resets STDOUT to :utf8 on every new cycle. -Right? In regards to "automatic `decode_utf8($v)` on form input with detection of uploads" I am conservative. Had a bit of trouble with it, thus falling back to doing it on a per runmode basis. Are your experiences more consistent (would this be production code)? I can't share your last comment: Might be that I do things less efficient but I have my fastcgi loop far enough around everything, so I can use the normal runmode switching facility and all mentioned plugins work under CGI::Fast. Read more... (2 kB)	[reply] [d/l] [select]
Re^3: perl -COE and, for example, CGI::Application by karavelov (Monk) on Sep 23, 2008 at 15:18 UTC
I think that it is better to do this with "binmode" but not with switches. One reason for this is that using "binmode" you could run the same cgi-app module under mod-perl. I use the decode_utf8() in prerun stage approach in production code. Initially I had some troubles caused by interaction of utf8 and MD5, but I escape this case. Pay attention that "auth_login/auth_passwword" is what I use for authentication, the default in CGI::Application::Plugin::Authentication is different. Now I see why you does not have problems with CGI::Fast - because you create new object for every request. My initial experience was with CGI::Application::FastCGI where the run handler is : `sub run { my $self = shift; my $request = FCGI::Request(); $self->fastcgi($request); while ($request->Accept >= 0) { $self->reset_query; $self->SUPER::run; } }` [download] In your approach you have "new,run,new,run,new,run...". In my case I have "new,run,run,run...". Actually the difference in performance is not quite big but usually I have some heavy initialization in init stage that is run only once (on new object creation), so I see some benefit in using this approach.	[reply] [d/l]
Re: perl -COE and, for example, CGI::Application by wol (Hermit) on Sep 23, 2008 at 10:44 UTC
I've not seen -C options before today, but they're all documented here quite nicely: http://perldoc.perl.org/perlrun.html. I might find a use for them myself... The option to use UTF-8 for STDOUT breaks down when trying to output anything other than text: The 'U' is for Unicode, which implies text data. Image files aren't text (as usual, there are some esoteric exceptions, but in general...) so putting them through a UTF-8 encoder is like running a gamma correction on a perl script. In this case, I'd suggest that you have two reasonable options: Stick with your current approach (default to UTF-8 with the -COE options, and switch to ":raw" when you need to output non-text), or Remove the -C option completely, and defer selection of ":utf8" or ":raw" until the point at which your code is able to identify what it's going to be outputting.	[reply]
Re^2: perl -COE and, for example, CGI::Application by isync (Hermit) on Sep 23, 2008 at 11:12 UTC
Although I am proud of myself for once using the -C options which are new to you ;-) I now have reverted to not using them after the discussion with karavelov. As shown, my updated approach is to use a :utf8 on the start of each script-cycle and in the few cycles my script outputs image/binary data I tell it to switch to :utf8 on a per-case basis. So I am d'accord with what you are concluding!	[reply]
Re: perl -COE and, for example, CGI::Application by isync (Hermit) on Sep 23, 2008 at 11:20 UTC
Slightly going Off-Topic: May I point your focus on my code comments in my first response: Should I move `while( my $q = new CGI::Fast ){ $q->header(-charset => 'utf-8'); # as I now set :utf8 in cgiap +p` [download] the modification of the $q->header to charset utf-8 into CGI::Application's App.pm cgiapp_prerun() stage? Currently I only have it at this very early app.cgi position as I am unclear about when CGI::Application creates the CGI.pm / CGI::Fast object and I am unsure if on-init of this object the charset declaration is taken into account...	[reply] [d/l]