in reply to Cool uses for path_info

I would just fix path_info() untainting it, and done. After all, user input may slip through, if it is valid and doesn't do any harm. I haven't looked through the entire module, so I can only guess that requests fail elsewhere with a 404, if path_info() doesn't provide anything useful for a component relying on it.

But then, you probably do untaint both the environment and user input as early as possible, don't you? If not, you should have a very good reason.

perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

Replies are listed 'Best First'.
Re^2: Cool uses for path_info
by Dallaylaen (Chaplain) on Nov 24, 2016 at 14:29 UTC

    Thanks for your reply.

    Currently, a customizable 404 page is returned if (1) URI doesn't match any route configured in the application, or (2) user called die 404; (or its longer analog) in the handler. Cookies and parameters have signature like $request->param( name => qr/.../ ); . Sorry for not explaining in the question.

    And yes, fixing path_info() into "untaint" style was my first thought. However, after trying it out I noticed that only few paths in an actual application require path_info, and in those that don't I keep using a boilerplate along the lines of

    die 404 if $request->path_info(qr/.*/);

    Consider something like

    /questions
    /questions/tagged/\w+
    

    I would like to get a 404 upon requesting /questions/foobar automatically, without having to specify anything in the handler.

    Also if there's something like

    /history/\d{4}/\4{2}

    the path is likely to be processed with further regexp extracting specific values, so why not do it for the user at once and return captured values?

    That's why I'm thinking of going for a more convoluted API and deprecating path_info() altogether. Complex APIs are evil, but so is boilerplate code and unneeded repetition.

Re^2: Cool uses for path_info
by Dallaylaen (Chaplain) on Nov 28, 2016 at 12:53 UTC
    I ended up adding path_info_regex parameter to the path handler definition that untaints path_info for future use, while resulting in 404 if it doesn't match. Current behavior is deprecated and will be phased out in future versions (in fact, the regex will just get a default value equal to ^$). What I originally came up with was clearly overengineered. Thanks again for the discussion!