I'm using Perl more and more to perform Web Automation. We support an operation that uses a largish application here that supports a Web Interface and I use the Web Interface + LWP to script interactions. The largish application provides no other scriptable interface and we're not really allowed to touch it, except as users.

So far, I've been doing one-off scripts. I've been trying to genericize them, but at this point, I'm the only one with enough background in this to use these scripts. I'm the only one in the shop who does any Perl, but people are getting interested in that I'm showing them how I can automate things that manually are painful and error prone. If it's done right, perhaps I can get some programming help.

There are any number of applications I can imagine based on this technology but I can't be the only one using these things. I have to provide a User Interface so that the Help Desk people can use the fruits here. The individual applications will typically be quite small, but there'll potentially be a lot of them.

I'm thinking TT and CGI::Application. Any alternate suggestions here?

I've still not decided on the platform to support this. I'll have root on a Linux box I'll be setting up. Is this important? I could also put them on the same HPUX box that runs the largish application. I won't have root access, but I should have a fairly free hand otherwise. Lastly, I could use some Windows servers, which would probably allow me to more easily provide these to potential users outside of our organization (currently, the Windows servers are that are allowed to be accessed externally... It's a crazy, mixed up world, eh?). There isn't _much_ reason to offer these applications externally, and I can tunnel in to access these apps from home no problem, so I'm not sure this is a win.

I'm leaning toward the Linux box for initial deployment and maybe if I do it right I can setup the same environment over on the HPUX machines (and possibly the Windows machines) later. Oh, what Linux distribution to you recommend for a machine that might not have any Internet access? I might be blocked completely by the local firewall do to paranoia about the Linux box. I'm good with that, but I think Debian, which I've used before, is oriented to network installation and may not be the best choice here. SuSe? Because the standard distro gives you so much on the CDs?

Wise monks, please off me guidance. What's most important? Should I focus on developing a good working set of modules for the common operations and build from there? Or should I dive in with TT and CGI::Application, getting some stuff working as quickly as possible.

What about application logging? How much logging is necessary? Should I log everything, or should I set it up so there are logging levels. Should I make these logs available to the user so they can just provide the offending log along with their experiences when something doesn't go right?

I'm concerned that if I don't show fruit soon that my working on it will get too much scrutiny. That's a political issue, not really a Perl issue, but one with which that I think many here would be quite familiar.

Maybe I should just start on the HPUX box to not spend time with admin and setup of the Linux box. OTOH, I might be able to be more productive in the Linux box that I can control. Adding modules to the HPUX Perl might be messier. In your experience, how much is development speeded by having root on the development platform? In some ways, I could see not having root helping to enforce good security practices.

The things I won't gloss over will be:

I've done a lot of project work in the past, but always either in groups where many of the development issues were decided amongst a group or it was maintenance work and I just modified in an environment already setup. I've never attempted to develop a suite of applications by myself and it's daunting. Oh, did I mention that I probably can only shake free about 25% of my working hours + any off-hours time I can scrounge to do all this? Any advice on how to be productive on development in an environment with lots of interruptions? I'm actually hoping that if I'm successful, I'll be chartered to work on this more and other tasks less. Also, these should actually be big time savers to the Help Desk, so they'll be able to take on more work. This is the dream.

Lot's of rambling, lot's of questions. I'm sure some of you have dealt with many of these issues in the past. What's worked really well?

Replies are listed 'Best First'.
Re: Developing a Suite of CGI applications
by Corion (Patriarch) on Sep 15, 2002 at 18:45 UTC

    A broad field you meditate on. I don't know much about Linux distributions - I do my development (at home) on a Win2k box, at work I use Solaris and Windows NT. I can offer three answers to your questions :

    First, plan to throw away everything you'll write in your first two generations. In my experience, you need to have implemented a task at least three times, before you know what the important details are, and what approach will work in the end. Don't invest in a large scale object model or too detailed planning, as it will tie you to the bad ideas you had at the start. You maybe already have gathered some experience in your field with your one-off scripts, but that experience won't reach far enough for a help-desk application. In my opinion, you should first write all the one-off scripts as separate scripts, and watch out what parts can be abstracted into modules, and what parts simply aren't worth the bother. Remember that when you change a module, every program that uses it needs retesting, but when you change a copy of the module, only the program you're working on needs retesting.

    From my experience, you can't have enough logging. Period. Rotate the logfiles daily, if diskspace should be of any concern, but log everything. Don't make it configurable or somebody will switch it off. Logs aren't there for when everything works as planned, but they are there for you to stare at after something unexpected happened. Plan for an easy parseable format, and for logfile viewers (CGI maybe) that allow you to specify different levels. But don't throw away the log data.

    From what you describe, you'll be working alone - this is a bad idea. You need someone to discuss your thoughts with, and your boss will also need somebody to take your place when the bus runs you over. A lot of bad design ideas get killed by explaining them to somebody else.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
      Thank you for what sounds like good advice. Advice won from hard experience is what I'm looking for.

      My one-off scripts are using a lot of cut and paste reuse, which from what you are saying, is not necessarily a bad thing at this point. I was concerned.

      I have to agree that logging is very important. I'll take great care here to log everything and rotate the logs. I'm already doing this on a non-Perl application I support and I have to say that I've never regretted it. I even compress the old logs and more than once have been asked to research something that happened weeks ago and was happy I had them.

        From what you describe, you'll be working alone - this is a bad idea. You need someone to discuss your thoughts with, and your boss will also need somebody to take your place when the bus runs you over. A lot of bad design ideas get killed by explaining them to somebody else.

      Unfortunately, this will be a given. I can bounce some ideas off of some other programmers here, but they won't have very much experience in Perl or the areas that I'm working in. I may be making heavy use of the Monastary!

Re: Developing a Suite of CGI applications
by Ryszard (Priest) on Sep 15, 2002 at 19:04 UTC
    wow, what a post!

    I've done something similar where I work. I've used CGI::Application and HTML::Template (as C::A provides hooks to it) for the application framework. I written some extra mechanics to handle sessions, logging administration etc etc.. Its all OO based, and all relatively abstracted.

    This has really worked very well for me and the team that uses it. Its *very* easy to develop new applications, and plug them in, all with implicit session management, and access control. In fact its worked so well we've got a list of projects we're developing for other depts' in the company. Like you we're automating manual tasks, and building new systems... what you've mentioned is perfect for this.

    One thing you havent mentioned is persistant storage, or state preservation - we use an RDBMS, Oracle specifically, however if you write SQL92 complient code, it doesnt really matter which engine you use.

    Should I focus on developing a good working set of modules for the common operations and build from there?

    If you're starting out cold, I would build a nice extensible framework, then once you have the framework in place, building the applications is easy.. :-) The way I work is to design the application, then think about abstracting pieces of it into a set of modules you can use later. Typically I use a two tier model. The top tier is the web interface, the bottom tier is the interface to the physical infrastructure (OS, Network, Database etc). I've found this to work pretty well.

    What about application logging?

    We have the framework logging access to the applications, and each application logging all the "important" things. Rule of thumb: Log enuff to provide an audit trail to provide accountability. Some may not agree with this, however if someone finds a bug in your code, you should be replicating it in a stage (or integration) environment, where you can add whatever level of debug code you like...

    Platform?

    Personally, i'd go for whatever you have two, or three of... 1 for production, one for integration to production, and one for devel. You have the advantage of a spare of your production one fails... If you need to ramp up, grab a layer 4 switch and put 2..n machines behind it. That type of design will scale right up.. :-) (assuming your backend can keep up).

    Devel time and root?

    Depends on how much support you get from admins. I've worked in places where the admins are fantastic and provide all the support you could ever want, and workd in places where you almost need to hack the box to get anything done..

    Any advice on how to be productive ..

    Plan, design and document. Put your thoughts down on paper, review, revise then code. I can tell you from experience, if you're designing and coding on the fly, you can end up in all sorts of unscalable and unmaintainable hell. If you need to cut and paste your code into another application - my rule is to abstract it out, define the appropriate class, and bung it into a module. There is nothing worse than having to come back and re-learn your old code... having pod to explain the interface, and inline comments on bits of logic is great. Also, a very handy bit of doco to have is why you *havent* done things. Many, many hours have been lost to rethinking solutions...

    Hey, you've got a fun project here, and the time you spend upfront in designing and building a framework to house your applications will be time well spent, when you start churning out application after application after application ...

        One thing you havent mentioned is persistant storage, or state preservation - we use an RDBMS, Oracle specifically, however if you write SQL92 complient code, it doesnt really matter which engine you use.

      The only thing I can imagine that I'll need to be storing, at least for the present applications I'm envisioning, would be session information. Also, there are maybe 10 people total who would use these, with a maximum of 5 at once, typically. I'm actually thinking files or dbm would be good enough for my persistent storage needs.

        Plan, design and document. Put your thoughts down on paper, review, revise then code. I can tell you from experience, if you're designing and coding on the fly, you can end up in all sorts of unscalable and unmaintainable hell. If you need to cut and paste your code into another application - my rule is to abstract it out, define the appropriate class, and bung it into a module. There is nothing worse than having to come back and re-learn your old code... having pod to explain the interface, and inline comments on bits of logic is great. Also, a very handy bit of doco to have is why you *havent* done things. Many, many hours have been lost to rethinking solutions...

      I like the advice I received from Corion in 198061 above that I shouldn't modularize too early because it creates testing headaches. I also think he's correct in saying that I'll be throwing a lot away. This is especially appealing because I will be working alone and I'll only have user testing to give me direction.

      Ahh, many paths to mastery...

      Thanks for the good advice, though. I like your point about documenting and planning. Yes, I hate the feeling of relearning your code and coming up with future enhancements that you've already considered and forgot.

        I guess the modularisation thing is an agree to disagree thing.. :-)

        One thing I personally hate doing is finding I like a feature in a "standard" bit of code, then having to update it in different places...

        One good part about OO, is you can keep the interface the same to you module, but change the guts of it in anyway you like.

Re: Developing a Suite of CGI applications
by perrin (Chancellor) on Sep 15, 2002 at 19:07 UTC
    At this point, there are a number of mature application frameworks for Perl which you should consider using. Take a look at the list here. Most have been discussed on this site, so do a little searching.
      Good call, however depending on the motivation will depend on the path chosen. For example, i build my framework to extend my knowledge on design and abstraction in coding..
        It's not as if there is no coding left to do when you use a framework. At best, it makes some of the boring repetitive parts go away and lets you work on the part that is specific to your application. One reason I recommend looking at these frameworks is that they are all open source and you can learn a lot about application design by looking at their code and seeing how they did it. I learned many things that way.
Re: Developing a Suite of CGI applications
by TGI (Parson) on Sep 17, 2002 at 01:16 UTC

    I've got the same sort of thing going where I work. I've been using Mason as my templating system/framework. Do a quick super search for Mason, I've posted a few times regarding what I like about it (argument handling is a biggy!).

    As to whether you should build a Linux box or use an established HPUX system, I built a Linux box. This has worked out pretty well. Having ownership of the system means that I can use whatever tools I feel like (as long as they don't cost anything ;). If some other person or group has control of the system, you may find yourself having to wage protracted political warfare to get something basic that you need. (Hypothetical case: The friendly HP-UX admin who helped you get set up moves on to a new job, and is replaced by a BOFH.)

    Don't worry about getting more time for your project, if it is useful more and more of your time will be allotted to it. I'm scheduled for 75% of my work time now, when I started this my (previous) manager expressly forbade me from working on it.

    I wonder how common this is becoming. It seems to be a natural outgrowth of LAMP techniques and the addictiveness of Open Source software.


    TGI says moo

      Mason looks interesting, but I'm looking for something with less learning curve and infrastructure. Ideally, I'd like to be able to turn over parts of it to other programmers who will be relative Perl newbies.

      This is why I'm attracted by CGI::Application. I'm also leaning toward HTML::Template. Looks like this combination is fairly simple and has been used together a lot, which means I can count on lots of examples and help.

      One question about templating systems I have. Is anyone aware of templates that can be maintained with standard WYSIWYG HTML editors like FrontPage? We have some FrontPage users in our group. If they could maintain the templates with no help or interaction from me, that'd be a big plus. Of course, they would have to interact with the CGI programmer on the characteristics of form inputs.

        Yes, I edit the .tmpl files with FP very easyly.

        There is a good CGI::Application howto - tutorial - slide show with a clear example (and no it's not the Search Form example) completely developed and explained here: http://uniforum.chi.il.us/slides/cgiapplication/ . It deserves a look. It helps a lot for getting started. And the modules it uses are PerlMonk compatibles ;) .

Re: Developing a Suite of CGI applications
by Solo (Deacon) on Sep 15, 2002 at 23:44 UTC
    Check out http://www.screen-scraper.com. It's Java, it's not done, I couldn't find the download, but it's author is developing it under GNU GPL, it looks like the generalized version of what you want, and at the very least, you might be able to share experience. (There are screenshots, so the author's got something of a working app.)

    And say, this largish application wouldn't be SAP R/3 would it?

    --
    May the Source be with you.

    You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

      I just looked at it. Looks like it would have to go a long way to be what I need.

      This is not primarily a screen scraper, but rather, scraping screens to get information (following next page links if necessary), then taking that information and issuing Post or Get Forms based on it. I need to perform audits on every page, making sure the application structure hasn't changed in some subtle way, and I want to be able to integrate confirmation dialogues.

      I do the equivalent of all of that, today, with command-line scripts. The most complex one is about 300 lines long and that includes a bunch of cut-and-paste subroutines that I could modularize. All I need to do is wed this kind of thing to CGI applications, an area that Perl shines in, and I'm there.

      Why should I take this detour into the interminable infrastrucure monster that is Java development, when Perl makes this stuff so easy?

        And say, this largish application wouldn't be SAP R/3 would it?

      No. I wouldn't call SAP R/3 largish, I'd call it humongous.