in reply to Testing a GUI application

Testing GUIs is one of the most challenging tasks a tester can undertake. Automated testing of GUIs is an order of magnitude harder. Your dealing with user logic, rather than program logic, and no matter how twisted a programs logic is, you can at least inspect it. Users are not so easily tied down, and they are apt to see your GUI in completely different ways to you.

About the only reasonable way to automate GUI tests is to use some sort of external record and playback mechanism. (See Win32::GuiTest if your running on that platform. I have no knowledge of Prima).

Attempting to emulate the user from within the script under test will cause you to have to do things like duplicating state normally mantained within, and queried from, the GUI object elements themselves, and replicate these into alternate storage (as you mentioned).

This is filled with dire consequences. You're duplicating state and it will enevitably get out of sync. You're adding complexity within your application and completely changing the dynamics of the code. If you only use the duplicated state for testing and query the data from the GUI objects for use, then you are not testing your real code.

A mouse/keystroke recorder that allows you to replay previously recorded, manually-driven test sessions and then compare the states, preferable at each intermediate step as well as the final state--is the only reasonably successful method I have seen.

The recording of mouse actions should be recorded relative to the application window(s), not in absolute screen coordinates to avoid differences caused by where the system displays the windows.

The state comparison should be in terms of text displayed in the controls, the state (highlighted, greyed etc.) and recorded in a manner that allows it to be viewed and edited manually.

Not as a binary file. And not as bitwise comparisons of bitmaps of window or screen captures. Screens vary in size. Windows are usually movable and sizeable. Users can configure their desktops with different colours, fonts and reposition/hide screen elements like toolbars.

That's just a description of the best way (I know of) to do your testing, and some of the pitfalls to watch out for. It doesn't really help you with your quest to write tests, but maybe it will be helpful.

Good luck.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Testing a GUI application
by rcseege (Pilgrim) on Nov 04, 2005 at 20:37 UTC
    This is filled with dire consequences.

    Do you mind expanding on this a bit more? As you say, testing GUIs can be extremely difficult, but some toolkits provide a way to generate events and send them to individual components simulating the user interaction.

    I've had some success doing this in Perl/Tk, for example using the $widget->eventGenerate(...) method, though I've learned several painful lessons taking that approach. The test code can be non-trivial to write, and fragile because it depends on the application internals, but effective. It has influenced the way I design Perl/Tk apps (I tend to avoid the everything-in-one-script approach), so that I can create and test individual sections of the application easier.

    Prima looks as though it supports a similar capability with the $component->notify(...) method, but I just wanted to better understand your point regarding simulating the user.

      First, let me quote a small, but important part of my post from the paragraph before the one you quoted:

      Attempting to emulate the user from within the script under test ...

      That's important, because embedded tests will change the script under test.

      As I said, I know nothing of Prima, and little more of Tk, but your examples $widget->eventGenerate(.. .) and $component->notify(...), both suggest that for you to programmically trigger UI events, you need handles to the components. The problem then becomes how do you obtain those handles?

      If the test script is an integral part of program under test, it can, through judicious use of globals, gain access to the handles of the UI components--but as I described, then the test code is modifying the code under test with all the possibilities for problems that creates.

      Alternatively, if your test program is a separate executable that uses the APIs message queue to effect IPC between the test code (TC) and the code under test (CUT), as is certainly possible with Win32::GuiTest as it was with OS/2 Presentation Manager (PM), and I believe is possible with Tk; then the first thing the TC has to do is query the CUT in order to obtain those handles. The problem is how does it specify the object who's handle it needs?

      With Win32, you can specify an application determined integer identifier on most component creation calls. This can be used by GuiTests FindWindowLike() function to retrieve the handle to a specified window. It takes a certain discipline on behalf of the GUI writer to ensure that all his windows have unique identifiers, the API only insists upon them being unique between any given peer groups of a single parent. That means for example, that applications using MDI interfaces can often have identical ids within each subframe of the main application. Luckily, or by design, the calls underlying FindWindowLike() also allow the specification of a Parent window window handle from which to start the traversal of the sub hierarchy.

      Maybe Prima has a similar set of IDs and APIs to discover handles from them, but as far as I am aware, Tk doesn't. So, at least in the Tk environment, you are still left with the problem of how to obtain handles to the components within the CUT from the TC?

      GiuTest also allows this discovery via regex to match the text of the window. This again can be combined with a parent window handle from which to start the traversal, but it still has limitations:

      • Not all windows in an application have "window text".
      • Even with those that do, with some, that text can change over the lifetime of the window. Edit fields, Labels used for informational purposes.
      • It also possible to have the same text appear in windows at multiple places within the hierarchy. It's not that unusual to find "save" as an option below two different top level menu items for example.
      • The text can vary between different versions of the application. As, for example, with internationalised applications.

      All of these possibilities make writing and maintaining the TC that relies on either using the window text for the handle discovery, or intimate knowledge of the UI component hierarchy, and maybe different versions of the TC for different version of the application, a costly and expensive nightmare. Small changes to the UI through bug fixes or feature enhancements means that the whole regression test suite must be re-written.

      It also make writing TC a complex and expensive business in the first place. Working out the relationships and IDs required to allow you obtain the handles that you need to drive the application remotely, is an error prone process when done manually.

      The alternative I suggest, using a recorder, can easily perform this discovery--at least with Win32 or PM this is the case--as it can use the API WindowFromPoint() to do so. That is, when you click the mouse, the point at which you click is routinely translated from screen coordinates into target application coordinates and thence via the display manager code and the bounding boxes of the visible windows into the handle of the window in which you clicked. This is done every time you click, it is how window events are directed to their recipient window procedures.

      Having obtained the window handle of the component you clicked, it can the use these to query the IDs of both that component, and from that, it's parent. This combination of parent and child IDs should, in any given GUI app, uniquely identify the component. And do so in a way that even if extra elements and layers are added to the application, and in the face of internationalised versions, will still allow the playback tool to (re)discover the handle(s) of the appropriate components when the tests are being run.

      It is that combination of the automation of the TCs and a high degree of independence from changes within the CUT, that makes the recorder/playback approach so effective. It allows you to generate TCs more quickly, reuse them more frequently, and so test more thoroughly for much lower cost.

      Combine that with a feature to take frequent snapshots of the UI state, again in a position independent way of querying hierarchies of IDs, component states (grayed/highlighted etc.) and texts, and you have a quick, reliable, automated mechanism for exercising GUIs.

      If the the form of the recording is such that you can view and edit them with a standard text editor, it allows small corrections to be made easily, and for internationalised applications you can quickly replicate the generated tests for other languages by editing the texts recorded. If your internationalisation is done via good practice of using application atoms to specify window texts, then this step can either be easily automated, or even completely unnecessary if the atom numbers remain static and the associated texts are selected at runtime.

      Having the playback again take a snapshot of the state of the CUT, whenever the recorder did so, and compare IDs, states and text atoms (or preferable their associated atom numbers) renders the tests independent of screen sizes, color schemes, font selections etc.

      There are many tools available for performing this type of testing, the trick is selecting the good ones.

      How much of this is applicable to Prima I am not sure, but if the test tool is querying the state of the GUI via the native window managers APIs rather than the GUI toolkits, it should be fairly independent of the latter. Of course, you then need different tools for each platform on which you expect to run, but then there is no free lunch.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.