ASTEC project: Automated Testing
Bengt Jonsson, Uppsala University
Participating Research Groups:
Uppsala University, Dept. Computer Systems.
Participating Companies:
Primarily Validation AB, but also Telelogic AB and Volvo Teknisk Utveckling AB
The overall philosophy is to develop
techniques for automating the
testing embedded systems, communication protocols, and telecom services.
The current particular goals of the project are:
- Techniques to support testing of WWW Services.
The purpose of the project is to extend existing testing techniques to
the domain of WWW services. The main problems addressed are to define
and generate modeling structures suitable for test generation, and techniques
for test generation from such structure.
- Techniques for API Testing.
This subproject, which in some sense is a continuation of the previous, aims
at developing techniques for generating tests for systems with a significant
internal state component, e.g., a database or a file system. A major
problem is to use abstract and simple-to-generate models of the internal
state of such a system. Notions of
coverage etc. will be adapted to this situation.
In current software development technology, testing is the dominant form
of verification of software products. Testing occurs late
in the development process, and it is often there that errors and mistakes
are discovered.
Therefore, techniques that can support the generation and evaluation of
tests by automated tools are important.
Presently,
industry performs most testing activities with a rather low degree of
automation. Testing includes the following activities (among others):
- Producing a specification on some form. This is,
of course, far from trivial in practical situations.
- From the specification one generates test purposes.
These can be seen as reformulations of requirements on a form which
is suitable for testing. Each test purpose should represent a specific
specification item, which can be tested separately.
- Generation of actual tests, in the form of test suites,
or test scripts from test
purposes.
- Execution of tests, either manually or using a tool for test execution.
The quality of a test process can be measured by different notions of
coverage, which measure how much of the specification or of the
implementation has been tested and found correct.
Tools and techniques have been developed to support different parts of
this process. In this section, we consider some particular aspects.
Testing theory starts in a classical theory for testing the input-output
behavior of sequential procedures. Central problems in this area are:
how to generate test sets that achieve a certain coverage, and how
to detect whether the output in a test case is correct or incorrect.
Work on achieving coverage is usually based on a model of the control
flow of the procedure, and can very abstractly be
seen as one instance of a very general principle in testing,
formulated by Beizer [Bei90, p. 121, bottom] as
"testing consists of defining useful graph models and
covering them". In the classical theory of testing, the control
flow of a program is modeled as a graph, and measures of coverage
are defined on this graph. In testing of reactive systems,
a graph can model the control aspect of a system's behavior, and
test purposes can be translated into sets of paths in this graph.
References include [MPS99]
In Protocol Testing, there is a long tradition of research on
deriving test suites from a State Machine Specification of a protocol
entity. Assume that a specification of an entity is given as an
EFSM (Extended Finite State Diagram) state
transition diagram, where nodes represent (major) control states,
and where transitions are annotated with guards and operations on
data variables. In the years around 1990, considerable research concerned
generation of test suites to achieve a certain coverage, e.g., executing
all state transitions. Some references are
[SL89,VCI90,Hol91].
Current widely distributed tools for generating test suites from
state machine specifications, e.g., in SDL, often do not implement
these techniques. They seem not so concerned with coverage, maybe for
the reason that high coverage would result in too large test suites.
Instead, test sequences are derived from a combination of a test purpose,
which could be e.g., an MSC or an observer which specifies a
test objective, and a state machine specification in SDL.
The realization of this technique in Autolink and TestComposer is
described in, e.g.,
[SEG$^$98,Gra93].
Research papers describing related techniques are
A commonly used technique in practice for generating test suites is
to exercise them manually, and record them so that next time they
can be automatically executed.
This is the idea of so-called
Capture/Replay tools.
During manual testing, the tool records the sequences of
interactions in test sequences. These sequences can thereafter be
replayed to the system under test in order to repeat the test,
possibly in another context. The capture is recorded as a script,
and some tools offer support for manipulating scripts, e.g., to
change parameters in commands. It should be noted that a C/R tool
is basically a tool to assist in manual definition of test sequences,
it does not assist in achieving coverage, etc.
Tools for Capture/Replay testing include WinRunner, and tools from
Rational. These tools work on the level of user interface in a
Windows environment.
they have the limitation that
during playback, the tested system should be configured in
the same way as during recording, e.g., that the same objects have been
created on the screen of the user.
Tools for Capture/Replay testing in the WWW domain include
CAPBAK/Web from Software Research Inc.
http://www.soft.com/Products/Web/CAPBAK/capbakweb.html
This tool is based on a
browser (Microsoft Explorer), with added capabilities for recording HTTP
commands, and for navigation and measurement. It can be user as a Web
spider, or to measure web page latency, etc.
A problem in any testing is
to check that the output of the system conforms to its requirements,
in particular safety requirements. This problem is often referred to as the
oracle problem.
A common approach to the oracle problem is to select certain
combinations of input values, and combine them with corresponding
correct output values to form test cases or test sequence.
Besides the problem of finding the correct output values, this approach
assumes that the system under test is deterministic, i.e.,
that output values are uniquely determined by input values
and their occurrence in time, an assumption that is not valid in general for
distributed embedded systems.
An alternative approach is to
generate a test oracle from the safety requirements.
A test oracle is a separate module, which observes both the input and
the output of a component, and is able to determine whether the
observed sequence of inputs and outputs conforms to the (safety)
requirements for the system. This approach does not assume
unique output values, and solves the oracle problem in one construction.
A potential problem is that it may not be trivial
to construct a test oracle for an arbitrary requirement.
Work on the generation of test oracles include work
in the framework of the
language LUSTRE. Ouabdesselam et al [OP94,PO96]
and Raymond et al [RNHW98] expresses requirements
as synchronous observers in LUSTRE. These can be seen as test oracles
that are expressed in a high-level executable language.
Raymond et al [RNHW98] also
present methods for generating relevant inputs in
test sequences
O'Malley et al. [ORD96] present a technique for automatically
generating
test oracles from formulas in GIL (Graphical Interval Logic), an interval-based
temporal logic. The translation resembles the generation of deterministic
finite automata from regular expressions.
The oracle generation
is part of the TAOS (Testing with Analysis and Oracle Support) toolkit. It is
described by Richardson [Ric94], where plans to produce
oracles from the real-time logic RTIL are mentioned.
The Temporal Rover [Dru00] is a tool which generates
executable code in, e.g., C, C++, or Java, from temporal logic assertions
that are included as comments in the source file. The Temporal Rover can
be seen as a test oracle generator.
Generation of test oracles have also been considered in the context of
simulation, where they are usually called simulation checkers. An
example is FoCs [ABGaYW00], developed at IBM Haifa Research Laboratory,
which generates simulation checkers from propositional temporal logic
specifications.
Approaches to generating oracles for non-reactive procedures include
Peters and Parnas [PP98], who derive oracles for input-output
specifications from a formal
notation based on boolean expressions and bounded quantification.
There are many tools
that allow automated execution of test sequences:
- Testing from TTCN using TAU/Autolink [SEGHK 98] from Telelogic.
- Playback tools that use various scripting languages.
- Any scripting language can be used for automated test execution.
Previously, the project has consisted of a number of different activities
that have run in parallel, mostly as M.Sc. thesis projects.
For the continuation, we will focus on one major thread, viz. testing of
Web Services, and give less effort to the other tracks.
The purpose of the project is to contribute to the technological basis for
testing, and to do this in a framework which is relevant to
development of telecommunication services. It seems that the most
sensible framework to choose is that of WWW based services, since this
world comes with standards and a multitude of support tools that can be
used for the practical work. Care must be taken not to merely repeat what
already exists in the commercial world (e.g., for Capture/Playback tools.
Research work in this area includes work by Ricca and Tonella
[RT01] who describe a system for creating a graph of the
WWW pages at a web site. The graph is generated by a web spider (called
ReWeb). The approach works for static HTML pages, maybe also for
dynamically created static pages. Based on the graph, a tool
can perform static analysis of the graph structure, e.g.,
detecting dead pages by comparing with the pages that are present in the file
system, or describing which pages must be traversed in order to reach
a certain page.
Another tool (TestWeb) can use the graph to produce test suites
according to some given coverage criterion.
The project, then will focus on the following problems, given in the
order that they will be addressed
The aim of his work to extend the work of the ReWeb tool developed by
Ricca and Tonella. A web spider collects pages by navigating the Web site.
The pages are organized into a graph. For pages which are static, and
have a very simple frame structure, this is not so difficult. We propose
here to address the following problems in this context.
- Generating graphs of dynamic pages with varying data values
For dynamically generated pages, where the "same" page may appear
with varying contents, e.g., data which depends on user input,
database content, time of day, etc., it may be difficult to
create a map of the web site, since it may be difficult to
know when to identify two variants of the same page.
We propose to attack this by defining a suitable part of the HTML
page as "control", and the rest as "data". When constructing the graph,
the tool will identify nodes with the same "control", and if possible try
to present the "data" part in a structured format.
This approach is under investigation in an M.Sc. project
in collaboration with Validation AB, due to be finished by 01-06.
A description of the M.Sc. project can be found
at http://www.docs.uu.se/astec/astec-testing/Telia/mscproj-00.html.
Abstractly seen, this subproject will develop techniques for generating
state machines from observed sequences of
interaction between a system and its environment.
In order to be able to build an extended finite-state model of the
system, it must be possible to define and observe a "control state"
of the system.
- Developing appropriate modeling techniques for web sites with complex frame structure
The naive approach of describing a web site as a single graph is
appropriate when the web site consists of entire pages, which are loaded
one by one into the browser. In the case of frames, the picture is less
straight-forward: by performing an interaction with one page, the contents
of another page may change (this is very common: e.g., having one frame with
navigation buttons which change the content of another frame). A more
accurate description in this case will describe the site as a
hierarchically structured graph, maybe similar to a StateChart, where
transitions in one component synchronize with transitions in other
components (it could be an advantage to use UML in this context).
The purpose of this project is to develop a technique for
modeling the interactive aspect of a Web Site as a hierarchically structured
graph. The initial effort will be to describe how to extract
such information from the HTML code. Thereafter the techniques
developed in the preceding subproject will also be adapted to this
formalism.
- Extending the above by static analysis of CGI scripts.
A continuation of the project may complement the previous
efforts by techniques for static analysis of CGI scripts etc., with the
purpose of extracting control flow information which assists in building
a structure on which to base test generation.
.
In analogy with other testing and static analysis efforts, techniques
will be developed for generating tests, and for
performing various analyses of the graph structure. The project will adapt
existing notions of coverage, possibly extended with other notions that are
developed in the Secondary Track, as a basis for test generation.
For static analysis, one might possibly
envisage a connection to the MetaFrame tool, where such analyses are
formulated in a generic logic (mu-calculus), and thereafter compiled
into analysis code.
This track has long-term goals, and aims to increase the understanding
of fundamental issues in testing. Testing of sequential procedures
is a rather well-researched topic. In comparison, much less is
understood about how to test reactive systems with a significant
internal state, e.g., the API of a file system, or a WWW service with
a database component, etc. Any meaningful notion of
coverage when testing these system must be based on some model of the
systems internal (persistent) state. As an example, the theory of
protocol conformance testing, described previously,
uses a state machine model of the protocol
and defines coverage in terms of this machine.
For a database application, one would like to model the internal
database state. Very abstractly, this state could be viewed as
an input of each operation, and classical testing theory could be
used to partition the possible internal states into equivalence classes.
For instance, for a file system such equivalence classes could be:
file system empty, full, almost empty, almost full, etc.
API testing, using a model of the internal states of the system has been
done, e.g., for GUI testing [MPS00]. There are many other
examples. A major problem in this context is of course to create the
internal model in the first place. In protocol testing, the protocol is
often defined as a state machine, so that the model is available from
the start. In more general domains, such models are usually not created.
A related line of work is concerned with the use of "reference models"
in testing, i.e., when an independent implementation of the system
is available. This is the case, e.g., when a system has been ported to
a new language or a new platform; the old implementation can then
serve as reference model when testing the new implementation.
The protocol will address the problem to model internal state
in API testing. The major issue will be the use of how to create
crude, abstract models of a system's internal state, and how to use
these in system testing. Consider for example the testing of a file system.
A model of the file system can be created at various level of abstraction.
Very crudely, there may be (say) only three states: "empty", "full", and
"neitheremptynorfull". More elaborate models may model aspects of
storage, etc. at some level of detail.
Similar reasoning can be made in testing of database systems, etc.
To summarize the previous paragraph, the project will address the following
work items:
- Development of theory and techniques for using abstract models of a systems
internal state, and how to use these for
- Generating sequences
- Defining coverage
- having the abstract models as test oracles.
- Experimental evaluation and development of techniques.
The obvious choice of experimentation platform is to use one
in conjunction with
We foresee to develop the above techniques within the context of
testing systems with WWW-based interfaces. One advantage of this
approach is that generic tools can be developed, that standard
data formats are defined, and that there are
many generic tools available. For instance, the M.Sc. project which is
now in progress, has as part of its work developed a simple
Capture/Replay tool from components already available in Java.
http://www.docs.uu.se/astec/astec-testing/report-01-01.html
has a progress report from the project for 2000.
- For embedded systems in automotive applications, a technique
for generation of test oracles from formal requirements in a
real-time temporal logic has been developed and implemented.
Real-time logics are suitable for specifying requirements for embedded
systems and safety-critical applications. The technique uses the connection
between temporal logic and automata theory, which is well-established in
model-checking, extended with some techniques for handling data.
An M.Sc. project at
Volvo Technical Development and Uppsala University
has developed and implemented
a technique for translating formal requirements written in
TRIO, a real-time temporal logic,
into FIL (Fault Injection Language), a language
used for programming test equipment at Volvo Technical Development.
As example requirements, the project uses
safety requirements for an idealied cruise controller developed by Volvo
Technical Development. In an earlier M.Sc. thesis by Johan Nielsen
at Prover Technology AB, these requirements have been formalized
in TRIO. This work is reported in John Haakansson's M.Sc. thesis
[Håk00].
A slight generalization of the work, for temporal logics in general is
reported in a draft paper [HJL99,HJL00].
Currently, the work is continued in a new M.Sc. project at Volvo
Technical Development, in which a simulation model for distributed
systems using the CAN bus is developed in Simulink. The oracle generation
algorithm will be used to derive simulated testers in this environment.
- For telecom applications, a similar technique is being developed for
generating test oracles from requirements formulated
in the MSC (Message Sequence Charts) notation, according to the
MSC 2000 standard by ITU-T.
An M.Sc. project at Telelogic AB and Uppsala University
is developing an execution model for the MSC 2000 standard.
This execution model can directly be used to generate test oracles
from requirements written in MSC, and can also be used to generate
test sequences. A basic motivation for this M.Sc. project is
to simplify the generation of test sequences from MSCs, which
is performed by the tool Tau.
In the current version of Tau, MSCs do not carry enough information
to generate test sequences. For instance, MSCs do not speficy the
values of message data parameters, and do not specify initialization
(preamble) that precedes a test sequence. The new standard MSC2000
contains facilities for specification of data parameters, and for
specifying high-level control structures.
This work is reported in the M.Sc. thesis of Gerardo Padilla,
and in [JP01].
- The main track of the project proposal can be seen as a continuation of
previous work in the project.
An M.Sc. project at Validation AB [Löf00] has produced
a proposal for constructing functional specifications of
WWW applications in a graphical structure which can, but need not, be
similar to the linked structure of HTML pages used for the application.
An specification of a small WWW-based booking
service has been carried out in this
format, together with test scripts.
Currently, an M.Sc. project is developing and implementing techniques
for constructing graph-structured functional specifications of services with
WWW-based interfaces, automatically or
semi-automatically, from observed interactions with the service. The main idea
of the project is to record sequences of HTML pages, and from these
sequences attempt to construct a graph which can generate such sequences.
The project is planned to finish
by 01-05, and to be reported in the M.Sc. thesis by J. Boustedt.
The project is expected to run from 01-01-01 and onwards. The
activities will be carried out by the following people
- Senior Staff:Bengt Jonsson UU, 20
senior researcher 25
- Ph.D. Student:One Ph.D. student will be recruited,
employed by Uppsala University and (potentially) at Validation AB,
starting summer 2001.
A second Ph.D. student will be recruited at the beginning of 2002.
- M.Sc. Students: Several M.Sc. Projects will be carried out in
association with the project, performing work in the context of different
companies, such as
Telelogic AB, Validation AB and Volvo Teknisk Utveckling AB
- ABGaYW00
-
Y. Abarbanel, I. Beer, L. Gluhovsky, and S. Keidar an Y. Wolfsthal.
FoCS: Automatic generation of simulation checkers from formal
specifications.
In Emerson and Sistla, editors, Proc. Int. Conf. on
Computer Aided Verification, volume 1855 of Lecture Notes in Computer
Science, pages 538-542, 2000.
- Bei90
-
B. Beizer.
Software Testing Techniques.
Van Nostrand Reinhold, New York, 1990.
- Dru00
-
D. Drusinsky.
The remporal rover and the ATG rover.
In K. Havelund, editor, SPIN Model Checking and Software
Verification, Proc. 7th SPIN Workshop, volume 1885 of Lecture Notes in
Computer Science, pages 323-330, Stanford, California, 2000. Springer
Verlag.
- Gra93
-
J. Grabowski.
The generation of TTCN test cases from MSCs.
Technical Report IAM-93-010, University of Berne, Institute for
Informatics, April 1993.
- Håk00
-
J. Håkansson.
Automated generation of test scripts from temporal logic
specifications.
Master's thesis, Uppsala University, 2000.
- HJL99
-
J. Håkansson, B. Jonsson, and O. Lundqvist.
Automated generation of test scripts from temporal logic
specifications.
In 11th Nordic Workshop on Programming Theory, Uppsala,
Sweden., Oct. 1999.
- HJL00
-
J. Håkansson, B. Jonsson, and O. Lundqvist.
Generating on-line test oracles from temporal logic specifications,
2000.
Manuscript, submitted.
- Hol91
-
G.J. Holzmann.
Design and Validation of Computer Protocols.
Prentice Hall, 1991.
- JP01
-
B. Jonsson and G. Padilla.
An execution semantics for NSC2000.
In Proc. 10th SDL Forum, Lecture Notes in Computer Science,
Copenhagen, Denmark, June 2001. Springer Verlag.
- Löf00
-
D. Löf.
Automated testing of WWW applications.
Master's thesis, Uppsala University, 2000.
- MPS99
-
A.M. Memon, M.E. Pollack, and M.L. Soffa.
Using a goal-driven approach to generate test cases for GUIs.
In Proc. 21st Int. Conf. on Software Engineering, pages
257-266, May 1999.
- MPS00
-
A.M. Memon, M.E. Pollack, and M.L. Soffa.
Automated test oracles for GUIs.
In Proc. FSE-8, 8th Int. Symp. on the Foundations of
Software Engineering, San Diego, CA, Nov. 2000.
- OP94
-
F. Ouabdesselam and I. Parissis.
Testing synchronous critical software.
In Proc. 5th Int. Symp. on Software Reliability Engineering,
pages 239-248, 1994.
- ORD96
-
T.O. O'Malley, D.J. Richardson, and L.K. Dillon.
Efficient specification-based test oracles for critical systems.
In Proc. 1996 California Software Symposium, April 1996.
- PO96
-
I. Parissis and F. Ouabdesselam.
Specification-based testing of synchronous software.
In Proc. 4th ACM SIGSOFT Symp. on Foundations of Software
Engineering, pages 127-134, 1996.
- PP98
-
D.K. Peters and D.L. Parnas.
Using test oracles generated from program documentation.
IEEE Transactions on Software Engineering, 24(3):161-173,
March 1998.
- Ric94
-
D. J. Richardson.
TAOS: Testing with analysis and oracle support.
In Proc. ACM SIGSOFT International Symposium on Software
Testing and Analysis, Aug. 1994.
- RNHW98
-
P. Raymond, X. Nicollin, N. Halbwachs, and D. Weber.
Automatic testing of reactive systems.
In RTSS98, 1998.
- RT01
-
F. Ricca and P. Tonella.
Building a tool for the analysis and testing of web applications:
Problems and solutions.
In T. Margaria and W. Yi, editors, Proc. TACAS '01,
Int. Conf. on Tools and Algorithms for the Construction and Analysis of
Systems, Lecture Notes in Computer Science, 2001.
- SEG$^$98
-
M. Schmitt, A. Ek, J. Grabowski, D. Hogrefe, and B. Koch.
Autolink - putting sdl-based test generation into practice.
In 11th Int. Workshop on Testing of Communicating Systems
(IWTCS'98), Tomsk, Russia, Sept. 1998.
- SL89
-
D.P. Sidhu and T.K. Leung.
Formal methods in protocol testing, a detailed study.
IEEE Trans. on Software Engineering, SE-15(4):413-426, 1989.
- VCI90
-
S.T. Vuong, W.Y.L. Chan, and M.R. Ito.
The UIOv-method for protocol test sequence generation.
In Proc. 2nd Int. Workshop on Protocol Test Systems, pages
161-176. North-Holland, 1990.