Statistical Testing of Web Applications Paper Comments

Statement of the Problem/Goals:

Overall problem

Websites used to be static, but are now dynamically generated. That makes them hard to test, but they're super important to test and get running correctly.

Goals

Use a dynamic analysis technique to make testing web applications easier by modeling the application (not necessarily a goal to automate the entire process, though). Be able to “properly model” “both dynamic and static pages.” (p. 104)

Contribution to state-of-the-art

The beginnings of a new way to test web applications. Ideas for how to approach this new problem - model the web application (we sort of do this …).

Technical Approach:

Key insights

Use access logs that are automatically recorded by the Web server to generate a realistic model of the web app that can be “interpreted as a Markov chain on which statistical testing can be conducted.” (p. 104)

Navigation is probably one of the most important factors for the model.

Statistical testing (maybe not a new insight) – the parts of the application that are used more often should be most thoroughly tested (because it's hard/impossible to test the whole thing).

Overall approach/strategy

Use access logs to create a Markov model of the web application based on visited urls (navigation). Use this to estimate the reliability of the Web app and prioritize the execution of test cases … and to help decide when to stop testing. (p.115).

Discussion/Critique:

How did they evaluate their efforts?

“The number of failures occurring in such test sessions can be used to estimate reliability, since the behavior of users is stochastically reproduced in the test cases.” (I don't actually totally get this, and I think this might be their strategy for evaluating the web app they're trying to test, not their own methods …

They looked at the number of failures they got when they actually tested the apps with these models. BUT is this actually a good way to evaluate? The values weren't in their generated model, so how'd they decide what to plug in and how do they know that's not what's determining this?

Conclusions from evaluation results

It's really hard to test web applications. When you make these Markov models you get an almost fully connected graph. The occurence of a failure depends on the followed path and inserted inputs … so even though they seem to think their model is a good jumping-off point (which it is), they acknowledge that the values need to be addressed.

What application/useful benefit do the researchers/you see for this work?

A way to generate models that make it easier to test web applications. A jumping-off point for making it even easier … they didn't seem to be thinking about the automation of the testing process, but this is definitely a part of what we do.

Limitations mentioned

Failure depends on navigation and values (see above). Used a complex web application for one of their case studies, so it's hard to tell how things are working in there.

Back, forward, go-to commands from web browser not considered in edge traversal counts.

Additional limitations

Too many manual steps (p. 113).

Questions!

p. 103 – Why is advertising the motivation for learning to make testing web applications easier?

p. 116 – “The probability that the user exercises a path not seen during testing is 0.22%” –> This sounds really really good, but is it? I'm not sure exactly what that means.

p. 116 – Why do they start off with such a complex web app? We're still using pretty simple ones … that seems weird to me.

p. 123 – “For Web applications with an internal state having dependencies on the specific path followed, this results in a limited efficacy of the test phase. It is thus preferable to limit as much as possible the dependencies of the internal state on the path followed. Ideally, only the immediate predecessor should affect the internal state of the application.” – If I am reading this correctly, this is hilarious. Do they really expect that people are going to make Web applications that are simpler and not as cool just so that they can be tested more easily?

You could leave a comment if you were logged in.