failing Evosuite test cases #1

martinezmatias · 2018-03-08T18:39:10Z

@SophieHYe
Evosuite generates test cases that fail on the version used for generating them.
A test case reproduces the bug

I executed Evosuite command over the correct version of repository QuixBug (ie., without our infrastructure) and the problem continues.
The command used was: java -jar <local_path>/quixbugs-experiment/test_executors_java/./libs/evosuite-master-1.0.6-SNAPSHOT.jar -class correct_java_programs.POSSIBLE_CHANGE -projectCP <local_path_to_quixbug>/target/classes -base_dir <local_path_to_quixbug>/src/main/java/correct_java_programs_test -Dglobal_timeout 120 -seed 1 -Drandom_seed 1 -Dsearch_budget 60 -Dstopping_condition MaxTime -Dsandbox false -Dno_runtime_dependency true -mem 2000 -Djunit_check true

SophieHYe · 2018-03-09T10:46:54Z

@martinezmatias
Hi Matias,

I looked at the possible_change tests generate by Evosuite. I would think the problem is because our "correct" version is not exactly "correct". Noted that the failing tests do not have assertion in the test method. Further, these failing tests is not casued by assertion error but the StackOverFlowError.

In the correct version, two issues are not considered well and EvoSuites expose these two problems:
(1) First, the tests failed because of StackOverFlowError. The correct version missed some conditions to stop the interative method calls.
test2/test4/test5 from seed 1 expose this problem when the coins is not null and total number is alway the positive integer, the interation will not stop.
(2) Second, even test7 is passed but it catch NullPointerException which is not handle well in the correct version. (This is my fault in writing the correct Java version according to their correct Python version)

The correct version passed all the tests cases we generated from Json tests because those tests not exposed the problems and make us think it is the correct version.

To test it, I translated the Evosuite test2/test4 to Jsons ([[[-2124], 1], 20]; [[[0, 0, 1161, -2685, 639], 639], 20]) to test correct Python programs and they failed as well with similar errors:

[[[-2124], 1], 20]
Correct Python: (<class 'RecursionError'>, RecursionError('maximum recursion depth exceeded in comparison',), <traceback object at 0x10163fe08>)

[[[0, 0, 1161, -2685, 639], 639], 20]
Correct Python: (<class 'RecursionError'>, RecursionError('maximum recursion depth exceeded in comparison',), <traceback object at 0x10178d888>)

martinezmatias · 2018-03-09T14:40:56Z

Hi @SophieHYe @monperrus

In the correct version, two issues are not considered well and EvoSuites expose these two problems:

For us, this situation helps us to motivate our research: the original inputs are not enough and we need more inputs/outputs. The previous correct version could allow us to motivate the problem.

To test it, I translated the Evosuite test2/test4 to Jsons ([[[-2124], 1], 20]; [[[0, 0, 1161, -2685, 639], 639], 20]) to test correct Python programs and they failed as well with similar errors:

Great work. Does it mean that the correct python version from Quixfix is incorrect, right?

@SophieHYe The previous correct version modifies the if condition by adding a new term. Do you have in mind a correct version for that buggy "possible_change" program?

monperrus · 2018-03-09T16:09:34Z

It means that the correct version is correct and we don't have to change anything on it.

@martinezmatias during the postprocessing of evosuite tests, have you removed try/catch blocks or @expected annotations?

SophieHYe · 2018-03-09T16:37:21Z

Hi @martinezmatias @monperrus

"Does it mean that the correct python version from Quixfix is incorrect, right?"
Yes, the correct python version is incorrect.

"The previous correct version modifies the if condition by adding a new term. Do you have in mind a correct version for that buggy "possible_change" program?"

The buggy program is "if (total < 0)" and the "correct" program is "if (total < 0 ||coins.length==0)".

martinezmatias · 2018-03-09T18:04:38Z

during the postprocessing of evosuite tests, have you removed try/catch blocks or @expected annotations? @monperrus
No, nothing.

Yes, the correct python version is incorrect.

So, we should probably do an issue on QuixBug repo

The buggy program is "if (total < 0)" and the "correct" program is "if (total < 0 ||coins.length==0)"

We need to find the new correct version for python/java and do the PR

martinezmatias · 2018-03-14T19:59:10Z

Hi @SophieHYe
I see you created an input generator according to issue #2 . That's perfect. So I imagine that the experiment could replace that one uses Evosuite which has some problems as we have previously discuss in this issue.
To complement the information of this issue (beyond we useEvosuite or not in the future), the programs with at least one seed with one Evosuite failing test case over the correct program version are: flatten, hanoi, knapsack, levenshtein, longest_common_subsequence, next_permutation, possible_change, sqrt, subsequences, to_base, wrap.

monperrus · 2018-03-15T17:35:53Z

So I imagine that the experiment could replace that one uses Evosuite which has some problems as we have previously discuss in this issue.

We'll discuss about that tomorrow over Skype.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

failing Evosuite test cases #1

failing Evosuite test cases #1

martinezmatias commented Mar 8, 2018 •

edited

Loading

SophieHYe commented Mar 9, 2018 •

edited

Loading

martinezmatias commented Mar 9, 2018 •

edited

Loading

monperrus commented Mar 9, 2018

SophieHYe commented Mar 9, 2018

martinezmatias commented Mar 9, 2018

martinezmatias commented Mar 14, 2018

monperrus commented Mar 15, 2018 via email

failing Evosuite test cases #1

failing Evosuite test cases #1

Comments

martinezmatias commented Mar 8, 2018 • edited Loading

SophieHYe commented Mar 9, 2018 • edited Loading

martinezmatias commented Mar 9, 2018 • edited Loading

monperrus commented Mar 9, 2018

SophieHYe commented Mar 9, 2018

martinezmatias commented Mar 9, 2018

martinezmatias commented Mar 14, 2018

monperrus commented Mar 15, 2018 via email

martinezmatias commented Mar 8, 2018 •

edited

Loading

SophieHYe commented Mar 9, 2018 •

edited

Loading

martinezmatias commented Mar 9, 2018 •

edited

Loading