This repository is part of my MSc. thesis titled "Test Case Generation from User Stories in Requirement Engineering using Generative AI Techniques with LLM Models: A Comparative Analysis." The research explores the application of Large Language Models (LLMs) in automating the generation of test cases from user stories within software requirement engineering. By comparing different Generative AI techniques and LLM models, the thesis aims to identify the most effective approach for improving the accuracy, completeness, and efficiency of test case generation.
The core idea behind this thesis is to leverage advanced Generative AI techniques and LLMs to automate the traditionally manual and time-consuming process of generating test cases from user stories. User stories, typically written in natural language, are an integral part of the Agile software development process, serving as a source for deriving test cases that validate the functionality of software features. The thesis investigates multiple prompting techniques and LLM models to assess their ability to generate relevant and comprehensive test cases, ultimately providing insights into the best practices for integrating AI into requirement engineering workflows.
This repository is structured to provide a detailed and organized view of the experiments conducted as part of the thesis research. Each folder within the repository corresponds to a specific experiment or set of experiments and includes the following components:
- Content: Each experiment folder contains a PDF document with the test cases generated by the selected LLM models and prompting techniques. These test cases are crucial for evaluating the models based on their accuracy, completeness, and relevance to the user stories provided.
- Purpose: The PDFs serve as a tangible output of the experiments, demonstrating the practical application of the models in generating test cases.
- Content: Accompanying each experiment, there is an Excel file that documents all the key metrics and scores calculated during the experiment. This includes the number of input data samples, accuracy scores, completeness scores, and other performance indicators.
- Purpose: The Excel sheets provide a comprehensive analysis of each experiment, enabling detailed comparisons across different models and prompting techniques.
- Content: Within each experiment folder, an "images" subfolder contains visual graphs and charts that illustrate the results of the experiments.
- Purpose: These visualizations offer an intuitive understanding of performance trends, comparisons between models, and the overall effectiveness of the techniques employed. They are essential for quickly grasping key insights and drawing conclusions from the data.
- Content: A dedicated folder contains all the original code used during the experiments. This includes scripts for data preprocessing, model prompting, test case generation, and performance analysis.
- Purpose: This folder allows users to explore and run the code that was integral to the research, ensuring reproducibility and transparency of the experiments.
The experiments documented in this repository are designed to fulfill several key objectives within the thesis:
- Comparative Analysis: Evaluate and compare the effectiveness of different LLM models and prompting techniques in generating test cases from user stories.
- Tree of Thoughts (ToT) Framework: Integrate and test the Tree of Thoughts (ToT) framework to enhance the logical reasoning capabilities of the LLMs in generating more accurate test cases.
- Scalability Testing: Conduct experiments with varying input data sizes (100 and 500 samples) to assess the scalability and robustness of the models.
- Performance Metrics: Analyze the generated test cases using a range of metrics, including accuracy, completeness, and relevance, to determine the best-performing models and techniques.
- Explore the Generated Test Cases: Navigate through the PDFs in each folder to review the test cases produced by different models and techniques. These documents are key to understanding the practical outcomes of the research.
- Analyze the Metrics: Open the Excel files to explore the detailed metrics and scores for each experiment. These files provide a deep dive into the performance of the models across various dimensions.
- Visualize the Results: Check the "images" folder within each experiment directory for visual representations of the data. These graphs are designed to help users quickly understand the results and identify trends.
- Run the Code: Explore the "Code" folder to view or execute the original scripts used to carry out the experiments. This is essential for reproducibility and further experimentation.
The content of this repository is provided for academic and research purposes only. The results and conclusions presented are based on specific models and techniques as detailed in the thesis. While every effort has been made to ensure the accuracy of the data and findings, variations may occur depending on the context and application of these methods. Users are advised to apply the information contained in this repository at their own discretion and risk.
© 2024 Akshat Mehta. All rights reserved. Unauthorized use of the materials contained in this repository without permission is strictly prohibited.