Skip to content

Commit 8ac9129

Browse files
committed
Add public Replication Package
0 parents  commit 8ac9129

436 files changed

Lines changed: 397311 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.dockerignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.git
2+
.idea
3+
Dockerfile

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.idea

Dockerfile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
FROM maven:3-eclipse-temurin-21
2+
WORKDIR /replication
3+
COPY . .
4+
# Download all sources and go offline
5+
RUN mvn -B compile test-compile && mvn -B dependency:go-offline
6+
7+
ENV OPENAI_API_KEY=sk-DUMMY
8+
ENV OPENAI_ORG_ID=""
9+
ENV OLLAMA_HOST=http://localhost:11434
10+
11+
ENTRYPOINT bash -c "cat README.md && bash"

INSTALL.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Installation Instructions
2+
This file guides you in setting up everything you need to run the Replication package.
3+
4+
Everything was tested on Linux (amd64) and MacOS (arm64). Windows should work as well, but we did not test it.
5+
6+
## Hardware / Service Requirements
7+
* We recommend the execution on a system with at least 16 GB RAM.
8+
* If you want to run the LLMs with **new** projects, you need ...
9+
* an [ollama](https://ollama.com/) instance capable of running LLAMA 3.1 70b.
10+
* an OpenAI access token and an organization id.
11+
* For the projects, we used in the paper, we provide the cached LLM responses. Thus, you don't need access to an ollama instance or an OpenAI access token.
12+
13+
## Prerequisites (Docker Image with all bundled dependencies)
14+
* Docker
15+
16+
## Prerequisites (Local)
17+
* Java JDK 21
18+
* Maven 3

LICENSE.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2020-2024 ArDoCo
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
# Replication Package for "Enabling Architecture Traceability by LLM-based Architecture Component Name Extraction"
2+
3+
by Dominik Fuchß, Haoyu Liu, Tobias Hey, Jan Keim, and Anne Koziolek
4+
5+
## Requirements
6+
The requirements are defined in [INSTALL.md](INSTALL.md).
7+
8+
## Quickstart
9+
If you just want to run the evaluation, you can run:
10+
11+
### Docker
12+
```bash
13+
docker run -it --rm ghcr.io/ardoco/icsa25
14+
mvn -q test -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TraceLinkEvaluationSadSamViaLlmCodeIT
15+
```
16+
17+
### Local
18+
```bash
19+
mvn -q test -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TraceLinkEvaluationSadSamViaLlmCodeIT
20+
```
21+
22+
## Overview of the Repository
23+
* The most important class is the test class [TraceLinkEvaluationSadSamViaLlmCodeIT](tests/integration-tests/tests-tlr/src/test/java/edu/kit/kastel/mcse/ardoco/tlr/tests/integration/TraceLinkEvaluationSadSamViaLlmCodeIT.java)
24+
* This class is responsible for the evaluation of the trace links generated by our approach using TransArC.
25+
* Set `OPENAI_API_KEY` and `OLLAMA_HOST` (environment variables) to the OpenAI access token and the Ollama host, respectively. If you just want to replicate the results, you can use the provided docker container or locally set OPENAI_API_KEY to `sk-DUMMY` and OLLAMA_HOST to `http://localhost:11434`.
26+
* After setting the environment variables. Just run the test, to get the results of the paper.
27+
* To run the test, you can run `mvn -q test -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TraceLinkEvaluationSadSamViaLlmCodeIT`. The output of the test execution is described below.
28+
* For our in-depth analysis, we used [TraceLinkEvaluationSadSamViaLlmIT](tests/integration-tests/tests-tlr/src/test/java/edu/kit/kastel/mcse/ardoco/tlr/tests/integration/TraceLinkEvaluationSadSamViaLlmIT.java). Here, we run the SAD-SAM TLR based on the extracted component names (see e.g., [mediastore_gpt4o_from_docs.txt](tests/integration-tests/tests-tlr/src/test/resources/mediastore/mediastore_gpt4o_from_docs.txt))
29+
* [cache-llm](tests/integration-tests/tests-tlr/cache-llm) contains the llm requests and responses for the evaluation in JSON format. They will be used for replication, so you don't need to run the LLMs again. You need to remove the directory, if you want to send new requests to the LLMs.
30+
* The important logic for the extraction of the component names is located in [LLMArchitectureProviderInformant](stages-tlr/model-provider/src/main/java/edu/kit/kastel/mcse/ardoco/tlr/models/informants/LLMArchitectureProviderInformant.java) and connected classes.
31+
* [results](results) contain the results of the evaluation in human-readable logging format.
32+
* **The calculation of results (using the cached LLM responses) takes roughly 25-30 minutes (on a MacBook Air M2)**. Since the best configuration uses only the documentation to generate the SAMs, we made this the default configuration. If you want to change the configuration, you can directly modify `TraceLinkEvaluationSadSamViaLlmIT`.
33+
34+
### Environment variables
35+
* To set an environment variable, you can use the export command (Linux and MacOS): `export OPENAI_API_KEY=YOUR_API_KEY`.
36+
* The following environment variables are important:
37+
* `OPENAI_API_KEY`: The OpenAI API key to access the LLMs. (Docker Default: `sk-DUMMY`)
38+
* `OPENAI_ORG_ID`: The OpenAI organization id. (Docker Default: empty string)
39+
* `OLLAMA_HOST`: The host of the Ollama instance. (Docker Default: `http://localhost:11434`)
40+
* (optional) `OLLAMA_USER`: The user (authorization) for the Ollama instance. (Docker Default: not set)
41+
* (optional) `OLLAMA_PASSWORD`: The password (authorization) for the Ollama instance. (Docker Default: not set)
42+
43+
### Results
44+
The results are structured as follows:
45+
* Each evaluation starts with the configuration: e.g., `Evaluating project MEDIASTORE with LLM 'GPT_4_O_MINI'`.
46+
* Also, a line regarding the used prompts is included: e.g., `Prompts: DOCUMENTATION_ONLY_V1, null, null` (Prompt for the Documentation, Prompt for the Source Code, Prompt for the Aggregation).
47+
* Then the outputs from the LLM for the different steps are printed in the log. (e.g., `LLMArchitectureProviderInformant - Initial Response` or `LLMArchitectureProviderInformant - Response:`)
48+
* In the end, the evaluation results are printed in the log.
49+
50+
```
51+
2024-09-30 10:14:30:985 +0200 [main] INFO TraceLinkEvaluationSadSamViaLlmCodeIT - Evaluating project MEDIASTORE with LLM 'GPT_4_O_MINI'
52+
2024-09-30 10:14:30:985 +0200 [main] INFO TraceLinkEvaluationSadSamViaLlmCodeIT - Prompts: DOCUMENTATION_ONLY_V1, null, null
53+
[... More Logs ...]
54+
2024-09-30 10:14:37:140 +0200 [main] INFO LLMArchitectureProviderInformant - Initial Response: Based on the provided software architecture documentation for the Media Store, we can identify several high-level components that make up the system. Here’s a breakdown of these components along with their responsibilities:
55+
56+
### 1. **Facade Component**
57+
- **Role**: Acts as the server-side web front end.
58+
- **Responsibilities**:
59+
- Delivers web pages to users.
60+
- Manages user sessions.
61+
- Provides registration and login functionalities.
62+
- Allows users to browse, download, and upload audio files.
63+
64+
### 2. **UserManagement Component**
65+
- **Role**: Handles user registration and authentication.
66+
- **Responsibilities**:
67+
- Processes registration requests.
68+
- Manages user login and authentication.
69+
- Implements password hashing and salting for security.
70+
71+
### 3. **UserDBAdapter Component**
72+
- **Role**: Encapsulates database access for user data.
73+
- **Responsibilities**:
74+
- Queries the database for user-related information.
75+
- Creates queries based on user requests.
76+
77+
### 4. **MediaManagement Component**
78+
- **Role**: Central business logic component.
79+
- **Responsibilities**:
80+
- Coordinates communication between various components.
81+
- Processes download requests and fetches audio files.
82+
- Forwards audio files to users after processing.
83+
84+
### 5. **TagWatermarking Component**
85+
- **Role**: Responsible for watermarking audio files.
86+
- **Responsibilities**:
87+
- Re-encodes audio files.
88+
- Applies digital watermarks to ensure copyright protection.
89+
90+
### 6. **ReEncoder Component**
91+
- **Role**: Handles audio file re-encoding.
92+
- **Responsibilities**:
93+
- Converts audio files to different bit rates.
94+
- Reduces file sizes as necessary.
95+
96+
### 7. **Packaging Component**
97+
- **Role**: Manages the packaging of multiple audio files.
98+
- **Responsibilities**:
99+
- Archives several audio files into a single compressed file for easier downloading.
100+
101+
### 8. **MediaAccess Component**
102+
- **Role**: Manages access to audio files and their metadata.
103+
- **Responsibilities**:
104+
- Stores uploaded audio files at a predefined location.
105+
- Fetches a list of available audio files.
106+
- Retrieves associated metadata from the database for download requests.
107+
108+
### 9. **AudioAccess Component**
109+
- **Role**: Facilitates querying of audio files.
110+
- **Responsibilities**:
111+
- Creates queries to list all available audio files from the database.
112+
113+
### 10. **Database Component**
114+
- **Role**: Represents the persistence layer.
115+
- **Responsibilities**:
116+
- Stores user information and metadata of audio files (e.g., name, genre).
117+
- Executes queries created by the UserDBAdapter and AudioAccess components.
118+
- Stores salted hashes of passwords.
119+
120+
### 11. **DataStorage Component**
121+
- **Role**: Manages the physical storage of audio files.
122+
- **Responsibilities**:
123+
- Stores audio files in a dedicated file server or local disk.
124+
- Decouples audio file storage from the database.
125+
126+
### Summary
127+
The architecture of the Media Store is composed of several interrelated components, each with specific roles and responsibilities. The Facade component serves as the entry point for users, while the MediaManagement component orchestrates the core business logic. User management, file handling, and data storage are handled by dedicated components, ensuring a modular and maintainable system design.
128+
2024-09-30 10:14:37:141 +0200 [main] INFO LLMArchitectureProviderInformant - Response: - Facade
129+
- UserManagement
130+
- UserDBAdapter
131+
- MediaManagement
132+
- TagWatermarking
133+
- ReEncoder
134+
- Packaging
135+
- MediaAccess
136+
- AudioAccess
137+
- Database
138+
- DataStorage
139+
2024-09-30 10:14:37:142 +0200 [main] INFO LLMArchitectureProviderInformant - Component names:
140+
AudioAccess
141+
DataStorage
142+
Database
143+
Facade
144+
MediaAccess
145+
MediaManagement
146+
Packaging
147+
ReEncoder
148+
TagWatermarking
149+
UserDBAdapter
150+
UserManagement
151+
2024-09-30 10:14:37:142 +0200 [main] INFO LLMArchitectureProviderAgent - Finished LLMArchitectureProviderAgent - LLMArchitectureProviderInformant in 0.003 s
152+
[... More Logs ...]
153+
MEDIASTORE (SadSamViaLlmCodeTraceabilityLinkRecoveryEvaluation):
154+
Precision: 0.49 (min. expected: 1.00)
155+
Recall: 0.52 (min. expected: 0.52)
156+
F1: 0.50 (min. expected: 0.68)
157+
Accuracy: 0.99 (min. expected: 0.99)
158+
Specificity: 0.99 (min. expected: 1.00)
159+
Phi Coef.: 0.50 (min. expected: 0.72)
160+
Phi/PhiMax: 0.51 (Phi Max: 0.97)
161+
P & R & F1 & Acc & Spec & Phi & PhiN
162+
0.49 & 0.52 & 0.50 & 0.99 & 0.99 & 0.50 & 0.51
163+
--- Evaluated project MEDIASTORE with LLM 'GPT_4_O_MINI' ---
164+
```
165+
166+
## Future extension scenarios
167+
Here, we provide documentation for future extension scenarios.
168+
169+
### New LLMs
170+
If you want to provide new LLMs, you need to update the add them to the enum [LargeLanguageModel](stages-tlr/model-provider/src/main/java/edu/kit/kastel/mcse/ardoco/tlr/models/informants/LargeLanguageModel.java)
171+
172+
### New Projects
173+
If you want to evaluate new projects, you need to add them to the enum [CodeProject](tests/integration-tests/tests-base/src/main/java/edu/kit/kastel/mcse/ardoco/core/tests/eval/CodeProject.java)
174+
175+
Alternatively, you can run the ArDoCo Pipeline [ArDoCoForSadSamViaLlmCodeTraceabilityLinkRecovery](pipeline-tlr/src/main/java/edu/kit/kastel/mcse/ardoco/tlr/execution/ArDoCoForSadSamViaLlmCodeTraceabilityLinkRecovery.java):
176+
* Instantiate the class with the project's name
177+
* Run setUp(...) by providing ...
178+
* the input text (text file),
179+
* input code (code directory),
180+
* additional configs (typically an empty map),
181+
* the output directory,
182+
* the selected LLM,
183+
* the prompt to extract component names from documentation (can be null),
184+
* the prompt to extract component names from code (can be null),
185+
* the selected code features (can be null),
186+
* the prompt to aggregate the component names (can be null)
187+
* Invoke `run()` to execute the pipeline

0 commit comments

Comments
 (0)