<h1id="artifacts-for-continuous-model-validation-using-reference-attribute-grammars">Artifacts for "Continuous Model Validation Using Reference Attribute Grammars"</h1>
<h3id="introduction">Introduction</h3>
<p>The paper discusses the utilization of reference attribute grammars (RAGs) for model validation and presents two specific contributions. First, the differences between models and trees specified by reference attribute grammars, specifically non-containment references, are discussed and a manual, yet optimised method to efficiently overcome these differences is presented. Secondly, an extension of RAG grammar specifications is proposed to model noncontainment references automatically. The proposed modelling techniques are compared to state-of-the-art modelling tools utilizing a benchmarking framwork for continuous model validation, the <em>Train Benchmark</em>.</p>
<h3id="structure-of-the-supplementary-artifacts">Structure of the Supplementary Artifacts</h3>
<p>The artifacts are structured in three parts:</p>
<ul>
<li>Full collection of all measurement data and diagrams mentioned in the paper</li>
<li>Benchmark code to reproduce the measurements, including all relevant source codes</li>
<li>A standalone example of non-containment references preprocessor</li>
</ul>
<h3id="general-remarks-on-the-presented-listings-and-measurements">General Remarks on the presented Listings and Measurements</h3>
<p>For reasons of readability and simplicity, there are some minor differences in naming in the source codes and the measured resuting data. Most importantly, the names of the three presented JastAdd implementation variants are different in the code and the diagrams.</p>
<p>The following table shows the relation of the terminology used in the paper and in the code.</p>
<h3id="structure-of-the-train-benchmark">Structure of the Train Benchmark</h3>
<p>The benchmark is able to measure different scenarios specified by configurations with several kinds of parameters:</p>
<olstyle="list-style-type: decimal">
<li><strong>Input Data:</strong> There are two types of input data used in the benchmark, the <code>inject</code> and the <code>repair</code> data set. The former contains <em>valid</em> models, i.e., models, which do not contain any of the faults that are supposed to be found by the presented queries. The latter, <code>repair</code>, contains models already containing faults.</li>
<li><strong>Queries:</strong> The queries are used to find the aforementioned faults. For each fault, there are two queries: <em>repair</em>, to find the fault, and <em>inject</em>, to find places where a fault can be injected.</li>
<li><strong>Transformations:</strong> The transformations performed by the benchmark are, again, two sets: <em>inject</em> and <em>repair</em> transformations.</li>
<li><strong>Transformation Strategies:</strong> The benchmark does not perform the operation on all matches. The strategy <em>fixed</em> performs the transformation on a given number of matches, while the <em>proportional</em> strategy performes them on a given percentage of all matches.</li>
</ol>
<p>These settings are defined in a <em>benchmark scenario</em>, which can be edited before running the benchmark.</p>
<h3id="measurement-data">Measurement Data</h3>
<p>The result data are stored in the directory <ahref="paper-results/"class="uri">paper-results/</a>. This directory contains two subdirectories</p>
<ul>
<li><ahref="paper-results/measurements">measurements</a> contains two directories. The <ahref="paper-results/measurements/inject">inject</a> subdirectory contains the measurements for the <em>inject</em> scenario, which is also included in <ahref="paper-results/measurements/inject/BenchmarkScript.groovy">inject/BenchmarkScript.groovy</a>. The <ahref="paper-results/measurements/repair">repair</a> subdirectory contains the same data for the <em>repair</em> scenario in <ahref="paper-results/measurements/repair/BenchmarkScript.groovy">repair/BenchmarkScript.groovy</a>. Both directories contain files with time measurement data (starting with <code>times</code>) and the numbers of matches (starting with <code>matches</code>). Each file name contains information on the tool used, the query, and the size of the model.</li>
<li><ahref="paper-results/diagrams">diagrams</a> contains the same subdirectories, both containing diagrams with the respective measurements. The diagrams are generated from the same data as in the paper, but enlarged for better readability. In particular, the six diagrams presented in the paper are
<ul>
<li><ahref="paper-results/diagrams/repair/Read-and-Check-RouteSensor.pdf">Fig. 7a. Read and Check for RouteSensor (repair)</a></li>
<li><ahref="paper-results/diagrams/repair/Read-and-Check-ConnectedSegments.pdf">Fig. 7b. Read and Check for ConnectedSegments (repair)</a></li>
<li><ahref="paper-results/diagrams/inject/Transformation-and-Recheck-RouteSensor.pdf">Fig. 7c. Transformation and Recheck for RouteSensor (inject)</a></li>
<li><ahref="paper-results/diagrams/inject/Transformation-and-Recheck-ConnectedSegments.pdf">Fig. 7d. Transformation and Recheck for ConnectedSegments (inject)</a></li>
<li><ahref="paper-results/diagrams/repair/Transformation-and-Recheck-RouteSensor.pdf">Fig. 7e. Transformation and Recheck for RouteSensor (repair)</a></li>
<li><ahref="paper-results/diagrams/repair/Transformation-and-Recheck-ConnectedSegments.pdf">Fig. 7f. Transformation and Recheck for ConnectedSegments (repair)</a></li>
</ul></li>
</ul>
<p><strong>Please Note:</strong> The measurements were conducted using a timeout for the whole run. If a run was not completed, no individual times of the steps appear in the measurements and diagrams. Thus, some tools do not have measurements for all problem sizes.</p>
<h3id="the-source-code">The Source Code</h3>
<p>For this publication, we tried to modify the source code of the benchmark itself as little as possible. Therefore, unfortunately, the code base is rather large and confusing. The following section tries to point to the parts relevant for this paper.</p>
<p>The benchmark is structures in modules, some of which form the code of the benchmark, some are provided by the contesting tools, and some are related to required model serializations. There are some naming conventions: - Tool-related modules are in directories starting with <code>trainbenchmark-tool</code>. - Model serialization-related modules start with <code>trainbenchmark-generator</code>. - All other modules are core modules of the bechmark.</p>
<p>Since the JastAdd-based solutions use a preprocessor to generate Java files, for the presented variant, it is even more compolicated. Each JastAdd configuraration must be presented to the benchmark as a separate tool. Thus there are two directories for each variant, one for the bacht processing mode and one for the incremental mode. Because these two modes share almost all the source code, a third directory is used to store this shared code. Finally, there is a directory for code shared between all JastAdd variants. These are the important directories:</p>
<ul>
<li><ahref="trainbenchmark/trainbenchmark-tool-jastadd-namelookup-base">JastAdd with Name Lookup</a>
<h3id="reproducing-the-measurements">Reproducing the Measurements</h3>
<p><strong>Please Note: Reproducing the graphs as presented in the paper and supplied here takes a very long time depending on the utilized hardware. It is strongly suggested to run the benchmark with a smaller maximum problem size, less repetitions, and a shorter timeout.</strong> Most results of the benchmark are observable with more restricted setup as well. In the following, we will provide a suggested way to run the benchmark in different sizes.</p>
<p>To reproduce the measurements, there are several options. We provide a prepared Docker image that can be run directly. Alternatively, it is, on course, also possible to simply run the provided gradle build scripts. However, since there are some software requirements imposed by the benchmark, particularly for creating the diagrams using R. We stronly suggest running the Docker variant.</p>
<h4id="running-the-benchmark-with-docker">Running the Benchmark with Docker</h4>
<h4id="running-the-benchmark-directly">Running the Benchmark directly</h4>
# Artifacts for "Continuous Model Validation Using Reference Attribute Grammars"
### Introduction
The paper discusses the utilization of reference attribute grammars (RAGs) for model validation and presents two specific contributions.
First, the differences between models and trees specified by reference attribute grammars, specifically non-containment references, are discussed and a manual, yet optimised method to efficiently overcome these differences is presented.
Secondly, an extension of RAG grammar specifications is proposed to model noncontainment references automatically.
The proposed modelling techniques are compared to state-of-the-art modelling tools utilizing a benchmarking framwork for continuous model validation, the *Train Benchmark*.
### Structure of the Supplementary Artifacts
The artifacts are structured in three parts:
- Full collection of all measurement data and diagrams mentioned in the paper
- Benchmark code to reproduce the measurements, including all relevant source codes
- A standalone example of non-containment references preprocessor
### General Remarks on the presented Listings and Measurements
For reasons of readability and simplicity, there are some minor differences in naming in the source codes and the measured resuting data.
Most importantly, the names of the three presented JastAdd implementation variants are different in the code and the diagrams.
The following table shows the relation of the terminology used in the paper and in the code.
The benchmark is able to measure different scenarios specified by configurations with several kinds of parameters:
1.**Input Data:** There are two types of input data used in the benchmark, the ``inject`` and the ``repair`` data set.
The former contains *valid* models, i.e., models, which do not contain any of the faults that are supposed to be found by the presented queries.
The latter, `repair`, contains models already containing faults.
2.**Queries:** The queries are used to find the aforementioned faults. For each fault, there are two queries: *repair*, to find the fault, and *inject*, to find places where a fault can be injected.
3.**Transformations:** The transformations performed by the benchmark are, again, two sets: *inject* and *repair* transformations.
4.**Transformation Strategies:** The benchmark does not perform the operation on all matches.
The strategy *fixed* performs the transformation on a given number of matches, while the *proportional* strategy performes them on a given percentage of all matches.
These settings are defined in a *benchmark scenario*, which can be edited before running the benchmark.
### Measurement Data
The result data are stored in the directory [paper-results/](paper-results/).
This directory contains two subdirectories
-[measurements](paper-results/measurements) contains two directories.
The [inject](paper-results/measurements/inject) subdirectory contains the measurements for the *inject* scenario, which is also included in [inject/BenchmarkScript.groovy](paper-results/measurements/inject/BenchmarkScript.groovy).
The [repair](paper-results/measurements/repair) subdirectory contains the same data for the *repair* scenario in [repair/BenchmarkScript.groovy](paper-results/measurements/repair/BenchmarkScript.groovy).
Both directories contain files with time measurement data (starting with `times`) and the numbers of matches (starting with `matches`).
Each file name contains information on the tool used, the query, and the size of the model.
-[diagrams](paper-results/diagrams) contains the same subdirectories, both containing diagrams with the respective measurements.
The diagrams are generated from the same data as in the paper, but enlarged for better readability.
In particular, the six diagrams presented in the paper are
-[Fig. 7a. Read and Check for RouteSensor (repair)](paper-results/diagrams/repair/Read-and-Check-RouteSensor.pdf)
-[Fig. 7b. Read and Check for ConnectedSegments (repair)](paper-results/diagrams/repair/Read-and-Check-ConnectedSegments.pdf)
-[Fig. 7c. Transformation and Recheck for RouteSensor (inject)](paper-results/diagrams/inject/Transformation-and-Recheck-RouteSensor.pdf)
-[Fig. 7d. Transformation and Recheck for ConnectedSegments (inject)](paper-results/diagrams/inject/Transformation-and-Recheck-ConnectedSegments.pdf)
-[Fig. 7e. Transformation and Recheck for RouteSensor (repair)](paper-results/diagrams/repair/Transformation-and-Recheck-RouteSensor.pdf)
-[Fig. 7f. Transformation and Recheck for ConnectedSegments (repair)](paper-results/diagrams/repair/Transformation-and-Recheck-ConnectedSegments.pdf)
**Please Note:** The measurements were conducted using a timeout for the whole run. If a run was not completed, no individual times of the steps appear in the measurements and diagrams. Thus, some tools do not have measurements for all problem sizes.
### The Source Code
For this publication, we tried to modify the source code of the benchmark itself as little as possible.
Therefore, unfortunately, the code base is rather large and confusing. The following section tries to point to the parts relevant for this paper.
The benchmark is structures in modules, some of which form the code of the benchmark, some are provided by the contesting tools, and some are related to required model serializations.
There are some naming conventions:
- Tool-related modules are in directories starting with `trainbenchmark-tool`.
- Model serialization-related modules start with `trainbenchmark-generator`.
- All other modules are core modules of the bechmark.
Since the JastAdd-based solutions use a preprocessor to generate Java files, for the presented variant, it is even more compolicated.
Each JastAdd configuraration must be presented to the benchmark as a separate tool. Thus there are two directories for each variant, one for the bacht processing mode and one for the incremental mode.
Because these two modes share almost all the source code, a third directory is used to store this shared code.
Finally, there is a directory for code shared between all JastAdd variants.
These are the important directories:
-[JastAdd with Name Lookup](trainbenchmark/trainbenchmark-tool-jastadd-namelookup-base)
**Please Note: Reproducing the graphs as presented in the paper and supplied here takes a very long time depending on the utilized hardware. It is strongly suggested to run the benchmark with a smaller maximum problem size, less repetitions, and a shorter timeout.** Most results of the benchmark are observable with more restricted setup as well. In the following, we will provide a suggested way to run the benchmark in different sizes.
To reproduce the measurements, there are several options. We provide a prepared Docker image that can be run directly.
Alternatively, it is, on course, also possible to simply run the provided gradle build scripts.
However, since there are some software requirements imposed by the benchmark, particularly for creating the diagrams using R. We stronly suggest running the Docker variant.