<h1id="artifacts-for-continuous-model-validation-using-reference-attribute-grammars">Artifacts for “Continuous Model Validation Using Reference Attribute Grammars”</h1>
<h1id="artifacts-for-continuous-model-validation-using-reference-attribute-grammars">Artifacts for “Continuous Model Validation Using Reference Attribute Grammars”</h1>
<p><em>Note: There is a variant of this submission including a docker image (provided as a link) and one without it (uploaded in HotCRP). We encourage using the one including the image, since building the image takes a long time.</em></p>
<p>The paper discusses the utilization of reference attribute grammars (RAGs) for model validation and presents two specific contributions. First, the differences between models and trees specified by reference attribute grammars, specifically non-containment references, are discussed and a manual, yet optimised method to efficiently overcome these differences is presented. Secondly, an extension of RAG grammar specifications is proposed to model non-containment references automatically. The proposed modelling techniques are compared to state-of-the-art modelling tools utilizing a benchmarking framework for continuous model validation, the <em>Train Benchmark</em>.</p>
<p>The paper discusses the utilization of reference attribute grammars (RAGs) for model validation and presents two specific contributions. First, the differences between models and trees specified by reference attribute grammars, specifically non-containment references, are discussed and a manual, yet optimised method to efficiently overcome these differences is presented. Secondly, an extension of RAG grammar specifications is proposed to model non-containment references automatically. The proposed modelling techniques are compared to state-of-the-art modelling tools utilizing a benchmarking framework for continuous model validation, the <em>Train Benchmark</em>.</p>
<h3id="structure-of-the-supplementary-artifacts">Structure of the Supplementary Artifacts</h3>
<h3id="structure-of-the-supplementary-artifacts">Structure of the Supplementary Artifacts</h3>
<p>The artifacts are structured in three parts:</p>
<p>The artifacts are structured in four parts:</p>
<ul>
<ul>
<li>A standalone example of the non-containment references preprocessor</li>
<li>A standalone example of the non-containment references preprocessor (relational-rags-0.2.3.zip)</li>
<li>Benchmark code to reproduce the measurements, including all relevant source codes</li>
<li>Benchmark code to reproduce the measurements, including all relevant source codes
<li>Full collection of all measurement data and diagrams mentioned in the paper</li>
<ul>
<li>as a zip file (ModelValidationWithRAGs.zip)</li>
<li>as a docker container (trainbenchmark-docker.tar)</li>
</ul></li>
<li>Full collection of all measurement data and diagrams mentioned in the paper (paper-results.zip)</li>
</ul>
</ul>
<h3id="general-remarks-on-the-presented-listings-and-measurements">General Remarks on the presented Listings and Measurements</h3>
<h3id="general-remarks-on-the-presented-listings-and-measurements">General Remarks on the presented Listings and Measurements</h3>
<p>For reasons of readability and simplicity, there are some minor differences in naming in the source codes and the measured resulting data. Most importantly, the names of the three presented JastAdd implementation variants are different in the code and the diagrams.</p>
<p>For reasons of readability and simplicity, there are some minor differences in naming in the source codes and the measured resulting data. Most importantly, the names of the three presented JastAdd implementation variants are different in the code and the diagrams.</p>
...
@@ -70,26 +30,22 @@
...
@@ -70,26 +30,22 @@
<table>
<table>
<thead>
<thead>
<trclass="header">
<trclass="header">
<th>Name used in Paper</th>
<thstyle="text-align: left;">Name used in paper and result data</th>
<th>Name used in result data</th>
<thstyle="text-align: left;">Name used in source code</th>
<p>To transform the grammar extension we provide a preprocessor for JastAdd. This preprocessor including its source code is provided in the <code>preprocessor</code> subdirectory.</p>
<p>To transform the grammar extension we provide a preprocessor for JastAdd. This preprocessor including its source code is provided in the <code>preprocessor</code> subdirectory.</p>
<p>Its usage is:</p>
<p>Its usage is:</p>
<ul>
<ul>
<li>Build the preprocessor
<ul>
<li><code>./gradlew build jar</code></li>
<li>copy the jar <code>cp build/libs/relational-rags-0.2.3.jar relast-compiler.jar</code></li>
</ul></li>
<li>Run preprocessor on train benchmark (output written to standard output):
<li>Run preprocessor on train benchmark (output written to standard output):
<h3id="structure-of-the-train-benchmark">Structure of the Train Benchmark</h3>
<h3id="structure-of-the-train-benchmark">Structure of the Train Benchmark</h3>
...
@@ -126,18 +81,10 @@
...
@@ -126,18 +81,10 @@
</ol>
</ol>
<p>These settings are defined in a <em>benchmark scenario</em>, which can be edited before running the benchmark.</p>
<p>These settings are defined in a <em>benchmark scenario</em>, which can be edited before running the benchmark.</p>
<h3id="measurement-data">Measurement Data</h3>
<h3id="measurement-data">Measurement Data</h3>
<p>The result data is stored in the directory <ahref="paper-results/"class="uri">paper-results/</a>. This directory contains two subdirectories:</p>
<p>The result data is stored in the directory <ahref="paper-results/">paper-results/</a>. This directory contains two subdirectories:</p>
<ul>
<ul>
<li><ahref="paper-results/measurements">measurements</a> contains two directories. The <ahref="paper-results/measurements/inject">inject</a> subdirectory contains the measurements for the <em>inject</em> scenario, which is also included in <ahref="paper-results/measurements/inject/BenchmarkScript.groovy">inject/BenchmarkScript.groovy</a>. The <ahref="paper-results/measurements/repair">repair</a> subdirectory contains the same data for the <em>repair</em> scenario in <ahref="paper-results/measurements/repair/BenchmarkScript.groovy">repair/BenchmarkScript.groovy</a>. Both directories contain files with time measurement data (starting with <code>times</code>) and the numbers of matches (starting with <code>matches</code>). Each file name contains information on the tool used, the query, and the size of the model.</li>
<li><ahref="paper-results/measurements">measurements</a> contains two directories. The <ahref="paper-results/measurements/individual">individual</a> subdirectory contains the measurements for individual queries for both the <em>inject</em> and <em>repair</em> scenario. The <ahref="paper-results/measurements/all-queries">all-queries</a> subdirectory contains the same data for the a run including all queries in sequence. Both directories contain files with time measurement data (starting with <code>times</code>) and the numbers of matches (starting with <code>matches</code>). Each file name contains information on the tool used, the query, and the size of the model.</li>
<li><ahref="paper-results/diagrams">diagrams</a> contains the same subdirectories, both containing diagrams with the respective measurements. The diagrams are generated from the same data as in the paper, but enlarged for better readability. In particular, the six diagrams presented in the paper are
<li><ahref="paper-results/diagrams">diagrams</a> contains the same subdirectories, containing diagrams with the respective measurements. The diagrams are generated from the same data as in the paper, but enlarged for better readability.</li>
<ul>
<li><ahref="paper-results/diagrams/repair/Read-and-Check-RouteSensor.pdf">Fig. 7a. Read and Check for RouteSensor (repair)</a></li>
<li><ahref="paper-results/diagrams/repair/Read-and-Check-ConnectedSegments.pdf">Fig. 7b. Read and Check for ConnectedSegments (repair)</a></li>
<li><ahref="paper-results/diagrams/inject/Transformation-and-Recheck-RouteSensor.pdf">Fig. 7c. Transformation and Recheck for RouteSensor (inject)</a></li>
<li><ahref="paper-results/diagrams/inject/Transformation-and-Recheck-ConnectedSegments.pdf">Fig. 7d. Transformation and Recheck for ConnectedSegments (inject)</a></li>
<li><ahref="paper-results/diagrams/repair/Transformation-and-Recheck-RouteSensor.pdf">Fig. 7e. Transformation and Recheck for RouteSensor (repair)</a></li>
<li><ahref="paper-results/diagrams/repair/Transformation-and-Recheck-ConnectedSegments.pdf">Fig. 7f. Transformation and Recheck for ConnectedSegments (repair)</a></li>
</ul></li>
</ul>
</ul>
<p><strong>Please Note:</strong> The measurements were conducted using a timeout for the whole run. If a run was not completed, no individual times of the steps appear in the measurements and diagrams. Thus, some tools do not have measurements for all problem sizes.</p>
<p><strong>Please Note:</strong> The measurements were conducted using a timeout for the whole run. If a run was not completed, no individual times of the steps appear in the measurements and diagrams. Thus, some tools do not have measurements for all problem sizes.</p>
<h5id="running-the-docker-image">Running the Docker Image</h5>
<h5id="running-the-docker-image">Running the Docker Image</h5>
<ul>
<ul>
<li><code>docker run -it -v "$PWD"/docker-results:/trainbenchmark/results:Z -v "$PWD"/docker-diagrams:/trainbenchmark/diagrams:Z trainbenchmark</code></li>
<li><code>docker run -it -v "$PWD"/docker-results:/trainbenchmark/results:Z -v "$PWD"/docker-diagrams:/trainbenchmark/diagrams:Z trainbenchmark</code></li>
<li>This makes the results and diagrams available outside the container in the directories <code>docker-results</code> and <code>docker-diagrams</code> respectively</li>
<li>This makes the results and diagrams available outside the container in the directories <code>docker-results</code> and <code>docker-diagrams</code> respectively</li>
<li>Once running, a command prompt is opened and some information is displayed</li>
<li>Once running, a command prompt is opened and some information is displayed</li>
*Note: please use the HTML version of this README.*
*Also Note: There is a variant of this submission including a docker image (provided as a link) and one without it (uploaded in HotCRP). We encourage using the one including the image, since building the image takes a long time.*
### Authors
### Authors
...
@@ -16,135 +13,184 @@
...
@@ -16,135 +13,184 @@
- Jesper Öqvist <jesper.oqvist@cs.lth.se>
- Jesper Öqvist <jesper.oqvist@cs.lth.se>
- Uwe Aßmann <uwe.assmann@tu-dresden.de>
- Uwe Aßmann <uwe.assmann@tu-dresden.de>
### Introduction
### Introduction
The paper discusses the utilization of reference attribute grammars (RAGs) for model validation and presents two specific contributions.
The paper discusses the utilization of reference attribute grammars
First, the differences between models and trees specified by reference attribute grammars, specifically non-containment references, are discussed and a manual, yet optimised method to efficiently overcome these differences is presented.
(RAGs) for model validation and presents two specific contributions.
Secondly, an extension of RAG grammar specifications is proposed to model non-containment references automatically.
First, the differences between models and trees specified by reference
The proposed modelling techniques are compared to state-of-the-art modelling tools utilizing a benchmarking framework for continuous model validation, the *Train Benchmark*.
attribute grammars, specifically non-containment references, are
discussed and a manual, yet optimised method to efficiently overcome
these differences is presented. Secondly, an extension of RAG grammar
specifications is proposed to model non-containment references
automatically. The proposed modelling techniques are compared to
state-of-the-art modelling tools utilizing a benchmarking framework for
continuous model validation, the *Train Benchmark*.
### Structure of the Supplementary Artifacts
### Structure of the Supplementary Artifacts
The artifacts are structured in three parts:
The artifacts are structured in four parts:
- A standalone example of the non-containment references preprocessor
- A standalone example of the non-containment references preprocessor
- Benchmark code to reproduce the measurements, including all relevant source codes
(relational-rags-0.2.3.zip)
- Full collection of all measurement data and diagrams mentioned in the paper
- Benchmark code to reproduce the measurements, including all relevant
source codes
- as a zip file (ModelValidationWithRAGs.zip)
- as a docker container (trainbenchmark-docker.tar)
- Full collection of all measurement data and diagrams mentioned in
the paper (paper-results.zip)
### General Remarks on the presented Listings and Measurements
### General Remarks on the presented Listings and Measurements
For reasons of readability and simplicity, there are some minor differences in naming in the source codes and the measured resulting data.
For reasons of readability and simplicity, there are some minor
Most importantly, the names of the three presented JastAdd implementation variants are different in the code and the diagrams.
differences in naming in the source codes and the measured resulting
data. Most importantly, the names of the three presented JastAdd
implementation variants are different in the code and the diagrams.
The following table shows the relation of the terminology used in the paper and in the code.
The following table shows the relation of the terminology used in the
paper and in the code.
| Name used in Paper | Name used in result data | Name used in source code |
Name used in paper and result data Name used in source code
The benchmark is able to measure different scenarios specified by configurations with several kinds of parameters:
The benchmark is able to measure different scenarios specified by
configurations with several kinds of parameters:
1. **Input Data:** There are two types of input data used in the benchmark, the ``inject`` and the ``repair`` data set.
1. **Input Data:** There are two types of input data used in the
The former contains *valid* models, i.e., models, which do not contain any of the faults that are supposed to be found by the presented queries.
benchmark, the `inject` and the `repair` data set. The former
contains *valid* models, i.e., models, which do not contain any of
the faults that are supposed to be found by the presented queries.
The latter, `repair`, contains models already containing faults.
The latter, `repair`, contains models already containing faults.
2. **Queries:** The queries are used to find the aforementioned faults. For each fault, there are two queries: *repair*, to find the fault, and *inject*, to find places where a fault can be injected.
2. **Queries:** The queries are used to find the aforementioned faults.
3. **Transformations:** The transformations performed by the benchmark are, again, two sets: *inject* and *repair* transformations.
For each fault, there are two queries: *repair*, to find the fault,
4. **Transformation Strategies:** The benchmark does not perform the operation on all matches.
and *inject*, to find places where a fault can be injected.
The strategy *fixed* performs the transformation on a given number of matches, while the *proportional* strategy performs them on a given percentage of all matches.
3. **Transformations:** The transformations performed by the benchmark
are, again, two sets: *inject* and *repair* transformations.
These settings are defined in a *benchmark scenario*, which can be edited before running the benchmark.
4. **Transformation Strategies:** The benchmark does not perform the
operation on all matches. The strategy *fixed* performs the
transformation on a given number of matches, while the
*proportional* strategy performs them on a given percentage of all
matches.
These settings are defined in a *benchmark scenario*, which can be
edited before running the benchmark.
### Measurement Data
### Measurement Data
The result data is stored in the directory [paper-results/](paper-results/).
The result data is stored in the directory
This directory contains two subdirectories:
[paper-results/](paper-results/). This directory contains two
subdirectories:
- [measurements](paper-results/measurements) contains two directories.
- [measurements](paper-results/measurements) contains two directories.
The [inject](paper-results/measurements/inject) subdirectory contains the measurements for the *inject* scenario, which is also included in [inject/BenchmarkScript.groovy](paper-results/measurements/inject/BenchmarkScript.groovy).
The [individual](paper-results/measurements/individual) subdirectory
The [repair](paper-results/measurements/repair) subdirectory contains the same data for the *repair* scenario in [repair/BenchmarkScript.groovy](paper-results/measurements/repair/BenchmarkScript.groovy).
contains the measurements for individual queries for both the
Both directories contain files with time measurement data (starting with `times`) and the numbers of matches (starting with `matches`).
*inject* and *repair* scenario. The
Each file name contains information on the tool used, the query, and the size of the model.
- [diagrams](paper-results/diagrams) contains the same subdirectories, both containing diagrams with the respective measurements.
contains the same data for the a run including all queries in
The diagrams are generated from the same data as in the paper, but enlarged for better readability.
sequence. Both directories contain files with time measurement data
In particular, the six diagrams presented in the paper are
(starting with `times`) and the numbers of matches (starting with
- [Fig. 7a. Read and Check for RouteSensor (repair)](paper-results/diagrams/repair/Read-and-Check-RouteSensor.pdf)
`matches`). Each file name contains information on the tool used,
- [Fig. 7b. Read and Check for ConnectedSegments (repair)](paper-results/diagrams/repair/Read-and-Check-ConnectedSegments.pdf)
the query, and the size of the model.
- [Fig. 7c. Transformation and Recheck for RouteSensor (inject)](paper-results/diagrams/inject/Transformation-and-Recheck-RouteSensor.pdf)
- [diagrams](paper-results/diagrams) contains the same subdirectories,
- [Fig. 7d. Transformation and Recheck for ConnectedSegments (inject)](paper-results/diagrams/inject/Transformation-and-Recheck-ConnectedSegments.pdf)
containing diagrams with the respective measurements. The diagrams
- [Fig. 7e. Transformation and Recheck for RouteSensor (repair)](paper-results/diagrams/repair/Transformation-and-Recheck-RouteSensor.pdf)
are generated from the same data as in the paper, but enlarged for
- [Fig. 7f. Transformation and Recheck for ConnectedSegments (repair)](paper-results/diagrams/repair/Transformation-and-Recheck-ConnectedSegments.pdf)
better readability.
**Please Note:** The measurements were conducted using a timeout for the whole run. If a run was not completed, no individual times of the steps appear in the measurements and diagrams. Thus, some tools do not have measurements for all problem sizes.
**Please Note:** The measurements were conducted using a timeout for the
whole run. If a run was not completed, no individual times of the steps
appear in the measurements and diagrams. Thus, some tools do not have
measurements for all problem sizes.
### The Source Code
### The Source Code
For this publication, we tried to modify the source code of the benchmark itself as little as possible.
For this publication, we tried to modify the source code of the
Therefore, unfortunately, the code base is rather large and confusing. The following section tries to point to the parts relevant for this paper.
benchmark itself as little as possible. Therefore, unfortunately, the
code base is rather large and confusing. The following section tries to
The benchmark is structured in modules, some of which form the code of the benchmark, some are provided by the contesting tools, and some are related to required model serializations.
point to the parts relevant for this paper.
There are some naming conventions:
- Tool-related modules are in directories starting with `trainbenchmark-tool`.
The benchmark is structured in modules, some of which form the code of
- Model serialization-related modules start with `trainbenchmark-generator`.
the benchmark, some are provided by the contesting tools, and some are
- All other modules are core modules of the benchmark.
related to required model serializations. There are some naming
conventions: - Tool-related modules are in directories starting with
The JastAdd-based solutions use a preprocessor to generate Java files, for the presented variant.
`trainbenchmark-tool`. - Model serialization-related modules start with
Each JastAdd configuration must be presented to the benchmark as a separate tool. Thus, there are two directories for each variant, one for the batch processing mode and one for the incremental mode.
`trainbenchmark-generator`. - All other modules are core modules of the
Because these two modes share almost all the source code, a third directory is used to store this shared code.
benchmark.
Finally, there is a directory for code shared between all JastAdd variants.
These are the important directories:
The JastAdd-based solutions use a preprocessor to generate Java files,
for the presented variant. Each JastAdd configuration must be presented
- [JastAdd with Name Lookup](trainbenchmark/trainbenchmark-tool-jastadd-namelookup-base)
to the benchmark as a separate tool. Thus, there are two directories for
each variant, one for the batch processing mode and one for the
incremental mode. Because these two modes share almost all the source
code, a third directory is used to store this shared code. Finally,
there is a directory for code shared between all JastAdd variants. These
**<span style="color:red">Please Note: Reproducing the graphs as presented in the paper and supplied here takes a very long time depending on the utilized hardware. It is strongly suggested running the benchmark with a smaller maximum problem size, fewer repetitions, and a shorter timeout.</span>** Most results of the benchmark are observable with more restricted setup as well. In the following, we will provide a suggested way to run the benchmark in different sizes. Note that running the benchmark requires a significant amount of disk space (up to 10GB when running the full benchmark).
**[Please Note: Reproducing the graphs as presented in the paper and
supplied here takes a very long time depending on the utilized hardware.
To reproduce the measurements, there are several options. We provide a prepared Docker image that can be run directly.
It is strongly suggested running the benchmark with a smaller maximum
Alternatively, it is, of course, also possible to simply run the provided gradle build scripts.
problem size, fewer repetitions, and a shorter
However, since there are some software requirements imposed by the benchmark, particularly for creating the diagrams using R. We strongly suggest running the Docker variant.
timeout.]{style="color:red"}** Most results of the benchmark are
observable with more restricted setup as well. In the following, we will
provide a suggested way to run the benchmark in different sizes. Note
that running the benchmark requires a significant amount of disk space
(up to 10GB when running the full benchmark).
To reproduce the measurements, there are several options. We provide a
prepared Docker image that can be run directly. Alternatively, it is, of
course, also possible to simply run the provided gradle build scripts.
However, since there are some software requirements imposed by the
benchmark, particularly for creating the diagrams using R. We strongly
suggest running the Docker variant.
#### Running the Benchmark with Docker
#### Running the Benchmark with Docker
...
@@ -153,41 +199,53 @@ However, since there are some software requirements imposed by the benchmark, pa
...
@@ -153,41 +199,53 @@ However, since there are some software requirements imposed by the benchmark, pa
- Variant 1 (*recommended*): Load the provided docker image
- Variant 1 (*recommended*): Load the provided docker image
- Prerequisites: An installation of Docker in the `PATH`
- Prerequisites: An installation of Docker in the `PATH`
- Steps:
- Steps:
- Unpack the provided archive and open a terminal in the extracted directory
- Unpack the provided archive and open a terminal in the
extracted directory
- `docker load --input trainbenchmark-docker.tar`
- `docker load --input trainbenchmark-docker.tar`
- Variant 2: Build the docker image from the provided Dockerfile
- Variant 2: Build the docker image from the provided Dockerfile
- Prerequisites: An installation of Docker in the `PATH`
- Prerequisites: An installation of Docker in the `PATH`
- Steps:
- Steps:
- Unpack the provided archive and open a terminal in the extracted directory
- Unpack the provided archive and open a terminal in the
extracted directory
- `docker build -t trainbenchmark .`
- `docker build -t trainbenchmark .`
##### Running the Docker Image
##### Running the Docker Image
- `docker run -it -v "$PWD"/docker-results:/trainbenchmark/results:Z -v "$PWD"/docker-diagrams:/trainbenchmark/diagrams:Z trainbenchmark`
- `docker run -it -v "$PWD"/docker-results:/trainbenchmark/results:Z -v "$PWD"/docker-diagrams:/trainbenchmark/diagrams:Z trainbenchmark`
- This makes the results and diagrams available outside the container in the directories `docker-results` and `docker-diagrams` respectively
- This makes the results and diagrams available outside the container
- Once running, a command prompt is opened and some information is displayed
in the directories `docker-results` and `docker-diagrams`
respectively
- Once running, a command prompt is opened and some information is
displayed
- Follow the instructions below
- Follow the instructions below
#### Running the Benchmark directly
#### Running the Benchmark directly
- For running a standard run, use one of the following commands:
- For running a standard run, use one of the following commands:
| Name | Command | Minimum size | Maximum size | Timeout | Runs |
Name Command Minimum size Maximum size Timeout Runs