D1.3.4 Functional and Technical Architecture Specifications

EUROPEAN COMMISSION
DIRECTORATE-GENERAL FOR COMMUNICATIONS NETWORKS, CONTENT AND TECHNOLOGY
Future Network
Cloud and Software

Document Control Information

Settings	Value
Document Title:	D1.3.4 Functional and Technical Architecture Specifications
Project Title:	SC1
Document Author:	Sovereign-X
Doc. Version:	0.1
Sensitivity:	Limited
Date:	24 Dec 2025

Document history:

The Document Author is authorised to make the following types of changes to the document without requiring that the document be re-approved:

Editorial, formatting, and spelling
Clarification

To request a change to this document, contact the Document Author or Owner.

Changes to this document are summarised in the following table in reverse chronological order (latest version first).

Revision	Date	Created by	Short Description of Changes
0.1	31 Dec 2025	Sovereign-X	SfR M24 first Iteration - Submitted for Review

1. Introduction

1.1. Scope of this document

The purpose of this document is to describe the functional and technical architecture of Simpl-Open, following the approach further described in the Architecture Approach section. It includes the following content (non-exhaustive list):

A high-level overview of Simpl-Open architecture vision;
A description of Architecture Principles, Assumptions and Decisions that drive the Simpl-Open architecture;
A description of the Simpl-Open architecture from a business, application, data and technology perspective, each of them described using appropriate diagrams (BPMN, ArchiMate, UML);
A description of the Simpl-Open security architecture.

1.2. Target Audience

The intended audience of this document comprises people involved in the architecture, design, integration, testing and maintenance of Simpl-Open.

It mainly targets architects, but can also be helpful for developers, testers and other stakeholders involved in Simpl-Open, as well as stakeholders involved in Simpl-Live or other data spaces interested in integration of Simpl-Open.

1.3. Changes with respect to the previous version

1.3.1. 06 Mar 2026

Updated “ACV Static - Data Orchestration Service” to include the auth proxy and made the asset orchestrator part of release.
Updated “TCV Static - Data Orchestration Service” to include the auth proxy and made the asset orchestrator part of release.

1.3.2. 13 Feb 2026

Updated “LDM - Domain 1 - Access Control & Trust” to include the Role entity in Users & Roles component.
Updated “PDM - Domain 1 - Access Control & Trust” to include the Role table in Users & Roles component.
Updated “ ACV Dynamic - BP 03B – Participant User and Roles Configuration ” to include enable/disable functionality
Updated “ACV Dynamic - BP 03C - End User Role Request” to include Role Request review functionality in the frontend

1.3.3. 23 Jan 2026

Update “APIs” to include new Onboarding v2 API and Authentication Provider syncronization API

1.3.4. 19 Dec 2025

Update “ACV Static - Tier 1 Authentication Service” to include the external identity provider
Update “TCV Static - Tier 1 Authentication Service” to include the external identity provider
Update “ACV Dynamic - BP 03B – Participant User and Roles Configuration” to include users and roles management
Update “TCV Dynamic - BP 03B - Participant User and Roles Configuration” to include users and roles management
Added “ACV Dynamic - BP 03C - End User Role Request” for end user role request
Added “TCV Dynamic - BP 03C - End User Role Request” for end user role request
Update “APIs” to include new Onboarding and Users&Roles API description
Update “CDM - Domain 1 - Onboarding & IAA” to include roles request in Users&Roles data model
Update “LDM - Domain 1 - Onboarding & IAA” to include roles request in Users&Roles data model
Update “PDM - Domain 1 - Onboarding & IAA” to include roles request in Users&Roles data model
Update “ ACV Static - Catalogue Client Service ” to remove EDC Connector adapter
Update “ TCV Static - Catalogue Client Service ” to remove EDC Connector adapter
Update “ ACV Static - Resource Offering Service ” to remove EDC Connector adapter
Update “ TCV Static - Resource Offering Service ” to remove EDC Connector adapter
Update “ ACV Static - Connector Service ” to add EDC Connector adapter
Update “ TCV Static - Connector Service ” to add EDC Connector adapter
Update “User Interfaces” to mark Identity Provider frontend as a released component
Update “ACV Static - Tier 1 Authentication Service” to include the authenticator plugin
Update “TCV Static - Tier 1 Authentication Service” to include the authenticator plugin
Update “APIs” to include Users And Roles v2 APIs
Fixed descriptions of entities and attributes “LDM - Domain 1 - Onboarding & IAA”
Updated “Simpl-Open Application Architecture” with Data Orchestration Service
Update “ACV - Domain 2 - Publish and consume resources” to include the Sync Schema Adapter, Schema Management Service and Data Orchestration Service
Update “ACV Static - Schema Management Service” to include more description and the schema synch adapter
Added the anonymisation services to “ACV Static - Data Orchestration Service”
Update “ACV Dynamic - BP 06 – Consumer searches resources in data space catalogues” to include the Sync Schema Adapter
Update “ACV Dynamic - BP 05B - Provider manages resource descriptions” to include the Sync Schema Adapter
Updated “User Interfaces” to add Schema Management UI and Data Orchestration UI
Updated “APIs” to add Data Orchestration Interfaces and update Schema Management and Synch Schema Service
Updated “Technology Deployment View” to include the Schema Management Service
Updated “Identification, Authentication & Authorisation” to reflect Roles & Identity Attributes for Schema Management and Data Orchestration
Updated “Simpl-Open Technology Choices” to reflect Apache Fuseki and Dagster
Updated “Custom Components Data Model” to reflect Conceptual, Logical and Physical Data Models for the Sync Schema Adapter
Updated “Simpl-Open Application Architecture” to explain the difference between mandatory and optional services
Updated “Simpl-Open Functional Architecture” to rename domain 1 into Access Control & Trust
Updated “ACV - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “CDM - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “LDM - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “PDM - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “TCV - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “ LDM - Domain 2 - Publish and consumer resources ” to update the Infrastructure Provider Storage
Update “ APIs ” to include Infrastructure Provider API description
Updated “ High-Level Architecture ” to reflect the new capabilities map structure and removed “ Annex 1 - Architecture Building Blocks ” as this is now described in the high-level architecture itself
Updated “ Simpl-Open Application Architecture ” to refer to the website for NFRs instead of the legacy annex and removed “ Annex 3 - Non-Functional Requirements ”
Updated “ Assumptions and Architecture Decisions ” to include missing decisions
Updated “ACV - Domain 3 - Management/Operation of Data Space” with new overview diagram using application services and individual service static views
Updated “TCV - Domain 3 - Management/Operation of Data Space” with new structure and individual service static views
Added “TCV Dynamic - BP 02C - Manage resource description schemas”
Added “TCV Static - Schema Management Service”
Updated “ACV Static - Schema Management Service” with description
Added “ACV Dynamic - BP 02C - Manage resource description schemas”
Updated “CDM - Domain 2 - Publish and consume resources” to include the Schema Synch Service
Updated “LDM - Domain 2 - Publish and consume resources” to include the Schema Synch Service
Updated “PDM - Domain 2 - Publish and consume resources” to include the Schema Synch Service
Added “ACV Static - Schema Synch Service”
Added “TCV Static - Schema Synch Service”
Added Schema Management to “Detailed Technical Specifications”
Updated “User Interfaces” to add Infrastructure UI for deployment script VM templates

1.3.5. 07 Nov 2025

Update “ACV Dynamic - BP 12C – Credentials actions by the Governance Authority” to include the Identity Provider Frontend as officially available
Update “ACV Static - Identity Provider Service” to include Identity Provider Frontend as officially available
Update “CDM - Domain 1 - Onboarding & IAA” to update the data model of Identity Provider and Authentication Provider
Update “LDM - Domain 1 - Onboarding & IAA” to update the data model of Identity Provider and Authentication Provider
Update “PDM - Domain 1 - Onboarding & IAA” to update the data model of Identity Provider and Authentication Provider
Update “APIs” to include the new auto renewal APIs for the authentication provider component

1.3.6. 26 Sep 2025

Update section “Simpl-Open Application Architecture” to reflect the new structure of the section
Update section “ACV - Domain 2 - Publish and consume resources” with new overview diagram using application services
Add section “ACV - Domain 2 - Publish and consume resources - Static Views” containing individual service static views
Add section “ACV - Domain 2 - Publish and consume resources - Dynamic Views” to reorganise the already existing dynamic views
Update section “Simpl-Open Technology Architecture” to reflect the new structure of the section
Update section “TCV - Domain 1 - Onboarding & IAA” to reflect the new structure of the section
Add section “TCV - Domain 1 - Onboarding & IAA - Static Views” containing individual service static views
Add section “TCV - Domain 1 - Onboarding & IAA - Dynamic Views” to reorganise the already existing dynamic views
Update section “TCV - Domain 2 - Publish and consume resources” to reflect the new structure of the section
Add section “TCV - Domain 2 - Publish and consume resources - Static Views” containing individual service static views
Add section “TCV - Domain 2 - Publish and consume resources - Dynamic Views” to reorganise the already existing dynamic views
Removed hyperlinks from section “User Interfaces” .
Update section “ LDM - Domain 1 - Onboarding & IAA” to update the data model for the security attributes provider and authentication provider components
Update section “PDM - Domain 1 - Onboarding & IAA” to update the data model for the security attributes provider and authentication provider components
Update section “APIs” to update Security Attributes Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 v2 APIs
Update section “*ACV Static - Tier2 Authentication Service” *to include the communication with Security Attributes Provider and Identity Provider

1.3.7. 05 Sep 2025

Update section “APIs” to include Security Attributes Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 v2 APIs
Update section “User Interfaces” to include participant management functionalities in the Onboarding frontend
Update section “User Interfaces” to include credential renewal functionalities in the participant utility frontend
Update section “CDM - Domain 1 - Onboarding & IAA” to update the data model for identity provider and authentication provider components
Update section “ LDM - Domain 1 - Onboarding & IAA” to update the data model for identity provider, security attributes provider and authentication provider components
Update section “PDM - Domain 1 - Onboarding & IAA” to update the data model for identity provider, security attributes provider and authentication provider components
Update section “ACV - Domain 1 - Onboarding & IAA” to include credential renewal flows
Create section “ACV Dynamic - BP 12C – Credentials actions by the Governance Authority”
Update section “TCV - Domain 1 - Onboarding & IAA” to include credential renewal flow
Create section “TCV Dynamic - BP 12C – Credentials actions by the Governance Authority”
Update section “ACV Dynamic - BP 07 - Consumer and Provider establish a usage contract for selected catalogue items” to include traceability to the business process on the diagram
Update section “ACV Dynamic - WF 12B - Local Node Logging and Monitoring” to include traceability to the business process on the diagram
Update section “ACV Dynamic - BP 09A - Consumer consumes a data resource from a Provider” to include traceability to the business process on the diagram
Update section “ACV Dynamic - BP 09B - Consumer receives a data processing service on a data resource via an application” to include traceability to the business process on the diagram
Update section “Application Components Views” to reflect the new structure of the section
Update section “ACV - Domain 1 - Onboarding & IAA” with new overview diagram using application services
Add section “ACV - Domain 1 - Onboarding & IAA - Static Views” containing individual service static views
Add section “ACV - Domain 1 - Onboarding & IAA - Dynamic Views” to reorganise the already existing dynamic views
Added first version of orchestration platform to “ACV - Domain 2 - Publish and consume resources”
Update section “ACV Dynamic - BP 08 - Consumer consumes an infrastructure resource from a Provider” to add traceability to BPs

1.3.8. D1.3.2 → D1.3.3

Update section “ PDM - Domain 1 - Onboarding & IAA ”
Update section “ LDM - Domain 1 - Onboarding & IAA ”
Update section “ CDM - Domain 1 - Onboarding & IAA ”
Update section “ ACV - Domain 1 - Onboarding & IAA ” to include Document Validation and Hashicorp Vault technology
Update section “ TCV - Domain 1 - Onboarding & IAA ” to include Document Validation and Hashicorp Vault technology
Update section “ TCV Dynamic - BP 03A - Onboarding of a participant - Tier II ”
Update section “TCV Dynamic - BP 03B - Onboarding Tier 1 - Organisation Local IDP(Directory) Connection/Mapping” to include Document Validation and Hashicorp Vault technology
Update section “ ACV Dynamic - BP 03A - Onboard a Participant ” to include ArchiMate Refactoring, traceability with BP03A and Document Validation Service
Update section “ ACV Dynamic - BP 03B - Connect/map Organisation Local IDP (Directory) ” to include ArchiMate cleanup and traceability with BP03B
Update section “ APIs ” to include IAA OpenAPI definition, AsyncAPI definition, API descriptions
Update section “ Data Space Concepts ” to include a the new “Anatomy of a Simpl-Open service” section
Update section: “ ACV - Domain 2 - Publish and consume resources ” with new components & APIs to include EDC Connector Adapter, Validation Service and Contract Consumption Adapter
Update section: “ ACV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue ”
Update section: “ ACV Dynamic - BP 09A - Consumer consumes a data resource from the provider ”
Update section: “ ACV Dynamic - BP 09B - Consumer receives data processing service over a dataset via an Application ”
Update section: “ TCV - Domain 2 - Publish and consume resources ” with new components & APIs to include EDC Connector Adapter, Validation Service and Contract Consumption Adapter
Update section: “ TCV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue ”
Update section: “ TCV Dynamic - BP 09A - Consumer consumes a data resource from the provider ”
Update section: “ TCV Dynamic - BP 09B - Consumer receives data processing service over a dataset via an Application ”
Update section: “ APIs ” to include EDC Connector Adapter, Validation Service and Contract Consumption Backend
Update section: “ ACV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider ”
Update section: “ TCV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider ”
Update section: “ Technology Deployment View ”
Add “ Architecture Patterns ” section into the “ Architecture Framework” and remove “ Annex 4 - Architecture Patterns ”
Remove section “ List of Business Processes ” and referred to the website instead
Update section “ Self-Description Tooling ” to remove the Flow Diagram (duplication with ACV Dynamic)
Update section “ Simpl-Open Application Architecture ”
Update section: “ APIs ” to include post-configuration and decommissioning
Update section: “ Simpl-Open Technology Choices ” to include Terraform related technologies
Update section: “ User Interfaces ” with the Infrastructure Deployment Script Management UI
Update section: “ Infrastructure Provisioning ”
Updated section “ Annex 2 - Mapping between functional requirements and components ” with latest list of requirements
Update section: “ Open-Source Components Data Model ” to include OpenTofu
Add section “ Digital Identities integration with EU Digital Identity Framework - eIDAS ” in the “ Data Spaces Concepts ” section
Update section: “ Simpl-Open Security Architecture ” with updated diagrams
Update section: “ LDM - Domain 2 - Publish and consume resources ” with updated Infrastructure Provider fields
Update section: “ PDM - Domain 2 - Publish and consume resources ” with updated Infrastructure Provider fields
Update “ Architecture Decisions Record ” with latest decisions
Updated “ACV Dynamic - BP 05B - Publish and consume” according to BP and renamed to “ACV Dynamic - BP 05B - Manage resources”
Removed (and replaced by “ACV Dynamic - BP 05B - Manage resources” ) sections:
- “ACV Dynamic - BP 05B - Request sd”
- “ACV Dynamic - BP 05B - Retrieve status”
- “ACV Dynamic - BP 05B - Update SD status”

1.3.9. D1.3.1 → D1.3.2

Add section “User Interfaces” into the “ Simpl-Open Application Architecture”
Add section “Custom Components Data Model” into the “Simpl-Open Data Architecture”
“Simpl-Open Security Architecture” enhanced
“UI/UX Style Guide” referenced in the UIs section
Added Health checks and Tracing for Monitoring component

2. Simpl-Open High-Level Overview

2.1. Simpl-Open Description

With the ongoing exponential growth of data, there is a pressing need within the European Union to provide access to resilient and competitive data storage and processing capacities for both the private and the public sector. In particular, the European Commission aims to address the need for more data sharing and decentralised data processing closer to the user (at the edge). It is also critical to deploy EU data services in the public and private sectors to grant Europe a leading status as a data-driven society and improve data usage within the European Union. The data services of various organisations within the same industry sector should be abstracted into sector-specific Data Spaces. This could bring several benefits, such as greater productivity, improvements in health and well-being, adaptation to environment and climate change, transparent governance and convenient public services.

To support the above-mentioned objectives, the European Commission is creating an Open-Source, multi-vendor, large-scale, modular and interoperable middleware called “Simpl-Open”. Simpl-Open will be the basis for a European Cloud Federation enabling the operation and interconnection within and in between various European data spaces and the safe migration of the users to the cloud.

Simpl-Open will federate data, application and infrastructure across the European Union with secure, resilient, energy efficient and accessible cloud-to-edge capabilities. It will allow EU stakeholders to pool together resources to create more business value, increase resource usage efficiency and reduce costs and duplication of efforts. Simpl-Open considers both the public sector as well as EU business as core stakeholders. Using the features provided by Simpl-Open, an open marketplace for EU resources will be created that enables energy-efficient reuse of efforts achieved by other EU participants.

The following figure displays how the architecture vision of Simpl-Open maps the five actor groups (see definition in Actors section) and different Data Spaces:

At the core of Data Spaces lie the five types of actors that Simpl-Open considers. These actors are a symbolic representation of a distributed network of cooperating parties in an open ecosystem. Simpl-Open, represented by the Simpl-Open Agent, spans across these actors, enabling asset sharing between them. It provides common services on which Data Spaces can be built. Simpl-Open stays agnostic to the specifics of a particular Data Space, allowing additional Data Space specific services to be added on top of Simpl-Open. This added layer can, for example, contain standards on data representation, enforce common quality certifications, or define peer review rules to assess data quality. The Data Space specific services tailor the ecosystem beyond simple sharing of assets; they make sure those assets become valuable to participants.

Simpl-Open does not only aim to be used to build Data Spaces but it also creates interoperability between Data Spaces. As multiple Data Spaces incorporate Simpl-Open, Data Spaces become more connected. This enables services to cross the boundaries of specific Data Spaces. Such services will initially be more limited, as Simpl-Open cannot capture the details of all different Data Spaces. It will be up to the user to deal with the specifics of each Data Space in interpreting the assets that it obtains. To make this illustrative view more tangible, the following figure presents an example of how a set of distributed actors might interconnect to form a Data Space. It is important to note that this figure displays one possible scenario of many possible ways different participants might interact. The number of participants in a Data Space or the number of stakeholders behind a single actor is only limited by its technical feasibility. This implies that large numbers of participants and stakeholders can interact simultaneously. The Simpl-Open Agent in the figure serves as an abstract component that actors need to deploy to become part of the Data Space.

It is important to note that each of these displayed actors are an abstraction to the internal systems of one or more stakeholders. The deployment of Simpl-Open in a Data Space can have various degrees of granularity. The stakeholder behind an actor can be an individual user that has the capabilities to deploy a Simpl-Open Agent or can be an entire data sharing initiative on its own. It is up to the Data Space governance authority to decide how Simpl-Open best provides value and what level of granularity of the deployment fits best.

2.2. Data Space Concepts

This section defines general concepts that are necessary for a good understanding of the Simpl-Open documentation and ecosystem in general.

2.2.1. Actors and Data Space Deployment

As described above, a Data Space consists of actors (individuals or entities) who need to interact with each other. In Simpl-Open, it is assumed that individuals (called end-users ) are always part of an entity (called participant ). A participant can operate one or multiple node (s) which represent a distinct and/or isolated set of IT resources and can participate to different Data Spaces with various participant roles (including Data Provider, Application Provider, Consumer, Infrastructure Provider and - exactly one - Governance Authority). Nodes can also be spread across physical locations.

Example: a university (= participant ) which embodies students, researchers or accountants (= end-users ). The university has a dedicated network (with connected IT resources) for its sciences department (= the sciences node ) hosted in Paris (= physical location ) and a dedicated network for its economy department (= the economy node ) hosted in Rome (= physical location ). The sciences department might want to offer the results of its researches (= data provider role ) to a science-related Data Space while the economy department might want to consume data (= consumer role ) from an economy-related Data Space.

Simpl-Open Agent is a middleware, to be deployed on each node, acting as a local gateway for secure communication within a Data Space.

The following diagram illustrates this deployment view:

The following diagram translates the above deployment view into a data domain model to better understand the relationship and cardinality between the different entities.

2.2.2. Data Space Participant: Tier I and Tier II

Only the HL concepts of Tier I and Tier II are presented here as it is required for a good understanding of the next sections. More details can be found in the Simpl-Open Application and Technology Architecture, especially related to Domain 1.

Identification, authentication and authorisation are of paramount importance within a Data Space.

The identification must be supported by the governance authority. This authority is in charge of reviewing the identity details of organisations that want to participate in the Data Space. If the authority approves the participation of and organisation in the Data Space, it provides a proof of identification that the organisation installs in its Simpl-Open Agent. With this proof, the participant authenticates itself to other Data Space participants and other participants define authorisation rules based on the verifiable identity.

The identification system plays an important role in two functionalities of Simpl-Open:

Establish secure communication channels;
Provide the information on which participants can base themselves to define access and usage policies.

To keep the identification system manageable in a large-scale environment, the identification is split into two tiers:

The first tier manages the identification, authentication and authorisation of the organisation’s members (humans or machines) to use the Simpl-Open Agent of their organisation;
The second tier identifies and authenticates the organisation as a whole in the Simpl network.

The figure below depicts this two-tier approach:

In the first tier , the Simpl-Open Agent connects to the preferred IAA system of the organisation: EU Login, eID, Microsoft AD, OpenID Connect, etc. This mechanism is already well established and not unique to Simpl.

The second tier involves the machine-to-machine authentication and identification of an organisation in the Simpl network. Each organisation holds an “Identity” file to support the identification, authentication and authorisation of the organisation in the Data Space. Recalling the two functionalities that the IAA supports, the necessary content of this Identity file becomes apparent:

For the establishment of a secure communication channel between participants (1) , the Identity file should contain a proof of the organisation’s public key. Each Data Space participant will create a cryptographic public/private keypair that is used in the asynchronous authentication mechanisms needed to establish a communication channel. An example of how such a secure communication channel can be established is the well-known TLS/SSL protocol. The Identity file associates the public key of an organisation to its identity. Proving the identity of the organisation then becomes proving the possession of the private key that belongs to the respective public key. This way, the organisation can be authenticated in the network and a secure communication channel can be established.
Access control and authorisation by providers (2) can be performed based on custom identity attributes of an organisation. Examples of such attributes are the organisation name, geographical location, whether it is a private or public institution, etc. Based on these attributes, providers can define access and usage policies for their resources. For example, a provider can open a resource to all public institutions, or to all participants from a specific Member State. On the other hand, the access control policies can be more stringent and access is only allowed for a specific organisation. The Identity file proves the attributes of an organisation and, as such, ensures the trust on which a provider can rely to enforce their access control.

2.2.3. Anatomy of a Simpl-Open service

Simpl-Open is deployed within the participant organization’s premises (e.g., data center) and is intended to be connected to the internet via a firewall. Only the Tier 2 Gateway is designed to be exposed through the firewall, enabling agent-to-agent communication.

The Tier 1 Gateway is intended to remain privately accessible within the organization’s internal network.

Simpl-Open Agents consist of a tailored set of Simpl-Open services, depending on the participant’s role (e.g., Consumer, Data Provider, Infrastructure Provider, Application Provider).

Each Simpl-Open service includes the following components:

Tier 1 Frontend : Accessible by the organization’s end users, this interface provides access to the agent’s functionalities.
Tier 1 Gateway : Secures internal traffic and enforces Role-Based Access Control (RBAC) policies.
Local Tier 1 Backend : Located behind the Tier 1 Gateway, it delivers local services to the agent and may also interact with remote Tier 2 Backends.
Tier 2 Gateway : Secures inter-organizational communications and enforces Attribute-Based Access Control (ABAC) policies.
Remote Tier 2 Backend : Accessed through the Tier 2 Gateway, it offers services to external agents.
Local Resource (Data/Infrastructure/Application) : Resources owned by the organization but external to the Simpl-Open Agent, accessible through the Local Tier 1 Backend.
Remote Resource (Data/Infrastructure/Application) : Resources owned by another organization and external to its Simpl-Open Agent, accessible through the Remote Tier 2 Backend.

According to the Architecture Vision documents, services are categorized into two types: Built-in Services and Access-Through Services.

2.2.4. Built-in Services

Built-in services are services that Simpl-Open offers to end users and are completely implemented by the middleware.

Local Built-in Services

An example of a local Built-in Service is the User & Roles component. This service enables agent administrators to manage users and roles locally within the scope of the agent.

Cross-Agent Built-in Services

The Catalog is an example of a Cross-Agent Built-in Service. It allows the local Tier 1 Catalog Backend of a consumer to interact with the Remote Tier 2 Catalog service provided by the Governance Authority.

2.2.5. Access-through Services

This kind of service enables access to the Application/Data/Infrastructure resources that providers can offer through the Simpl-Open middleware.

Local Access Through

A Local Access-Through Service enables an external application (or frontend) to access to a participant’s internal resource via the local Tier 1 component. Currently, none of the services within Simpl-Open operate in this manner.

Remote Access Through

A Remote Access-Through Service enables access to a remote resource via the local Tier 1 component, which communicates with the remote Tier 2 component through Tier 2 communication.

2.2.6. Simpl-Open Service Template - The Echo Service

The echo service is an example of cross-agent built-in service that allows to check if the connection and the attribute exchange between participant is working. A boilerplate example of the Echo local and remote backend has been open sourced here .

2.3. Access Control & Trust

How IAA works at a high level:

Roles are used to enforce RBAC (role-based access control) to end users that access Simpl-Open functionalities in tier 1;
Identity Attributes are used to enforce **ABAC (**attributes-based access control) in the agent-to-agent (node-to-node) communication in tier 2;
Assignable Identity Attributes are used to be assigned to Roles enabling end users belonging to those roles to act on behalf of the Participant in a certain context.

Second Tier IAA - X.509 certificates with dynamic attribute provisioning

For clarification purposes, next an example is shown on how Tier II will work in practice:

1.1 - John Doe logs into the Consumer IAA Tier 1 System.

1.2 - IAA Tier 1 System retrieves user roles from Simpl-Open Agent User roles module and assign to John Doe the rights to access the Data Space functionality through Consumer’s Simpl-Open Agent, from now on all actions performed by John Doe are actually performed by the Simpl-Open Agent of Consumer which in turn interacts with the other Simpl-Open Agents (Provider and/or Data Space built-in capabilities).

2.1 - John Doe makes the infrastructure request to the Provider Simpl-Open Agent that validates it against the Access Control and Trust capability.

2.2 - Provider and Consumer authenticate each other using the mutual x509 TLS Authentication.

2.3 - Provider and Consumer verify validity of the x509 certificate through the Identity provider federation.

2.4 - Provider enforces access control policy based on embedded identity attributes and authorise Consumer Simpl-Open Agent.

2.5 - Consumer requests his own identity attributes ephemeral proof to Identity provider federation.

2.6 - Identity provider federation responds to Consumer ephemeral proof with identity attributes.

2.7 - Consumer sends ephemeral proof with identity attributes to Provider.

2.8 - Provider checks and validates the ephemeral proof, then enforces access control policy based on embedded identity attributes and authorises Consumer Simpl-Open Agent.

3.1 - Once verification against Access Control and Trust is successfully passed, the Provider uses his own Infrastructure/User data services module to fulfil received requests:

3.2 - Provider checks the policies querying Contracts module.

3.3 - Provider enforces retrieved contract policies.

3.4 - Activate the Provisioning module to fulfil the requested resource.

4 - Provider returns an affirmative response to Consumer request.

The process explained above is depicted below :

2.4. Digital Identities integration with EU Digital Identity Framework - eIDAS

In Simpl Open project, Digital Identities are the basis on which IAA core functionalities are built (Governance Authority relies on a x509 Certification Authority building block to issue and manage the digital identity to the Participants that onboard) and are used exclusively for tier 2 Authentication; furthermore, a full integration with the EUDI Framework (planned in the Simpl Open roadmap) will enable the middleware to be used in all possible scenarios, from the least to the most demanding in terms of trust and regulation compliance. On this page, it will be described how these digital identities are used and how the middleware is designed to integrate with the EUDI Framework.

2.4.1. eIDAS - EUDI Framework

Consists of 2 main elements that represent the main functionalities offered and precisely:

Trust Services

Create and validate electronic signatures, seals, time stamps, delivery services and certificates for website authentication.

eSignature

Create and verify electronic signatures in line with European standards.

This is the European Commission’s digital building block that was created to enable applications to integrate with eIDAS Trust services

Electronic Identification

Electronically identify users from all across Europe.

eID

Offers digital services capable of electronically identifying users from all across Europe.

This is the European Commission’s digital building block was created to enable applications to integrate with eIDAS Electronic Identification

The 2 elements will be joined in the EUDI Wallet that will be used for both Electronic Identification and Qualified Electronic Signatures

2.4.2. Digital Identities in Simpl

Digital identities in Simpl-Open are split into two kinds:

Issued by the Governance Authority and exclusively dedicated to IAA operations
like intra-agent secure communication (mTLS),ABAC policies enforcement, etc.
Used for both electronic identification and electronic signatures, and Simpl-
Open is designed to permit the selection of the electronic signature level that
best fits the scenario to cover (e.g. a qualified electronic signature for
contracts, and advanced electronic signature for service offering self-descriptions)

Electronic Identification Use Cases (applicable only on Tier 1)

Onboarding using Electronic Identification

Dataspace Governance Authority can decide to use the identification information provided by eID during the onboarding process to simplify and speed up the approval of the onboarding request.

Organisations, like for example universities, can decide to rely on the identification information provided by eID to identify and give roles/permissions to their end users.

Trust Services

Participants’ end users eEIDAS electronic Signatures

In contexts where the Dataspace Governance Authority require that a certain participant end user (e.g. the legal representative of a Participant) sign Contracts, SLA, Terms and Conditions, Agreements, etc, using Qualified Electronic Signatures.

Participants decide when the eIDAS QES is required

In contexts where a Data Provider require that for a certain Service Offering additional Contracts, SLA, Terms, and
Conditions, Agreements, need to be signed by the Consumer using Qualified Electronic Signatures.

2.4.3. References

eIDAS - EUDI Framework - https://eidas.ec.europa.eu/efda/home#/screen/home

eSignature - https://ec.europa.eu/digital-building-blocks/sites/display/DIGITAL/eSignature

eID - https://ec.europa.eu/digital-building-blocks/sites/display/DIGITAL/eID

2.5. Connector

The IDSA Reference Architecture Model defines the connector as being the technical core component required for a participant to join a Data Space.

DSSC defines the connector as a technical software component that is run by (or on behalf of) a participant and that provides connectivity with similar components run by (or on behalf of) other participants, to enable the secure and trusted sharing of data.

A connector can provide more functionality than what is strictly related to connectivity. The connector can offer technical modules that implement data interoperability functions, authentication interfacing with trust services and authorisation, resource description, contract negotiation, etc.

DSSC uses “participant agent services” as the broader term to define these services.

DSSC also distinguishes the 2 major components that make up a connector:

The control plane is responsible for deciding how data is managed, routed and processed. For example: the control plane handles the identification of users and the handling of access and usage policies.
The data plane handles the actual exchange of data.

This implies that the control plane by its nature can be standardised to a high-level, while the data plane is likely to be different for each Data Space (as different types and sorts of data exchange take place in each Data Space).

The data plane needs to be integrated with the control plane to ensure that it can work with the necessary control mechanisms.

DSSC identifies the different categories of components within a Data Space, making a distinction between the (1) participant agent (= connector in DSSC vocabulary) and (2) shared services:

Within the control plane, several components can be identified:

A Participant Wallet: providing participants with the ability to store and exchange identities and other attestations. For instance, in the form of Verifiable Credentials.
A Data, Services & Offerings Catalogue: providing participants with the ability to share (on a technical level) the data, services and offerings which are provided through the data plane.
Components for Contract Negotiation: providing participants with the ability to share data access and usage policies with others in the Data Space and to enforce these on the data plane. For instance: to create an authorisation registry, which - based on policies - can determine who gets access to a certain data set or service.

On the data plane there is the actual transfer process. As indicated before, the data plane is likely to be highly application specific. It should however work in conjunction with the control plane, e.g. to ensure that no data sharing can start before certain conditions are met (identification, contract negotiation, etc.).

Note that components of the connector can have different granularities. They can be conceived as an integrated component, but they can also consist of multiple (packaged) components (e.g. with a separated, but linked, component for Participant Wallet).

Concretely for Simpl-Open, a connector is used to implement the 3 parts of the IDSA Data Space Protocol :

Publication and request of catalogue items - mapping to Data, Services & Offerings Catalogue component of the control plane ;
Contract negotiation - mapping to Contract Negotiation component of the control plane;
Data transfer process - mapping to the Data Plane.

The control plane of the connector is also used as orchestrator between the 3 parts.

The current implementations of connectors do not cover all the needs envisioned in Simpl-Open and therefore extension points are planned, for instance, to cover the infrastructure provisioning.

2.6. High-Level Architecture

This section elaborates on the High-Level Architecture of Simpl-Open. It presents the capabilities of Simpl-Open and the building blocks that support these capabilities. It is important to remark that the high level architecture lays out the capabilities of Simpl-Open as a whole. How these capabilities are realised is then described in the following sections of the document.

The concepts described in this section have been, for a large part, already developed in the Architecture Vision Document of the Simpl Preparatory Study. They are taken over in this document and updated/complemented where needed to stay up-to-date with the current developments of Simpl-Open.

Six architectural dimensions describe Simpl-Open: the integration dimension, the data dimension, the infrastructure dimension, the administration dimension, the governance dimension and the security dimension.

The integration dimension contains the capabilities that enable participants to integrate with each other in a secured and trusted manner. This is required for the well-functioning of a Data Space integrating Simpl-Open. These capabilities regard security, access control and trust and federation management.
The data dimension focuses on semantic interoperability, data models, data quality and governance of data. It ensures that data can be understood, processed, and exchanged consistently across participants through standardized vocabularies and quality management.
The infrastructure dimension allows end users to utilise and manage infrastructure resources offered by infrastructure providers. Simpl-Open can connect to third-party infrastructure resources, enabling end users to execute applications and manage workloads.
The administration dimension provides supporting capabilities for the well-functioning of the other dimensions as well as administration of Simpl-Open. The administration layer allows actors to operate their components in the Data Space.
The governance dimension establishes and enforces policies, manages risks, oversees compliance and provides audit and assurance for the entire ecosystem. It supports the implementation of legal and organisational interoperability through policy management, contract management, participant lifecycle, and audit capabilities.
The security dimension ensures that all interactions and data exchanges across the Simpl-Open ecosystem are confidential, authentic, and tamper-resistant. It focuses on safeguarding communications, assets, and operations through technical and procedural resilience measures.

Each of these six layers is further detailed in the following sub-sections.

2.7. Capabilities (Level 1)

The following figure presents the Level 1 capability map of Simpl-Open, where capabilities are applied to the dimensions:

In the Administration dimension , the Observability capability monitors system health, usage, and performance across the data space, providing insights and dashboards for operational oversight. The Support capability provides operational assistance to participants and end-users through service desk services, ticketing systems, and status pages. It enables troubleshooting, issue tracking, and knowledge sharing to ensure smooth installation, configuration, and ongoing use of Simpl-Open components. The Notification and messaging capability provides asynchronous, event-driven notifications to users and admins for key workflows like onboarding requests and governance actions

In the Data dimension , the Data governance capability ensures that data sharing adheres to defined quality, metadata, and governance standards. It provides services like data lineage, data profiling and data quality rules. The Data processing capability provides the means to transform, aggregate and visualise datasets across multiple sources. The Supporting data services capability provides the foundational data services that enable efficient, scalable, and reliable management of data operations across the ecosystem, including orchestration and distributed execution. The Semantics & Vocabulary ensures semantic interoperability across the Data Space by providing standardized vocabularies, ontologies, and schema management. It enables participants to understand and interpret shared data consistently through formal knowledge representation and mapping services.

In the Integration dimension the Data sharing capability allows participants to exchange data with others through interoperable interfaces, where the Application sharing capability allows participants to make applications and services available to others through interoperable interfaces, as well as provide algorithms and models for AI-based processing. The Federated Management capability manages identity federation, catalogue federation, trust anchoring, and cross-domain access across multiple data spaces. The Resource discovery capability supports consumers in finding available resources securely and efficiently through catalogues. The Policy enforcement capability enforces access and usage policies at runtime integration points where policy decisions are applied. The Contract enforcement capability clarifies its role in validating and enforcing contractual terms at integration points, connecting it to policy management and billing. The Supporting Integration Services emphasizes its supporting role in maintaining persistent resource addresses across federated environments and integration endpoints. The Resource sharing service will embed all services related to generic resource sharing, specifically focused on the implementation of the connector protocol.

In the Infrastructure dimension , the Provisioning capability handles allocation, lifecycle, and orchestration of infrastructure resources required by participants and data services. The Supporting infrastructure services capability provides underlying infrastructure-level services such as distribution and the management of distributed resources. The HPC capability enables the execution of high-performance computing workloads where demanding analytical or AI-driven computations are needed, leveraging shared or external infrastructure resources.

In the Governance dimension , the Consent management capability ensures that data subjects’ consent preferences are properly captured, managed, and respected throughout data processing activities. The Contract management capability governs the lifecycle of contractual agreements between participants, ensuring that terms and obligations are traceable and enforceable. The Policy management capability enables the lifecycle, the definition and distribution of access and usage policies across the Data Space. The Audit capability provides transparency and verifiable evidence of compliance, supporting accountability and continuous assurance. The Participant management capability handles onboarding, identity validation, and lifecycle management including offboarding of all participants in the ecosystem.

In the Security dimension , the Credential management capability covers the implementation of digital signatures to guarantee data confidentiality, integrity, and authenticity, along with the storage of these credentials and signatures in the digital wallet. The CSIRT capability provides coordinated incident detection, response, and resolution services. It ensures operational readiness against threats, manages vulnerability disclosures, and leads recovery activities in case of security incidents. The Access control and trust capability enables secure and trusted collaboration between participants within the Data Space. It ensures that only authenticated and authorised entities can access shared data, services, and applications, while maintaining interoperability across different trust domains.

2.8. Services (Level 2)

The following figure presents the Level 2 capability map of Simpl-Open, where business services are applied to capabilities:

2.7.1. Administration Dimension

The Observability capability has the following services: Resource usage, QoS metrics and alerts, Exporting, Dashboarding, Logging, Performance monitoring, Energy metrics and alerts, and Reporting:

The Resource usage service: Provides visibility into consumption of compute, storage, and network resources to support capacity planning and chargeback.
The QoS metrics and alerts service: Tracks SLOs and emits alerts on threshold breaches to enable timely operational responses.
The Exporting service: Enables scheduled or ad‑hoc export of metrics and logs to external observability or compliance systems.
The Dashboarding service: Offers configurable dashboards for real‑time and historical operational insights across tenants and domains.
The Logging service: Centralizes, indexes, and retains logs with correlation to traces and metrics for efficient troubleshooting.
The Performance monitoring service: Measures latency, throughput, and error rates to detect regressions and bottlenecks early.
The Energy Metrics & Alerts service: Captures energy usage KPIs and triggers notifications to optimize sustainability targets.
The Reporting service: Generates scheduled and on-demand reports aggregating operational data, compliance evidence, and business metrics. Supports customizable report templates, multi-format exports (PDF, CSV, JSON), and role-based access to reporting views. Enables stakeholders to track resource consumption, policy adherence, SLA compliance, and data space activity over time.

The Support capability has the following services: Service desk, Support page, Ticketing system:

The Service desk service: Provides first‑line assistance, triage, and knowledge‑base guidance for participants and operators.
The Support page service: Publishes status, FAQs, runbooks, and contact channels to streamline self‑service support.
The Ticketing system service: Orchestrates issue lifecycle with SLAs, prioritization, and handoffs across resolver groups.

The Notification and messaging capability has the following services:

Notification service: enables timely, reliable communication of critical business events to participants across federated data spaces

2.7.2. Data Dimension

The Data governance capability has the following services: Data lineage, Data profiling, Data quality rules.

The Data lineage service: Records end‑to‑end provenance to enable impact analysis, compliance evidence, and reproducibility.
The Data profiling service: Analyzes datasets for structure, distributions, and anomalies to inform governance decisions.
The Data quality rules service: Defines and evaluates quality checks with reporting and remediation workflows.

The Data processing capability has the following services: Data analytics, Data visualisation, Anonymisation.

The Data analytics service: Provides batch and interactive analytics for descriptive, diagnostic, and predictive insights.
The Data visualisation service: Delivers charts and exploratory views to communicate insights and monitor KPIs.
The Anonymisation and pseudonymisation service: Applies masking, pseudonymisation, and differential privacy patterns to protect personal data.

The Supporting data services capability has the following services: Data orchestration, Distributed execution, Semantic mapping.

The Data orchestration service: Coordinates multi‑step pipelines with dependencies, retries, and policy‑aware scheduling.
The Distributed execution service: Runs data jobs elastically across clusters with placement, scaling, and fault tolerance.

The Semantics & Vocabulary capability has the following services: Semantic mapping service, vocabulary hub and ontology management and schema management

The Semantic mapping service: Discovers and documents schema/ontology mappings across domains. Supports semantic interoperability and cross-domain discovery.
The Vocabulary hub service: Manages, versions, and publishes controlled vocabularies (SKOS, DCAT, schema.org , domain-specific ontologies). Enables cross-domain data understanding. Authors, aligns, and publishes ontologies (OWL, RDF) for domain modelling (e.g., manufacturing, healthcare). Enables semantic queries and reasoning.
The Schema management service: Stores, versions, and governs data schemas (JSON Schema, Avro, Parquet metadata) linked to vocabularies. Supports Metadata description service (Governance) in enforcing DCAT-AP compliance

2.7.3. Integration dimension

The Data sharing capability has the following services: Bulk data transfer, Data streaming, Simple data transfer.

The Bulk data transfer service: Moves large datasets reliably with checkpointing, integrity checks, and resume support.
The Data streaming service: Publishes and subscribes to real‑time event flows with ordering, retention, and replay.
The Simple data transfer service: Provides lightweight pull or push exchanges for small files and APIs.

The Application sharing capability has the following services: Calculation algorithm, Machine Learning Model, Software apps (Rendering engine).

The Calculation algorithm service: Exposes deterministic computational functions for remote execution.
The Machine Learning Model service: Serves trained models with versioning, inference endpoints, and monitoring.
The Software apps (Rendering engine) service: Hosts interactive applications and engines for domain‑specific processing and visualization.

The Supporting Integration Services capability has the following service: Resource address management.

The Resource address management service: Manages the resolution and lifecycle of resource identifiers across the Data Space. Provides persistent addressing schemes that enable resources to be uniquely identified and located regardless of their physical location or deployment changes. Integrates with the Resource Catalogue service to maintain synchronized resource metadata and enable consumers to locate resources through both catalogue searches and direct URI-based addressing.

The Federated Management capability has the following services: Federation orchestration.

Federation orchestration service: Coordinates cross-domain identity federation, trust framework establishment, and catalogue synchronization across multiple autonomous data spaces. Manages trust anchors, maintains federation metadata, and orchestrates authentication and authorization flows that span organizational boundaries. Enables seamless interoperability while preserving sovereignty of individual data space instances.

The Resource discovery capability has the following services: Resource catalogue, Search engine.

The Resource catalogue service: Publishes registries of datasets, services, and apps with federation support.
The Search engine service: Indexes and queries resources with fine‑grained policy‑aware filtering.

The Policy Enforcement capability has the following services: Policy Enforcement Point service.

The Policy Enforcement Point service: Enforces access and usage policies at integration interfaces (API gateways, connectors, catalogues, data/application sharing endpoints)

The Contract enforcement capability has the following services: Contract Enforcement service.

The Contract enforcement service: Applies and monitors contractual terms programmatically across interactions.

The Resource sharing capability has the following services: Resource sharing runtime.

The Resource sharing runtime service: The resource sharing runtime service will embed all services related to implementing the DSP.

2.7.4. Infrastructure Dimension

The Provisioning capability has the following services: Infrastructure provisioning.

The Infrastructure provisioning service: Allocates, configures, and lifecycles compute, storage, and network resources.

The Supporting infrastructure services capability has the following services: Infrastructure orchestration, Distributed management.

The Infrastructure orchestration service: Automates deployment and day‑2 operations via declarative control and runbooks.
The Distributed management service: Manages multi‑site topologies, synchronization, and drift remediation.

The HPC capability has the following services: HPC.

The HPC service: Provides access to high‑performance compute resources for large‑scale simulations and AI workloads.

2.7.5. Governance Dimension

The Consent management capability has the following services: Consent management service.

The Consent management service: Captures, stores, and enforces data subjects’ consent preferences in accordance with GDPR and privacy regulations. Maintains versioned consent records linked to specific data processing activities, enables consent revocation workflows, and provides audit trails. Integrates with policy management to ensure consent terms are enforced across data sharing operations.

The Contract management capability has the following services: Billing, SLA Management, License asset, Contract establishment.

The Billing service: Calculates and issues invoices based on usage, entitlements, or fixed agreements.
The SLA Management service: Tracks service commitments and penalties with evidence and notifications.
The License asset service: Manages software and content licenses, entitlements, and renewals.
The Contract establishment service: Establishes (and invalidates) contract agreements.

The Policy management capability has the following services: Policy decision point service and policy administration point service.

The Policy Decision Point service: Evaluates policies against attributes, contracts, and consent to render decisions (grant/deny/obligation). Takes attributes from PIP adapters and usage context from PEP.
The Policy Administration Point service: Authors, approves, versions, and distributes policies and contract-linked obligations. Manages policy lifecycle (draft → approved → active → deprecated)
The Policy Information Point service: Adapts attributes from external sources (participant registry, consent store, contract service, catalogue) to feed PDP decisions. Bridges organizational context to policy evaluation

The Audit capability has the following services: Audit.

The Audit service: Collects immutable evidence, signatures, and trails to support compliance and assurance.

The Resource management capability has the following services: Metadata description

The Metadata description service: Maintains DCAT‑AP compliant descriptors to enable interoperability.

The Participant management capability has the following services: Onboarding, User roles, Offboarding.

The Onboarding service: Validates identities, performs due diligence, and provisions initial access.
The User roles service: Defines and assigns roles and responsibilities with least‑privilege defaults.
The Offboarding service: Revokes access, archives evidence, and ensures controlled exit procedures.

2.7.6. Security Dimension

The CSIRT capability has the following services: Incident response, Threat monitoring.

The Incident response service: Coordinates detection, containment, eradication, and recovery with post‑incident review.
The Threat monitoring service: Continuously monitors for indicators of compromise and emerging vulnerabilities.

The Access control and trust capability has the following services: Identity provider, Authentication provider federation, Authorisation, Security attribute provider federation, Encryption, Guaranteed Authenticity / Integrity.

The Identity provider service: Issues and manages identities with lifecycle hooks for onboarding and offboarding.
The Authentication provider federation service: Federates external IdPs to enable single sign‑on across domains.
The Authorisation service: Enforces fine‑grained, policy‑based access decisions for data, services, and apps.
The Security attribute provider federation service: Aggregates and validates assurance attributes to support trust decisions.
The Encryption service: Protects data in transit and at rest using modern, configurable cryptography.
The Guaranteed Authenticity / Integrity service: Uses signatures and checksums to ensure tamper detection and provenance.

The Credential Management capability has the following services: Wallet service, VC issuance/verification service, Signing service

The Wallet service: The wallet service manages the storage, signing, and lifecycle of resource descriptions, verifiable credentials, and usage contracts for dataspace participants
The VC issuance/verification service: Issues, verifies, and manages W3C-compliant verifiable credentials for organization identity, user attributes, and data provenance. Includes VC issuance, verification, and trust anchor registry
The Signing service: Generates digital signatures using cryptographic keys, verifies signatures to confirm authenticity and integrity. Ensures Non-repudiation: The signer cannot later deny they signed something. Provides Tamper detection: Any change to signed data invalidates the signature

2.8.1. Deployment model

The architecture of Simpl-Open follows a loosely coupled self-contained architecture which groups components into building blocks, capability by capability. This approach permits the deployment Simpl-Open agent in different flavours depending on the type of participant, e.g. an Infrastructure Provider requires a different subset from the full Simpl-Open stack than a Data Provider. This modular architecture within a Data Space is presented on the following figure:

2.8.2. Scope covered by the Release 3.0

Below figure depicts the capabilities that will be (partially) implemented as part of the Release 3.0 (December 2025):

The current version of this document covers the architecture of the Release 3.0 only and as such, following sections only focusses on components implementing the capabilities mentioned above as being in scope of the Release 3.0.

Placeholders have also been added for content that will be made available after the Release 3.0, with clear disclaimers at the beginning of the respective sections.

2.9. Architecture Framework

2.9.1. Architecture Approach

The architecture of Simpl-Open is created using a layered approach, inspired by the TOGAF methodology, which is reflected in the structure of this document:

Business Architecture - describes how Simpl-Open should achieve its business goals and respond to the strategic drivers set out in the Architecture Vision. This layer was already defined in the preparatory study and this document only provides an update on the functional capabilities (which have evolved since then) and revisited concepts of business processes.
Application Architecture - develops the target application architecture of Simpl-Open that enables the business architecture and the architecture vision, in a way that addresses the requirements. It identifies architecture components through Solution Views (business process-based approach, both static and dynamic) and Deployment Views (agent type-based approach, static only).
Data Architecture - presents data entities and/or collections and how they are structured within the system.
Technology Architecture - develops the target technology architecture that enables the application architecture to be delivered through technology components and technology services. Each application building block is mapped to a technology implementing the capabilities. Just like for the application architecture, both Solution and Deployment Views are defined.
Security Architecture - covers the security aspects of the architecture.

The list of architecture principles and patterns to which Simpl-Open adheres is presented in the next section.

The source of the ArchiMate diagrams presented in this document are available as an Archi model, versioned in the Simpl-Open repository .

More information on how the model can be accessed is available here .

2.9.2. Architecture Principles

Simpl-Open is designed upon ten architectural principles. Each of these principles is applied throughout Simpl-Open’s design. They are all equally important to the design. The following figure provides an overview of these principles:

Federation : Federated systems describe autonomous entities, tied together by a specified set of standards, frameworks, and legal rules. Simpl-Open should federate data, infrastructure and applications. This principle is key to enable interoperability and information sharing among the different entities that will be part of Simpl, while giving maximum autonomy to service owners.
Modularity : The architecture of Simpl-Open needs to be defined in a modular way which allows the replacement or addition of components without affecting the rest of the system. This also provides the possibility to implement every component with a different open-source technology. Through modularity, Simpl-Open users are able to deploy a specific subset of components that are tailored for their purposes.
Loose coupling : Components and services should have minimal dependencies on each other. Standardised, business-oriented APIs make sure consumers are not impacted by changes to services. This allows service owners to change implementation, switch out components, or modify data records behind the APIs without downstream impact to end users. This principle ties in with the modularity and resilience principle.
Resilience: Components of the architecture must be fault tolerant, such that failures in one of them will have minimal impact on other components. Single points of failure need to be avoided to the maximum extent possible as the main objective is achieving a distributed architecture.
Openness & agnosticism : The open specification allows insights into all parts of the architecture without any proprietary claims. It makes adding, updating or changing components easy for all users. Services should be provided irrespective of specific technologies and should be executable in all environments.
Composability & extensibility : Simpl-Open’s architecture should allow services to deliver value to the business in different contexts, providing the necessary tools to facilitate their composition together with other services to form new aggregated services. Simpl-Open remains open to iterative growth allowing the addition of new services and capabilities that fit future use cases to the platform. An open development community should be promoted in order to enable the contribution of new features that extend Simpl-Open’s functionalities by its members.
Interoperability : Simpl-Open enables interoperability between its participants to share resources in a well-specified manner. The architecture should describe the technical means to achieve this and be agnostic to the specific implementation details of each participant.
Scalability & elasticity : Simpl-Open provides the means to accommodate larger workloads and allow new entities and users on the platform without affecting the performance. Both vertical scaling – i.e. the practice of adding more resources to a single node – and horizontal scaling – i.e. the process of duplicating nodes – should be possible. Simpl-Open’s performance should be able to follow user demand without deteriorating.
Security, privacy & trust : Users of Simpl-Open must be confident that when they interact with other entities they are doing so in a secure and trustworthy environment and in full compliance with relevant regulations. Data confidentiality, availability and integrity must be guaranteed. Privacy of data subjects, Simpl-Open users, or individuals must be assured.
Discoverability : All services that are deployed in Simpl-Open will be ‘publicly’ exposed and discoverable in a service registry or catalogue. In this context, ‘public’ is seen as visible by all approved participants of a Data Space, not the public internet. Services will adhere to a service description, providing interested parties with a clear understanding of their business purpose and technical interface.

These architecture principles are completed with coding principles which can be found in the Development Handbook

2.10. Architecture Patterns

Simpl-Open is designed using a combination of the following architecture patterns:

Pattern	Short Description
Microservices architecture	Breaks down applications into small, independently deployable services. Each service focuses on a specific business capability and communicates via APIs, enhancing flexibility and scalability.
Event-driven architecture	Components communicate by producing and reacting to events, promoting loose coupling and scalability. This pattern enhances responsiveness and supports real-time processing.
Asynchronous communication	Allows components to interact without waiting for immediate responses. Improves system performance, decoupling, and scalability by enabling non-blocking interactions, typically using queues or background workers.
Stateless design	Ensures each request is independent and self-contained, avoiding reliance on stored session data. Improves scalability, simplifies fault tolerance, and supports load balancing.
Least privilege	Grants each user or service only the minimum level of access required to perform its tasks. This limits potential damage in the event of a breach and reduces the attack surface.
Defence in depth	Applies multiple layers of security controls throughout the system. If one layer is compromised, others still provide protection, improving the overall resilience of the system.
Zero trust	Assumes no implicit trust in users or systems, whether inside or outside the network. Continuously verifies identities and enforces strict access controls before granting access to resources.
Retries	Automatically re-attempts failed operations after a delay, particularly useful in cases of transient failures. Helps improve reliability without requiring manual intervention.
Circuit breakers	Monitors service calls and halts repeated failures by stopping requests to underperforming services. This prevents cascading failures and gives systems time to recover.
Graceful degradation	Ensures that the system continues to provide limited or reduced functionality when some components fail. Improves user experience and system robustness under failure conditions.

2.11. Assumptions and Architecture Decisions

This information is based on currently available information tailored for Release 3.0 (December 2025 release) only.

2.11.1. Assumptions

ID	Topic	Assumptions
ASM-01	Data Space data management (downloading data vs using it)	Infrastructure provider is a mandatory intermediary to enable security of data processing. It is the responsibility of the infrastructure provider to setup access control on the provisioned infrastructure tenant for the consumer. It is the responsibility of the data provider to setup policy enforcement measures (e.g. restricting download) on the infrastructure tenant for the consumer. In the case where data from the data provider is downloaded directly by the consumer (without an infrastructure provider involved), then the "technical enforcement" is replaced by a "legal enforcement".
ASM-02	Possible data sharing scenarios	The following scenarios to share data exist: Simple Data Download: Data Provider is willing to offer the possibility of downloading the dataset to the consumer. The contract will create some legally binding usage policies. → supported by Simpl-Open. The data/app provider offers one, or a bundle of infrastructure instances that host both the data, and the application that can process the data. The consumer still has the possibility of downloading the data, but the contract may prevent it or put limitations/usage policies on it. → currently not in scope (but could be implemented in the future). The data/app provider offers a bundle of infrastructure instances that host the data and the application that process the data separately. The access will be provided only to the application (or the infra instance that hosts the application). The consumer cannot access the data, and as part of the contract they shouldn't even try. → supported by Simpl-Open. Compute to Data or loading the data in confidential memory enclaves (such as Intel SGX). An advanced version of Scenario 3, with more technical complexities. → currently not in scope (but could be implemented in the future).
ASM-03	Actors with multiple participant roles	One agent per participant role (i.e. multiple agents required if a participant plays multiple roles in the Data Space). One standard deployment script per type of participant will be provided.
ASM-04	Distinction between Certificate/Credentials	There is a clear distinction between credentials for securing the Data Space (Tier 1 and 2 IAA) and the credentials for signing SDs and contracts (legally binding signature).
ASM-05	Data sharing connector	A connector agnostic "Asset Manager" will be developed, which can access different storage types and handle the data transfer. Currently, existing plugins of the EDC connector are used (such as the S3 object storage extension, that can handle access management and data transfer, in case of contracting). The asset manager will be a module of the agents. The combination of the connector (any) and the asset manager will be a part of the agents, for example the consumer and the provider agent.
ASM-06	Contract signature	Currently, only a simple signature is used (not a legally valid one).
ASM-07	Usage of a Data Space connector	Any communication/transfer between agents will be done via Data Space connectors. They are responsible to implement the 3 aspects of the Data Space Protocol (DSP): registering and requesting service offerings in/from the catalogue; negotiation of a contract; enabling consumption of service offerings.
ASM-08	Storage attached to VMs and containers	It is assumed that VMs and containers always have an attached storage.
ASM-09	Type of storage supported	It is assumed that Simpl-Open only supports natively S3-compliant storage but is extensible to support other storages (offering an API).
ASM-10	Deployment and termination of built-in applications	It is assumed that the application is always deployed and terminated together with the infrastructure resource as part of deployment script.
ASM-11	Type of built-in application deployment supported	It is assumed that Simpl-Open only supports natively applications deployed on Kubernetes but is extensible to support other platforms (offering an API).
ASM-12	Supported infrastructure resources	It is assumed that Simpl-Open only supports natively: S3-compliant storage Kubernetes containers platform VMWare virtual machines but is extensible to support other platforms (offering an API).

2.11.2. Architecture Decisions Record (ADR)

ID	Title	Context	Decision	Consequence	Date	Decision Maker
ADR-01	API Guidelines	One of the base principles of Simpl is interoperability, and in this respect, REST API guidelines should be established.	The decision is to use : the BelgIF API Guidelines but be pragmatic about it and only address the most important guidelines which have been documented in the APIs section of this document. these guidelines are compliant with the European Interoperability Framework (EIF), which promotes Interoperability (one of the core Simpl-Open principles). OWASP Security cheat sheet and the OWASP API Security project these additional guidelines will promote Security (another of the core Simpl-Open principles). OpenAPI definitions are stored in GitLab in yaml format.	The guidelines should be implemented for each custom-built component in Simpl-Open.	29 Nov 2024	DG Connect
ADR-02	PostgreSQL Deployment Model	The different patterns to persist data in the Simpl-Open microservices architecture are: a (database) cluster per service a database per service (co-hosted on the same database cluster) a shared database (with schema or table per service) Option 3 has not be further analysed as it creates tight coupling between services and goes against the architecture principles of Simpl-Open.	The decision is to select option 2 (a database per service) as default option for the development of Simpl-Open. Note: this decision does not prevent the final data space user to change the deployment model that best fits its interests.	Pros Fault tolerance at node level (loss of a pod is compensated by its replicas); Logical data segregation between services; Less operational complexity; More optimal resource allocation; Less expensive; Provide loose coupling; Provide modularity; Allow horizontal scalability. Cons No fault tolerance at cluster level (lost of the entire StatefulSet impacts all services); No physical data segregation between services.	23 Jan 2025	DG Connect
ADR-03	Notification service	In certain cases, notifications need to be sent to end-users. E.g. when an onboarding request is created, a notary of the GA should be notified. It is important to distinguish notifications from asynchronous call-backs. For notifications, the Simpl-Open custom-built components need a way to generate events. Assumption: in data spaces the likelihood of asynchronous processing is high due to the federated nature. Following options have been considered: independent and synchronous notification microservice : In this option, a new independent microservice is built and is triggered synchronously over a REST API. independent and asynchronous notification microservice : In this option, a new independent microservice is built and is reacting asynchronously on messages posted on Kafka. notification java library : In this option, a reusable java library so sends notifications which can be integrated into any application component that needs it. The library should encapsulate the logic for sending notifications and should be injectable as a dependency into other java applications (same model as the one built for the Log4J wrapper).	The decision is to select option 2 (independent and asynchronous notification microservice).	Pros Fault tolerant by default (message-driven); Operational complexity: centralised configuration and credentials management. Provide loose coupling; Provide modularity; Allow horizontal scalability. Cons Operational complexity: an additional microservice to maintain; Interoperability: Kafka producer required to integrate (less standard).	23 Jan 2025	DG Connect
ADR-04	Distributed tracing	Operating a complex, distributed system like Simpl-Open, with its federated, modular, and loosely coupled architecture, requires deep visibility into how requests travel across its many independent components providing engineers with clear optics into existing flows. This enhanced visibility is enabling teams to rapidly spot bottlenecks, pinpoint errors, and diagnose performance degradation much faster than would otherwise be possible by using a standard logs. Ultimately, effective tracing aims to provide a unified, 360-degree view of all interacting components, consolidated within a single dashboard, improving observability and troubleshooting efficiency across the entire system. A need for tracing comes from the real-engineering pain-point experienced by a team in Simpl-Open project. Below are options that were considered for implementation of Tracing in Simpl-Open. All presented options take advantage of modular and resilient architecture of Simpl-Open ensuring Tracing scalability and interoperability: In-house implementation of Tracing library: Implement a library that would provide all necessary elements to effectively collect telemetry data from agents and sent it to Kibana for visualization. Integrate OpenTelemetry with Elastic though Elastic Distributions of OpenTelemetry ( EDOT ) extension: Elastic integrates with OpenTelemetry, allowing to reuse your existing instrumentation to easily send observability data to the Elastic Stack. Integrate Open Telemetry with ELK though Kafka Integrate Open Telemetry with ELK though log collector: Use the OpenTelemetry agent together with OpenTelemetry collector on edge machines to send traces to APM (an existing, free feature in Elastic stack). Traces will be displayed in preconfigured dashboard in Kibana.	The decision is to select option 3 (Integrate Open Telemetry with ELK though log collector).	Pros: Decoupling of components. Scalability. Ability to enrich logs on a collector level. Production ready. Minimal engineering effort if using instrumentation . Cons: Increased system complexity. More resource consumption on the edge machine.	17 Apr 2025	DG Connect
ADR-05	Access control inside Simpl-Open Agent	Tier 1 communication with Simpl-Open microservices is protected by an API gateway that validates JWT tokens at the edge. The assumption has been that internal communication cannot be accessed by an attacker, so no internal service-to-service traffic security is in place. However, this model does not align with the Zero Trust approach, which assumes that threats may exist within the internal network. To follow this Zero Trust approach, Simpl-Open need to ensure that all service-to-service (east-west) communication is authenticated and authorized, preventing unauthorized access even within the internal network. Below are options that were considered: Microservices to validate OAuth2 JWT roles and scopes: Tier1 Gateway forwards the User JWT token to agent Microservices (user-facing APIs) . Microservices must implement Role-Based Access Control (RBAC) , introducing an additional layer of verification beyond the existing Tier 1 RBAC enforcement. Service-to-Service communication must be enforced as follows: each microservice needing to communicate internally will obtain an OAuth2 JWT Access token from Keycloak using Client Credentials Grant. This token contains a list of the scopes that indicate the resource and the action the microservice can perform within the agent. The token is transmitted to the target microservice to authenticate the request and enforce access control based on the token’s claims. Scopes can follow a resource-action format (e.g., participant:create, identityattributes:read) to provide fine-grained Scope Based Access Control (SBAC) . Service mesh (Istio): Every POD in the Kubernetes (K8s) namespace will be paired with a sidecar container and is part of mesh . Istio can be used to enforce mTLS for secure service-to-service communication. Role and Scope based access control can be enforced using Istio. Internal Communication only through the Tier 1 gateway: User-Facing APIs: the Tier1 gateway must enforce RBAC policies and forward the request to the agent components. No access control must be implemented in the microservice. Service-to-Service Communication: Each microservice needing to communicate internally will obtain an OAuth2 JWT Access token from Keycloak using Client Credentials Grant. This token contains a list of the scopes that indicate the resource and the action that the microservice is allowed to perform within the agent. The token is transmitted to the gateway. The gateway validates the request and enforces access control based on the token’s claims. If the access control and token validity evaluation are successful, the gateway forwards the request to the target microservice. No access control must be implemented in the microservice. Mutual TLS (mTLS) for Service-to-Service Authentication: Every component will be provided with a private key and a certificate, enabling it to authenticate and encrypt communications using Mutual TLS (mTLS) for secure service-to-service interactions within the system. Microservices to validate OAuth2 JWT roles and scopes with optional Istio service mesh: Tier1 Gateway forwards the User JWT token to agent Microservices (user-facing APIs) . Microservices must implement Role-Based Access Control (RBAC) , introducing an additional layer of verification beyond the existing Tier 1 RBAC enforcement. Service-to-Service communication must be enforced as follows: each microservice needing to communicate internally will obtain an OAuth2 JWT Access token from Keycloak using Client Credentials Grant. This token contains a list of the scopes that indicate the resource and the action the microservice can perform within the agent. The token is transmitted to the target microservice to authenticate the request and enforce access control based on the token’s claims. Scopes can follow a resource-action format (e.g., participant:create, identityattributes:read) to provide fine-grained Scope Based Access Control . Optionally, every POD in the Kubernetes (K8s) namespace will be paired with a sidecar container and is part of mesh . Istio can be used to enforce mTLS for secure service-to-service communication.	The decision is to select option 5 (Microservices to validate OAuth2 JWT roles and scopes with optional Istio service mesh).	Pros: Provides Authorization capabilities via Roles and Scopes well-supported standard supported by web frameworks granular authorization via OAuth2 scopes and roles uses existing Simpl Authentication provider (Keycloak) without Istio security mesh cloud platform agnostic with Istio security mesh Includes channel encryption (mTLS) Can enforce additional security policies and manage traffic for open-source components outside our direct control. Cons: Increased operational complexity due to OAuth2 token management. Need for well-defined scope and role management to prevent over-provisioning of access rights. requires implementation on both client and server side without Istio security mesh Cannot enforce security policies and manage traffic for open-source components outside our direct control (Option 2 or 3 to be considered) with Istio security mesh not technology agnostic, works only on Kubernetes, Increased technology portfolio Resource overhead	15 May 2025	DG Connect
ADR-06	Healthchecks monitoring	To monitor the health of Simpl-Open components, a healthcheck mechanism should be implemented. Below are options that were considered: Direct HTTP Push to Log Store: Each Simple-Open service independently sends its health status directly to a dedicated HTTP endpoint on the Elastic Stack (e.g., a Logstash HTTP input or an Elasticsearch Ingest Node). This is a " push " model where the service initiates the communication. Collected health data is persisted directly in Elasticsearch and visualized in Kibana. Elastic Heartbeat to probe for services health: A Heartbeat component from Elastic, on a predefined schedule, check a status of each of registered service by sending HTTP call. Collected health data is persisted directly in Elastic and visualised in Kibana. Enhance Probing with Spring Boot Actuator: This option builds on the "pull" model from Option 2 by requiring services to use the Spring Boot Actuator library. The Actuator exposes a detailed, standardized health endpoint (e.g., /health). A probing tool like Elastic Heartbeat is then configured to poll this specific endpoint. This hybrid approach combines the active probing of Heartbeat with the rich, internal context provided by the Actuator. Elastic Heartbeat to probe for services health and pushes it to Kafka: A Heartbeat component pull health status for predefined list of services via HTTP and pushes data to Kafka. A dedicated Logstash pipeline is configured to read data from a predefined Kafka topic and put it in Elastic for visualization in Kibana.	The decision is to select option 3 (Enhance Probing with Spring Boot Actuator).	Pros: Deep, Standardized Health Information: Provides rich, out-of-the-box details on the status of dependencies needed for a Simple service to execute on business process. Actionable Business-Level Monitoring: Enables the creation of custom health checks tied to critical business processes, providing much more meaningful alerts than a simple uptime check. Minimal Implementation Effort: Adding the Actuator is a simple dependency addition in a Spring Boot application. The framework handles the rest (example below). Combines Best of Both Worlds: Gets the reliable active probing and Kibana integration from Heartbeat (Option 1) while solving its " Limited Scope " problem by providing deep application insights and readiness to execute on Simple-Open business logic. Fully Controllable: Actuator endpoints can be configured to meet a service dependencies and exposed. Predefined Implementation: Spring Boot offers predefined set of actuators ( Spring Boot Actuator: Production-ready Features ) making is easy for each service to enable only sub-set of actuators best suited for their use cases. Cons: Minor Service-Side Responsibility: Service owners are responsible for including the spring-boot-starter-actuator dependency and creating custom health indicators where necessary. Effort is not centralized. Security Requirement: Exposing detailed health information requires careful security configuration to ensure the endpoints are not publicly accessible. Each service must only enable desired set of actuators.	10 Jul 2025	DG Connect
ADR-07	Terraform Provisioning	As OVH does not have an official Crossplane plugin and the existent one, does not support the provisioning of Virtual Machines, a research was done in order to use Terraform instead.	It was confirmed that using an OVH Token, the Infrastructure Provisioner can deploy a Virtual Machine in a similar fashion when compared to the IONOS flow. The changes required related to the Infrastructure Provisioner will be done on the definition of the resource template and on the ArgoCD part. The chosen technology to support the creation of Virtual Machines using Terraform language is OpenTofu . Being a fully open-source solution, community-driven alternative to Terraform, ensuring long-term flexibility, transparency, and independence from proprietary licensing changes, it maintains full compatibility with existing Terraform configurations and supports new infrastructure-as-code configurations through full compatibility with the Terraform environment.	The Architecture related to the “BP08” will not have any changes, only the Technology View.	28 May 2025	DG Connect
ADR-08	Signer service eIDAS compliance	The OCM signer service provides a cryptographic function that binds the organization's identity to the claims within a Verifiable Credential, making it trustworthy and verifiable in a digital environment. While the OCM performs a digital signature that provides integrity and links the VC to the issuing organization, this process, by default, does not include the eIDAS requirements to achieve the legal equivalence of a handwritten signature (Qualified Electronic Signature). Requirements such as stringent identity verification, certified hardware/software, and accredited certificate issuance required by eIDAS. Some data spaces may require achieving the legal equivalence of a handwritten signature (QES) for certain scenarios (e.g. establishing a contract). The EU Digital Signature Service (DSS) offers exactly that by facilitating the creation and validation of electronic signatures in line with eIDAS regulations. Unlike the general OCM signer service which focuses on the cryptographic binding for VC integrity, DSS is specifically designed to help implement signing processes that adhere to the legal and technical requirements defined by eIDAS, including working with qualified certificates and signature creation devices when needed to achieve the highest levels of assurance and legal recognition within the European Union. Below are options that were considered: Standalone Service Embedded Library	The decision is to select option 1 (Standalone service).	Pros: Centralized Access: Offers a single, accessible service for various internal systems and business processes requiring QES. Simplified Management: Centralizes monitoring, maintenance, updates, and auditing of the QES functionality. Clear Separation of Concerns: Keeps the specialized QES logic distinct from other application code, improving modularity. Considerations: Potential for network latency (though often negligible for signing operations). Requires dedicated deployment and operational management.	26 Jun 2025	DG Connect
ADR-09	Infrastructure Consumption Monitoring	Key architectural considerations: Infrastructure Consumption Monitoring component needs to monitor various type of metrics for different cloud resources including but not limited to: compute (VMs), storage (S3), etc. Abstraction layer needs to exist to ensure different type of cloud providers can be integrated. Default implementation must exist for OVH. Pull method must be used to extract infrastructure consumption data though APIs from a cloud provider on a configurable schedule (as per #1). Information about cloud resources to pull consumption data from come, directly or indirectly, from infrastructure logs in ELK. Infrastructure consumption data are transformed into a standard JSON format (same format for all cloud providers) and persisted in a dedicated index in ELK along with infrastructure provider reference. Infrastructure consumption monitoring component accesses cloud provider APIs by using secrets (certificates, API keys, etc) stored in ELK ( Secrets keystore for secure settings \| Logstash Reference [8.17] \| Elastic ) or a different, dedicated store. Other useful aspects: The cloud providers (i.e. OVH) expose dedicated endpoints to get consumption data. Example here: OVHcloud API . Consumption data is per Service (aka cloud resource). Total consumption is often a sum of various sub-offering for a service. For example for a cloud hosting it's would be: instances + storage + snapshots. Consumption comes in a form of currency spent per unit. Below are options that were considered: Use open-source collector agent (i.e. OpenTelemetry or The Complete Guide to the ELK Stack \| Logz.io ) directly on the cloud resources. Invocation of a hosted script/program through a Logstash plugin to load infrastructure consumption data as part of Logstash processing pipeline. A dedicated microservice which gets invoked for a specific cloud resource based on signals stored in a message queue.	The decision is to select option 3 (dedicated microservice).	Pros : Full control over the code and functionalities. Fully testable with emphasis on shift-to-left/fail-fast. Resiliency and high availability achieved by design. Abstraction layer for different cloud providers achieved at interfaces layer. Secrets management for accessing APIs removed from ELK - separation of concerns. Adding support for new cloud providers defined by a set of interfaces to implement. Aligned with a principal of separation of concerns. Follows a Competing Consumer s cloud pattern for scalability and resiliency. Scheduled on demand (i.e. though ELK). Not invasive, pull based approach. Can be implemented as sync or async solution (design choice depending if a Kafka is needed). Less networking configuration needed - each agent only checks consumption of resource it uses. Cons : Needs to be fully coded so it will take time.	10 Jul 2025	DG Connect

3. Simpl-Open Business Architecture

3.1. Actors

An actor refers to an entity or participant that interacts with the system. Actors can be users, applications, Simpl-agent, etc. They play specific roles and have distinct permissions within the Data Space ecosystem.

The following context diagram introduces the main actors that will interact with each other using Simpl-Open and their interactions.

These actors are defined as follow:

Application Provider	The application providers cover all the Data Space actors offering applications to the consumers or any other type of participant. The term “application” is used in a rather broad sense in this document and it covers any sort of executables including applications, as well as algorithms, such as a trained AI model that users can leverage to analyse their data. Application providers can also define the access control policies regarding their resources and bill the users for their usage.
Data Provider	This category covers all the Data Space actors offering data to the consumers. They can share one or more data sets and regulate the access and usage over the data with the help of policies. In order to compensate the data usage, the data providers can also bill the Data Space consumers. An example of a data provider can be an energy network operator sharing data on the energy grid load towards energy production facilities (who act as consumers) for production optimisation application.
Infrastructure Provider	The infrastructure providers offer infrastructure resources and services to the consumers (or possibly to any other type of participant) to enable them to process the data provided by the data providers. They can, for example, launch virtual machines or containers and run applications, algorithms, or other executables on top of the underlying infrastructure. Similarly to the data providers, the infrastructure providers can define access control policies for the infrastructure resources and bill the middleware users for their usage.
Consumer	A consumer aims at using data, applications and infrastructure shared by providers. They can search for these and use them as allowed by the policies. For data, this means typically either using them online by utilising the infrastructure and applications provided by application and infrastructure providers, or if policy allows, download them for local usage.
Governance Authority	The Data Space participant that is accountable for creating, developing, operating, maintaining and enforcing the governance framework for a particular Data Space. ¹

¹ https://dssc.eu/space/Glossary/176553985/DSSC+Glossary+%7C+Version+2.0+%7C+September+2023

3.2. Simpl-Open Functional Architecture

The following diagram presents Simpl-Open functional architecture.

An agent per type of participant is represented and the functional components are represented as ArchiMate services.

Below are described all the functional components presented on the diagram, how they implement the building blocks from the high-level architecture, and how they interact between them. These interactions are highlighted with numbers on the diagram, which are linked to the below description through the purple numbers between brackets.

The Onboarding component implements the Onboarding building block: it provides the functionalities to submit, review and approve onboarding requests and deliver to the applicant the necessary security credentials to join a Data Space.

Both consumers and providers (data/application/infrastructure) can request to join a Data Space through the Onboarding component ( 1 ). This component allows the governance authority to control the required onboarding documents and approve/reject the onboarding request ( 2 ). If approved, the onboarding component sets up accesses and rights into the IAA component of the Governance authority ( 3 ) and delivers security credentials to the applicant ( 4 ).

The IAA component implements the Identity Provider Federation, Authorisation, Security Attribute Provider Federation and User Roles building blocks: it serves as a security intermediary for all communications between actors and components of Simpl-Open.

Once a participant has received security credentials from the Onboarding component, they install the credentials into their own IAA component ( 5 ). Once installed, the IAA component of all participants is federated, using the IAA component of the governance authority as trust anchor ( 6 ). In reality, the IAA component is connected to any single component of Simpl-Open as any interaction with the agent must be authorised and authenticated. For the sake of keeping the diagram readable, the relations between the IAA component and all the other components are not represented on the diagram.

The Vocabulary Management component implements part of the Metadata Description building block: it serves to harmonise the vocabularies in the Data Space, by providing the definition of metadata representation and, if required, the data representation standards.

The governance authority defines the vocabularies through the Vocabulary Management component ( 7 ).

The governance authority defines the schemas through the Schema Management component ( 8 ).

The Schema Management component implements another part of the Metadata Description building block: it provides the functionalities to define the ontologies and schema of the resource description (i.e. what properties can/should be part of it, what are their types, constraints and vocabulary).

The Resource Offering Editor component implements the last part of the Metadata Description building block. It provides the functionalities to create and sign resource descriptions (in the form of Self-Descriptions). It remains up to date with current metadata description standards by fetching schemas and vocabularies from the Schema Management and Vocabulary Management components ( 9 ).

The Federated Catalogue component implements the Resource Catalogue building block and part of the Search Engine building block. It provides the functionalities for providers to publish their resources and for consumer to discover these resources.

The Search component implements the remaining part of the Search Engine building block. It provides the functionalities for consumers to query and filter catalogue items to find the most suitable resources.

The Data Space Connector implements part of the Resource Catalogue, Usage Contract, Data Orchestration and Simple Data Transfer building blocks. It provides an implementation of the Data Space Protocol and acts as an orchestrator between its 3 parts:

Local Assets Catalogue in which the providers register the information, related to their own published resources, that is required for supporting the contract negotiation and transfer process;
Contract Negotiation provides the electronic contract negotiation required for consuming any type of resources;
Transfer Process supports the triggering of the data transfer or deployment of other types of resources.

Providers (data/application/infrastructure) create and sign resource description in the Resource Offering Editor component ( 10 ) and can then register it in the Local Assets Catalogue of their respective Data Space Connector component ( 11 ). The data registered in the Local Assets Catalogue of the Data Space Connector is the minimal subset of metadata required to enable the 2 next parts of the DSP: contract negotiation and transfer process.

Once registered locally, the Resource Offering Editor can publish the entire resource description (in the form of a Self-Description) to the Federated Catalogue component ( 12 ).

The Federated Catalogue validates the submitted resource descriptions against the schemas and ontologies provided by the Schema Management component ( 13 ) and against the vocabularies provided by the Vocabulary Management component ( 14 ).

Consumers can browse the resource offerings published in the Federated Catalogue through the Search component ( 15 ). Instead of having a search functionality embedded in the Federated Catalogue , the Search component is represented as a distinct component of the consumer agent, connecting to the Federated Catalogue in the governance authority agent ( 16 ), to enable the 2 tiers approach for IAA (the consumer end-user connects to the Search component via tier 1 and the Search component connects to the Federated Catalogue via tier 2).

Once consumers have found a resource offering that they would like to consume, they can request the consumption in the Search component which initiates a contract negotiation with the provider through the Data Space Connector component ( 17 ). The Search component has obtained from the Federated Catalogue the address of the provider’s Data Space Connector and the identifier to the resource offering, and provides these elements to the Data Space Connector . Based on these 2 elements, the consumer’s Data Space Connector initiates a contract negotiation with the provider’s Data Space Connector ( 18 ).

Based on the received resource offering identifier, the provider’s Data Space Connector can query its Local Assets Catalogue to obtain the necessary metadata to create a contract ( 19 ).

The provider’s Data Space Connector provides the contract to the consumer’s Data Space Connector for signature by the consumer ( 20 ).

As signing a contract is not explicitly part of the Data Space protocol, the signature process is not implemented within the Data Space Connector. Instead, it is externalised to the Contract Management component ( 21 ).

The Contract Management component implements the last part of the Usage Contract building block. It provides the functionalities to create, sign and persist usage contracts.

The consumer signs contracts through the Contract Management component ( 22 ).

Once signed by the consumer, its Data Space Connector provides the contract back to the provider’s Data Space Connector for the provider to sign it ( 23 ). As for the consumer, the signature is delegated to the Contract Management component ( 24 ) through which the provider can counter-sign the contract ( 25 ). The Contract Management component persists the signed contract and provides a copy to the consumer via their Data Space Connectors ( 26 ).

The Contract Management component of the consumer persists the signed contract ( 27 ).

Once a usage contract agreement is established, the Data Space Connector of the provider can start data and/or infrastructure consumption.

For standalone infrastructure consumption (see BP 08), the Data Space Connector of the infrastructure provider triggers the deployment of the Infrastructure Resource through the Infrastructure Management component ( 28 ).

The Infrastructure Management component implements the Infrastructure Orchestration, VM Provisioning, Container Provisioning and Storage Provisioning building blocks. It provides the necessary features to deploy and configure (incl. policies) infrastructure resources. It also partly implements the Data Visualisation building block by providing the functionality to deploy a built-in data visualisation application on the infrastructure resources. The remaining part of the Data Visualisation building block is implemented by the built-in application itself.

The Infrastructure Management component deploys and configures the requested Infrastructure Resource ( 29 ) and provides access details back to the consumer via the Data Space Connector ( 30 ).

The consumer gets access details from their Data Space Connector ( 31 ) and can access the Infrastructure Resource using these details (outside of Simpl-Open) ( 32 ).

For direct data download (see BP 09A), the Data Space Connector of the data provider accesses the Data Resource through the Data Transfer component ( 33 ).

The Data Transfer component provides the functionalities to access various types of data resources and transfer them between participants. It implements the Data Orchestration and Simple Data Transfer building blocks.

The Data Transfer component accesses the Data Resource ( 34 ) and transfers a copy of it to the consumer via the Data Space Connector ( 35 ). The consumer’s Data Space Connector stores the copy of the Data Resource on the consumer side ( 36 ), which can be accessed by the consumer ( 37 ).

For access to data over an application deployed on an infrastructure, currently, both the data and application resources are already available in the infrastructure provider and are deployed together with the infrastructure resource. In a future release, a solution involving the Data Space Connectors of both infrastructure and data providers will be envisaged.

The Observability component implements the Logging building block and part of the Monitoring building block. It provides the functionalities to collect and monitor logs and metrics from the other components of the agent.

In reality, the Observability component is connected to any single component of Simpl-Open as all of them produce logs and are monitored. For the sake of keeping the diagram readable, the relations between the Observability component and all the other components are not represented on the diagram.

From the above architecture, 3 functional domains can be distinguished:

Access Control & Trust - This domain provides the means to join a Data Space and establish trust between participants.
Publish and consume resources - This domain is about the essence of a Data Space: allow to share resources (datasets, infrastructure, applications) between the participants.
Management/Operation of Data Space - This domain provides the functionalities that are necessary to manage and operate a Data Space.

The table below summarises how the functional components implement the building blocks from the high-level architecture, and how they map to the functional domains.

Functional architecture component	High-level architecture building block implemented	Functional domain
Onboarding	Onboarding	Access Control & Trust
IAA	Identity Provider Federation Authorisation Security Attribute Provider Federation Authentication Provider User Roles	Access Control & Trust
Vocabulary Management	Metadata Description (partly)	Publish and consumer resources
Schema Management	Metadata Description (partly)	Publish and consumer resources
Resource Offering Editor	Metadata Description (partly)	Publish and consumer resources
Federated Catalogue	Resource Catalogue Search Engine building block (partly)	Publish and consumer resources
Search	Search Engine	Publish and consumer resources
Data Space Connector	Resource Catalogue (partly) Usage Contract (partly) Data Orchestration (partly) Simple Data Transfer (partly)	Publish and consumer resources
Contract Management	Usage Contract (partly)	Publish and consumer resources
Infrastructure Management	Infrastructure Orchestration VM Provisioning Container Provisioning Storage Provisioning	Publish and consumer resources
Data Transfer	Data Orchestration Simple Data Transfer	Publish and consumer resources
Observability	Logging Monitoring (partly)	Management/Operation of Data Space

A mapping between the functional requirements level 2 and the functional components presented above is provided in annex.

The business processes and the underlying functional requirements are available from the Simpl Programme website .

4. Simpl-Open Application Architecture

Simpl-Open Application Architecture develops the target application architecture of Simpl-Open that enables the business architecture and the architecture vision, in a way that addresses the requirements.

It identifies architecture components through following views:

View	Description
Application Services Static View	Provide a view per business domain of the application services implementing the domain.
Application Components Static View	Provide a view per application service of the application "Solution", with all the main components and interactions.
Application Components Dynamic View	Provide a dynamic view per business process (or sub-process) on how application components are used to satisfy different workflows.

Next to these architecture views, are provided:

Interfaces - describes APIs and/or UIs for each relevant architecture component presented in above views.

Simpl-Open non-functional requirements are available on the Simpl website .

Within Simpl-Open Architecture, three categories of components can be distinguished:

Simpl-Open Domain 1 (Access Control & Trust) Mandatory Services: these services are technical prerequisites to be installed for any Simpl-Open agent. Domain 1 services should be installed in at least 2 agents to allow the distributed IAA mechanism of Simpl-Open (based on 2-Tier approach).
Simpl-Open Domain 2 & 3 Optional Services: represents any other components which can be deployed on top of the Access Control & Trust components to provide all capabilities of Simpl-Open.
External Dependencies: These are mandatory capabilities for the services that require it but that can be replaced by equivalent technologies.

The following figure presents exhaustively the list of Domain 1 Services and External Dependencies, and provides a few examples of the Domain 2 & 3 Services:

4.1. Application Components Views

Application components views are presented per functional domain in following sub-sections.

For each functional domain, are presented:

a static view of the entire domain which presents all the application services that are necessary to implement the functionalities of the domain and how they interact with each other;
a set of static views that zoom each in a specific application service and present all the application components composing the service as well as integration with other services;
a set of dynamic views that present how a subset of the application components is used to satisfy different (parts of) business processes.

4.2. ACV - Domain 1 - Access Control & Trust

The static domain view illustrates the structural organisation of application services involved in the domain, segmented into two types of agents (Governance Authority and a generic applicant/participant which can represent Consumer, Data Provider or Infrastructure Provider), showcasing the roles each plays in the domain.

As per the legend, components highlighted in red are foreseen to be part of Simpl-Open but are not part of the current release, while components highlighted in orange are external to Simpl-Open.

The red numbers on the diagrams help to correlate APIs with their definition which can be found in the Interfaces section, while the green letters represent the link with the User Interfaces which can be found in the same Interfaces section.

4.2.1. ACV - Domain 1 - Access Control & Trust - Static Views

ACV Static - Authorisation Service

Authorisation

The authorisation component processes all Tier 1 and Tier 2 inbound traffic originating from external sources and enforces RBAC and ABAC rules.

ACV Static - Identity Attributes Service

Security Attributes Provider

The Security Attributes Provider component is deployed in the Governance Authority Agent and registers the participant’s security identity attributes. Upon approval of an onboarding request, the onboarding component calls the Security Attributes Provider to associate the security identity attributes to the participant.

ACV Static - Identity Provider Service

Identity Provider

This component is deployed inside the Governance Authority Agent. It generates and renew the credentials for a newly onboarded participant and stores them along with the participant’s information. This component also allows the applicant participant to download the generated security credentials that can then be installed in the Tier 2 Authentication provider of the participant agent.

ACV Static - Onboarding Service

Onboarding

The Onboarding component is deployed inside the Governance Authority Agent and it’s the core for managing onboarding requests by applicants (applicants can be both providers and consumers). This is where the applicant requests new Tier 1 credentials and initialises its onboarding request. The Governance Authority Tier 2 authorisation operator can approve, reject, or require new documents to fulfil the request. After the request has been approved, the applicant must create its keypair to be associated with the credential and can submit the public key to the governance authority which triggers the creation of a Tier 2 credential by the Identity Provider component.

Refer to ACV Dynamic - BP 03A - Onboard a Participant for a full description of a Participant Onboarding.

Document Validation Service

An external validation service where custom document validation logic can be implemented and exposed through a well-established API contract.

ACV Static - Tier 1 Authentication Service

Tier 1 Authentication Provider

The Tier 1 authentication provider contains the participant users, roles and allows IdP Federation.

ACV Static - Tier 2 Authentication Service

Tier 2 Authentication Provider

The tier 2 authentication provider is the component that:

Manages the storage and update of the security credentials inside the Credentials Database/Vault component.
Inside a participant, is involved in 2 steps after the onboarding request has been approved:

- when the applicant representative creates/uploads a keypair into a participant agent
- when the applicant representative installs the security credentials previously generated by the governance authority
In the communication between participants, helps the Authorisation Tier 2 components to validate Tier 2 credentials (Ephemeral Proof and Security Credentials).
Keeps a copy of the identity attribute of the dataspace local to the agent
Keeps details about the participant organization owning the agent
Support the credential renewal flow via the governance authority
Exposes internal APIs to help Simpl components to fetch information about the participants and their identity attributes

Credentials Database/Vault

Component that handles the physical storage of the participant credentials

ACV Static - User Management Service

User & Roles

This component works as an interface in front of the tier 1 authentication provider. Its responsibilities are:

reading and writing users and roles in the tier 1 authentication provider;
map the Tier 1 Roles to assignable security identity attributes;
create an applicant user along with temporary credentials in the tier 1 authentication provider at the beginning of the onboarding process.

4.2.2. ACV - Domain 1 - Onboarding & IAA - Dynamic Views

ACV Dynamic - BP 03A – Onboarding of a new data space Participant - Providers (data - application - infrastructure) & Consumers

A new participant in a Data Space – whether a data provider, application provider, infrastructure provider, or consumer – begins by registering itself and obtaining a temporary Tier 1 credential.

Using the Tier 1 temporary credential, the new participant submits an onboarding request by completing the required information forms.

The onboarding request is then processed by the Governance Authority and either approved or rejected. During the review process, the Governance Authority can provide comments on the onboarding request and submit requests for additional documents to the applicant participant.

Assuming the onboarding request gets approved, the new participant creates a Tier 2 key pair and begins the process of obtaining a valid identity credential for its Simpl-Open Agent. This credential proves the ‘identity’ of the installed Simpl-Open Agent and enables secure communication with other Data Space participants. Simpl-Open Agents will only permit communication with other network participants who hold a valid identity credential.

After the participant has successfully acquired a valid identity credential, they proceed to install this credential within their Simpl-Open Agent. This installation process involves integrating the credential into the agent’s system, ensuring that it is properly recognised and authenticated.

Applicant Participant creates an onboarding request: the applicant participant requests credentials in the Governance Authority providing information about the organisation and the participant’s role in the Data Space (consumer or data/infrastructure/application provider). Credentials are created in the Governance Authority Tier1 Authentication provider through the Users&Roles component. After the credentials have been created and stored in Tier1 User Database, the onboarding component creates an onboarding request with the status IN PROGRESS.

Applicant participant submits the onboarding request ⁽¹⁾ : the applicant participant logs in to the onboarding Frontend using the temporary credentials and fills the onboarding request. In addition to organisation data, the applicant must also upload any required documents, as specified by the onboarding procedure. A comment section is available to facilitate communication between the applicant and the Governance Authority. Once all mandatory documents have been provided, the applicant can submit the request for review to the Governance Authority representatives.

Governance Authority representative reviews the onboarding request ⁽¹⁾ : when the onboarding request has been submitted, the governance authority representative reviews it and decides:

1. to APPROVE the onboarding request and proceed to the credential’s creations step.
2. to REQUEST A REVIEW to the participant applicant, possibly requiring additional documents (“temporary rejection”)
3. to REJECT the onboarding request. In this case the onboarding process stops.

As soon as the onboarding request has been approved, the onboarding component creates the participant and saves the participant identity attributes in the Security Attributes Provider component.

Applicant creates a keypair: once the onboarding request has been created, the applicant representative can start the credentials creation process inside the participant agent. The applicant representatives generate a keypair and stores it in the participant agent.

Applicant triggers credential creation: the public key (whose keypair is safely stored inside the participant agent) in the form of a Certificate Signing Request (CSR) is sent by the applicant representative to the governance authority. Using the public key, the onboarding component triggers a credential creation through the identity provider component. After the creation, the applicant representative can download the credential.

Applicant installs credentials ⁽¹⁾ : the participant applicant can install the generated credential inside the participant agent along with the previously generated keypair. The Tier 1 public key is sent to the governance authority via Tier 2 communication. The governance authority receives the tier1 public key and notifies that the onboarding request has completed.

(1) The integration with the Simpl Open notification service has not yet been included in the latest release.

ACV Dynamic - BP 03B - Participant User and Roles Configuration

The participant must configure the User and Roles module to allow end users to log in and start operating with Simpl-Open. Simpl-Open administrators log into the Participant’s Agent and begin configuring roles within the Simpl-Open Agent.

After configuring roles, they can federate the local identity provider with the Authentication Provider module of the Simpl-Open Agent, if needed. This step is crucial as it ensures that the Simpl-Open Agent can accurately verify and manage existing users’ identities.

Next, end users must be managed. Administrators can create end users within Simpl-Open or manage existing users’ identities through IdP Federation. To enable end users to use Simpl-Open functionalities, administrators must assign roles to every user in the Participant’s Agent according to their duties and responsibilities. This assignment ensures that each user has the appropriate access and permissions to perform their tasks effectively.

ACV Dynamic - BP 03C - End User Role Request

To enable Tier 1 users to operate their agent after the login, at least one role must be assigned to them. Roles can either be assigned by Simpl-Open administrators or requested directly by the end user.

This process describes the Individual User Onboarding functionality, which allows users to request a role (or set of roles) directly.

1. Role Request Submission

After logging into the Participant’s Agent, the end user can create a role request and specify the desired roles. Once submitted, the system notifies Simpl-Open administrators that a new request is available for review.

2. Role Request Review

Simpl-Open administrators can review every role request submitted by the end users of the Participant’s Agent. They can either approve the request (assigning one or more roles to the requester) or reject it, in which case no roles are assigned.

Following the review, the system sends a notification to the end user informing them that their request has been processed.

ACV Dynamic - SA 03 – Credentials actions by the Governance Authority

The Governance Authority is responsible for managing the status of credentials issued to participants, ensuring compliance and preserving trust within the dataspace. The initial issuance of a participant’s credential, along with the first assignment of identity attributes, occurs during the onboarding process (see BP03A). After onboarding, the full credential lifecycle and any subsequent identity attribute assignments are managed by the Identity Provider component within the Governance Authority.

The following diagram outlines the components involved in the actions of revoking, suspending, reactivating, renewing credentials and editing a participant’s identity attributes assignment.

Governance Authority Revokes a credential: Permanently revoke a participant’s credentials. This credential will no longer be available for use in the future.

Governance Authority Suspends a credential: Temporarily suspend credentials.

Governance Authority Reactivates a credential: Restore suspended credentials once issues are resolved.

Governance Authority Renews a credential: Extend the validity of credentials approaching expiry. Renewal can be:

Manual: the participant submits a Credential Renewal Request and the credential is renewed by the Governance Authority
Automatic: a participant can be allowed to have an auto renewal credential in place. When the credential is about to expire, its credential is automatically renewed

Edit Identity Attributes: Update the participant’s assigned identity attributes as needed.

4.3. ACV - Domain 2 - Publish and consume resources

To share the provider of the data, application or infrastructure offering needs to make its offering available and findable for the interested consumers. To this end, the provider needs to describe its offering in the form of metadata (called Self-Description) and make it available in a central catalogue. This catalogue needs to provide appropriate functionality for the consumer to find his desired data, application or infrastructure offerings. For the consumption of the offerings are provided functionalities to negotiate a binding contract with validation of the access policy (control plane) as well as the technical consumption, e.g., the file transfer for data or the triggering of infrastructure deployment in the case of infrastructure.

The static domain view illustrates the structural organisation of application services involved in the domain, segmented into four types of agents (Governance Authority, Consumer, Data Provider and Infrastructure Provider), showcasing the roles each plays in the domain.

As per the legend, components highlighted in red are foreseen to be part of Simpl-Open but are not part of the current release, while components highlighted in orange are external to Simpl-Open.

To keep the diagram lighter, following components (and their relations) have only been represented within the Data Provider agent but are in reality also part of the typical deployment of an Infrastructure Provider agent:

Connector
Contract Manager Orchestrator
Contract Manager Backend
Signer Orchestrator
Signer Async Adapter
Signer Backed
VC Issuer
Wallet
SD Tooling
Catalogue Client Application
Synch Schema Adapter
Policy Template Datastore
Contract Template Datastore
Data Orchestration Service
Schema Synch Service

4.3.1. ACV - Domain 2 - Publish and consume resources - Static Views

ACV Static - Catalogue Client Service

Catalogue Client Application

The Catalogue Client Application Frontend is the primary interface through which users interact with the Catalogue. It presents search fields and options to users, which in case of advanced search are defined by the schema. It contains:
- Quick Search UI - This UI allows the consumer/provider to perform a Quick Search on the respective Catalogue.
- Advanced UI - This UI allows the consumer/provider to perform an Advanced Search on the respective Catalogue.
The Catalogue Client Application Backend
- sends the policy-filtered queries to the Catalogue Component via the Adapter Component. After receiving results from the Catalogue, it presents them in a structured format, ensuring that users can easily navigate and interpret the returned self-descriptions and metadata.
- transforms the schema definition automatically to front end files that are used to generate a custom made frontend to define the Self-Description.

Validation Backend

The Validation Backend performs syntax validation for the Self-Description on the provider side before they are published to the catalogue. Furthermore, it validated the resource source address, which is used for registering service offerings in the Connector.

Contract Consumption Adapter

The Contract Consumption Adapter component is requesting an Offering from the Provider. This Offering is returned with the Offering ID and respective usage & access policies.
Once the user accepts the conditions (usage & access policies) the Contract Consumption Adapter builds the request to start the Contract Negotiation via the EDC Connector Adapter and retrieves the Status of the Contract Negotiation.

ACV Static - Catalogue Service

Catalogue

Operating on the Governance Authority node, the Catalogue component functions as the central publication point for signed self-descriptions. It includes secure API functionalities for publishing, querying and managing self-descriptions. After publication, the self-description becomes accessible to potential consumers via the Catalogue’s API. The Catalogue also manages the status of self-descriptions and facilitates seamless access to information and metadata stored in the system’s databases.
When a search request is made via the Catalogue Client Application, the Catalogue’s Search Engine processes the request, taking into account the filters and parameters provided by the Policy Filter Service and Adapter Component. This ensures that the search results returned to users are both relevant and compliant with defined policies.
The Catalogue component also works closely with the Schema Registry to ensure semantic consistency across searches.
The Catalogue component contains:
- Catalogue Database - The catalogue database is one or multiple databases that persist the published Self-Descriptions.
- Search Engine - The search engine indexes the entries in the catalogue database and allow for a performant search.
- Vocabulary Datastore - The vocabulary datastore contains the loaded ontologies and schemas of the catalogue used for the semantic validation.
- Management Service - The management service allows to perform several operation on the self-description, for instance the revocation of a Self-Description.
- Syntax Validation Service - The Syntax Validation Service checks the syntax of the Self-Description before publication.
- Semantic Validation Service - The Semantic Validation Service checks the semantic of the Self-Description before publication. In detail it performs both a validation of SHACL Constraints and checks if the Self-Description complies with the ontologies in the catalogue.
- Quality Rule Validation Service - The Quality Rule Validation Service checks the quality of the Self-Description before publication. It checks if all mandatory quality rules are fulfilled and uses the recommended quality rules to calculate the quality score for the Self-Description.

Remark on Catalogue Deployments

In the current architecture view the catalogue is depicted as a single component, but yet a different schema is used for each type of resource (data, application and infrastructure). The catalogue might thus be deployed multiple times (e.g.) for testing purposes. The way this is deployed is subject to change. In the future the catalogues (data, infrastructure and application) may be kept in a single component deployment and can be separated by the different schemas.

According to Data Space Protocol (DSP) specification each implementation of a connector has to provide a local assets catalogue instance providing all registered service offerings (asset) and usage contract offerings of this provider. Hence as a prerequisite to adding/updating a resource this service offerings (assets) have first to be registered at the connector. There, the contract negotiation id to start the contract negotiation will be created. This id is crucial for self-description to provide any customer the link to start contract negotiation.

Query Mapper Adapter

The Query Mapper Adapter component functions as an intermediary, translating user-defined search parameters into a format compatible with the Catalogue’s database query language. These translation capabilities allow users to perform complex searches without needing to know the technical specifics of the database’s query language, making it easier for users to interact with the Catalogue in a secure and user-friendly manner.

Policy Filter Service

The Policy Filter service dynamically enforces access policies on search queries. It applies the access control rules defined within each self-description, filtering search results based on the user’s permissions.

This service is integrated in the Query Mapper Adapter component to embed policy-based filters into search queries before they are sent to the Catalogue. This integration ensures that all queries reflect the necessary governance controls, restricting access to authorised users and ensuring that sensitive information remains protected. In this way, the Policy Filter Service works as an invisible layer of security that ensures compliance while providing authorised access to the appropriate search results.

ACV Static - Connector Service

Connector

The Connector component registers each resource (dataset, application, or infrastructure) as an asset within the Data Space, associating policies and contracts with each asset. It also provides controlled endpoints for each resource, playing an intermediary role in the contract negotiation process by leveraging the policies and contract templates associated with the resource. This enables the management of contractual relationships between providers and consumers. The connector functions also as a gateway for secure data exchange and ensures that policies are enforced during data consumption. It is responsible for enforcing security protocols and managing policies that govern access to the data (simple dataset or bundle).
The Connector component is implementing the Data Space protocol and contains the following sub-components:
- Control Plane - The control plane of the connector acts as a state machine, overseeing the various states and transitions specified in the Contract Negotiation Protocol. It ensures that all agreements between the data provider and consumer are finalised before any data transactions take place. The Control Plane at the Provider side includes the Local Assets Catalogue component. An Asset is the primary building block for resource sharing, it represents any data or API endpoint that can be shared. Assets are descriptors that are loaded into EDC via its Management API during the registration phase performed before uploading a resource to the catalogue. In the case of a bundle, it is the URL that triggers the deployment script which will deploy the requested infrastructure and application.
  The control plane to perform its functionalities interacts with the Management API, the Protocol API and the Policy Engine.
- Data Plane - The Data Plane enables the data exchange based on the transfer protocol which will only take place in case the contract negotiation protocol has successfully established a contract. This second part is controlled by the control plane and performed by the data plane. The data plane component, which consists of an extension of the connector, manages the actual data exchange, ensuring that data flows securely from the provider’s source to the consumer’s specified destination, aligning with the agreed-upon terms of the contract. In the scenario of a bundled infrastructure, data and application, the role of the data plane component is performed by the infrastructure orchestrator which is responsible for retrieving the deployment script ID from the Asset of the resource to be used and triggering the execution of the script on the infrastructure provider. Once the provider completes the deployment, it will return the access details for the newly created environment. These details will then be forwarded to the user, who will use the provided information to access the infrastructure directly.
- Management API - The Management API is a RESTful interface for client applications to interact with the control plane.
- Dataspace protocol API - The Dataspace protocol API is a RESTful API interface that is used for the contract negotiation protocol.
- Policy Engine - The Policy Engine is crucial in making decisions based on the policies tied to the requested resource. The policy engine is able to perform this operation because the policies are registered and linked to the registered assets (Assets component in the Control Plane). This allows the policies to be retrieved at this moment and the necessary checks to be carried out. This component evaluates whether all policy requirements are met and if they are not, they can halt the process to prevent unauthorised access.
- Triggering Extension - The triggering extension will send the DeploymentScriptID and the email address of the consumer, to the Infrastructure Triggering Module, at the time of finalising a contract agreement. This will result into provisioning of the infrastructure resources and deployment of the applications on that resource.
- S3 Object Storage Extension - Can transfer datasets from the S3 Object Storage of the Data Provider to the S3 Object Storage of the Data Consumer, at the time of finalising a Data Transfer Contract.
The EDC Connector Adapter handles interaction with the EDC Connector. This includes
- the registration of the resource offering together with the associated policies in the connector during the creation of the resource description
- well as providing the Connector references of the asset for the resource descriptions during the request of a resource and the consumption

ACV Static - Contract Service

Contract Manager (Orchestrator and Backend)

The Contract Manager coordinates with the Verifiable Credentials Issuer (VC Issuer) Component , Signer Component and Wallet Component to integrate contract validation, issuance and storage functionalities. It also stores contracts for billing and record-keeping purposes, centralising key contract-related data.

Note : Currently, interactions with the VC Issuer Component, Signer Component and Wallet Component are streamlined through a single stub interface. Additionally, contract storage and Wallet emulation are consolidated into a single database, simplifying the initial implementation.

Message Broker

Interactions between the contract manager orchestrator and the contract manager backend are designed to be asynchronous. The role of the Message Broker is to facilitate these asynchronous processes.

ACV Static - Contract Template Datastore Service

Contract Template Datastore

The Contract Template Datastore stores Contract Templates to ensure consistent application of contract terms, which are later accessible to consumers during resource negotiation and access stages.

ACV Static - Data Orchestration Service

Orchestration Platform

The Orchestration Platform is able to execute Data Workflows to process or preprocess data using custom or built-in data services, like data anonymization.

ACV Static - Infrastructure Connector Service

Provisioned Node (Infrastructure Consumer) / Private Network

Is created by the Infrastructure Provisioner (see ACV Static - Infrastructure Provisioning Service ) on behalf of the Consumer
- Also Access Data (credentials and other details) is communicated to the Consumer.
So In principle the Consumer has access
- to the Infrastructure (Provisioned Node)
- and the Private Network

ACV Static - Infrastructure Provisioning Service

Triggering Module

The Triggering Module component is responsible for adding, managing and executing the deployment scripts and finally sharing the access data. The triggering module is made of three submodules:
- Script Storage Management submodule (Accessible via the API and the Infrastructure Deployment Script Management UI) : That is responsible for adding and managing the deployment scripts. It contains the following functions:
  - Add Script : Enables users to add deployment scripts to the local repository and database, ensuring the scripts are accessible for future provisioning tasks. This function also performs security checks to prevent the uploading of malicious scripts and files.
  - Remove/Invalidate Script : Manages the removal or invalidation of outdated scripts from the repository and database.
- Script Execution submodule : When an API call requests the triggering of the deployment script, this module initiates the execution process. Its functions are:
  - Retrieve Deployment Script : Retrieves deployment scripts from the repository, allowing the Infrastructure Provisioner to execute the necessary steps for the resource provisioning and software deployment.
  - Validate Deployment Script : Generates and compares a hash of the retrieved script from the repository to the hash that was stored in the database at the time of storing the script, to check for integrity and authenticity, confirming the script is secure and unaltered.
  - Trigger Execution : Communicates with the Infrastructure Provisioner via a message broker to initiate the provisioning process.
- Access Management submodule : When the provisioning is done, it shares the access information such as endpoints and credentials with the consumer:
  - Retrieve and Share Access Data : Obtains access credentials and details from the Infrastructure Provisioner, making them available for distribution to the necessary stakeholders.
The Triggering Module exposes its functionality via an API, enabling other Simpl-Open Agent modules to interact with it as needed. After triggering the execution of the deployment script, the triggering module listens for provisioning completion events from the Infrastructure Provisioner to confirm successful deployment and share the access data.

Infrastructure Provisioner

The Infrastructure Provisioner component is an asynchronous service that orchestrates the actual provisioning of infrastructure resources and potential deployment of the applications and datasets (in case they are a part of the deployment script). Upon receiving a deployment trigger from the Triggering Module, this component follows several steps to ensure resources are provisioned, configured and made accessible. This module contains two submodules for provisioning and decommissioning:
- Provisioning sub-component: Provisions the infrastructure resources, creates/grants access to them and runs post-configuration processes to set policies and to deploy applications.
  - Execute Deployment Script : Runs the deployment script received from the Triggering Module, provisioning resources such as compute instances, storage, or other assets.
  - Set Policies : Defines infrastructure specific usage and access policies to govern resource usage, aligning with predefined rules on the deployment script to control who can access the provisioned infrastructure resource.
  - Create Access Information : Generates and provides access credentials and endpoints, allowing authorised users to interact with the infrastructure.
  - Post Configuration : Can deploy applications and load datasets on the provisioned infrastructure resource.
  - Share Access Data : Returns the generated access information back to the Triggering Module / Access Management, so the information can be shared with the consumer.
- Decommissioning sub-component: It will decommission the infrastructure asset based on the criteria set by the business (e.g., the end date of the contract.) The two main functions are:
  - Pre-decommissioning : Initiates the pre-set decommissioning configurations such as notifying the consumer and making snapshots/backups.
  - Access Revocation : Revokes user access, if applicable and triggers the final termination process.
This infrastructure provisioner is not directly exposed via the public API but through the processes of the triggering module.

Infrastructure Provider Storage

The Infrastructure Provider Storage component houses both a Database and a Repository to store deployment scripts. The Storage component can support versioning, audit trails and controlled access, thus facilitating compliance and security in deployment operations.

Message Broker

Certain processes (e.g. provisioning the infrastructure resources) are designed to be asynchronous. The role of the Message Broker is to facilitate these asynchronous processes.

ACV Static - Issuer Service

Verifiable Credentials Issuer (VC Issuer)

The VC Issuer component securely issues and manages verifiable credentials, providing transparent and reliable validation of usage contracts. It relies on the Signer component to apply cryptographic signatures to contracts, ensuring data integrity.
Signed usage contracts are stored in the Wallet component for secure access, facilitating a robust and trustworthy credential management process.

ACV Static - Policy Template Datastore Service

Policy Template Datastore

The datastore contains templates of the policies that can be used as a blueprint to describe the access and usage policies for a resource.

ACV Static - Resource Offering Service

SD Tooling

Located on the Provider Node, the SD Tooling component enables providers to define self-descriptions for their resources by leveraging schemas from the Schema Registry. This ensures each self-description adheres to predefined properties and constraints. The SD Tooling Component supports both UI and API methods, providing flexibility to providers. It works in tandem with the Policy Creator and Contract Template components, allowing providers to incorporate policy and contract terms directly into self-descriptions.
The SD Tooling component contains:
- SD Manager - The SD Manager allows the user to manage his published Self-Description, for instance triggers the revocation.
- SD Creation Tool - The SD Creation Tool supports the provider in the creation of the Self-Description of their resources, by providing a generated frontend from the schema with the correct property fields.
- Policy Creator - The Policy Creator component enables the creation and management of Access and Usage Policies for resources. Access Policies determine the accessibility of a resource, while Usage Policies outline the permissible uses and monitor the extent of usage to support billing based on consumption. These policies are serialised into a standardised format to ensure consistent application and interpretation across components. Integrated into the Self-Description, they contribute to a governed, comprehensive resource description.
- Contract Template Editor - The Contract Template Editor enables the creation and the customisation of contract templates linked to resources in self-descriptions. Theses templates, once created, are stored in the Contract Template Datastore.

ACV Static - Schema Management Service

Schema Management

The Schema Management Service component represents the Metadata Description building block, enabling the Governance Authority to define the structure of self-descriptions. Using a UI or API, the Governance Authority can establish properties, data types, constraints and controlled vocabularies that apply across resources (datasets, applications, infrastructure). The resulting schema configurations are automatically transformed into semantic files and managed within the Schema Synch Service, ensuring the Provider Node has access to the most current schema standards for generating self-descriptions in compliance with governance protocols.
The Schema Backend functions as a central repository and management interface for schemas created by the Governance Authority. These schemas, represented as ontologies and structured schema definitions, are actively managed to provide consistent standards across resource descriptions. Serving as an application component rather than a simple data storage element, the Schema Registry facilitates regular synchronisation with the Provider Node, ensuring that providers always have access to the latest schema standards needed for creating compliant self-descriptions.
The Schema Registry is used by the catalogue client application to enable semantic consistency by defining and validating the terms used in self-descriptions and search fields. The Search Client uses the schema to define the search fields for the advanced search. This automatic form generation helps prevent ambiguous searches and ensures users can only search for terms recognised within the Data Space.
Schema Synch Adapter synchronises the schemas with the agents and makes sure the schemas for dependent components are accessible and up-to-date

ACV Static - Schema Synch Service

The Schema Synch Service implements:

The Schema Synch Adapter API, that received any updated from the Schema Management Service
The Schema Synch Adapter , that is retrieving the Schema updates and processed them

ACV Static - Signer Service

Signer Service

The Signer Service component manages the digital signing of self-descriptions, ensuring their authenticity and integrity. Upon completion, the self-description is signed using the provider’s private key to verify identity and prevent tampering. Once signed, the self-description is ready for distribution and is published to the relevant Catalogue component for broader access. This service is crucial for establishing trust between providers and consumers.
The Signer Service component provides cryptographic signing capabilities for contracts, ensuring non-repudiation and authenticity. This component validates the identity and integrity of each contract, instilling confidence in the security of agreements.

ACV Static - Vocabulary Management Service

Vocabulary Management

ACV Static - Wallet Service

Wallet

The Wallet component serves as a secure digital repository for storing, managing and presenting verifiable credentials (VCs). This “digital wallet” enables providers to securely manage and share their credentials, ensuring compliance with contractual requirements while facilitating efficient access to validated information.

4.3.2. ACV - Domain 2 - Publish and consume resources - Dynamic Views

The following sub-sections contain dynamic views that each present how a subset of above-described application components is used to satisfy different (parts of) business processes:

The first part is that the provider needs to describe his resource using a predefined schema that when tailored to the resource at hand becomes a self-description. Next the provider needs to make this self-description (SD) available for potential search. This process is described in detail in ACV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue. In simple terms, this publication process consists of:
- The provider describes his offering using the SD Tooling, how the description should look like is defined in the schema of the self-description.
- The provider registers the SD as an asset in the connector. The asset is composed by a subset of the metadata present in the SD, only the one that will be necessary afterwards during the consumption.
- The provider signs the Self-Description with his credentials to proof that he is the owner and to make the Self-Description tamper-proof.
- Finally, the provider publishes the Self-Description to the central catalogue on the Governance Authority Node so the consumer can search for it. The Governance Authority checks automatically if the Self-Description is correct according to the syntax, semantics and quality.
To update a Self-Description consist of at first revoking the old version of the Self-Description and publishing a new version, for detail see ACV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue.
The third process is that the consumer searches the catalogue for dataset, application or infrastructure offering. The consumer defines the search terms in the search client app and the catalogue on the governance authority agent executes the search in the catalogue. This is described in detail in ACV Dynamic - BP 06 - Search on Catalogue (Infrastructure, Data, Application). The process consists of:
- The Consumer (or provider) uses the search client app to write his search terms. It is possible to use two different ways of searching, quick search or advanced search.
- The consumer calls a service in the Query Mapper Adapter on the Governance Authority. This service maps the search terms onto executable queries for the catalogue and also ensures that the consumer can only see the offerings that allow it by enforcing the policy.
- The query is executed by the catalogue itself and the results are returned to the consumer (provider).
The fourth part consists of the consumption which is declined in 3 different resource consumptions: data direct download, Infrastructure consumption and data consumption through an application (also sometimes referred to as “bundle”):

4a. The first type of consumption is the direct consumption of Dataset ACV Dynamic - BP 09A - Consumer consumes a data resource from the provider. The consumer has found the offering that he wants from the central catalogue and next he wants to consume the data. this process in simple terms consists of three sub-processes:

- The consumer uses the connector to establish a contract with the provider (described in detail in ACV Dynamic - BP 07A - Establish a usage contract agreement).
- The control planes perform the contract negotiation between the connector of the consumer and provider (also includes the enforcement of the policies).
- The data plane is used to transfer the data from provider to consumer.

4b. The second consumption is the infrastructure consumption ACV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider. The consumer finds the offering in the central catalogue and then performs the request for its consumption.

- The consumer uses the connector to establish a contract with the provider (described in detail in ACV Dynamic - BP 07A - Establish a usage contract agreement).
- The control planes perform the contract negotiation between the connectors of the consumer and the provider. The control plan also includes the enforcement of the policies.
- The infrastructure provider retrieves, validates and triggers the deployment script of the infrastructure offering.
- The infrastructure provider retrieves the access data for the infrastructural resource and shares them with the consumer.

4c. Besides the direct consumption of a dataset, the data consumption is supported through a processing service over an application ACV Dynamic - BP 09B - Consumer receives data processing service over a dataset via an Application. The steps are a bit more complex due to the need for infrastructure to host the application and dataset:

- The uses the connector to establish a contract with the provider (described in detail in ACV Dynamic - BP 07A - Establish a usage contract agreement).
- The control plane is negotiated between the connector of the consumer and provider (also includes the enforcement of the policies).
- An infrastructure is provisioned for the consumption (ACV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider) and the dataset and application are installed on that infrastructure.
- The consumer gets (restricted) access to this infrastructure.

ACV Dynamic - BP 05B - Provider manages resource descriptions

Define and Publish Self-Description

This process outlines how a self-description can be defined and subsequently published in the Catalogue. Certain fields within the self-description link to other resources, which therefore need to be created beforehand. For instance, an infrastructure offering requires a deployment script to be added in advance so that it can be referenced within the self-description.

Schema Synchronisation: The SD Tooling component on the Provider side initiates a request for schema definitions from its local Schema Registry, which is kept in sync with the one on the Governance Authority node. This ensures consistent schema access and alignment across participants, supporting unified self-description formats in the Data Space. The retrieved schema definitions are stored in a local Schema Datastore on the Provider’s end, ensuring quick access and version control.
Create Self-Description: Providers can create a new self-description or modify an existing one through the User Interface. This interface allows them to fill in necessary fields or use a previously stored self-description template.
Syntax Validation: The Syntax Validation component within SD Tooling checks the initial structure of the self-description to confirm it meets the required format. While primarily focusing on the form of the self-description, this step also checks for basic schema compliance. If any issues are found, the provider is prompted to make corrections before proceeding.
Registering Self-Description: Following syntax validation, the self-description is directed to the Connector component, where it is registered as an asset. This registration is critical for linking the self-description to a specific connector instance, enabling controlled access for consumption by the consumer.
Signing and Publication: The Signing/Publication Service manages the integrity and authenticity of the self-description. It signs the document using the Provider’s private key to prevent tampering and then publishes the signed self-description to the Catalogue. A copy of the signed self-description is also stored locally in the Provider’s Wallet for record-keeping purposes. The Wallet maintains a history of signed copies, with any necessary purge or retention policies applied to manage storage effectively. These policies should specify when older records are archived or deleted to optimise space and meet governance standards.
Semantic Validation: After publication, the Catalogue on the governance node initiates Semantic Validation. This step checks that the self-description adheres to the Data Space’s vocabularies and ontology standards, ensuring semantic consistency.
Quality Check: The Catalogue also performs a Quality Rules check to verify that the self-description meets all mandatory quality standards. If any semantic issues are identified, the End User is notified to address specific violations. If the self-description passes all checks successfully, the End User receives a confirmation notification, indicating that the resource is now ready for publication within the Data Space.
Database Storage: Upon passing all validations, the self-description is stored in the Catalogue’s Database along with its associated metadata, making it discoverable and accessible to other participants in the Data Space.

Retrieve SD Metadata

This process illustrates how the metadata (such as status) of a self-description (SD) can be retrieved by a Provider. The different possible statuses for a self-description are outlined in the Gaia-X Federation Services documentation (40.3 Product Constraints) here .

Initiate Metadata Request: The Provider initiates a request to retrieve metadata associated with a specific self-description. This request includes the unique identifier of the desired self-description and is sent to the Catalogue on the Governance Authority Node .
Query Metadata: Upon receiving the request, the Governance Authority Node processes it by querying its Metadata Database to locate the requested metadata. This step ensures that the metadata aligns with the unique identifier provided.
Return Metadata: After locating the requested metadata, the Catalogue prepare a response containing the metadata details. This ensures the requested information is available for consumption.
Display to User: The SD Manager receives the metadata and presents it to the Provider’s end user, allowing them to view details like the status of the self-description.

Retrieve Full Self-Description

This sequence describes the steps taken by a provider to retrieve a complete Self-Description (SD) for a resource.

Initiate SD Request : The process begins with the provider using the SD Manager to initiate a request for a specific self-description by sending its unique identifier (SD ID) to the Management Service hosted on the Governance Authority Node .
Query Self-Descriptions : After processing, the Management Service queries the Self-Description Database to retrieve the detailed self-description associated with the given SD ID.
Respond with Self-Description : Once the full self-description is retrieved, it is sent back to the provider through the Response SD component. This self-description includes all necessary metadata and resource information.
Display to User : The SD Manager on the provider node displays the retrieved self-description to the provider’s end-user through its User Interface.
Optional Storage in Wallet : Optionally, the retrieved self-description can be stored in the provider’s Wallet for local record-keeping or offline access.

Revoke a Self-Description (SD)

This process outlines how a provider can revoke an SD in the system, with possible statuses detailed in the Gaia-X documentation (40.3 Product Constraints).

Initiate Status Change : The provider, through the SD Manager on the Provider Node, initiates a request to revoke a specific SD by sending the SD ID to the Catalogue on the Governance Authority Node .
Revoke SD : The system then revokes the SD in the Catalogue database to reflect the new status for the Self-Description.
Response and Display : The Management Service confirms the status update by returning the new status to the SD Manager . If the user interface (UI) is used, the updated status is displayed to the provider end user.

ACV Dynamic - BP 06 – Consumer searches resources in data space catalogues

The process describes the end user searching for a resource in the catalogue. The end user can either use the quick search or the advanced search. For the advanced search, it is a prerequisite that the local schema registry of the provider/consumer is synced manually with the central schema registry of the governance authority. The search request is sent to the catalogue. The query mapper translates the query input to the related database query language and adds the filters based on the access policies related to the user performing the request. The search engine executes the search queries and returns the results. The result is then displayed in the end User’s search Client.

User Search Request Initiation The user initiates a search request through the Search Client. Here, the user enters the search criteria, which could include keywords, filters, or other parameters relevant to the desired resources (e.g., datasets or applications).
For the Advanced Search the form of the search is defined by the schema in the Schema Registry .
Policy Filter Service The validated search request is forwarded to the Policy Filter Service. This service checks the user’s access rights based on the policies defined in the Policy Creator Component .
By applying the relevant filters, the Policy Filter Service modifies the search query to restrict results to only those resources the user is authorised to view.
Query Translation by Adapter Component The Query Mapper Adapter Component receives the policy-filtered search request and translates it into a query language that aligns with the Catalogue’s database structure.
This step includes mapping the search parameters to the Catalogue’s internal query schema and embedding any access restrictions set by the Policy Filter Service directly into the query.
Catalogue Component Query Execution The Catalogue Component receives the translated and filtered query from the Adapter. Within the Catalogue, the Search Engine processes the query by scanning its database, which houses all signed self-descriptions, metadata and associated policies.
The Catalogue ensures that each self-description or metadata entry returned aligns with the access policies, ensuring compliance with data governance standards.
Result Return to Search Client After processing the query, the Catalogue Component sends the authorised results back through the Adapter Component , which re-formats them for the Search Client’s display needs. The Search Client then presents these results to the user in a structured format, along with relevant metadata to provide a comprehensive view of each item.

ACV Dynamic - BP 07 - Consumer and Provider establish a usage contract for selected catalogue items

This dynamic view for the “ Establish Usage Contract Agreement ” process captures the flow of interactions between various components involved in initiating, negotiating, validating and finalising a contract agreement between a Consumer and Provider. The view is structured into four primary sections representing different roles: Consumer, Connector, Provider and Governance Authority.

Preconditions:

Consumer Discovery and Decision: The Consumer must have discovered and selected the desired resource from the Dataspace’s Catalogue, reviewed the associated terms and conditions within the Usage Contract template (Business Process - 06) and made the decision to consume the resource (Business Processes - 08, 09A and 09B) .
No Existing Contract: There must not be an existing Usage Contract in place that covers the specific resource and terms of the current consumption request.

Initiating Contract Negotiation (Consumer to Connector) : The Consumer initiates a contract negotiation through the Connector’s control plane by creating a “Contract Offer Request.” This request is sent to the Provider’s Connector, initiating the contract establishment process.
Contract Offer Creation and Validation (Provider) : Upon receiving the Contract Offer Request, the Provider’s Connector verifies access policy and existence of a contract and then generates a “Contract Offer” and sends it back to the Consumer Connector for validation. The Consumer then reviews and validates this offer to ensure it meets their requirements.
Agreement Formation and Validation (Bidirectional Communication) : If the Consumer accepts the offer, the Consumer Connector initiates the creation of a “Contract Agreement.” This agreement is validated by both the Consumer and Provider’s Connectors to ensure mutual compliance. Once validated, both parties confirm the contract through Verifiable Credentials.
Verification and Issue of Usage Contract VC : The Provider invokes the VC Issuer to issue a Verifiable Credential (VC) for the Usage Contract Agreement. This credential is signed by a signer and service subsequently returned and stored securely within the VC storage of the Wallet for regulated access to usage terms. This is then repeated on the Consumer’s side to issue, sign and securely store VC for the Usage Contract Agreement on the customer’s side.
Persisting Agreement (Wallet & Storage) : After the VC Usage Contract is signed and securely stored in the digital wallets of both the consumer and the provider, a copy of the contract (in a format to be determined, potentially a third VC or a traditional record) will be stored by the provider for future reference, such as billing and auditing purposes.

ACV Dynamic - BP 08 - Consumer consumes an infrastructure resource from a Provider

The dynamic view diagram illustrates the orchestrated interactions required to provision infrastructure resources and to deploy applications on the provisioned infrastructure asset. This view focuses on the coordinated roles of the Triggering Module , Broker , Storage and Infrastructure Provisioner.

Preconditions :

Infrastructure of the Data Space governance authority had been set up (agent deployed);
Infrastructure of the infrastructure provider had been set up (agent deployed);
Infrastructure service offering(s) had been listed on the catalogue (and therefore registered as connector assets), as per BP 05;
Infrastructure Consumer has been onboarded to the Data Space (as per BP 03A and BP 03B);
Infrastructure Consumer is authenticated and has been authorised;
Main Infrastructure instance of the infrastructure consumer had been set up (agent deployed).

Triggering and Infrastructure Provisioner Modules

This process outlines how the deployment script can be added, removed, invalidated and triggered, using the Triggering and Infrastructure Provisioner Modules.

Triggering Module

The module is in charge of doing the management of the life cycle of the Deployment Scripts and Configuration Scripts. The building blocks are:

API : The request to the triggering module API would be received either from the “script management UI” when a deployment script is being added or is being modified, or from other components of Simpl such as connector extensions (at the time of contracting between two connectors, to send the Deployment Script ID and other relevant information such as the Consumer Email, and trigger the execution of deployment script, which provisions the infrastructure resources and deploys apps asynchronously). The same applies to the decommissioning process that can be triggered from the “script management UI” or from a “Triggering Decommission” client.

Script Storage Management : is the functionality of the backend, accessible via the API which also is available via the UI that relies on the API. Using this functionality, service providers (infrastructure, app or data) can store Deployment Scripts and receive a unique identifier (DeploymentScriptID) assigned to that specific script. At the time of adding the scripts, they are being validated to not contain malicious code. Script are added to a repository and a database at the same time, to have a mechanism to check their integrity in the future and at the time of retrieval.

- Add Script: Loads the deployment script
- Generate Unique ID : Assigns a unique identifier (DeploymentScriptID) to each script for tracking purposes.
- Validate Script : Ensures the script is free of malicious code. Scripts failing this check are rejected.
- Hash the Script : Generates a hash value for the script, which is stored in the database to enable future integrity checks.
- Store Script in DB : Saves the script’s metadata and hash securely in the database, with protections against SQL injection attacks.
- Store Script in Repo : Stores the actual script file in a local repository for retrieval during execution.

Script Configuration Management : is the functionality of the backend, accessible via the API which also is available via the UI that relies on the API. Using this functionality, service providers (infrastructure, app or data) can store Configurations Scripts and assign them to specific Deployment Script. At the time of adding a Configuration, it’s being validated. Configurations are added to a database at the same time and are bonded to a Deployment Script.

- Add Configuration: Loads the configuration script
- Validate the Configuration : Ensures the Configuration script is in the correct format. Scripts failing this check are rejected.
- Store Script in DB : Saves the Configuration script in the database, with protections against SQL injection attacks.

Template Management : Templating is a feature that empowers providers to provision Virtual Machines (VMs)

- based on predefined characteristics, primarily focusing on hardware specifications and operating systems.
- A template serves as a blueprint for defining how VMs can be configured and deployed.
- Templates are used to build deployment script based on templates components.
  - Therefore the Script Execution part does not change just because of templates.

Invalidate Script: is the functionality of the backend for disabling a Deployment Script, accessible via the API which also is available via the UI that relies on the API. Using this functionality, service providers (infrastructure, app or data) can disable Deployment Scripts. At the time of disabling a Deployment Script, it’s being checked.

- Invalidate Script : Changes the status of the Deployment Script
- Validate Removal Criteria : Enforces predefined rules before invalidating a script.
- Flag as Invalid in DB : Updates the script’s validity status in the database to indicate it is no longer active.
- Remove from Repo : Deletes the script file from the repository, though its metadata remains in the database for audit or business purposes.

Script Execution : is responsible for handling deployment script retrieval, validation, and execution requests.

- Retrieve Deployment Script : When a request containing the DeploymentScriptID is received, the module retrieves the script from the repository.
- Validate Deployment Script : Checks the integrity of the retrieved script by generating a new hash and comparing it with the hash stored in the database. Integrity failures trigger errors, preventing execution.
- Recognise Post-Configuration Script : If the Crossplane/Terraform configuration file contains a Cloud-init configuration section containing post provisioning configurations, it will be recognised, to be encoded to base64 (as described in the next steps, since it’s required by Crossplane/Terraform), after proper modifications (e.g., adding a public key or password that’s generated by the access management module, as described in the Access Management ).
- Hash/Encode : Encodes the Cloud-init configuration (if exists) using Base 64. Hashes the randomly generated password by the Access Management (if exists) using SHA 256.
- Modify Deployment Script : Replaces the simple-text Cloud-init configuration with the base64 encoded version of it, which contains the added information such as the encrypted password or the public key.
- Trigger Deployment Script Execution : Sends the deployment script to the Infrastructure Provisioner via the Message Broker , ensuring asynchronous communication for scalability.

Access Management: is responsible for the creation of the access of the resource and to send it to the End User.

- Generate Password : When a request containing the DeploymentScriptID is received, the module retrieves the script from the repository.
- Retrieve Access Data : Checks the integrity of the retrieved script by generating a new hash and comparing it with the hash stored in the database. Integrity failures trigger errors, preventing execution.
- Share Access Data : Shares the endpoints, credentials and any information relevant to the provisioned instance (or deployed applications). Currently, it relies on the SMTP emailer and will be replaced by wallet solutions in a future release.
- SMTP Emailer : Shares the access information with the Consumer, using the email address which was received during the triggering process.

Decommissioning: is the functionality of decommissioning a resource that has been created through a DeploymentScriptID, accessible via the API which also is available using the UI that relies on the API. Using this functionality, one can disable Deployment Scripts upon specific scenarios like for instance: end of a contract, violation of a policy, etc…

- Invalidate Script : Changes the status of the Deployment Script
- Validate Removal Criteria : Enforces predefined rules before invalidating a script.
- Flag as Invalid in DB : Updates the script’s validity status in the database to indicate it is no longer active.
- Remove from Repo : Deletes the script file from the repository, though its metadata remains in the database for audit or business purposes.

Message Broker: The Message Broker facilitates communication between the Script Execution Module and the Infrastructure Provisioner Module .

Infrastructure Provisioner Module

The module is in charge of provisioning and decommissioning of the infrastructure resources and completing post-provisioning configuration tasks. The building blocks are::

Provisioning: is the responsible for the provisioning process on the Cloud Provider side.

- Validate Deployment Script : Checks the script for syntax correctness and interpretability to avoid execution errors.
- Execute Deployment Script : Provisions infrastructure resources based on the script’s configuration.
- Post Configuration : Completes additional tasks, such as setting policies, deploying applications, mounting or attaching storages and loading datasets as specified on the post configuration (Cloud-init) section of the deployment script.
- Share Access Data : Shares access information (such as endpoints) to the Access Management Module , via the Message Broker.

Decommissioning: is the responsible for the decommissioning process on the Cloud Provider side.

- Run pre-decommissioning tasks : Such as making a snapshot of the resources, depending on the business requirements (yet to be clarified by Business).
- Decommission : Terminate/destroy the resources.

Storage Solutions: Consists of the Database , Git-Based Repository and Wallet ensuring secure storage and retrieval of deployment scripts and passwords.

- Database : Stores metadata and hashes for each script to facilitate integrity verification.
- Repository : Hosts the actual deployment scripts for retrieval during provisioning.
- Wallet : If a random password generation is necessary for the instance that is going to be provisioned, this password will be temporarily stored on the wallet, until the provisioning is finalised and the password is going to be communicated to the consumer and to be deleted from the wallet.

Cloud Provider: Consists of the Cloud Provider Infrastructure where the resources are created.

ACV Dynamic - BP 09A - Consumer consumes a data resource from a Provider

This section describes the capabilities falling behind the scope of the current release and will be enhanced at a later time. In particular it includes only the direct data download capability for data sharing.

The Data Provider is open to offering straightforward access to the dataset for the consumer. This access can be facilitated through a direct download, making the process simple and efficient. To ensure proper governance, a formal contract will be established between both parties. Since the data is downloaded, Simpl no longer has control over its usage, and therefore this contract will define and enforce legally binding usage policies as well as access policies. These measures will provide clarity and security for both the Data Provider and the consumer, safeguarding proper usage of the data.

Preconditions:

Data Provider has registered the resource at the Connector;
Data Provider has created the SD for the resource (meta data description) and uploaded the SD to the data catalogue;
Consumer has logged in through their agent;
Consumer found the needed dataset using the searching capabilities on the Data Catalogue;
If the contract doesn’t exist, Consumer and Provider must establish a Contract on the requested resource (BP7 - see relative architecture for further details);
Consumer has available and compatible storage.

Assumptions for the current release:

The existence of the contract is not checked.

The view shows the dynamic application view of consuming a data resource by directly being given access to the dataset. It outlines the key functional components involved in the process of consuming a data resource.

The consumer initiates a request for the resource previously found in the catalogue, for which a contract has already been established. This request is sent to the provider through the connector. Upon receiving the request, the provider verifies the policies to ensure that the consumer has the necessary permissions to perform the requested action. The policies that are checked are only those that can technically be enforced. For all others, since the dataset is downloaded, the contract enforces the legally binding usage policies. Once the policies are confirmed, the transaction takes place between the two data orchestrator components, which, at the moment, are implemented as extensions of the EDC connector. These orchestrator components handle the interface between the connector and the actual source on the provider side and sink on the consumer side of the data.

Dataset Selection by the Consumer :
- The process begins in the Catalogue Client Application on the consumer side, where the consumer selects a dataset of interest. This action initiates a “Request Consumption of Data Asset” message, which is sent to the Contract Negotiation Adapter .
- This message signals the consumer’s intent to access the resource and moves the negotiation process forward.
Creation of Request Bundle :
- The Contract Negotiation Adapter takes the consumer’s request and compiles a Request Bundle .
- This bundle includes information about the selected dataset and any initial parameters needed to facilitate negotiation. It is forwarded to the consumer’s Connector (Control Plane) for further processing.
Requesting an Offering from the Provider :
- The consumer’s Connector (Control Plane) sends a Request Offering message to the provider’s Connector (Control Plane) .
- This step involves querying the provider’s system to locate the requested dataset and determine its availability.
Asset Validation by the Provider’s Connector :
- The provider’s Connector (Control Plane) checks the Asset Catalogue for the requested dataset:
  - If the dataset is not found , the provider responds with a Resource Not Found Message . This message is propagated back through the consumer’s Connector and Contract Negotiation Adapter to notify the consumer, effectively halting the process.
  - If the dataset is found , the workflow transitions into the contract negotiation phase.
Contract Negotiation Between Connectors :
- Once the dataset is validated, the consumer and provider’s Connectors (Control Planes) begin negotiating the terms of usage.
- This includes setting access conditions, pricing, compliance requirements, and obligations. The outcome of this step is a draft contract that must undergo further validation.
Policy Evaluation on the Provider’s Side :
- The draft contract is sent to the provider’s Policy Engine for evaluation against governance and compliance rules.
  - If the policy check fails , the Policy Engine sends a notification back to the consumer (via the Connector Control Plane and Contract Negotiation Adapter ) explaining the violation. The process halts here unless the consumer modifies the request to comply.
  - If the policy check succeeds , the Policy Engine approves the contract, and the workflow proceeds to finalisation.
Notification of Policy Check Results :
- The results of the policy evaluation (either success or failure) are sent back to the consumer’s Connector (Control Plane) .
- In case of failure , the Contract Negotiation Adapter notifies the consumer, providing details about the violation.
- If the policy check is successful , the contract is finalised and marked as complete.
Finalisation of Contract Agreement :
- Once approved, the contract agreement is formalised within the Control Plane of both the consumer and provider connectors.
- At this point, both parties have a binding agreement that governs the terms for the upcoming data transfer.
Initiation of File Transfer Request :
- With the contract in place, the consumer’s Data Plane Extension for S3 sends a Request File Transfer message to the provider’s Data Plane Extension for S3 .
- This request includes the contract agreement ID as a reference, ensuring that the data transfer adheres to the agreed terms.
Processing the File Transfer :
- The provider’s Data Plane Extension for S3 verifies the file transfer request using the contract ID and cross-checks it against the agreed terms.
- Once validated, the dataset is securely transferred to the consumer, completing the process.

ACV Dynamic - BP 09B - Consumer receives a data processing service on a data resource via an application

The consumer seeks to perform actions such as visualisation or processing on a dataset owned by a data provider but does not have direct access to the data itself. Instead, the consumer selects and enters into a contract for an offering from the data provider, which includes the provisioning of an infrastructure resource. An application is then deployed on this infrastructure, enabling the necessary processing of the dataset. Access is provided exclusively through a direct link to the application, ensuring that the consumer cannot directly access the data. As part of the contractual agreement, the consumer is prohibited from attempting to access the data in any way.

Preconditions:

Data Provider has registered the resource at the Connector;
Data Provider has created the SD for the resource (metadata description) and uploaded the SD to the data catalogue;
Consumer has logged in through their agent;
Consumer found the needed resource using the searching capabilities on the Data Catalogue and selected the bundle of dataset, application and infrastructure associated with the dataset;
If the contract doesn’t exist, Consumer and Provider must establish a Contract on the requested resource (BP07) (see relative architecture for further details).

The view shows the dynamic application view of consuming a bundle resource (dataset, application and infrastructure bundled together) by being given access to the provisioned node where the bundle is deployed. It outlines the key functional components involved in the process of consuming the resource.

In this scenario, the goal is to ensure that the consumer gains access only to the application, without direct access to the dataset itself. When the consumer selects and contracts a data processing service offering after the contracting is done, the infrastructure resource provisioning and the deployment of the application over the infrastructure instance will take place in the background and the Consumer will in the end receive the access data and credentials only to the deployed application. The access to the application is not depicted in this scenario.

The components coloured in grey are related to the BP07 (Contract Manager) as well as BP06 (Search) and they are mainly referring to the preconditions.

The diagram represents the action performed to trigger the Bp which consists in using the endpoint, with the needed parameters, contained in the selected description, the user can initiate the contract negotiation process.

The diagram also shows the flow of how the consumer is requesting a bundled resource via a data provider to the infrastructure provider and receives the respective access. The consumer will request the offering via the Catalogue Client UI, based on a previously identified search result. The Contract Negotiation Adapter is handling the request from the consumer, filtering for the requested asset on the provider’s catalogue and return the offering along with the respective usage and access policies. The user is then accepting those and at the same time start the contract negotiation. The contract negotiation adapter is building the request for the connector. After finalising the contract negotiation, the infra structure deployment is triggered. Once this step is completed, the access information is passed to the user,

Resource Selection :
- The Catalogue Client Application on the Consumer side initiates a request for consuming a bundled resource (dataset, application and infrastructure) by selecting it from a search result. This request is sent to the Contract Negotiation Adapter to begin the process.
Request Offering :
- The Contract Negotiation Adapter processes the Consumer’s request by forwarding it to the Connector (Control Plane) on the Data Provider’s side.
- The Connector checks the requested resource against its Asset Catalogue :
  - If the asset is not found , a “Resource Not Found” message is sent back to the Consumer.
  - If the asset is found , the Connector retrieves the associated offering details, including usage and access policies and returns them to the Consumer for review.
Policy Agreement :
- The Consumer, using the Catalogue Client Application , reviews the retrieved offering details.
- Upon agreeing to the usage and access policies, the Consumer initiates a contract negotiation via the Contract Negotiation Adapter .
Contract Negotiation Request :
- The Contract Negotiation Adapter composes a contract negotiation request and sends it to the Connector (Control Plane) of the Data Provider.
- The Policy Engine evaluates the request based on predefined policy rules:
  - If the policy check fails , the Consumer is notified of the violation.
  - If the policy check succeeds , the contract is finalised and a confirmation is sent to the Consumer.
Infrastructure Deployment Trigger :
- Once the contract is finalised, the Connector (Control Plane) triggers the Infrastructure Orchestrator to begin the provisioning process for the infrastructure and application deployment.
Triggering Deployment :
- The Infrastructure Orchestrator sends a deployment command to the Triggering Module of the Infrastructure Provider, detailing the specifications for provisioning the required infrastructure and deploying the application.
Provisioning and Deployment :
- The Triggering Module provisions the infrastructure instance and deploys the application onto it as per the deployment command.
- Once the deployment is complete, the Triggering Module fetches the access credentials (e.g., application URL, API keys) and sends them back to the Infrastructure Orchestrator .
Returning Access Information :
- The Infrastructure Orchestrator relays the access information to the Connector (Control Plane) on the Data Provider’s side.
- The Connector forwards the access details to the Contract Negotiation Adapter , which delivers them to the Consumer, completing the workflow.

This scenario follows the one above. Once the login credentials are received, the user accesses the dedicated infrastructure through a direct link.

4.4. ACV - Domain 3 - Management/Operation of Data Space

4.4.1. ACV - Domain 3 - Management/Operation of Data Space - Static Views

ACV Static - Schema Management Service

Schema Management

The Schema Management Service component represents the Metadata Description building block, enabling the Governance Authority to define the structure of self-descriptions. Using a UI or API, the Governance Authority can establish properties, data types, constraints and controlled vocabularies that apply across resources (datasets, applications, infrastructure). The resulting schema configurations are automatically transformed into semantic files and managed within the Schema Synch Service, ensuring the Provider Node has access to the most current schema standards for generating self-descriptions in compliance with governance protocols.
The Schema Backend functions as a central repository and management interface for schemas created by the Governance Authority. These schemas, represented as ontologies and structured schema definitions, are actively managed to provide consistent standards across resource descriptions. Serving as an application component rather than a simple data storage element, the Schema Registry facilitates regular synchronisation with the Provider Node, ensuring that providers always have access to the latest schema standards needed for creating compliant self-descriptions.
The Schema Registry is used by the catalogue client application to enable semantic consistency by defining and validating the terms used in self-descriptions and search fields. The Search Client uses the schema to define the search fields for the advanced search. This automatic form generation helps prevent ambiguous searches and ensures users can only search for terms recognised within the Data Space.
Schema Synch Adapter synchronises the schemas with the agents and makes sure the schemas for dependent components are accessible and up-to-date

ACV Static - Monitoring Service

This section describes the architecture for Monitoring and Logging, within a single node (Simpl-Open agent) and does not (yet) consider inter-nodes setup.

Simpl-Open Application Component

This component represents an abstraction of any Simpl-Open application component which are being monitored. These components can produce:
- Technical logs generated by the application and the underlying platform - e.g. access logs, error logs, etc. ;
- Business events generated by the application upon specific triggers in the business workflow - e.g. “Participant successfully onboarded”;
- Infrastructure metrics - e.g. CPU utilisation, RAM utilisation, etc. ;
- Health checks which are APIs implemented by the application components of the Simpl-Open agent to report on the status of the service - e.g. HTTP 200 “OK”.
- Tracing data to discover potential bottlenecks in the request processing.

Platform API

The platform API is an API provided by the platform on which Simpl-Open application components are deployed, which allows to collect enriched logs and metrics.

Monitoring Service

The monitoring service is modelled as a service which is offered to all the Simpl-Open Application Components. It is implemented through the set of components described below.

Log Collection Agent

The log collection agent collects Technical and business logs from each Simpl-Open application component and forwards them to the log ingestion pipeline .

Log Ingestion Pipeline

The log ingestion pipeline receives the logs from the log collection agent and standardises their format before storing them in the logs repository .

Infrastructure Metrics Collection Agent

The infrastructure metrics collection agent collects infrastructure metrics from each Simpl-Open application component and stores them directly in the log’s repository .

Logs Repository

The logs repository serves as a central hub to store all types of logs and metrics. It then feeds the monitoring space , the logs visualisation component and the reporting component.

Monitoring Space

The monitoring space displays dashboards built on top of the different logs but can also directly query health endpoints to display their status.

Logs Visualisation

The logs visualisation component allows to run queries on the logs and visualise them in a user interface. Logs visualisation and monitoring space share a common UI with distinct tab for each functionality.

Reporting

The reporting component includes both a user interface from which exports can also be performed and an API to query logs for other purposes such as monitoring federation or billing.

Alert Manager

The monitoring space is connected with an alert manager to trigger alerts based on predefined thresholds.

Health Checks

An internal scheduler is querying for health statuses of Simpl-Open components and store it in Elastic for further visualizations.

Application Tracing

Provides a mechanism to follow API requests ‘ journey ’ though components of Simpl-Open agent.

4.4.2. ACV - Domain 3 - Management/Operation of Data Space - Dynamic Views

ACV Dynamic - BP 02C – Manage Resource Description Schemas

The process describes the governance authority end user managing schemas for resource descriptions. The end user can either create a new resource description schema, revoke and existing one, or create a new version of it.

ACV Dynamic - WF 12B - Local Node Logging and Monitoring

Below diagram describes how the components presented above interact with each other to rend the functionalities.

The Simpl-Open Application Component generates various types of data, including technical logs, business events, infrastructure metrics and health check outputs. These data streams are exposed via APIs for collection.

Log and Event Generation : The Simpl-Open Application Component produces technical logs, business events, infrastructure metrics and health check data, exposing these via APIs.
Logs Collection : The Logs Collection Agent retrieves technical logs and business events, forwarding them to the Logs Ingestion Pipeline for processing.
Metrics Collection : The Infrastructure Metrics Collection Agent gathers infrastructure metrics and directly forwards them to the Logs Repository.
Log Transformation : The Logs Ingestion Pipeline processes and transforms the raw logs into a standardised format before storing them in the Logs Repository.
Centralised Storage : The Logs Repository stores technical logs, infrastructure metrics and business events in dedicated sections, ensuring they are accessible for subsequent steps.
Log Visualisation : The Logs Visualisation component retrieves and displays logs for analysis, allowing users to review technical, infrastructure and business-related events.
Data Aggregation : The Monitoring Space aggregates logs and metrics, enabling real-time analysis of system health and performance.
Alert Generation : The Alert Manager processes aggregated data, generating alerts for any anomalies or threshold breaches and notifying relevant stakeholders.
Report Generation : The Reporting Module queries logs and metrics from the Logs Repository to create detailed reports.
Report Presentation : These reports are displayed through a user-friendly interface, providing actionable insights for decision-making.

4.5. Interfaces

4.5.1. APIs

The table below presents the APIs of all the components depicted on the application deployment views. These APIs can be correlated to the Application Components Views static diagrams (per domain) through the numbering appearing on both the diagrams and the first column of this table.

Simpl-Open uses 2 types of APIs:

Synchronous JSON/HTTP APIs;
Asynchronous JSON/Kafka APIs.

Each API is described in a functional way and linked to the technical contract definition (e.g. OpenAPI definition for sync APIs) which is stored in GitLab.

The API guidelines of Simpl-Open can be found in the Simpl Contributions Code of Conduct & Guidelines.

The ‘ Monitoring Integration ’ column indicted if an API endpoint emits business log to the centralised monitoring component. In the end state all API endpoints should emits business logs to ensure auditability and traceability of all flows.

#	Component	Sync APIs	Async APIs	Monitoring Integration (yes/no)	API Guidelines Phase 1 Compliancy
		Name	Technical definition	Name	Technical definition
1	SD Tooling	SD Tooling API POST /selfDescriptions/enriched - Register SD Json and return it enriched and validated POST /selfDescriptions/publications - Publishes a finalized (enriched, validated and signed) SD json into the provider catalogue GET /policies/identity/attributes - Retrieve available identity attributes GET /policies/actions - Get access policy actions POST /policies/access - Create Access Policies POST /policies/usage - Create Usage Policies POST /contract/validate - Return JSON validated GET /schemas/{schemaId}/content - Returns the TTL shape file content for the specified ecosystem and name GET /schemas - Retrieve all available shapes/schemas GET /resourceDescriptions - Returns all resourceDescriptions associated to participant provider GET /resourceDescriptions/{resourceDescriptionId} - Returns detail of a specific resource description GET /resourceAddresses/templates/{templateId}/uiSchema - Returns a source address ui schema based on templateId GET /resourceAddresses/templates/{templateId}/schema - Returns a source address template schema based on templateId GET /resourceAddresses/sharingMethods - Return the list of sharing methods by offer type GET /resourceAddresses/sharingMethods/{sharingMethodId}/templates - Return list of source address templates by sharing method and offering type	https://code.europa.eu/simpl/simpl-open/development/data1/sdtooling-api-be/-/blob/main/openapi/openapi-v1.yaml?ref_type=heads	c: schema-changed		yes	yes
2	EDC Connector Adapter	EDC Connector Adapter Application API GET /transfers/{transferProcessId} - Returns the transfer process status POST /transfers - Initiate a new transfer process POST /connectorCatalog/assets - Return detailed information about asset available in connector's catalogue POST /contracts - Initiate a new contract negotiation GET /contracts/{contractNegotiationId} - Returns the contract negotiation status GET /configs/participant - Returns the configured participant on connector POST /selfDescriptions/enriched - Returns resourceDescription after enrichment with connector info	https://code.europa.eu/simpl/simpl-open/development/data1/edcconnectoradapter/-/blob/main/openapi/openapi-v1.yaml			yes	yes
3	Validation API	Validation API POST /resourceAddresses/validation - Validate ResourceAddress against Json Schema POST /selfDescriptions/validation - Validate Json-LD file against TTL schema	https://code.europa.eu/simpl/simpl-open/development/data1/sdtooling-validation-api-be/-/blob/main/openapi/openapi-v1.yaml?ref_type=heads			yes	yes
4	sync-schema-adapter	Sync Schema Adapter POST /schemas/events - Receives lifecycle events for schemas from the Schema Management Service	https://code.europa.eu/simpl/simpl-open/development/data1/schema-sync-adapter			yes	yes
5	Signer Service	Signer Service /v1/sign - Sign self-description.	https://gitlab.eclipse.org/eclipse/xfsc/tsa/signer/-/blob/ocm-wstack/gen/http/openapi3.json
6	Catalogue Client Application	Advanced Search API GET /selfDescriptions - Returns result of a quick search POST /selfDescriptions/advanced - Returns result of an advanced search GET /selfDescriptions/{selfDescriptionId} - Return selfDescription baased on selfDescriptionId GET /schemas - List of all available schemas GET /schemas/{schemaId}/content - Returns schema content based on schemaId	https://code.europa.eu/simpl/simpl-open/development/data1/xfsc-advsearch-be/-/blob/main/openapi/openapi-v1.yaml			yes	yes
7	Schema Management	Schema Management API POST /schemas - Create a new resource description schema POST /schemas/{schemaName}/versions - Create a new version of an existing resource description schema GET /schemas - Get all schema metadata PATCH /schemas/{schemaName} - Revoke/Publish a schema GET /schemas/{schemaName}/versions - Get all schema versions metadata Subscription API GET /webhooks - get all active subscriptions POST /webhooks - create a new subscription DELETE /webhooks/{webhookId} - remove a subscription	https://code.europa.eu/simpl/simpl-open/development/gaia-x-edc/simpl-schema-manager/-/blob/develop/openapi/schema_openapi.yaml?ref_type=heads	p: schema-changed
8	Vocabulary Management	listVocabularies; removeVocabulary; UploadVocabulary.	TBD
9	Catalogue	GAIA-X Federated Catalogue GET /schemas - Get a full list of all vocabularies and schemas stored in the catalogue; POST /schemas - Add a vocabulary or schema to the catalogue; GET /schemas/{schemaId} - Get a specific schema/vocabulary; DELETE /schemas/{schemaId} - Delete a schema/vocabulary; POST /self-descriptions/{self_description_hash}/revoke - Revoke a self-description; GET /self-descriptions/{self_description_hash} - Get the complete self-description; POST /self-descriptions - Publish a self-description to the catalogue; POST /query - Sends a cypher query to the Neo4J database for search (used for quick and advanced search); POST /verification - Validate a self description on Semantics, VPSignature and/or VCSignature depending on the options selected in the request.	https://code.europa.eu/simpl/simpl-open/development/gaia-x-edc/simpl-fc-service/-/blob/develop/openapi/fc_openapi.yaml?ref_type=heads
12	Query Mapper Adapter	Query Mapper Adapter API GET /selfDescriptions - Quick search POST /selfDescriptions/advancedSearch - Advanced search	https://code.europa.eu/simpl/simpl-open/development/gaia-x-edc/poc-gaia-edc/-/blob/develop/openapi/adapter_openapi.yaml
13	Tier 1 Authentication Provider	Keycloak APIs OIDC - set of APIs for managing authentication; GET Realms - API for retrieving all the Realms configured; GET Realm by Realm Name - API for retrieving a Realm using its name; GET Users - API for retrieving the user list of a Realm; GET Roles - API for retrieving the role list of a Realm; GET Roles by User ID - API for retrieving the roles assigned to a user.	https://www.keycloak.org/docs-api/latest/rest-api/index.html
14	Tier 2 Authentication Provider	Authentication Provider API - Tier 1 + Tier2 - V1 (Deprecated, see the deprecation details inside the OpenAPI specification) Keypairs - API for managing Keypairs GET /keypairs - Get installed Key Pair HEAD /keypairs - Keypair Exists POST /keypairs - Import Key Pair POST /keypairs/generate - Generate Key Pair Certificate Sign Requests - API for managing Certificate Sign Requests POST /csr/generate - Generates the CSR for the applicant participant Agents GET /agent/ping - Ping a participant GET /agent/identityAttributes - Get identity attributes with ownership GET /agent/identityAttributes/{credentialId} - Get identity attributes with ownership GET /agent/echo - Get echo information GET /agent/ephemeralProof - Request ephemeral proof from authority POST /agent/credentials/validate - Validates a counterpart credential Credentials DELETE /credentials - Delete the credential GET /credentials - Replaces hasCredential. Returns the participant's credentials' information. POST /credentials - Upload a credential file GET /credentials/download - Download the participant's credential Mtls GET /mtls/ping - Ping the participant POST /mtls/ephemeralProof - Store Ephemeral Proof Sessions POST /sessions/credentials - Validate Tier 1 session GET /sessions/identityAttributes - Retrieve identity attributes of a participant Identity Attributes GET /identityAttributes - Search identity attributes with ownership	https://code.europa.eu/simpl/simpl-open/development/iaa/authentication_provider/-/blob/develop/openapi/authenticationprovider-v1.yaml?ref_type=heads
		Authentication Provider API - Tier 1 - V2 Keypairs - Manage and store participant keypair securely GET /keypairs - List all keypairs POST /keypairs - Create a new keypair GET /keypairs/algorithm - Returns the current agent encryption algorithm selected for keypairs GET /keypairs/active - Get active KeyPair HEAD /keypairs/active - Checks if the participant has an active KeyPair POST /keypairs/import - Import Key Pair GET /keypairs/{keyPairId} - Retrieve keypair data by ID DELETE /keypairs/{keyPairId} - Delete a keypair by ID POST /keypairs/{keyPairId}/csr - Generates the CSR given a KeyPair identifier and the Distinguished Name and store it in the system POST /keypairs/{keyPairId}/csr/submit - Submits the CSR bound to the given KeyPair identifier to the Dataspace Governance Authority Credentials - API for managing participant's credentials GET /keypairs/{keyPairId}/credentials - Retrieves the credentials given a KeyPair identifier POST /keypairs/{keyPairId}/renew - Request to Dataspace Governance Authority a credentials renewal GET /credentials/{credentialId}/participant - Retrieve participant details by credential ID. POST /credentials - Upload a credential GET /credentials/active - Retrieve the agent's active credential POST /credentials/validate - Verify a credential Participants - Retrieve information about the agent participant and other data space participants GET /participants/{participantId} - Retrieve participant details by ID GET /participants/{participantId}/identityAttributes - Get Identity Attributes of a participant by participant ID GET /participant - Retrieve participant details of the current agent. GET /participant/identityAttributes - Retrieve participant identity attributes of the current agent. GET /echo - Inquiry the Dataspace Governance Authority GET /ping - Inquiry another participant agent GET /ephemeralProof - Request ephemeral proof from the Dataspace Governance Authority Identity Attributes - Retrieve information about the dataspace Identity Attributes GET /identityAttributes - Retrieve the list of all the defined identity attributes by the dataspace governance authority. POST /identityAttributes/synchronize - Synchronize Identity Attributes from Dataspace Governance Authority to local copy Tier 1 Credentials - Validates the Tier 1 credentials of an end-user originating from an external participant. POST /tierOneCredentials/validate - Validates Tier 1 credential of an End-User during Tier2 communication Automatic Renewals - Allow setting and retrieving the automatic renewal configuration. GET /automaticRenewal - Retrieves the automatic renewal configuration. PATCH /automaticRenewal - Set the automatic renewal configuration.	https://code.europa.eu/simpl/simpl-open/development/iaa/authentication_provider/-/blob/develop/openapi/authenticationprovider-tier1-v2.yaml?ref_type=heads
		Authentication Provider API - Tier 2 - V2 Participants - Retrieve information about the agent participant and other data space participants GET /ping - Respond to a ping from another participant POST /ephemeralProof - Stores the Ephemeral Proof	https://code.europa.eu/simpl/simpl-open/development/iaa/authentication_provider/-/blob/develop/openapi/authenticationprovider-tier2-v2.yaml?ref_type=heads
				Authentication Provider Async API Consumed Events None Produced Events credentialUpdatedEvent: The agent's credential has been updated or removed assignedIdentityAttributesUpdatedEvent: The identity attributes assigned to the participant have changed.	https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/authenticationprovider/v1/asyncapi.yaml?ref_type=heads
15	User & Roles	User and Roles API - Tier 1 - V1 Roles - API for managing Roles GET /roles - Search roles POST /roles - Create a new role DELETE /roles/{roleId} - Delete a role by id GET /roles/{roleId} - Find role by ID PUT /roles/{roleId} - Update an existing role DELETE /roles/{roleId}/identityAttributes - Delete an identity attribute from a role PUT /roles/{roleId}/identityAttributes - Assign identity attributes to a role POST /roles/{roleId}/identityAttributes/duplicate - Duplicate identity attributes to another role POST /roles/import - Import the provided roles into the system Users - API for managing Users GET /users - Search users POST /users - Creates a user DELETE /users/{userId} - Delete user GET /users/{userId} - Get user by UUID PUT /users/{userId} - Update user GET /users/{userId}/roles - Get user roles PUT /users/{userId}/roles - Update user roles POST /users/import - Import the provided users into the system	https://code.europa.eu/simpl/simpl-open/development/iaa/users-roles/-/blob/develop/openapi/usersroles-v1.yaml?ref_type=heads
		User and Roles API - Tier 1 - V2 Users - Manages the Users of the Simpl-Open Agent GET /users/{userId} - Find User by ID PUT /users/{userId} - Update User DELETE /users/{userId} - Delete User GET /users/{userId}/roles - Retrieve roles for a user PUT /users/{userId}/roles - Update roles for a user GET /users - Search Users POST /users - Create User POST /users/import - Imports the provided Users into the system Roles - Manages the Roles of the Simpl-Open Agent GET /roles/{roleId} - Find Role by ID PUT /roles/{roleId} - Update Role DELETE /roles/{roleId} - Delete Role GET /roles/{roleId}/identityAttributes - Retrieve identity attributes for a role PUT /roles/{roleId}/identityAttributes - Update identity attributes for a role GET /roles - Search Roles POST /roles - Create Role POST /roles/import - Imports the provided Roles into the system User Session - Helps users to retrieve their session data GET /user - Fetch user session data for a logged in user Role Requests GET /roleRequests - Search Role Requests POST /roleRequests - Create Role Request GET /roleRequests/{roleRequestId} - Find role request by ID PUT /roleRequests/{roleRequestId} - Update a role request DELETE /roleRequests/{roleRequestId} - Cancels role request GET /user/roleRequests - Search end user's role requests	https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/raw/develop/simpl-api-iaa/src/main/resources/static/openapi/tier-1/usersroles-tier1-v2.yaml?ref_type=heads
				Users and Roles Async API	https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/usersroles/v1/asyncapi.yaml?ref_type=heads
16	Security Attributes Provider	Security Attributes Provider API - Tier1 + Tier2 - V1 (Deprecated, see the deprecation details inside the OpenAPI specification) Identity Attributes - API for managing Identity Attributes POST /participants/{participantId}/identityAttributes/assign - Assign Identity Attributes POST /participants/{participantId}/identityAttributes/unassign - Unassign Identity Attributes DELETE /identityAttributes/{id} - Delete Identity Attribute GET /identityAttributes/{id} - Find Identity Attribute by ID PUT /identityAttributes/{id} - Update Identity Attribute PUT /identityAttributes/assignable/{value} - Update Assignable Parameter GET /identityAttributes - Search Identity Attributes POST /identityAttributes - Create Identity Attribute POST /identityAttributes/import - Imports the provided identity attributes into the system Mtls - API for managing Mtls GET /mtls/identityAttributes - Get Identity Attributes with Ownership GET /mtls/identityAttributes/{credentialId} - Get Identity Attributes by Certificate ID	V1 https://code.europa.eu/simpl/simpl-open/development/iaa/security-attributes-provider/-/blob/develop/openapi/securityattributesprovider-v1.yaml?ref_type=heads
		Security Attributes Provider API - Tier1 - V2 Identity Attributes - Manages the identity attributes of the data space GET /identityAttributes/{identityAttributeId} - Find Identity Attribute by ID PUT /identityAttributes/{identityAttributeId} - Update Identity Attribute DELETE /identityAttributes/{identityAttributeId} - Delete Identity Attribute GET /identityAttributes - Search Identity Attributes POST /identityAttributes - Create Identity Attribute POST /identityAttributes/import - Imports the provided identity attributes into the system Participant - Manages the assignment of identity attributes to participant GET /participants/{participantId}/identityAttributes - Retrieve identity attributes for an participant PUT /participants/{participantId}/identityAttributes - Update identity attributes for an participant	Tier 1 - V2 https://code.europa.eu/simpl/simpl-open/development/iaa/security-attributes-provider/-/blob/develop/openapi/securityattributesprovider-tier1-v2.yaml?ref_type=heads
		Security Attributes Provider API - Tier2 - V2 Identity Attributes - Allow agents to manage identity attributes via Tier 2 communication GET /identityAttributes - Get the Identity Attributes of the dataspace. Credentials - Manage information related to Tier 2 credentials GET /credentials/{credentialId}/identityAttributes - Get Identity Attributes of a participant identified by a credential Ephemeral Proof - Manage the issuance of ephemeral proof via Tier 2 communication POST /token - Generate ephemeral proof Participants - Manage participant identity attributes GET /participants/{participantId}/identityAttributes - Get Identity Attributes of a participant identified by a credential	Tier 2 - V2 https://code.europa.eu/simpl/simpl-open/development/iaa/security-attributes-provider/-/blob/develop/openapi/securityattributesprovider-tier2-v2.yaml?ref_type=heads
17	Identity Provider	Identity Provider API - Tier1+ Tier2 - V1 (Deprecated, see the deprecation details inside the OpenAPI specification) Credentials - API for managing Participant's credentials DELETE /credentials/{id} - Revoke credential GET /credentials/{id} - Retrieve information about the credential of the given participant PATCH /credentials/{id} - Change the status of a participant credential PUT /credentials/{id} - Create credential GET /credentials/{id}/download - Download credential POST /credentials/{id}/renew - Renew the specified credential Participants - API for managing Participants GET /participants - Search for participants POST /participants - Create a new participant GET /participants/{participantId} - Get participant by ID Certificate Sign Requests - API for managing Certificate Sign Requests POST /participants/{participantId}/csr - Validate and store a CSR for a participant Mtls - API for managing Mtls GET /mtls/whoami - Get participant information PATCH /mtls/publicKey - Store tier-one public key GET /mtls/echo - Get participant with identity attributes GET /mtls/credentials - Get participant credentials Mtls Ephemeral Proofs - API for managing Mtls Ephemeral Proofs POST /mtls/token - Generate ephemeral proof	V1 https://code.europa.eu/simpl/simpl-open/development/iaa/identity-provider/-/blob/develop/openapi/identityprovider-v1.yaml?ref_type=heads
		Identity Provider API - Tier1 - V2 Participants - Allow the creation and retrieval of participant, also allowing to manage the credential requests (CSR). GET /participants - List all participants POST /participants - Create a new participant GET /participants/{participantId} - Retrieve participant data by ID GET /participants/{participantId}/csr - Retrieves the CSR for the specified participant PUT /participants/{participantId}/csr - Stores the CSR for the specified participant GET /participants/{participantId}/automaticRenewal - Retrieves the automatic renewal configuration for a participant. PUT /participants/{participantId}/automaticRenewal - Update the automatic renewal configuration for a participant. POST /participants/{participantId}/automaticRenewal - Create the automatic renewal configuration for a participant. DELETE /participants/{participantId}/automaticRenewal - Delete the automatic renewal configuration for a participant. Credentials - Manage the digital credentials of participants, enabling secure authentication and verification of their identity and roles within trusted data exchange processes. GET /participants/{participantId}/credentials - Retrieves the credentials given a Participant identifier POST /participants/{participantId}/credentials - Create a credential for the given participant GET /credentials/{credentialId} - Retrieve information about a participant credential PATCH /credentials/{credentialId} - Change the status of a participant credential GET /participants/{participantId}/credentials/active - Get the active credential of a participant Automatic Renewals - Allow the creation, retrieval and editing of the default automatic renewal configuration for the dataspace participants. GET /participants/automaticRenewal - Retrieves the default automatic renewal configuration PUT /participants/automaticRenewal - Update the default automatic renewal configuration POST /participants/automaticRenewal - Create the default automatic renewal configuration DELETE /participants/automaticRenewal - Delete the default automatic renewal configuration	Tier 1 - V2 https://code.europa.eu/simpl/simpl-open/development/iaa/identity-provider/-/blob/develop/openapi/identityprovider-tier1-v2.yaml?ref_type=heads
		Identity Provider API - Tier2 - V2 Credentials - Manage participant credentials for Tier 2 communication, allowing the participant to interact with the governance authority GET /credentials - Get participant credentials via agent to agent communication PATCH /credentials - Update the Credential Id in use for the participant POST /credentials/renew - Store the credential renewal request for the participant via agent to agent communication PUT /csr - Stores a CSR for a participant Participants - Manage participant additional information such as Tier 1 public keys sent over the Tier 2 communication channel PUT /tierOnePublicKey - Store tier-one information GET /credentials/{credentialId}/participant - Retrieve participant details by credential ID GET /participants/{participantId} - Retrieve participant data by ID GET /participant - Gives the participant information GET /echo - Returns the caller participant information along with their identity attributes	Tier 2 - V2 https://code.europa.eu/simpl/simpl-open/development/iaa/identity-provider/-/blob/develop/openapi/identityprovider-tier2-v2.yaml?ref_type=heads
		EJBCA REST Interface API GET /ejbca/publicweb/status/ocsp - checks the status and availability of the OCSP (Online Certificate Status Protocol) service provided by EJBCA; GET /ejbca/publicweb/webdist/certdist - distributes certificate-related files, such as Certificate Authority (CA) certificates and Certificate Revocation Lists (CRLs).	https://docs.keyfactor.com/ejbca/latest/ejbca-rest-interface
18	Onboarding	Onboarding API v1 Onboarding Validation Rules - API for managing Onboarding Validation Rules GET /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/validationRules - Retrieve onboarding validation rules PUT /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/validationRules - Replaces the validation rules for the given onboarding procedure template POST /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/validationRules - Create an onboarding validation rule GET /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/validationRules/{validationRuleId} - Retrieve an onboarding validation rule by id PUT /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/validationRules/{validationRuleId} - Update an onboarding validation rule DELETE /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/validationRules/{validationRuleId} - Delete an onboarding validation rule Onboarding Templates - API for managing Onboarding Templates GET /onboardingProcedureTemplates/{onboardingProcedureTemplateId} - Get Onboarding Template by id PUT /onboardingProcedureTemplates/{onboardingProcedureTemplateId} - Update Onboarding Template DELETE /onboardingProcedureTemplates/{onboardingProcedureTemplateId} - Delete Onboarding Template GET /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/identityAttributes - Retrieve identity attributes for an onboarding procedure template PUT /onboardingProcedureTemplates/{onboardingProcedureTemplateId}/identityAttributes - Update identity attributes for an onboarding procedure template GET /onboardingProcedureTemplates - Get Onboarding Templates POST /onboardingProcedureTemplates - Create Onboarding Template Mime Types - API for managing Mime Types GET /mimeTypes/{mimeTypeId} - Get MIME type by ID PUT /mimeTypes/{mimeTypeId} - Update MIME type by ID DELETE /mimeTypes/{mimeTypeId} - Delete MIME type by ID GET /mimeTypes - Get all MIME types POST /mimeTypes - Create a new MIME type GET /mimeTypes/supported - Get all supported MIME types Onboarding Requests Management - API for managing Onboarding Requests Management GET /onboardingRequests - Search Onboarding Requests POST /onboardingRequests - Create Onboarding Request POST /onboardingRequests/{onboardingRequestId}/submit - Submits the onboarding request POST /onboardingRequests/{onboardingRequestId}/requestRevision - Notary requires a revision of the onboarding request from the applicant POST /onboardingRequests/{onboardingRequestId}/reject - Rejects the onboarding request POST /onboardingRequests/{onboardingRequestId}/documents - Add Document to Onboarding Request PATCH /onboardingRequests/{onboardingRequestId}/documents - Set Document for Onboarding Request POST /onboardingRequests/{onboardingRequestId}/approve - Approve the onboarding request GET /onboardingRequests/{onboardingRequestId}/identityAttributes - Retrieve identity attributes for an onboarding request PUT /onboardingRequests/{onboardingRequestId}/identityAttributes - Update identity attributes for an onboarding request PATCH /onboardingRequests/{onboardingRequestId}/expirationTimeframe - Set Expiration Timeframe for Onboarding Request POST /onboardingRequests/{onboardingRequestId}/comments - Add Comment to Onboarding Request POST /onboardingRequests/{onboardingRequestId}/evaluateFaultRules - Evaluate rules in FAULT status of the onboarding request GET /onboardingRequests/{onboardingRequestId} - Get Onboarding Request GET /onboardingRequests/{onboardingRequestId}/documents/{documentId} - Get Document from Onboarding Request DELETE /onboardingRequests/{onboardingRequestId}/documents/{documentId} - Deletes a document by id GET /onboardingRequests/{onboardingRequestId}/documents/{documentId}/eidasAttributes - Retrieves the eIDAS attributes of the user POST /onboardingRequests/{onboardingRequestId}/documents/{documentId}/eidasAttributes - Triggers the retrieval of the eIDAS attributes Onboarding Statuses - API for managing Onboarding Statuses PATCH /onboardingStatus/{onboardingStatusId} - Set Onboarding Status Label GET /onboardingStatus - Get Onboarding Status Participant Types - API for managing Participant Types GET /participantTypes - Get all participant types eIDAS Attributes - APIs to retrieve the eIDAS attributes provided in the official eIDAS attribute profile GET /eidasAttributes - Get eIDAS Attributes Onboarding Users - APIs to retrieve user data to help onboarding process GET /user/onboardingData - Get user onboarding data	https://code.europa.eu/simpl/simpl-open/development/iaa/onboarding/-/blob/develop/openapi/onboarding-v1.yaml?ref_type=heads
		Onboarding API v2 Onboarding Requests Management - Manage onboarding requests for applicant who want to join the data space POST /onboardingRequests - Create Onboarding Request	https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/raw/main/simpl-api-iaa/src/main/resources/static/openapi/tier-1/authenticationprovider-tier1-v2.yaml?ref_type=heads
				Onboarding Async API Consumed Events None Produced Events onboardingRequestDeletedEvent: An onboarding request has been deleted.	https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/onboarding/v1/asyncapi.yaml?ref_type=heads
20	Connector Management API	management-api Configuring and managing Assets (Any kind of resources); Assign Policies; Assign Contract Templates.	https://app.swaggerhub.com/apis/eclipse-edc-bot/management-api/0.7.0
21	Connector Control Plane	control-api Start Contract negotiation	https://app.swaggerhub.com/apis/eclipse-edc-bot/control-api/0.7.0
22	Connector Data Plane	public-api Data Transfer, CRUD	https://app.swaggerhub.com/apis/eclipse-edc-bot/public-api/0.7.0
23	Triggering Module	Infrastructure Provisioning API Script Controller – API for managing deployment scripts and post-configuration GET /scripts – Returns a list of deployment scripts POST /scripts – Uploads a new deployment script POST /scripts/trigger – Triggers a deployment script POST /scripts/{deploymentScriptId}/configFiles – Uploads a new post-configuration file associated with a specific deployment script GET /scripts/{deploymentScriptId} – Returns a specific deployment script GET /scripts/triggerList – Returns a list of deployment scripts triggered DELETE /scripts/{deploymentScriptId} – Removes a specific deployment script Status Controller – API to get the status of the endpoint GET /status – Retrieves the current API status Provider Controller – API to get a list of Cloud Providers GET /cloudProviders – Returns a list of Cloud Providers Resources Controller – API to manage provisioned resources DELETE /resources/deactivation/{requestUniqueId} – Removes a specific provisioned resource Cloud Environment Controller - API for managing cloud environments GET /cloudEnvironments - Returns a list of cloud environments POST /cloudEnvironments - Creates a new cloud environment DELETE /cloudEnvironments/{cloudEnvironmentId} - Removes a specific cloud environment GET /cloudEnvironments/{cloudEnvironmentId} - Returns a specific cloud environment PUT /cloudEnvironments/{cloudEnvironmentId} - Edits a specific cloud environment Component Controller - API for managing components for VM Templates GET /components - Returns a list of components POST /components - Creates a new component VM Template Controller - API for managing virtual machines templates GET /templates - Returns a list of VM Templates POST /templates - Creates a new VM Template DELETE /templates/{templateId} - Removes a specific VM Template GET /templates/{templateId} - Returns a specific VM Template PUT /templates/{templateId} - Edits a specific VM Template POST /templates/{templateId}/copy - Create a new VM Template from a selected one	https://code.europa.eu/simpl/simpl-open/development/infrastructure/infrastructure-be/-/blob/develop/openapi/infrastructure-provisioning-api.yaml
24	Contract Manager Orchestrator	Contract Manager POST /credentials/agreements/{contractAgreementId}/definitions/{contractDefinitionId} - Issue Verifiable credential POST /agreements/{contractAgreementId}/definitions/{contractDefinitionId}/status - Confirm signing of the Contract Agreement (setup status) GET /agreements/{contractAgreementId} - Get Contract Agreement GET /agreements/file/{contractAgreementId} - Get Contract Agreement File	code.europa.eu/simpl/simpl-open/development/contract-billing/contract/-/raw/develop/openapi/openapi3-v1.yaml?ref_type=heads	Issue Verifiable credential Response ContractAgreementResponseEvent	src/main/java/eu/europa/ec/simpl/contracts/kafka/events · main · Simpl / Simpl-Open / Development / Contract-Billing / contract · GitLab
25	Contract Manager Backend			Issue Verifiable credential Request ContractAgreementRequestEvent Confirm signing of the Contract Agreement StatusUpdateRequestEvent	src/main/java/eu/europa/ec/simpl/contracts/kafka/events · main · Simpl / Simpl-Open / Development / Contract-Billing / contract · GitLab
27	VC Issuer	Issue Verifiable Credential.	Currently under investigation
28	Contract Consumption Service	Contract Consumption API POST /connectorCatalog/assets - Search for offer registered in the provider connector catalog POST /contracts - Request contract negotiation for a specific asset GET /contracts/{contractNegotiationId} - Receive the Status of contract negotiation GET /transfers/{transferProcessId} - Get the transfer process status POST /transfers - Initiate a new transfer process GET /resourceAddresses/templates/{templateId}/uiSchema - Returns a destination address ui schema based on templateId GET /resourceAddresses/templates/{templateId}/schema - Returns a destination address template schema based on templateId GET /resourceAddresses/sharingMethods/{sharingMethodId}/templates - Return the list of destination address templates by sharing method and offering type	https://code.europa.eu/simpl/simpl-open/development/data1/contract-consumption-be/-/blob/main/openapi/openapi-v1.yaml			yes	yes
29	Notification Service			Notification Service API NotificationMessage	https://code.europa.eu/simpl/simpl-open/development/contract-billing/notification-service/-/blob/develop/docs/asyncApi/asyncapi.yaml?ref_type=heads
30	Tier 2 Gateway	NA/ - this is an API gateway and does not implement any API.		Tier 2 Gateway Async API Consumed Events credentialUpdatedEvent: The agent's credential has been updated or removed Produced Events None	https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/tier2gateway/v1/asyncapi.yaml?ref_type=heads
31	Orchestration Engine API	POST /graphql - with the query in the body defining the information you want to retrieve or trigger, see https://docs.dagster.io/api/graphql for documentation
32	Infrastructure Provider API	POST /scripts/trigger - Starts the execution of a script based on the provided deployment script id. GET /scripts - Retrieve a list of deployment scripts. POST /scripts - Submit a new deployment script. POST /scripts/{deploymentScriptId}/configFiles - Upload configuration files for a deployment script GET /scripts/{deploymentScriptId} - Retrieve deployment script details by deploymentScriptId DELETE /scripts/{deploymentScriptId} - Script delete operation GET /scripts/triggerList - Retrieve the list of script triggers GET /cloudProviders - Retrieve a list of Cloud Providers GET /cloudEnvironments - Retrieve a basic list of Cloud Environments POST /cloudEnvironments - Create a new cloud environment GET /cloudEnvironments/{cloudEnvironmentId} - Returns a specific cloud provider by its identifier. PUT /cloudEnvironments/{cloudEnvironmentId} - Updates metadata for an existing cloud environment. DELETE /cloudEnvironments/{cloudEnvironmentId} - Delete a cloud environment DELETE /resources/deactivation/{requesterUniqueId} - Initiate deactivation of provisioned resources by requester ID GET /templates - Returns a list of configured VM templates with the most relevant data. POST /templates - Create a new VM Template GET /templates/{templateId} - Returns a specific template by its identifier. PUT /templates/{templateId} - Updates metadata for an existing VM template. DELETE /templates/{templateId} - Removes an existing template. POST /templates/{templateId}/copy - Creates a new template based on another one based on the id that is provided. GET /cloudProvisionerTemplates - Returns a list of provisioner templates with their details. POST /cloudProvisionerTemplates - Create a new Cloud Provisioner Template GET /components - List all components (with optional filtering by type) POST /components - Creates a new component definition that can be assigned to a VM template. GET /components/types - Returns a list of component types.	https://code.europa.eu/simpl/simpl-open/development/infrastructure/infrastructure-be/-/blob/main/openapi/infrastructure-provisioning-api.yaml?ref_type=heads

completed, planned

4.5.2. User Interfaces

The Simpl-Open UX/UI Style Guide can be found in the Simpl Contributions Code of Conduct & Guidelines.

#	Component	Domain	Description	Functionalities
A	User & Roles	Domain 1	IAA frontend that allows to manage participant local users, assign roles to user and assign identity attributes to roles.	Create and manage end users Create and manage roles Assign roles to end users Assign identity attributes to roles List the local copy of identity attributes of the Data Space. Create and manage role requests Approve or reject role requests
B	Onboarding	Domain 1	IAA frontend that allows Participant and Governance Authority representatives to manage the onboarding requests of new participants in the Governance Authority agent.	Onboarding Governance Authority Representative Manage onboarding procedure templates along with document templates; Approve or reject an onboarding request; Request new documents on an open onboarding request; Add comments to an Onboarding Request. Participant Representative Creating temporary credentials for a dataspace applicant; Open a new onboarding request; Add comments to an onboarding request; Upload documents along with an onboarding request; Submit an onboarding request for review; Upload a participant public key; Download participant credentials.
C	Security Attributes Provider	Domain 1	IAA frontend component that allows a Governance Authority representative to manage the security attributes of a data space.	Create identity attributes for the Data Space; Assign identity attributes to a participant type (Consumer or Producer); Unassign identity attributes to a participant type.
D	Identity Provider	Domain 1	Allows the Governance Authority representative to manage participants and their credentials.	Participant Management Governance Authority Representative manage onboarded participants manage onboarded participants' credentials renew participant credentials
E	Participant Utility	Domain 1	IAA frontend that allows the management of credentials Participant Agent. It allows a Participant Representative to generate Keypairs and install credentials to complete the onboarding process.	Create a Keypair; Request credential renewal Manage keypairs Upload a locally generated Keypair; Upload the credential provided by the Governance Authority at the end of the onboarding process
F	Catalogue Client Application	Domain 2	The Catalogue Client Application is the primary interface through which users interact with the Catalogue. It presents search fields and options to users, which in case of advanced search are defined by the schema.	Quick Search: put a number of search terms into the bar click on "Search" to receive the results Advanced Search Select the Schema to search for Fill out the properties that you want to search on Click on "Search" to receive the results Data Consumption Search for a valid document (see above) Click on the "More details" button to enable the "Request resource" button Click on the "Request resource" button A contract offer will appear after a short loading period Clicking "Decline" will close the modal Clicking "Accept" will start the contract negotiation and will redirect to the contract negotiation status page The page refreshes every 3 seconds automatically to retrieve a new status until the status is "FINALIZED". It stops auto-refresh after that. You can also manually refresh the page to refresh the status. When the status is "FINALIZED" the "Start Transfer" button will appear. Clicking "Start transfer" will open a modal and it'll display the required data destination fields depending on resource type. For data offerings, this form will pop-up: Fill out the fields one-by-one, then scroll down to the bottom of the form and click the "Start Transfer" button: You'll be redirected to the transfer process status page: The page refreshes every 3 seconds automatically until the "DEPROVISIONED" or "TERMINATED" state is reached. The page can also be manually refreshed.
G	SD Tooling	Domain 2	Frontend with the forms for the provider to create Self-Descriptions. Written in Angular and NodeJS. The result is a SD in the form of a JSON-LD document that can be uploaded to the catalogue.	Select Schema for the SD to create Fill out the generated form with all mandatory properties Publish the SD to the catalogue on the Governance Authority
H	Schema Management UI	Domain 2	N/A - not part of the current release.
I	Vocabulary Management UI	Domain 2	N/A - not part of the current release.
J	Infrastructure Deployment Script Management UI	Domain 2	User Interface for adding and removing (invalidating) the Deployment Scripts, that can provision infrastructure resources and/or deploy applications. The UI also allows the addition of Post-Configuration script associated with a Deployment Script.	List Deployment Scripts, by accessing the "Deployment Scripts" menu Add/Upload a Deployment Script, by clicking the "+ Add Script +" button Deployment Script details, by clicking the "Properties" icon Download Script, by clicking the "Download" icon Inactivate a Deployment Script, by clicking the "Trash" icon Adding a Post Configuration, by clicking on "Add Config File" button List of Decommissioned resources, by accessing the "Contracts" menu Decommissioning a cloud resource, by clicking the "Decommission" button
K	Orchestration Management UI	Domain 2	UI layer from dagster, allows you to manage the workflow	list workflows see contained services configure workflow for run manual start a workflow run see previous runs with status monitor execution and logs
L	Infrastructure Deployment Script Management UI	Domain 2	User Interface for managing templates for deployment scripts. A template is defined for a specific cloud environment and a specific deployment script is generated from such a template.	Configure the Cloud Environment Register Extra Configurations (Optional) Define configurations for the provisioned VM Define security policies for the provisioned VM Define additional packages that will be installed on the provisioned VM Create the VM Template (OVH, IONOS)

4.6. Traceability from the functional architecture

The following table presents a mapping between the components from the functional architecture and the ones from the application architecture.

Functional Component	Application Component
Onboarding	Onboarding
IAA	Authorisation
	Tier 1 Authentication Provider
	Tier 2 Authentication Provider
	Identity Provider
	Security Attributes Provider
	User & Roles
	Credential Database/Vault
Vocabulary Management	Vocabulary Management
Schema Management	Schema Management
	Schema Registry
Service Offering Editor	SD Tooling
	Signer Service
	Wallet
	Policy Template Datastore
Federated Catalogue	Catalogue
Search	Catalogue Client Application
Data Space Connector	Connector
Contract Management	Contract Manager Orchestrator
	Contract Manager Backend
	Contract Template Datastore
Data Transfer	Connector
Infrastructure Management	Triggering Module
	Infrastructure Provisioner
Observability	Monitoring Module

5. Simpl-Open Data Architecture

Simpl-Open Data Architecture presents data entities and/or collections and how they are structured within the system.

Given that Simpl-Open combines existing/reusable open-source components and custom-built components, the following approach is followed:

For open-source components, the dedicated sub-section provides a link to the available data model documentation of the component.
For custom components, the dedicated sub-section describes the data model per component (as per the microservices approach) through the following layers:

Layer	Description
Conceptual	The conceptual data model (CDM) operates at a high level, providing an overarching perspective on the application's data needs. It defines a broad and simplified view of the data to create a shared understanding of the application by capturing the essential concepts. These essential concepts are captured in an Entity Relationship Diagram (ERD) and the accompanying entity definitions.
Logical	The logical data model (LDM) contains representations that fully defines relationships in data, adding the details and structure of essential entities. The LDM remains data platform agnostic because it focuses on business needs, flexibility and portability. The LDM includes the specific attributes of each entity, the relationships between entities and the cardinality of those relationships.
Physical	The physical data model (PDM) is a data model that represents relational data objects. It describes the technology-specific and database-specific implementation of the data model and is the last step in transforming from the logical data model to a working database. The physical data model includes all the needed physical details to build the database.

Layer

Description

Conceptual

The conceptual data model (CDM) operates at a high level, providing an overarching perspective on the application's data needs. It defines a broad and simplified view of the data to create a shared understanding of the application by capturing the essential concepts. These essential concepts are captured in an Entity Relationship Diagram (ERD) and the accompanying entity definitions.

Logical

The logical data model (LDM) contains representations that fully defines relationships in data, adding the details and structure of essential entities. The LDM remains data platform agnostic because it focuses on business needs, flexibility and portability.

The LDM includes the specific attributes of each entity, the relationships between entities and the cardinality of those relationships.

Physical

The physical data model (PDM) is a data model that represents relational data objects. It describes the technology-specific and database-specific implementation of the data model and is the last step in transforming from the logical data model to a working database. The physical data model includes all the needed physical details to build the database.

5.1. Open-Source Components Data Model

OSS	Data Model
XFSC Signer	The self-description is wrapped into a verifiable credentials and the proof section of the VC contain the signature. The data model is defined here: https://www.w3.org/TR/vc-data-model/#proofs-signatures
XFSC catalogue	XFSC catalogue stores the data in three different ways: File storage for the JSON-LD serialisation. Definition of the file format can be found https://json-ld.org/ . The data model itself depends on the schema definition used (defined in Additional Technical Specifications about Capabilities section). Graph-DB (Neo4J) used as index to allow semantic queries. The data model of the database also depends on the schema. Metadata stored in a relational database (PostgreSQL). Data Model is described in https://gaia-x.gitlab.io/data-infrastructure-federation-services/cat/architecture-document/architecture/catalogue-architecture.html#_metadata_store
OpenBao	Secrets data is stored in secret engine https://openbao.org/docs/internals/architecture/ . The data model depends on the model used currently a Key-Value (KV) Store Data Model is used.
Keycloak	Keycloak use a "code first" approach to data modelling. There are no data model diagrams available in their documentation, but the data model is described in their code repository: https://github.com/keycloak/keycloak/tree/main/model
EJBCA	EJBCA data model diagram is located https://doc.primekey.com/ejbca/ejbca-introduction/ejbca-architecture/internal-architecture
Crossplane	Resource Definition: https://docs.crossplane.io/latest/concepts/composite-resource-definitions/
OpenTofu	Resource Definition: https://opentofu.org/docs/language/resources/
Kubernetes	Kubernetes objects: https://kubernetes.io/docs/concepts/overview/working-with-objects/
Ansible	Ansible Data Manipulation: https://docs.ansible.com/ansible/latest/playbook_guide/complex_data_manipulation.html
ArgoCD	RBAC Model https://argo-cd.readthedocs.io/en/stable/operator-manual/rbac/#rbac-model-structure

5.2. Custom Components Data Model

5.2.1. Conceptual Data Model

2.19.1. CDM - Domain 1 - Access Control & Trust

Conceptual data model of components from domain 1. Please refer to Domain 1 Logical Data Model for a complete description of entities and their fields.

CDM - Onboarding

Handles the onboarding of a new participant in the Data Space.

Entity	Description
Participant Type	A type of participant in the dataspace. It can be a consumer, an application provider, a data provider, or an infrastructure provider.
Onboarding Applicant	An applicant representing an organisation that seeks to join a dataspace.
Onboarding Procedure Template	A template that defines, for each participant type, the data that must be provided by an applicant (see User Roles) to complete the onboarding process.
Document Template	A component of an onboarding procedure template. It defines the document that must be uploaded as part of an onboarding request.
MIME Type	The MIME type associated with a document template.
Onboarding Request	An instance of an onboarding procedure template, created by an applicant. The request can change status based on actions taken by the applicant (e.g. submission) or by governance authority representatives (e.g. rejection, approval, or request for review).
Document	An instance of a document linked to an onboarding request, uploaded by an applicant (see User Roles).
Comment	A comment that can be added to an onboarding request by either an applicant or a governance authority representative (e.g. Notary).
Identity Attribute	A Tier 2 identity attribute used within the dataspace.
Validation Rule	A definition containing the parameters required to validate a document uploaded by an applicant. Validation rules can be combined into hierarchical structures.
Validation Rule Execution	A record of the outcome of a validation performed on a document uploaded by an applicant.

CDM - Users Roles

Microservice that helps to map tier 1 roles with tier 2 security attributes.

Entity	Description
Identity Attribute	A Tier 2 identity attribute used within the dataspace. It can be assigned to one or more Tier 1 roles.
Role	A Tier 1 role assigned to a user within the SIMPL agent. It can have one or more Tier 2 identity attributes assigned.
User	A Simpl-Open end-user that uses the agent's functionalities
Roles Request	Created by an end-user to request specific roles and access the agent's functionalities.

CDM - Security Attributes Provider

Microservice that provides Ephemeral Proofs to onboarded Data Space participants. It’s the core of Dynamic Attribute Provisioning. Deployed only by the Data Space Governance Authority.

Entity	Description
Participant	An onboarded Data Space participant.
Identity Attribute	A Tier 2 identity attribute used within the dataspace.

CDM - Identity Provider

Microservice that handles the credentials for each Data Space participant. Deployed only by the Data Space Governance Authority.

Entity	Description
Participant	An onboarded Data Space participant, along with the information needed to issue a credentials.
Credential	A credential (currently x509 Certificate) signed by the Governance Authority and later provided to the participant (see Credential in Authentication Provider component).
Auto Renewal Defaults	The default auto-renewal configurations of the dataspace.
Participant Auto Renewal	The credential auto-renewal configurations of the specific participant.
Auto Renewal Errors	Errors that may arise during credential auto renewal of a participant.

CDM - Authentication Provider

Entity	Description
KeyPair	A KeyPair (public and private) linked to the participant's credential.
Credential	A credential issued by the Governance Authority to the participant. The participant uses it to communicate with other participants.
Private Key	The private key content related to the keypair.
Participant	The information of the participant owning the agent.
Identity Attribute	A local copy of the dataspace identity attributes.
Auto Renewal Config	The agent's credential auto-renewal configuration.
Credential Sync Error	Execution errors that may happen during credential synchronisation with the Governance Authority.

2.19.2. CDM - Domain 2 - Publish and consume resources

CDM - Contract Manager

Entity	Description
Infrastructure Provider	Represents the signed agreement between a provider and a consumer.

CDM - Infrastructure Provider Storage

Handles the storage and management of deployment scripts by infrastructure providers for provisioning infrastructure instances and applications.

Entity Description

Entity	Description
Infrastructure Provider	Represents the company that offers infrastructure deployment services.
Deployment Script	Represents the deployment script uploaded by the provider to enable provisioning of infrastructure instances and applications.
Script Trigger	Represents a provisioning request for a deployment script.
Script Identify	Represents metadata about deployment scripts, including their hash for integrity checks.

CDM - Schema Sync Adapter

It manages the storage of information related to schemas, including their versions, associated metadata, and the events related to the publication and revocation of a schema.

5.2.2. Logical Data Model

2.20.1. LDM - Domain 1 - Access Control & Trust

LDM - Onboarding

Handles the onboarding of a new participant in the Data Space.

Entity Descriptions and Attributes

MimeType

Description: Represent the allowed MIME types for onboarding request documents.
Attributes
- id: the identifier of the MIME type.
- description : A human-readable text that describes the MIME type (e.g. “pdf”, “zip”).
- value : The actual MIME type value following the RFC6838 (e.g. “application/pdf”, “application/zip”).

ParticipantType

Description: The participant type is related to an onboarding procedure template.
Attributes
- id: The identifier of the participant type.
- value : The code of the participant type.
- label: A human-readable name for the participant type.

OnboardingProcedureTemplate

Description : The template of an onboarding procedure. Along with the document template, it defines the information that has to be filled out by the applicant.
Attributes
- id: The template identifier
- description: A brief description of the onboarding procedure template (e.g. “The role of this participant in the dataspace is …”).
- participant_type_id: The participant type the onboarding procedure template refers to. References to ParticipanType entity.
- expiration_timeframe : An expiration timeframe after which the onboarding request is considered rejected, expressed in seconds.
- expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR).
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

OnboardingProcedureTemplateIdentityAttribute

Description: The mapping between the onboarding procedure template and the dataspace identity attributes.
Attributes:
- onboarding_procedure_template_id : the identifier of the onboarding procedure template. References to OnboardingProcedureTemplate entity.
- identity_attribute_code : the code of the identity attribute mapped to the onboarding procedure template.

DocumentTemplate

Description: The information related to a document that has to be uploaded.
Attributes:
- id : The template identifier.
- name: The short name of the document template.
- description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”).
- mandatory : Specifies if the document template has to be provided or is optional. Defaults to true (mandatory).
- mime_type_id : The document mime type. References the MimeType entity.

OnboardingApplicant

Description: The information regarding the applicant who opens an onboarding request.
Attributes:
- id: The identifier of the applicant
- username: The username of the user (same as the one in Keycloak).
- firstname: User’s first name.
- lastname: User’s last name.

OnboardingRequest

Description: Onboarding request represents an instance of an onboarding request created by an applicant.
Attributes:
- id : the identifier of the onboarding request
- onboarding_procedure_template_id : The onboarding procedure template that the onboarding request refers to. References the OnboardingProcedureTemplate entity.
- onboarding_status_id : The status of the onboarding request. References the Onboarding request status.
- expiration_timeframe: An expiration timeframe after which the onboarding request is considered rejected.
- expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR)
- participant_type_id : The participant type, copied from the related onboarding procedure template. References the ParticipantType entity.
- participant_id : The participant’s identifier. Populated when the onboarding request is approved and the participant is created.
- rejection_cause : The text explaining why the request is rejected.
- onboarding_applicant_id: The identifier of the applicant representative that created the onboarding request. References the OnboardingApplicant entity.
- organization : The name of the organisation that opened this onboarding request through the applicant representative.
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

OnboardingRequestIdentityAttribute

Description: The mapping between the onboarding request and the dataspace identity attributes.
Attributes:
- onboarding_request_id : The identifier of the onboarding request. References the OnboardingRequest entity.
- identity_attribute_code : The code of the identity attribute mapped to the onboarding procedure template.

Document

Description: The document uploaded by an applicant to complete the onboarding request.
Attributes:
- id: the identifier of the document.
- description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”)
- document_template_id : The document template in the onboarding procedure to which this document refers. References the DocumentTemplate entity. It can be null if the document is requested during the onboarding of the applicant participant.
- onboarding_request_id : The identifier of the onboarding request. References the OnboardingRequest entity.
- mime_type_id : The document type. References the MimeType entity.
- content : The actual content of the document uploaded by the applicant dataspace participant during the request creation or editing. If null, it means that the document has not been uploaded yet by the applicant dataspace participant.
- fileSize : The size of the uploaded file.
- filename : The name of the uploaded file.

Comment

Description: Comments inserted by the actors involved in the onboarding process.
Attributes:
- id: The identifier of the comment.
- onboarding_request_id : The identifier of the onboarding request to which the comment belongs. References the OnboardingRequest entity.
- author : The author of the comment. It’s the username stored in Keycloak.
- content : The comment written by the author.

OnboardingStatus

Description: Supporting table containing the status values (APPROVED, IN PROGRESS, IN REVIEW, REJECTED, EVALUATING).
Attributes:
- id: The id of the status.
- value : The actual status of an onboarding request.
- label : A human-readable label for the status.

EventLog

Description: Register Business Events related to an onboarding request (Comment Inserted, Onboarding Request Status Change).
Attributes:

- id : The identifier of the event.
- onboarding_request_id : The identifier of the related onboarding request. References the OnboardingRequest entity.
- initiator_user_id : The identifier of the user that caused the event.
- initiator_service : The identifier of the component or service that caused the event (e.g. background service monitoring stale onboarding request).
- event_type : Type of event (e.g. COMMENT_INSERTED, STATUS_CHANGED).
- event_details : Additional JSON metadata that contains details about the event (e.g. new state).
- entity_id : The id of the entity related to the event (e.g. the id of the comment if the event Type is COMMENT_INSERTED.
- creation_timestamp : The creation timestamp of the event.

ValidationRule

Description: Validation rule context used to validate documents uploaded by the applicants.
Attributes:
- id : The identifier of the validation rule.
- name : The short name of the validation rule.
- description : A detailed description of the validation rule.
- document_template_id: The identifier of the document template to which the rule applies. References the DocumentTemplate entity.
- onboarding_procedure_template_id : The onboarding procedure template where the rule has been created. References the OnboardingProcedureTemplate entity.
- valid_since : The date from which the rule becomes valid and must be evaluated.
- valid_to : The date until which the rule remains valid and must be evaluated.
- active: Boolean parameter indicating if the rule is active. An inactive rule is not evaluated.
- type: the validation rule type (CONTENT_CHECK, PRESENCE, COMPOSITE).
- auto_approval: Flag indicating that if the rule passes, the related onboarding request is automatically approved.
- required: Flag indicating that if the rule does not pass, the related onboarding request is automatically rejected.
- content_validation_rule: When the type is CONTENT_CHECK, it contains the URL to an external validation service.
- strategy : The evaluation strategy for a composite rule. ALL indicates that all child rules must be valid. AT_LEAST_ONE indicates that only one rule needs to be valid.
- parent_id : The id of the parent COMPOSITE rule, if the rule is a child rule. Reference the ValidationRule entity.
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

ValidationRuleExecution

Description: The rule execution outcome of a rule evaluated against an actual document uploaded by the applicant.
Attributes:
- id: The identifier of the execution.
- validation_rule_id: The identifier of the validation rule used for this execution. Reference the VaklidationRule entity.
- document_id: The ID of the document on which the validation was performed. References the Document entity.
- onboarding_request_id: The onboarding request this execution is related to. References the OnboardingRequest entity.
- execution_start_date : The start date and time of the rule execution.
- execution_end_date : The end date and time of the rule execution.
- status : The outcome of the validation (SUCCESS, IN PROGRESS, ERROR, FAULT).
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

ValidationRuleExecution

Description: A validation rule remark. Register the result of the external validation service when a content check rule fails.
Attributes:
- id: The identifier of the remark.
- execution_id: The identifier of the validation rule execution to which the remark refers. References the ValidationRuleExecution entity.
- jsonb: An unstructured field containing the remark.

LDM - Users Roles

Microservice that helps to map tier 1 roles with tier 2 security attributes.

Entity Descriptions and Attributes

IdentityAttributeRole

Description: The mapping between the tier 1 role and the assignable tier 2 identity attributes.
Attributes:
- id : The unique ID of the attribute → role association.
- ida_code: The unique identity attribute code.
- role_name : The role name mapped to the attribute. The role name references a role defined inside the tier1 authentication provider.
- enabled : Flag indicating if the association between the identity attribute and the role is valid.

RoleRequest

Description: The role request created by an end-user to request a role in the agent.
Attributes :
- id: The unique ID of the roles request
- user_email: the email of the user who requested the roles
- status: the status of the request (open, cancelled, approved, rejected)
- reviewed_by: the id of the user that reviewed the request
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

RoleRequested

Description: A specific role linked to a role request.
Attributes :
- id: The unique ID of the requested role request
- role_request_id: the id of the parent role request
- role: the code of the requested role
- requested_by: either the id of the end-user that requested the role or the id of the approver that added the role to the existing role request
- approved: if the role has been included when the parent role request has been accepted
- requested_timestamp : The request timestamp.

Role

Description: An End-User Simpl-Open role
Attributes:
- id: The unique ID of the role.
- code: The role code.
- name: The role’s human-readable name.
- description: The role’s human-readable description.
- builtin: Boolean indicating if the role built-in (default) for Simpl-Open.
- enabled: Boolean indicating if the role can be assigned to a user and can be included in their session after authentication.

LDM - Security Attributes Provider

Microservice that provides Ephemeral Proofs to onboarded Dataspace Participants. It’s the core of the Dynamic Attribute Provisioning approach. Deployed only by the Data Space Governance Authority.

Entity Descriptions and Attributes

IdentityAttribute

Description: The complete list of all the identity attributes of the Data Space.
Attributes:
- id : The unique ID of the identity attribute.
- code : The identity attribute unique code. This is the actual identifier that is used to enforce authorisation across participants.
- name : The human-readable identity attribute name.
- description : The description of the identity attribute.
- assignable_to_roles : Flag indicating if the identity attribute can be assigned to a role.
- enabled : True if the identity attribute is enabled for this participant.
- is_right : Flag indicating if the identity attribute is considered a legal right.
- built_in : Boolean indicating that the identity attribute is built-in (installed with the agent and not modifiable).
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

ParticipantIdentityAttribute

Description: Maps a participant with their identity attribute.
Attributes:
- participant_id: The identifier of the participant associated with the entity attribute.
- identity_attribute_id : The identifier of the identity attribute. References the IdentityAttribute entity.

LDM - Identity Provider

Microservice that handles the credentials for each dataspace participant. Deployed only by the Data Space Governance Authority.

Entity Descriptions and Attributes

Participant

Description: Contains the information of the dataspace participants.
Attributes:

- id : The unique ID of the participant.
- organization : The organisation name of the participant.
- participant_type : The type of the participant (CONSUMER, DATA PROVIDER, INFRASTRUCTURE PROVIDER, APPLICATION PROVIDER).
- certificate_signing_request_content : The content of the CSR needed to issue a credential to the participant
- tier1_public_key_content : Contains the tier 1 public key (Keycloak public key) used by the participant Keycloak to sign user tier1 JWTs.
- active_credential_id : The id of the the participant’s active credential. References the Credential entity.
- applicant_email : The email of the applicant responsible for the onboarding procedure of the participant
- is_authority : Boolean indicating that the participant is the Governance Authority of the data space.
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.
- renewal_request_timestamp : The timestamp of the renewal request issuance by the participant.

Credential

Description: Metadata of the credential stored in the credential factory component (EJBCA)
Attributes:
- id : The unique ID of the credential
- participant_id : The id of the participant owning the credential. References the Participant entity.
- credential_type: Type of the credential, currently only x509 credentials are supported
- certificate_authority : The certificate authority name of the credential factory component (EJBCA)
- serial: The serial number of the credential in the credential factory component (EJBCA)
- credential_id: The id of the credential inside the credential factory component (EJBCA)
- expiry_date: the expiration date of the credential/

AutoRenewalDefault

Description: the default auto-renewal configuration for the data space.
Attributes:
- id: the id of the auto-renewal configuration.
- days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process is triggered by the scheduled job.
- modified_by_user: indicates if the default installation configuration has been overwritten by a user of the Governance Authority.
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

AutoRenewalParticipant

Description: auto-renewal configurations that override the default ones for the participant
Attributes:
- participant_id: the id of the participant.
- days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process for the participant is triggered by the scheduled job.
- boolean: indicates if the auto renewal is enabled for the participant.
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

AutoRenewalError

Description: stores the auto-renewal error details link to each participant.
Attributes:
- id: the id of the logged error.
- participant_id: the id of the participant for whom the autorenewal has failed.
- description: the description of the error.
- creation_timestamp : The creation timestamp.

LDM - Authentication Provider

Microservice that manages the credentials and the tier2 authentication of a participant

Entity Descriptions and Attributes

KeyPair

Description: The keypair created or uploaded by an applicant representative to initiate the credential creation after the approval of an onboarding request.
Attributes:
- id : The unique ID of the participant.
- name: The name of the KeyPair, inserted by the user upon creation.
- active: Boolean indicating that the keypair is linked to an active credential.
- public_key : The keypair public key content.
- public_key_hash: The keypair public key hash.
- certificate_signing_request: The content of the CSR, needed for requesting the issuance of a new credential linked to the keypair
- creation_timestamp : The creation timestamp.
- update_timestamp : The update timestamp.

PrivateKey

Description: The private key content related to the keypair.
Attributes:
- id: The ID of the private key.
- private_key: The private key encrypted content.
- keypair_id: The keypair linked to this key. References the KeyPair entity.
- creation_timestamp : The creation timestamp.

Credential

Description: The credential that allows tier2 communication of the participant in the dataspace. Can be empty if the configured credential storage is Hashicorp Vault.
Attributes:
- id : The unique ID of the credential.
- content : The credential (x509 certificate or foreseen SSI Verifiable Credential).
- credential_id: The Base58 of the credential content.
- issuance_date: The date and time when the credential was issued.
- expiry_date: The expiry date and time of the credential.
- keypair_id: the keypair linked to this credential. References the KeyPair entity.
- creation_timestamp : The creation timestamp.

Identity Attribute

Description: When used inside a participant agent, it contains a local copy of the identity attributes of the data space, in sync with the identity attributes provided by the governance authority.
Attributes:
- id : The id of the identity attribute
- code : The unique code identifying the identity attribute
- name: The human readable identity attribute name.
- description : The description of the identity attribute
- assignable_to_roles : Boolean indicating if the identity attribute is assignable to roles
- enable : Boolean indicating if the identity attribute is enabled or not
- assigned_to_participant : Boolean indicating if the identity attribute is currently assigned to the participant.
- creation_timestamp : The creation timestamp
- update_timestamp : The update timestamp

ParticipantInfo

Description: the details of the participant. Needed to retrieve the basic information related to the organization owning the agent. Populated only after the onboarding process has completed.
Attributes:
- id: the ID of the entry.
- participant_id: the ID of the participant owning the agent.
- organization: the name of the organisation of the participant owning the agent.
- authority_creation_timestamp: the creation timestamp of the participant inside the governance authority.
- authority_update_timestamp: the update timestamp of the participant inside the governance authority.
- creation_timestamp: the creation timestamp.
- update_timestamp: the update timestamp.

AutoRenewalConfig

Description : stores the auto-renewal config for the agent.
Attributes:
- id: the ID of the entry.
- enabled: indicates if the auto-renewal is enabled for the participant agent.

CredentialSyncExecutionError

Description: logs the execution errors that may happen during credential syncronization with the Governance Authority.
Attributes:
- id: the ID of the entry.
- execution_timestamp: the execution timestamp of the attempted credential synchronisation.
- error_message: the error details.

2.20.2. LDM - Domain 2 - Publish and consume resources

LDM - Contract Manager

Contract manager handles the integration between the connector and VC Issuer component, Signer component, and Wallet component.

In the current release, the Contract Manager stores contract agreement related data in a single table for two main purposes:

Establish the data persistence for billing purposes (future feature)
Demonstrate contract negotiation status

contract_agreements

contract_agreement_id: UUID of contract agreement issued by the Connector
contract_definition_id ID of the contact definition
consumer_signature_date: Date and time of the consumer signature event
provider_signature_date: Date and time of the provider signature event
status: Status of contract negotiations
contract_negotiation_id: ID of the contract negotiation
asset_id: ID of the asset
provider_id: ID of the provider
consumer_id: ID of the consumer
contract_offer_id: ID of the contract offer

LDM - Infrastructure Provider Storage

Handles the logical representation of the infrastructure provider’s deployment scripts and their relationships, ensuring efficient storage, retrieval, and management.

Entity Descriptions and Attributes

Configuration

Description : Represents the companies offering infrastructure deployment services.
Attributes :
- id: Unique identifier for the configuration.
- file_name: Configuration name.
- configuration: Instruction containing the configuration.
- script_id: The deployment script that is bonded to the configuration.

Deployment Script

Description : Stores details about deployment scripts used for infrastructure provisioning.
Attributes :
- id : Unique identifier for the script.
- title : Title of the script.
- description : A short description of the script.
- valid : Indicates if the script is valid.
- creation_date : Date when the script was uploaded.
- update_date : Last modification date.
- cloud_provider_id : Links to the Infrastructure Provider table.
- gitea_sha : Hash of the script in the repository.
- location : Location in the repository.
- content : Content of the deployment script.
- file_name : Name of the file that was uploaded.
- script_identify_id : Links to the Script Identify table.

Script Trigger

Description : Represents provisioning requests for deployment scripts.
Attributes :
- id : Unique identifier for the provisioning request.
- status : Status of the provisioning process (e.g., Received, Sent, Running).
- resource_status : Status of the provisioned resource (e.g., Provisioning, Activated).
- decommissioned_date_time : Decommissioning timestamp.
- key_user : Credential retrieval key part 1.
- key_vault : Credential retrieval key part 2.
- requester_email : Email address of the requester.
- provisioned_date_time : Provisioned timestamp.
- volume_id : Id of the Virtual Machine’s storage.
- datacenter_id : Id of the Datacenter where the Virtual Machine is running.
- error_message : Error message related to the provisioning/decommissioning process.
- requester_unique_id : Unique identifier for the resource request.
- script_id : Links to the Deployment Script table.

Script Identify

Description : Stores metadata for deployment scripts, such as hashes for integrity verification.
Attributes :
- id : Unique identifier for the metadata entry.
- deployment_script_id : Links to the Deployment Script table.
- hash : Hash of the deployment script.

Template :

Description : Stores VM template related data and references.
Attributes :
- id : Primary Key, unique identifier for each Template.
- cloud_environment_id : Reference to the Cloud Environment where the template will be running.
- cloud_provisioner_template_id : Reference to the Cloud Provider Template file and description of VM limits.
- name : Template name.
- description : Template description.
- cpu_core : The number of cores of the VM.
- ram : The amount (mb) of memory to be assign to the VM.
- storage : The storage (mb) size of the VM.
- os : Name of the Operative System do be installed.
- active : Indicates if the template is active (True) or inactive (False).
- creation_date : Date when template was stored.
- script_id : Referenc to the Deployment script created based on this template.

Component

Description : Represents a component that can be applied to a VM Template.
Attributes :
- id : Unique identifier for the component.
- name : Name of the component
- description: Description of the component
- content: Content of the component
- creation_date : creation date
- update_date: update date
- active: if the component is active or not

Component Type

Description : Represents a the types of components (VM Configuration, Post configuration and Policies)
Attributes :
- id : Unique identifier for the component type.
- name : component type name.

Cloud Provider :

Description : Simple and basic attributes for a cloud provider.
Attributes :
- id : Primary Key, unique identifier for each Provider
- cloud_provider_name : Cloud Provider name.

Cloud Environment :

Description : The main characteristics for one of many environments a cloud provider provides.
Attributes :
- id : Primary Key, unique identifier for each Cloud Environment
- cloud_provider_id : Reference to Cloud Provider stored in the database (e.g. Ionos, AWS…).
- environment_name : Environment name.
- environment_description : Environment description.
- iac : Infrastructure as Code technology to support the deployed resources.
- datacenter_name : Name of the DataCenter where the resources will be provisioned.
- datacenter_description : Description of the DataCenter.
- location : Cloud Environment location identifier (e.g., us-east-1, europe-west3).
- vault_path: Vault path where the cloud environment token is securely stored.
- total_cpu_cores : The number of total cores available for the Cloud Environment
- used_cpu_cores : The number of cores used by the Cloud Environment
- total_ram : The total amount of memory available for the Cloud Environment
- used_ram : RAM used by the Cloud Environment
- total_storage : The total amount of storage available for the Cloud Environment
- used_storage : Storage used by the Cloud Environment
- vault_user : Vault identity or role with permissions to access the cloud environment token.
- vault_key : Vault path or key where the cloud environment token is securely stored.
- active : if the Cloud Environment is active or not.

Cloud Provisioner Template :

Description : The provisioner specific data to derive/create templates.
Attributes :
- id : Primary Key, unique identifier for each Cloud Provisioner Template.
- file_name : Template file title.
- file : File content of the Cloud Provider Template (Terraform/Crossplane)
- min_cpu_core : Minimum number of cores for the VM
- max_cpu_core : Maximum number of cores for the VM
- min_ram : Minimum ram size (Mb) for the VM
- max_ram : Maximum ram size (Mb) for the VM
- min_storage : Minimum storage size (Gb) for the VM
- max_storage : Maximum storage size (Gb) for the VM
- os : (List of) OS’s allowed for the creation of VMs
- is_ovh : Flag to identify OVH templates
- ovh_flavor : Ovh flavor for the VM
- ovh_project_id : Ovh project id related to the environment of the vm
- ovh_os_image_id : Ovh os image id for the VM
- ovh_region : Ovh region where the VM will be running

LDM - Sync Schema Adapter

Handles the logical representation of schema and update events

Entity Descriptions and Attributes

Schema

Description : Stores details about schema info and publication status
Attributes :
- id : Unique identifier for the schema.
- creation_date : Creation date of schema
- latest_version : Latest version available for the schema
- metadata : Metadata info for the schema
- name : Name of the schema as created on Schema Management Service GA side
- resource_type : Last modification date.
- status : Publication status of schema
- update_date : Last update date of specific schema

Schema Event

Description : Stores details about notification events related to publication and revoking of schema, produced by Schema Management Service
Attributes :
- id : Unique identifier for the schema.
- changelog : Changelog info associated to event
- event_date : Date the event occurred
- event_id : Event id generated by schema management service
- event_type : Event type (PUBLISH | REVOKE)
- info : Last modification date.
- origin : System originating event
- processing_status : Processing status of event
- version : Schema version event refers to

5.2.3. Physical Data Model

Attributes labelled with “NN” are Not Null.

2.21.1. PDM - Domain 1 - Access Control & Trust

Contains the export of the physical data model of IAA Microservice. Please refer to the LDM - Domain 1 - Access Control & Trust for a description of entities and fields.

PDM - Onboarding

Postgres physical data model of the Onboarding service. It handles the onboarding of a new participant in the Data Space.

mime_type

id : The identifier of the MIME type.
value : A human-readable text that describes the MIME type (e.g. “pdf”, “zip”).
name : The actual MIME type value following the RFC6838 (e.g. “application/pdf”, “application/zip”).

participant_type

id : The identifier of the participant type.
value : The code of the participant type.
label : A human-readable name for the participant type.

onboarding_procedure_template

id : The template identifier.
description : A brief description of the onboarding procedure template (e.g. “The role of this participant in the dataspace is …”).
participant_type_id : The participant type the onboarding procedure template refers to. Foreign key to the participant_type table.
expiration_timeframe : An expiration timeframe after which the onboarding request is considered rejected, expressed in seconds.
expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

onboarding_procedure_template_identity_attribute

onboarding_procedure_template_id : The identifier of the onboarding procedure template. Foreign key to the onboarding_procedure_template table.
identity_attribute_code : The code of the identity attribute mapped to the onboarding procedure template.

document_template

id : The template identifier.
name : The short name of the document template.
description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”).
mandatory : Specifies if the document template has to be provided or is optional. Defaults to true.
mime_type_id : The document MIME type. Foreign key to the mime_type table.

onboarding_applicant

id : The identifier of the applicant.
username : The username of the user (same as the one in Keycloak).
firstname : User’s first name.
lastname : User’s last name.

onboarding_request

id : The identifier of the onboarding request.
onboarding_procedure_template_id : The onboarding procedure template that the onboarding request refers to. Foreign key to the onboarding_procedure_template table.
onboarding_status_id : The status of the onboarding request. Foreign key to the onboarding request status.
expiration_timeframe : An expiration timeframe after which the onboarding request is considered rejected.
expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR).
participant_type_id : The participant type, copied from the related onboarding procedure template. Foreign key to the participant_type entity.
participant_id : The participant’s identifier. Populated when the onboarding request is approved and the participant is created.
rejection_cause : The text explaining why the request is rejected.
onboarding_applicant_id : The identifier of the applicant representative that created the onboarding request. Foreign key to the onboarding_applicant entity.
organization : The name of the organisation that opened this onboarding request through the applicant representative.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

onboarding_request_identity_attribute

onboarding_request_id : The identifier of the onboarding request. Foreign key with the onboarding_request table.
identity_attribute_code : The code of the identity attribute mapped to the onboarding procedure template.

document

id : The identifier of the document.
description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”).
document_template_id : The document template in the onboarding procedure to which this document refers. Foreign key to the document_template table. Can be null if the document is requested during onboarding of the applicant participant.
onboarding_request_id : The identifier of the onboarding request. Foreign key to the onboarding_request entity.
mime_type_id : The document type. Foreign key to the mime_type table.
content : The actual content of the document uploaded by the applicant dataspace participant during the request creation or editing. Null if not uploaded yet.
fileSize : The size of the uploaded file.
filename : The name of the uploaded file.

comment

id : The identifier of the comment.
onboarding_request_id : The identifier of the onboarding request to which the comment belongs. Foreign key to the onboarding_request table.
author : The author of the comment. It’s the username stored in Keycloak.
content : The comment written by the author.

onboarding_status

id : The id of the status.
value : The actual status of an onboarding request.
label : A human-readable label for the status.

event_log

id : The identifier of the event.
onboarding_request_id : The identifier of the related onboarding request. Foreign key to the onboarding_request table.
initiator_user_id : The identifier of the user that caused the event.
initiator_service : The identifier of the component or service that caused the event (e.g. background service monitoring stale requests).
event_type : Type of event (e.g. COMMENT_INSERTED, STATUS_CHANGED).
event_details : Additional JSON metadata that contains details about the event (e.g. new state).
entity_id : The id of the entity related to the event (e.g. the id of the comment if event_type is COMMENT_INSERTED).
creation_timestamp : The creation timestamp of the event.

validation_rule

id : The identifier of the validation rule.
name : The short name of the validation rule.
description : A detailed description of the validation rule.
document_template_id : The identifier of the document template to which the rule applies. Foreign key to the document_template table.
onboarding_procedure_template_id : The onboarding procedure template where the rule has been created. Foreign key to the onboarding_procedure_template table.
valid_since : The date from which the rule becomes valid and must be evaluated.
valid_to : The date until which the rule remains valid and must be evaluated.
active : Boolean parameter indicating if the rule is active. An inactive rule is not evaluated.
type : The validation rule type (CONTENT_CHECK, PRESENCE, COMPOSITE).
auto_approval : Flag indicating that if the rule passes, the related onboarding request is automatically approved.
required : Flag indicating that if the rule does not pass, the related onboarding request is automatically rejected.
content_validation_rule : When the type is CONTENT_CHECK, contains the URL to an external validation service.
strategy : The evaluation strategy for a composite rule. ALL means all child rules must be valid; AT_LEAST_ONE means only one rule must be valid.
parent_id : The id of the parent COMPOSITE rule if the rule is a child rule. Foreign key to the validation_rule table.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

validation_rule_execution

id : The identifier of the execution.
validation_rule_id : The identifier of the validation rule used for this execution. Foreign key to the validation_rule table.
document_id : The ID of the document on which the validation was performed. Foreign key to the document table.
onboarding_request_id : The onboarding request this execution is related to. Foreign key to the onboarding_request table.
execution_start_date : The start date and time of the rule execution.
execution_end_date : The end date and time of the rule execution.
status : The outcome of the validation (SUCCESS, IN PROGRESS, ERROR, FAULT).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

validation_rule_execution_remark

id : The identifier of the remark.
execution_id : The identifier of the validation rule execution to which the remark refers. Foreign key to the validation_rule_execution table.
jsonb : An unstructured field containing the remark in JSONB format.

PDM - Users Roles

Postgres physical data model of the User Roles Microservice. It helps to map tier 1 roles with tier 2 security attributes.

identity_attribute_roles

id : The unique ID of the attribute to role association.
ida_code : The unique identity attribute code.
role_name : The role name mapped to the attribute. The role name references a role defined inside the tier1 authentication provider.
enabled : Flag indicating if the association between the identity attribute and the role is valid.

role_request

id: The unique ID of the roles request.
user_email: the email of the user who requested the roles.
status: the status of the request (open, cancelled, approved, rejected).
reviewed_by: the id of the user that reviewed the request.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

role_requested

id: The unique ID of the requested role request
role_request_id: the id of the parent role request. Foreign key to role_request table.
role: the code of the requested role.
requested_by: either the id of the end-user that requested the role or the id of the approver that added the role to the existing role request.
approved: if the role has been included when the parent role request has been accepted.
requested_timestamp : The request timestamp.

role

id: The unique ID of the role.
code: The role code.
name: The role’s human-readable name.
description: The role’s human-readable description.
builtin: Boolean indicating if the role builtin (default) for Simpl-Open.
enabled: Boolean indicating if the role can be assigned to a user and can be included in their session after authentication.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

PDM - Security Attributes Provider

Postgres physical data model for the Security Attributes Microservice. It contains the association between participants and identity attributes. Being the core of the Dynamic Attribute Provisioning approach, it is queried by the identity provider to build the Ephemeral Proofs to onboarded Dataspace Participants. Deployed only by the Data Space Governance Authority.

identity_attribute

id : The unique ID of the identity attribute.
code : The unique identity attribute code.
name : The human-readable identity attribute name.
description : Flag indicating if the association between the identity attribute and the role is valid.
assignable_to_role : Flag indicating if the identity attribute can be assigned to a role.
enabled : Flag indicating if the identity attribute is enabled for the dataspace.
is_right : Flag indicating if the identity attribute is considered a legal right (currently not used).
built_in: Flag indicating that the identity attribute is built-in (installed with the agent and not modifiable).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

participant_identity_attribute

participant_id : The ID of the participant (owned by the identity provider component).
identity_attribute_id : The ID of the participant associated with the entity attribute. Foreign key to identity_attribute table.

PDM - Identity Provider

Postgres physical data model for the Identity Provider Microservice. It contains the participant information, the participant’s Certificate Signing Request(CSR) and the participant’s tier1 public key. Deployed only by the Data Space Governance Authority.

participant

id : The unique ID of the participant.
participant_type : The type of the participant.
organization : The organisation name of the participant.
certificate_signing_request_content : The content of the CSR needed to issue a credential to the participant
tier1_public_key_content : Contains the tier 1 public key (Keycloak public key) used by the participant Keycloak to sign user tier1 JWTs.
active_credential_id : The id of the participant’s active credential. Foreign key to the credential table.
is_authority : Boolean indicating that the participant is the Governance Authority of the data space.
renewal_request_timestamp : The timestamp of the renewal request issuance by the participant.
applicant_email : The email of the applicant responsible for the onboarding procedure of the participant.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

credential

id : The id of the credential
participant_id : The id of the participant owning the credential. Foreign key to the participant table.
credential_type: Type of the credential, currently only x509 credentials are supported
certificate_authority : The certificate authority name of the credential factory component (EJBCA)
serial: The serial number of the credential in the credential factory component (EJBCA)
credential_id: The id of the credential inside the credential factory component (EJBCA)
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

auto_renewal_default

id: the id of the auto-renewal configuration.
days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process is triggered by the scheduled job.
modified_by_user: indicates if the default installation configuration has been overwritten by a user of the Governance Authority.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

auto_renewal_participant

participant_id: the id of the participant. Foreign key to the participant table.
days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process for the participant is triggered by the scheduled job.
boolean: indicates if the auto renewal is enabled for the participant.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.

auto_renewal_error

d: the id of the logged error.
participant_id: the id of the participant for whom the autorenewal has failed.
description: the description of the error.
creation_timestamp : The creation timestamp.

PDM - Authentication Provider

Postgres physical data model for the authentication provider microservice. It contains the keypair and the credentials that the participant uses to communicate with other participants. It also contains a local copy of the dataspace identity attributes.

keypair

id : The unique ID of the participant.
name: The name of the KeyPair, inserted by the user upon creation.
active: Boolean indicating that the keypair is linked to an active credential.
public_key : The keypair public key content.
public_key_hash: The keypair public key hash.
certificate_signing_request: The content of the CSR, needed for requesting the issuance of a new credential linked to the keypair

private_key

id: The ID of the private key.
private_key: The private key encrypted content.
keypair_id: The keypair linked to this key. Foreign key to the keypair table.

credential

id : The unique ID of the credential.
content : The credential (x509 certificate or foreseen SSI Verifiable Credential).
issuance_date: The date and time when the credential was issued.
expiry_date: The expiry date and time of the credential.
keypair_id: the keypair linked to this credential. Foreign key to the keypair table.

identity_attribute

id : The id of the identity attribute
code : The unique code identifying the identity attribute
description : The description of the identity attribute
assignable_to_roles : Boolean indicating if the identity attribute is assignable to roles
enable : Boolean indicating if the identity attribute is enabled or not
assigned_to_participant : Boolean indicating if the identity attribute is currently assigned to the participant.
creation_timestamp : The creation timestamp
update_timestamp : The update timestamp

participant_info

id: the ID of the entry
participant_id: the ID of the participant owning the agent.
organization: the name of the organization of the participant owning the agent
authority_creation_timestamp: the creation timestamp of the participant inside the governance authority
authority_update_timestamp: the update timestamp of the participant inside the governance authority
creation_timestamp: the creation timestamp
update_timestamp: the update timestamp

automatic_renewal_config

id: the ID of the entry
active: indicates if the auto-renewal is enabled for the participant agent.

credential_sync_execution_error

id: the ID of the entry
execution_timestamp: the execution timestamp of the attempted credential synchronisation.
error_message: the error details.

2.21.2. PDM - Domain 2 - Publish and consume resources

PDM - Contract Manager

DBML

Table contract_agreements {

contract_agreement_id UUID [primary key]

contract_definition_id uuid

consumer_signature_date timestamptz

provider_signature_date timestamptz

status text

contract_negotiation_id text

asset_id text

provider_id text

consumer_id text

}

PDM - Infrastructure Provider Storage

Infrastructure Provider - PDM

Table infrastructure_provider {

id BIGINT [pk]

name VARCHAR(255)

}

Table deployment_script {

id BIGINT [pk, increment]

valid BOOLEAN

creation_date DATE

description VARCHAR(255)

file OID

gitea_sha VARCHAR(255)

location VARCHAR(255)

original_file_name VARCHAR(255)

title VARCHAR(100)

update_date DATE

cloud_provider_id BIGINT [ref: > infrastructure_provider.id]

script_identify_id BIGINT [ref: > script_identify.id, unique]

}

Table script_trigger {

id BIGINT [pk, increment]

decommissioned_date_time TIMESTAMP

provisioned_date_time TIMESTAMP

key_user VARCHAR(255)

key_vault VARCHAR(255)

requester_email VARCHAR(255)

requester_unique_id VARCHAR(255)

resource_status VARCHAR(255)

status VARCHAR(255)

script_id BIGINT [ref: > deployment_script.id]

volume_id VARCHAR(255)

datacenter_id VARCHAR(255)

error_message TEXT

}

Table script_identify {

id BIGINT [pk]

deployment_script_id VARCHAR(50) [unique]

hash VARCHAR(255)

}

Table config_file {

id BIGINT [pk, increment]

file_name VARCHAR(255)

file TEXT

script_id BIGINT [ref: > deployment_script.id]

}

Table template (

id bigserial,

cloud_environment_id bigint [ref: > cloud_environment.id],

cloud_provisioner_template_id bigint [ref: > cloud_provisioner_template.id],

name varchar(100),

description text,

cpu_core integer,

ram integer,

storage integer,

os varchar(50),

active boolean,

creation_date date,

script_id bigint [ref: > script.id]

)

Table template (

id bigserial,

cloud_environment_id bigint [ref: > cloud_environment.id],

cloud_provisioner_template_id bigint [ref: > cloud_provisioner_template.id],

name varchar(100),

description text,

cpu_core integer,

ram integer,

storage integer,

os varchar(50),

active boolean,

creation_date date,

script_id bigint [ref: > script.id]

)

Table cloud_environment

(

id bigserial,

cloud_provider_id bigint [ref: > cloud_provider.id],

environment_name varchar(100) ,

environment_description text,

iac varchar(50),

datacenter_name varchar(100),

datacenter_description text,

location varchar(100),

vault_path varchar(255),

total_cpu_cores integer,

used_cpu_cores integer,

total_raminteger,

used_ram integer,

total_storage integer,

used_storage integer,

vault_user varchar(255),

vault_keyvarchar(255)

)

Table cloud_provider

(

id bigint,

cloud_provider_name varchar(255)

)

Table cloud_provisioner_template

(

id bigserial,

file_name varchar(100),

file text,

min_cpu_core integer,

max_cpu_core integer,

min_ram integer,

max_ram integer,

min_storage integer,

max_storage integer,

os text[],

cloud_provisioner_template_uuid varchar(50),

is_ovh boolean,

ovh_flavor varchar(255),

ovh_project_id varchar(255),

ovh_os_image_id varchar(255),

ovh_region varchar(255),

)

PDM - Sync Schema Adapter

DBML

CREATE TABLE IF NOT EXISTS public.schema

(

i_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1 ),

d_creation_date timestamp(6) without time zone NOT NULL,

s_latest_version character varying(50) COLLATE pg_catalog.”default” NOT NULL,

s_metadata character varying(255) COLLATE pg_catalog.”default” NOT NULL,

s_name character varying(250) COLLATE pg_catalog.”default” NOT NULL,

s_resource_type character varying(50) COLLATE pg_catalog.”default” NOT NULL,

s_status character varying(25) COLLATE pg_catalog.”default” NOT NULL,

d_update_date timestamp(6) without time zone,

CONSTRAINT schema_pkey PRIMARY KEY (i_id),

CONSTRAINT schema_un1 UNIQUE (s_name)

)

CREATE TABLE IF NOT EXISTS public.schema_event

(

i_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1 ),

s_changelog character varying(250) COLLATE pg_catalog.”default”,

d_event_date timestamp(6) without time zone NOT NULL,

s_event_id character varying(250) COLLATE pg_catalog.”default” NOT NULL,

s_event_type character varying(25) COLLATE pg_catalog.”default” NOT NULL,

s_info character varying(2000) COLLATE pg_catalog.”default”,

s_origin character varying(25) COLLATE pg_catalog.”default” NOT NULL,

s_processing_status character varying(25) COLLATE pg_catalog.”default” NOT NULL,

s_version character varying(50) COLLATE pg_catalog.”default”,

i_schema_id bigint NOT NULL,

CONSTRAINT schema_event_pkey PRIMARY KEY (i_id),

CONSTRAINT schema_name_un1 UNIQUE (s_event_id),

CONSTRAINT fklpbr0w32che4eibn92v5c464 FOREIGN KEY (i_schema_id)

REFERENCES public.schema (i_id) MATCH SIMPLE

ON UPDATE NO ACTION

ON DELETE CASCADE

)

6. Simpl-Open Technology Architecture

Simpl-Open Technology Architecture develops the target technology architecture that enables the application architecture to be delivered through technology components and technology services. Each application component is mapped to a technology implementing the capabilities.

It identifies technology components through the following views:

View	Description
Technology Components Static View	Provide, per application service, an enriched view of the Application Components Static View by adding technology components that support the implementation of the application components.
Technology Components Dynamic View	Provide a dynamic view (sequence diagrams) per business process (or sub-process) on how technology components are used to satisfy different workflows.
Technology Deployment View	Provides an aggregated view of how the different technology components (cross BPs and domains) are deployed for all Simpl-Open agent types (Governance Authority, Data Provider, Infrastructure Provider, Application Provider, Consumer).

Next to these architecture views, the following are provided:

A table of OSS Technology - with reasons for selecting them and links to existing documentation such as data models and installation guides;
Detailed technical specification that are particularly relevant for contributing to Simpl-Open and/or implementing it in a data space.

6.1. Technology Components Views

Technology components views are presented per functional domain in following sub-sections.

For each functional domain, are presented:

a static view of the entire domain which enriches the application components view with the technologies that are implementing the components;
a set of dynamic views (sequence diagrams) that present how a subset of the technology components are used to satisfy different (parts of) business processes.

6.1.1. TCV - Domain 1 - Access Control & Trust

2.22.1. TCV - Domain 1 - Access Control & Trust - Static Views

This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.

TCV Static - Authorisation Service

Authorisation

The Authorisation Tier 1 RBAC component is implemented with Spring Cloud Gateway.
The Authorisation Tier 2 ABAC component is implemented with Spring Cloud Gateway.

TCV Static - Identity Attributes Service

Security Attributes Provider

The Attributes Management component is implemented as a Java backend application.
The Security Attributes Provider UI component is implemented as an Angular frontend application.
The Attributes Database is implemented in PostgreSQL.

TCV Static - Identity Provider Service

Identity Provider

The Credential Management service is implemented as a Java backend application.
The Credential Verification service is implemented as a Java backend application.
The Identity Provider UI component is implemented as an Angular frontend application.
The Credential Factory component is implemented with Enterprise JavaBeans Certificate Authority (EJBCA).
The Identity Database is implemented in PostgreSQL.

TCV Static - Onboarding Service

Onboarding

The Onboarding Manager component is implemented as a Java backend application.
The Onboarding UI component is implemented as an Angular frontend application.
The Onboarding Database component is implemented in PostgreSQL.

Document Validation

The Document Validation Service is provided by the governance authority and can be implemented in any language. The implemented API must be exposed through a well-established OpenAPI contract (Document Validation API).

TCV Static - Tier 1 Authentication Service

Tier 1 Authentication Provider

The Users Management component, providing the Agent Users Management and Local IDP Federation services, is implemented with Keycloak.
The Tier 1 Authentication Provider UI component is implemented as an Angular frontend application.
The User Database is implemented in PostgreSQL.
The Authenticator Plugin is a custom Keycloak SPI that allows to add custom claims to Tier1 JWT tokens

TCV Static - Tier 2 Authentication Service

Tier 2 Authentication Provider

The Credential Management component is implemented as a Java backend application.
The Tier 2 Authentication Provider UI component is implemented as an Angular frontend application.

Credentials Database/Vault

The Credentials Database/Vault is implemented with Postgres and Hashicorp. An application configuration allows selecting which one to use.

TCV Static - User Management Service

User & Roles

The User & Roles Management component is implemented as a Java backend application.
The User & Roles UI component is implemented as an Angular frontend application.
The User & Roles Database is implemented in PostgreSQL.

2.22.2. TCV - Domain 1 - Access Control & Trust - Dynamic Views

TCV Dynamic - BP 03A - Onboarding of a new data space Participant - Providers (data - application - infrastructure) & Consumers

This perspective illustrates the interactions and the flows between all the technological components.

Applicant Onboarding Request Submission

mermaid diagram Expand source

sequenceDiagram

actor applicant as Applicant

participant obui as Onboarding UI

participant t1 as Tier 1 Gateway

participant ob as Onboarding

participant sap as Security Attributes Provider

participant uar as Users & Roles

participant idp as Identity Provider

participant kc as Keycloak

applicant ->> obui: create onboarding request

activate obui

obui ->> t1: create onboarding request

t1 ->> ob: request

ob ->> sap: fetch identity attributes

sap —>> ob: identity attributes

ob ->> ob: save onboarding request

ob ->> uar: create user

uar -> kc: create user with credentials

kc —>> uar: user details

uar —>> ob: user details

ob —>> t1: onboarding request

t1 —> obui: request

obui —>> applicant: ok

deactivate obui

applicant ->> obui: login

activate obui

obui ->> kc: login using temporary credentials

kc —>> obui: login status

obui —>> applicant: ok

deactivate obui

applicant ->> obui: submit

activate obui

obui ->> t1: Submit Onboarding Request

t1 ->> ob: Request

ob —>> t1: Ok, submitted

t1 —> obui: submitted

obui —>> applicant: ok

deactivate obui

box Governance Authority

participant obui

participant t1

participant ob

participant sap

participant uar

participant idp

participant kc

end

Governance Authority Onboarding Review

mermaid diagram Expand source

sequenceDiagram

actor gar as GA Representative

participant obui as Onboarding UI

participant t1 as Tier 1 Gateway

participant ob as Onboarding

participant sap as Security Attributes Provider

participant uar as Users & Roles

participant idp as Identity Provider

participant kc as Keycloak

gar ->> obui: login

activate obui

obui ->> kc: login using temporary credentials

kc —>> t1: login status

t1 —> obui: status

obui —>> gar: ok

deactivate obui

gar ->> obui: Approve/Reject

activate obui

alt Reject

obui ->> t1: reject

t1 ->> ob: reject

ob —>> t1: Ok, rejection completed

t1 —> obui: completed

end

alt Request revision

obui ->> t1: request revision

t1 ->> ob: revision

ob —>> t1: Ok, revision requested

t1 —> obui: requested

end

alt Approve

obui ->> t1: Approve

t1 ->> ob: Approve

ob ->> idp: create participant

idp —>> ob: participantId

ob ->> sap: assign identity attributes

sap —>> ob: attribute assigned

ob —>> t1: Ok, approved

t1 —> obui: approved

end

obui —>> gar: review completed

deactivate obui

box Governance Authority

participant obui

participant t1

participant ob

participant sap

participant uar

participant idp

participant kc

end

Applicant installs credentials

mermaid diagram Expand source

sequenceDiagram

participant kcp as Keycloak

participant uarp as Users & Roles

participant hc as Hashicorp Vault

participant authp as Authentication Provider

participant t1p as Tier 1 Gateway

participant pui as Participant Utility UI

actor applicant as Applicant

participant obui as Onboarding UI

participant t1 as Tier 1 Gateway

participant t2 as Tier 2 Gateway

participant sap as Security Attributes Provider

participant uar as Users & Roles

participant idp as Identity Provider

participant kc as Keycloak

participant EJBCA as EJBCA

applicant ->> pui: login

activate pui

pui ->> kcp: login using temporary credentials

kcp —>> pui: login status

pui —>> applicant: ok

deactivate pui

applicant ->> pui: generate keypair

activate pui

pui ->> t1p: generate keypair

t1p ->> authp: generate keypair

authp ->> authp: generate keypair

authp ->> hc: store private key

hc ->> authp: ok

authp —>> t1p: ack keypair generated

t1p —>> pui: ack keypair generated

pui —>> applicant: ack keypair generated

deactivate pui

applicant ->> pui: generate CSR

activate pui

pui ->> t1p: generate CSR

t1p ->> authp: generate CSR

authp ->> authp: generate CSR

authp —>> t1p: CSR

t1p —>> pui: CSR

pui —>> applicant: CSR

deactivate pui

applicant ->> obui: login

activate obui

obui ->> kc: login using temporary credentials

kc —>> obui: login status

obui —>> applicant: ok

deactivate obui

applicant ->> obui: upload CSR

activate obui

obui ->> t1: upload CSR

t1 ->> idp: CSR

idp ->> EJBCA: generateCredential

EJBCA —>> idp: credential

idp —>> t1: credential

t1 —> obui: credential

obui —> applicant: credential

deactivate obui

applicant ->> pui: upload credential

activate pui

pui ->> t1p: upload credential

t1p ->> authp: upload credential

authp ->> hc: store credential

hc —>> authp: ok, credential stored

authp —>> t1p: ok, credential stored

t1p —>> pui: ok, credential stored

pui —>> applicant: ok, credential stored

deactivate pui

authp —>> uarp: credential installed (EVENT)

activate uarp

uarp -> kcp: get Tier 1 public key

kcp —>> uarp: Tier 1 public key

uarp ->> t2: upload Tier 1 public key

t2 ->> idp: upload Tier 1 public key

idp —> t2: ok, public key stored

t2 —> uarp: ok, public key stored

deactivate uarp

box Governance Authority

participant obui

participant t1

participant t2

participant sap

participant uar

participant idp

participant kc

participant EJBCA

end

box Participant

participant t1p

participant pui

participant hc

participant authp

participant uarp

participant kcp

end

TCV Dynamic - BP 03B - Participant User and Roles Configuration

This perspective illustrates the interactions and the flows between all the technological components.

Configure identity provider federation

mermaid diagram Expand source

sequenceDiagram

actor u as User

participant t1 as Tier 1 Authorization provider (Keycloak)

participant idp as Organization IdP

u ->> t1: login

activate t1

t1 —>> u: login status

deactivate t1

u ->> t1: configure IdP connection

activate t1

t1 ->> idp: IdP federation

idp —>> t1: federation set up completed

t1 —>> u: ok, federation completed

deactivate t1

Configure users and roles

mermaid diagram Expand source

sequenceDiagram

actor u as End user

participant urui as Users & Roles UI (Typescript - Angular)

participant t1 as Tier 1 Gateway (Java - Spring Cloud Gateway)

participant ur as Users & Roles (Java - Spring Boot)

participant kc as Tier 1 Authorization provider (Keycloak)

u ->> urui: login

activate urui

urui ->> kc: login using credentials

kc —>> t1: login status

t1 —> urui: status

urui —>> u: ok

deactivate urui

alt Create end user

u ->> urui: create end user

activate urui

urui ->> t1: create end user

t1 ->> ur: create end user

ur ->> kc: create end user

kc —>> ur: ok, end user created

ur —>> t1: ok, end user created

t1 —>> urui: ok, end user created

urui —>> u: ok, end user created

deactivate urui

end

alt Update end user

u ->> urui: update end user

activate urui

urui ->> t1: update end user

t1 ->> ur: update end user

ur ->> kc: update end user

kc —>> ur: ok, end user updated

ur —>> t1: ok, end user updated

t1 —>> urui: ok, end user updated

urui —>> u: ok, end user updated

deactivate urui

end

alt Delete end user

u ->> urui: delete end user

activate urui

urui ->> t1: delete end user

t1 ->> ur: delete end user

ur ->> kc: delete end user

kc —>> ur: ok, end user delete

ur —>> t1: ok, end user delete

t1 —>> urui: ok, end user delete

urui —>> u: ok, end user delete

deactivate urui

end

alt Enable/Disable end user

u ->> urui: enable/disable end user

activate urui

urui ->> t1: enable/disable end user

t1 ->> ur: enable/disable end user

ur ->> kc: enable/disable end user

kc —>> ur: ok, end user enabled/disabled

ur —>> t1: ok, end user enabled/disabled

t1 —>> urui: ok, end user enabled/disabled

urui —>> u: ok, end user enabled/disabled

deactivate urui

end

alt Assign roles to end user

u ->> urui: assign roles to end user

activate urui

urui ->> urui: select roles

urui ->> t1: assign roles to end user

t1 ->> ur: assign roles to end user

ur ->> kc: assign roles to end user

kc —>> ur: ok, roles assigned

ur —>> t1: ok, roles assigned

t1 —>> urui: ok, roles assigned

urui —>> u: ok, roles assigned

deactivate urui

end

alt Create role

u ->> urui: create role

activate urui

urui ->> t1: create role

t1 ->> ur: create role

ur ->> kc: create role

kc —>> ur: ok, role created

ur —>> t1: ok, role created

t1 —>> urui: ok, role created

urui —>> u: ok, role created

deactivate urui

end

alt Update role

u ->> urui: update role

activate urui

urui ->> t1: update role

t1 ->> ur: update role

ur ->> kc: update role

kc —>> ur: ok, role update

ur —>> t1: ok, role update

t1 —>> urui: ok, role update

urui —>> u: ok, role update

deactivate urui

end

alt Delete role

u ->> urui: delete role

activate urui

urui ->> t1: delete role

t1 ->> ur: delete role

ur ->> kc: delete role

kc —>> ur: ok, role delete

ur —>> t1: ok, role delete

t1 —>> urui: ok, role delete

urui —>> u: ok, role delete

deactivate urui

end

alt Assign identity attributes to role

u ->> urui: assign identity attributes to role

activate urui

urui ->> urui: select identity attributes

urui ->> t1: assign identity attributes to role

t1 ->> ur: assign identity attributes to role

ur ->> kc: assign identity attributes to role

kc —>> ur: ok, identity attributes assigned

ur —>> t1: ok, identity attributes assigned

t1 —>> urui: ok, identity attributes assigned

urui —>> u: ok, identity attributes assigned

deactivate urui

end

TCV Dynamic - BP 03C - End User Role Request

This perspective illustrates the interactions and the flows between all the technological components.

Role request submission and cancellation

mermaid diagram Expand source

sequenceDiagram

actor u as End user

participant urui as Users & Roles UI (Typescript - Angular)

participant t1 as Tier 1 Gateway (Java - Spring Cloud Gateway)

participant ur as Users & Roles (Java - Spring Boot)

participant kc as Tier 1 Authorization provider (Keycloak)

u ->> urui: login

activate urui

urui ->> kc: login using credentials

kc —>> t1: login status

t1 —> urui: status

urui —>> u: ok

deactivate urui

alt Submit role request

u ->> urui: Create role request

activate urui

urui ->> urui: select roles

urui ->> t1: create role request

t1 ->> ur: create role request

ur —>> t1: ok, role request created

t1 —>> urui: ok, role request created

urui —>> u: created

deactivate urui

end

alt Cancel role request

u ->> urui: Cancel role request

activate urui

urui ->> t1: cancel role request

t1 ->> ur: cancel role request

ur —>> t1: ok, role request cancelled

t1 —>> urui: ok, role request cancelled

urui —>> u: ok, role request cancelled

deactivate urui

end

ARole request review

mermaid diagram Expand source

sequenceDiagram

actor u as Users Roles Manager

participant urui as Users & Roles UI (Typescript - Angular)

participant t1 as Tier 1 Gateway (Java - Spring Cloud Gateway)

participant ur as Users & Roles (Java - Spring Boot)

participant kc as Tier 1 Authorization provider (Keycloak)

u ->> urui: login

activate urui

urui ->> kc: login using credentials

kc —>> t1: login status

t1 —> urui: status

urui —>> u: ok

deactivate urui

u ->> urui: Approve/Reject

activate urui

alt Reject

urui ->> t1: reject

t1 ->> ur: reject

ur —>> t1: ok, role request rejected

t1 —> urui: completed

end

alt Approve

urui ->> urui: select roles

urui —>> t1: approve

t1 ->> ur: approve

ur ->> kc: assign roles to end user

kc —>> ur: roles assigned

ur —>> t1: ok, role request approved

t1 —> urui: completed

end

urui —>> u: review completed

deactivate urui

TCV Dynamic - SA 03 – Credentials actions by the Governance Authority

Governance Authority Representative revokes, suspends or reactivates credentials

mermaid diagram Expand source

sequenceDiagram

actor garep as Governance Authority Representative

participant obui as Onboarding UI (Angular)

participant t1 as Tier 1 Gateway (Spring Cloud Gateway)

participant idp as Identity Provider (Java - Spring Boot)

participant ejbca as Credential Store (EJBCA)

alt revoke

garep ->> obui: revoke credential

activate obui

obui ->> t1: revoke credential

t1 ->> idp: revoke credential

idp ->> ejbca: revoke credential

note right of ejbca: credential must be active

ejbca —>> idp: ok, revoked

idp —>> t1: ok, revoked

t1 —> obui: ok, revoked

obui —>> garep: ok, revoked

deactivate obui

end

alt suspend

garep ->> obui: suspend credential

activate obui

obui ->> t1: suspend credential

t1 ->> idp: suspend credential

note right of ejbca: credential must be active

idp ->> ejbca: suspend credential

ejbca —>> idp: ok, suspended

idp —>> t1: ok, suspended

t1 —> obui: ok, suspended

obui —>> garep: ok, suspended

deactivate obui

end

alt re-activate

garep ->> obui: re-activate credential

activate obui

obui ->> t1: re-activate credential

t1 ->> idp: re-activate credential

idp ->> ejbca: re-activate credential

note right of ejbca: credential must be suspended, not revoked

ejbca —>> idp: ok, re-activated

idp —>> t1: ok, re-activated

t1 —> obui: ok, re-activated

obui —>> garep: ok, re-activated

deactivate obui

end

box Governance Authority

participant obui

participant t1

participant idp

participant ejbca

end

Applicant representative request credential Renewal - Governance authority renews credential

mermaid diagram Expand source

sequenceDiagram

actor garep as Governance Authority Representative

participant obui as Onboarding UI (Angular)

participant t1 as Tier 1 Gateway (Spring Cloud Gateway)

participant idp as Identity Provider (Java - Spring Boot)

participant ejbca as Credential Store (EJBCA)

participant t2 as Tier 2 Gateway (Spring Cloud Gateway)

participant authp as Authentication Provider (Java - Spring Boot)

participant t1p as Tier 1 Gateway (Spring Cloud Gateway)

participant putility as Participant Utility (Angular)

actor partrep as Participant Representative

alt renewal request

Note right of partrep: keypair and CSR already generated

partrep ->> putility: submit credential renewal request (CSR)

activate putility

putility ->> t1p: credential renewal request (with CSR)

t1p ->> authp: credential renewal request (with CSR)

authp ->> t2: credential renewal request (with CSR)

t2 ->> idp: credential renewal request (with CSR)

idp ->> idp: store credential request (CSR)

idp —>> t2: ok, request accepted

t2 —>> authp: ok, request accepted

authp —>> t1p: ok, request accepted

t1p —>> putility: ok

putility —>> partrep: ok

deactivate putility

end

alt renew

garep ->> obui: renew credential

activate obui

obui ->> t1: renew credential

t1 ->> idp: renew credential

idp ->> idp: fetch latest CSR

idp ->> ejbca: renew credential (CSR)

ejbca —>> idp: ok, renewed

idp —>> t1: ok, renewed

t1 —> obui: ok, renewed

obui —>> garep: ok, renewed

deactivate obui

end

box Governance Authority

participant obui

participant t1

participant idp

participant t2

participant ejbca

end

box Participant

participant putility

participant t1p

participant authp

end

Governance Authority Representative assigns identity attributes to a participant

mermaid diagram Expand source

sequenceDiagram

actor garep as Governance Authority Representative

participant sapui as Security Attributes Provider UI (Angular)

participant t1 as Tier 1 Gateway (Spring Cloud Gateway)

participant sap as Security Attributes Provider (Java - Spring Boot)

garep ->> sapui: assign identity attribute

activate sapui

sapui ->> t1: assign identity attribute (participantId + identity attributes)

t1 ->> sap: assign identity attribute (participantId + identity attributes)

sap ->> sap: map identity attributes to participant

sap —>> t1: ok, assigned

t1 —> sapui: ok, assigned

sapui —>> garep: ok, assigned

deactivate sapui

box Governance Authority

participant sapui

participant t1

participant sap

end

6.1.2. TCV - Domain 2 - Publish and consume resources

2.23.1. TCV - Domain 2 - Publish and consume resources - Static Views

This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.

TCV Static - Catalogue Client Service

Catalogue Client Application

The Catalogue Client Application Backend component is implemented as a Java backend application.
The Catalogue Client Application UI component is implemented as an Angular frontend application.

Validation Backend

The Validation Backend component is implemented as a Java backend application.

Contract Consumption Adapter

The Contract Consumption Adapter component is implemented as a Java backend application.

TCV Static - Catalogue Service

Catalogue

The S earch Engine component is implemented with XFSC .
The Catalogue Database component is implemented in PostgreSQL with Neo4J.
The Vocabulary Datastore component is implemented as a File system.
The Management Service is implemented with XFSC.
The Sematic Validation service is implemented with RDFLib pySHACL.
The Quality rule validation service is implemented with RDFLib pySHACL.
The Syntax Validation service is implemented with RDFLib pySHACL.

Query Mapper Adapter

The Query mapper adapter component is implemented with Spring Cloud Gateway.

TCV Static - Connector Service

Connector

The Connector component is implemented as an Eclipse Dataspace Connector.
The Control plane component is implemented as a Java backend application.
The Data plane component is implemented as a Java backend application.
The Infrastructure orchestrator is implemented as a Java backend application.
The Policy engine component is implemented as a Java backend application.

EDC Connector Adapter

The EDC Connector Adapter component is implemented as a Java backend application.

TCV Static - Contract Service

Contract Manager Backend

The Contract Manager Backend component is implemented as a Java backend application.
The API interface is implemented as a Kafka consumer Json/Kafka.

Contract Manager Orchestrator

The Contract Manager Orchestrator component is implemented as a Java backend application.
The API interface is implemented as a Kafka consumer Json/Kafka.

Message Broker

The Message Broker component is implemented with Kafka.

TCV Static - Contract Template Datastore Service

Contract Template Datastore

The Contract Template Datastore component is implemented as a File System.

TCV Static - Data Orchestration Service

Orchestration Platform

Orchestration Engine (Dagster Deamons): Several Dagster features, like schedules, sensors, and run queueing, require a long-running dagster-daemon process to be included with your deployment. They start the RunLauncher as ephemeral processes.

Orchestration Run Worker (K8Run Launcher): The run launcher is the interface to the computational resources that will be used to actually execute Dagster runs. It receives the ID of a created run and a representation of the pipeline that is about to undergo execution. We use the K8Run Launcher, which is a run launcher that allocates a Kubernetes job per workflow run.

Code Location: A code location is a collection of Dagster definitions loadable and accessible by Dagster’s tools, such as the CLI, UI. A code location comprises:

A reference to a Python module that has an instance of Definitions in a top-level variable
A Python environment that can successfully load that module

Orchestration Management UI (Dagit) : Dagit is Dagster’s browser-based orchestration console that provides an intuitive, real-time view into your data pipelines and assets. It acts as the operational hub for engineers, analysts, and operators to develop, launch, and monitor jobs: you can visualize graphs of ops, configure runs, inspect logs and intermediate results, manage schedules and sensors, and observe asset materializations, all without leaving the UI.

Orchestration Engine API: Dagster exposes a GraphQL API as its primary programmatic interface to the orchestration engine. This API underpins both Dagit and the Python client libraries, and lets you manage every aspect of your Dagster instance without using the UI. Through it, you can:

Launch and cancel runs (jobs, backfills, re-executions).
Query run states, logs, and event streams for monitoring.
Manage workspace locations (code servers).

Asset Orchestrator : A component developed for SIMPL Open. It connects the data and application offerings from the catalogue with the orchestration engine.

Auth Proxy : Authentication sidecar component to integrate the orchestration platform with the IAA stack in a loose coupled way.

Repository (Gitea) : The repository is not only a logical container for services, workflows, and schedules but also the natural unit of versioning and auditability for your orchestration code. Because a repository is defined in source control ( Gitea Repository ), every change to a job graph, op implementation, resource configuration, or schedule is captured as a commit with author, timestamp, and diffs. This enables you to:

Version your pipelines : each commit or tag corresponds to a known set of jobs/ops; you can deploy specific versions of the repository image to different environments (dev/test/prod).
Audit changes : by inspecting the repository history you can see who modified a job, added a sensor, or changed a resource definition, providing traceability for compliance.
Rollback safely : if a change breaks a pipeline, you can redeploy the previous repository image and Dagster will run the older job definitions.
Tie runs to code versions : by embedding a Git commit hash or image tag in the run metadata, you get a direct link between a Dagster run and the exact code and configuration that produced it.

This approach shifts “audit and versioning” from being an afterthought in the orchestration layer to being a first-class property of the development workflow, making it straightforward to satisfy governance, reproducibility, and regulatory requirements.

CICD (Gitea Actions): In our orchestration platform running, the CI/CD is the mechanism that moves changes from development into the productive cluster in a controlled, auditable way. It allows for automatic tests before the publication and better auditability. Rollbacks are trivial because previous images and manifests are retained. This setup ensures that the version of each Dagster job or op running in production is traceable to a specific commit, that deployments are reproducible, and that production workloads can scale or recover automatically under Kubernetes while still meeting compliance and reliability requirements.

TCV Static - Infrastructure Connector Service

Provisioned Node (Infrastructure Consumer) / Private Network

Is created by the Infrastructure Provisioner (see TCV Static - Infrastructure Provisioning Service ) on behalf of the Consumer
- Also Access Data (credentials and other details) is communicated to the Consumer.
So In principle the Consumer has access
- to the Infrastructure (Provisioned Node)
- and the Private Network

Infrastructure Connector Service :warning:

Currently (2025-09-26) it is not clear what the term “Connector” really means
- No definition found anywhere in the documentation
- In the “data space universe” a connector typically means:
  - A (software) component that acts as a trustworthy gateway for data exchange, enabling secure, sovereign, and standardized sharing of data between organizations and systems while enforcing usage policies.
  - Such connectors/gateways often use frameworks like “Eclipse Dataspace Components (EDC)” and operate on the Application level
    - Expressed in this picture with all the assets
    - :warning: Note: typos, connection lines probably wrong
- If the above definition is true then the access to the Infrastructure (Provisioned Node)/Private Network is not part of the connector.
Therefore not known what this really means for Infrastructure
- Needs to be solved as soon as possible.

TCV Static - Infrastructure Provisioning Service

Triggering Module

The Script Storage Management Module component is implemented as a Java backend application.
The Script Execution Module component is implemented as a Java backend application.
The Access Management Module component is implemented as a Java backend application.
The Triggering Module UI component is implemented as an Angular frontend application.
The API interface is implemented as a Kafka consumer Json/Kafka.

Infrastructure Provisioner

The Infrastructure Provisioner component is implemented in ArgoCD.
The Infrastructure Provisioner component is implemented in Crossplane.
The API interface is implemented as a Kafka consumer Json/Kafka.

Infrastructure Provider Storage

The Database is implemented in PostgreSQL.
The Repository is implemented in Git-based.

Message Broker

The Message Broker component is implemented with Kafka.

TCV Static - Issuer Service

TCV Static - Policy Template Datastore Service

Policy Template Datastore

The Policy template datastore component is implemented with XFSC Organisation Credential Manager.

TCV Static - Resource Offering Service

SD Tooling

The SD Manager component is implemented as a Java backend application.
The Validation BE component is implemented as a Java backend application.
The SD Creation Tool component is implemented with XFSC Organisation Credential Manager.
The SD Tooling UI component is implemented as an Angular frontend application.

TCV Static - Schema Management Service

The Schema Management Backend Service implements:

The Schema Management, which is storing the Schemas
The Schema Subscription API, where any Service can subscribe for Schema Updates
The Schema Management Backend API for creating, updating and revoking schemas
The Scheme Management UI

The Schema Synch Service implements:

The Schema Synch Adapter API, that received any updated from the Schema Management Service
The Schema Synch Adapter , that is retrieving the Schema updates and processed them

TCV Static - Signer Service

Signer Service

The Signer service component is implemented with XFSC Organisation Credential Manager.

TCV Static - Vocabulary Management Service

Vocabulary Management

The Vocabulary Management Backend component is implemented as a File System.
The Vocabulary Management Frontend component is implemented as a ReactJS application.

TCV Static - Wallet Service

Wallet

The Wallet component is implemented with XFSC Organisation Credential Manager.

2.23.2. TCV - Domain 2 - Publish and consume resources - Dynamic Views

TCV Dynamic - BP 05B - Provider manages resource descriptions

This perspective illustrates the interactions and the flows between all the technological components.

Schema Synchronization

The figure illustrates the role of the schema-sync-adapter component. This component receives notifications from the Schema Management Service when a schema is published, versioned, or revoked. It then retrieves the schema from the Schema Management Service and stores it in the schema storage (NFS), making it available for use by the SD-Tooling .

TCV Dynamic - BP 06 - Consumer searches resources in data space catalogues

This perspective illustrates the interactions and the flows between all the technologies.

The search stack is split into a consumer/provider part and a centralised one.

The first one includes a client that offers a UI to the end user. The frontend application, for the advanced search, checks the parameters of the search with a local instance of the schema cache system previously synced with the instance present on the Governance Authority side. This allows us to perform a check on the parameters inserted in the Advanced search UI and send to the Query Mapper Adapter queries that are consistent with the schemas of the resources present in the Data Space. Furthermore, on both sides is present an instance of the Spring Cloud API Gateway which takes care of securing the connection towards the other agent.

In the Governance Authority instance, apart from the already mentioned components, there is the Query Mapper & Filter Adapter which is in charge of translating the incoming query to the required query language and applying the filtering based on access policies. Then this last component redirects the resulting query to the API of the catalogue. The XFSC catalogue includes an internal search engine that will be used to perform the query on the Self-description present underneath Neo4J DB. The catalogue has behind also a Postgres DB for managing metadata and ensuring efficient file identification and data consistency.

TCV Dynamic - BP 07 - Consumer and Provider establish a usage contract for selected catalogue items

This perspective illustrates the interactions and the flows between all the technological components.

TCV Dynamic - BP 08 - Consumer consumes an infrastructure resource from a Provider

This perspective illustrates the interactions and the flows between all the technological components.

TCV Dynamic - BP 09A - Consumer consumes a data resource from a Provider

This perspective illustrates the interactions and the flows between all the technologies.

The data consumption BP9A is mainly addressed by the EDC connector Java backend.

The EDC connector is in charge of various steps of the Dataspace protocol. In particular, the core of the backend is the control plane that is in charge of the contract negotiation and the selection of the correct data plane depending on the type of resource requested while the actual data transfer is performed by the selected EDC connecter Java extension which will connect to the real data source.
In the consumption process policies should be checked and this action is performed by the Policy Module present in the EDC connector Java backend.

TCV Dynamic - BP 09B - Consumer receives a data processing service on a data resource via an application

This perspective illustrates the interactions and the flows between all the technologies.

The data consumption BP which encompasses also the infrastructure provider is addressed by the infrastructure-related components (see BP8 for the details about this part).

The request from the user is directed to the Data provider EDC connector Java backend which forwards the request to the custom EDC extension connector that is capable of interacting with infrastructure Provider APIs.

6.1.3. TCV - Domain 3 - Management/Operation of Data Space

2.24.1. TCV - Domain 3 - Management/Operation of Data Space - Static Views

This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.

TCV Static - Monitoring Service

This section describes the architecture for Monitoring and Logging, within a single node (Simpl-Open agent) and does not (yet) consider inter-nodes setup.

This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.

The monitoring service is based primarily on the Elastic stack.

Filebeat and Metricbeat are used to collect respectively technical logs and infrastructure metrics.

As Simpl-Open application services are deployed as containers in Kubernetes, both technical logs and infrastructure metrics are collected via the kube-api.

Technical logs are then forwarded to Logstash for processing and potential transformation. Business logs are directly sent by the application services to Logstash.

Metricbeat, Heartbeat and Logstash forward respectively infrastructure metrics and logs (technical and business) to Elasticsearch which acts as a central logs repository.

Kibana is used as a user interface to provide reporting, log visualisation, monitoring space and alerting capabilities. Kibana also queries the health endpoints of the services, exposed as REST/JSON APIs, to display their health in a dashboard.

A custom reporting application exposes a REST/JSON API to query logs for other purposes such as monitoring federation (i.e. forwarding some logs to the Governance Authority) or billing.

Health check component query and collects health data from components across Simpl-Open agent and store it for further health visualizations.

Tracing captures detailed information about requests flow across Simpl-Open components to help to discover potential bottlenecks.

TCV Static - Schema Management Service

The Schema Management Backend Service implements:

The Schema Management, which is storing the Schemas
The Schema Subscription API, where any Service can subscribe for Schema Updates
The Schema Management Backend API for creating, updating and revoking schemas
The Scheme Management UI

2.24.2. TCV - Domain 3 - Management/Operation of Data Space - Dynamic Views

Schema Lifecycle Management (Governance Perspective)

Schema Version Creation : A Governance Administrator creates a new version of a schema by submitting a SHACL file and its associated metadata (e.g., version number, changelog) to the SMS Management API . The SMS validates and stores the new version.
Schema Publication : To make an entire schema family available for use, the Administrator uses the SMS Management API to change the status of the Schema Concept to PUBLISHED.
Event Notification : Upon successfully changing the status, the SMS :
- Updates its internal database to reflect the new status.
- Publishes a SchemaPublished event. This event contains the schema’s metadata, its new status, and the content of its versions.
Schema Revocation : If a schema family is no longer approved for use, the Administrator changes its status to REVOKED via the API. This triggers a SchemaRevoked event, preventing new data from being validated against any version of this schema.

6.2. Technology Deployment View

The content presented in this section presents a view on the currently available release for the GA, Data Provider, Infra Provider and Consumer. Application Provider view falls behind the scope of the current release.

The following Technology Deployment View describes how the different technology components are deployed for all Simpl-Open agent types (Governance Authority, Data Provider, Infrastructure Provider, Application Provider, Consumer):

Simpl-Open is designed to be a container-native application and is provided with all the required deployment artefacts to be deployed on a pre-existing Kubernetes Cluster .

Each agent is deployed inside its own Kubernetes Namespace .

Three types of workloads are used:

Deployment - used for managing a stateless application workload, where any Pod in the Deployment is interchangeable and can be replaced if needed.
StatefulSet - used to run one or more related Pods that do track state somehow (for example, if the workload records data persistently). StatefulSet can match Pods with PersistentVolumes.
DaemonSet - used for Pods that provide facilities that are local to nodes. Every time a node is added to the cluster, and it matches the specification in a DaemonSet, the control plane schedules a Pod for that DaemonSet onto the new node. Each pod in a DaemonSet performs a job similar to a system daemon on a classic Unix / POSIX server.

Kubernetes Services are used to expose certain components, running as one or more pods, behind a single outward-facing endpoint, even when the workload is split across multiple nodes.

6.3. Technology Open-Source Products

The present section is divided into 2 parts:

Roadmap of 3 Years with draft consideration about Open-Source Software product selection;
Open-Source Product Decision, as architecture is further analysed, and components / interface are confirmed.

The Roadmap OSS selection is assumed to be valid until the respective capabilities are confirmed or amended by the Decisions that will happen in Agile fashion quarter by quarter.

6.3.1. Simpl-Open Technology Roadmap

The following illustration presents the Draft 3 Years Roadmap of the Open-Source Software product selection to implement the functional capabilities required by Simpl-Open.

Also below is presented the table with the rationale for selection, available today.

As a general process, quarter by quarter, release by release, the Architecture team will further analyse capability by capability and confirm or amend selection based on detailed requirement and detailed architecture, including interaction with other technologies/components.

The draft table below provide a first rationale of selection identified as preliminary stages.

Tools	Description	Rationale
Eviden Open-Source	Partitum is a Proven solution component of the Eviden Clearing house as a service. This solution is currently running at Athumi (Belgium – Flanders). A Data Space intermediate, that is responsible for securely exchanging data between the different actors in a Data Space community and monetisation. The product provides the necessary tools to remove financial burden for the actors by: · Onboard the different actors in your eco-system and taking care of the contractual and financial agreements necessary to exchange data; · Clearing of transactions based upon contractual agreements between the actors and their risk profile; · Settlement of executed transactions between different actors; · Automatically invoicing through billing or self-billing.	No integrated toolset available in the market matching client requirements.
DAPS	Issue dynamic identity attributes based on scoped request.	It fits the second authentication mechanism described in Annex III of the “Architecture Vision Document” where identity attributes are dynamically by the Identity Attributes along with an ephemeral proof.
EJBCA	Public key infrastructure certificate authority software. https://www.ejbca.org/	It is needed in all the envisioned authentication mechanisms between Participants as they require the issuance of a x.509 certificate.
Keycloak	Identity and Access Management software. https://www.keycloak.org/	This component will manage the authentication and authorisation of the End Users. It can be easily federated with existing Participants’ identity providers and extended to implement several types of authentication mechanisms (2FA, Digital Wallet, etc.).
ELK Stack	https://www.elastic.co/elastic-stack/	As suggested by Tenders Specifications and based on Market Standard.
Prometheus	https://prometheus.io/	As suggested by Tenders Specifications and based on Market Standard.
Grafana	https://grafana.com/	As suggested by Tenders Specifications and based on Market Standard.
MTLS	Mutual TLS (mTLS) is a security practice that provides encrypted communications between every workload and application in your infrastructure, regardless of location.	Recognised protocols by several Open-Source products.
Crossplane	Crossplane enables cloud-agnostic infrastructure provisioning and management. https://www.crossplane.io/	To abstract away cloud-specific APIs, enabling consistent control of resources across various cloud providers. It empowers DevOps teams to define infrastructure as code (IaC) and easily manage multi-cloud environments, enhancing agility and reducing vendor lock-in.
Terraform	Terraform automates infrastructure as code, simplifying provisioning and scaling. https://www.terraform.io/	For its declarative IaC approach, enabling infrastructure automation through code. Terraform's extensive provider ecosystem ensures broad cloud support and efficient orchestration, facilitating rapid scaling and reducing operational overhead.
Ansible	Ansible orchestrates application deployment and configuration with minimal complexity. https://www.ansible.com/	Agentless automation for simplified application provisioning and configuration management. Ansible's idempotent playbooks, robust modules, and YAML-based syntax simplify complex tasks, ensuring consistency and efficient operations across infrastructure.
Kubernetes	Kubernetes is a container orchestration platform, simplifying application deployment and scaling. https://kubernetes.io/	For containerised workload management and orchestration. Its advanced features, including auto-scaling, rolling updates, and service discovery, simplify application lifecycle management and enhance resource utilisation, making it a top choice for container-based applications.
UFW	Uncomplicated Firewall (UFW) simplifies firewall management for Linux systems.	Straightforward firewall rule management on Linux. Its user-friendly interface and uncomplicated syntax make it a powerful tool to secure systems against unwanted network traffic while simplifying the configuration of firewall policies.
WireGuard	WireGuard offers secure, efficient VPN solutions for network privacy and protection. https://www.wireguard.com/	To secure network communications with state-of-the-art cryptography. lightweight design, minimal attack surface, and dynamic routing capabilities to provide robust VPN security, ensuring high-speed, low-latency connections for infrastructure.
nftables	nftables is a versatile packet filtering framework for fine-grained network control.	For advanced network filtering and routing. Its expressive syntax and performance optimisations help network administrators to efficiently manage packet filtering, firewall rules, and network address translation (NAT).
ModSecurity	ModSecurity provides web application firewall (WAF) protection against online threats.	To secure web applications with robust WAF capabilities. Its comprehensive rule sets and real-time threat detection safeguard applications from web-based attacks, ensuring data integrity and user trust.
Ceph	Ceph is a distributed storage system for scalable, reliable data storage. https://ceph.io/en/	For cost-effective, highly available storage solutions. Its distributed architecture, erasure coding, and RADOS (Reliable Autonomic Distributed Object Store) technology deliver scalable, fault-tolerant storage, making it ideal for cloud and data-intensive workloads.
OKD (OpenShift)	OKD, the open-source version of OpenShift, offers container orchestration and management. https://www.okd.io/	To deploy, manage, and scale containerised applications with Kubernetes simplicity. OKD's developer-friendly features, integrated CI/CD, and extensive ecosystem enhance DevOps workflows and application delivery, without worrying about the infrastructure.
OpenStack	OpenStack is an open-source cloud computing platform for building private and public clouds. https://www.openstack.org/	To create customisable, private cloud environments. The modular architecture provides flexibility and control over cloud resources, enabling tailored cloud solutions, reducing costs, and avoiding vendor lock-in.
Kubeless	Kubeless is a serverless framework for Kubernetes, enabling function-as-a-service (FaaS).	Serverless application development over Kubernetes. Simplifies event-driven, microservices-based architectures, providing rapid scaling and efficient resource utilisation, perfect for modern application workloads. Suitable for providers who are already running Kubernetes.
OpenHPC	OpenHPC provides a comprehensive high-performance computing (HPC) stack for clusters.	To build and manage high-performance computing clusters. OpenHPC simplifies the integration of HPC software components, ensuring optimised performance for scientific and computational workloads.
OpenWhisk	OpenWhisk is an open-source serverless platform with support for multiple programming languages. https://openwhisk.apache.org/	Serverless capabilities for flexible, event-driven application development. OpenWhisk's language-agnostic approach simplifies serverless computing, facilitating faster development and deployment of cloud-native functions.
eDelivery	eDelivery helps public administrations to exchange electronic data and documents with other public administrations, businesses and citizens at the national level and across borders, in an interoperable, secure and reliable way.	Part of Digital Building Blocks from European Commission.
eSignature	The DIGITAL eSignature Building Block allows public administrations, businesses, and citizens to electronically sign any document, anywhere in Europe, at any time, in line with the eIDAS Regulation for e-signatures, e-seals and related services offered by Trust Service Providers.	Part of Digital Building Blocks from European Commission.
eInvoicing	The eInvoicing Building Block aims to promote the successful uptake of electronic invoicing in Europe, respecting the European standard on electronic invoicing and Directive 2014/55/EU on electronic invoicing in public procurement.	Part of Digital Building Blocks from European Commission.
eID	The eID Building Block allows public administrations and private service providers to easily extend the use of their online services to citizens from other Member States, in line with the eIDAS Regulation. In the digital age, public administrations and businesses need to carry out fast, secure electronic transactions and validate the identities of those involved with the same legal validity as traditional paper processes. Electronic identification (eID) makes this possible.	Part of Digital Building Blocks from European Commission.
Eclipse EDC	The EDC connector is a software installed by the participating company or a platform thereby providing technical access to the ecosystem. A connector can consist of monolithic or self-contained software. https://github.com/eclipse-edc/Connector	As an open source project hosted by the Eclipse Foundation, the EDC provides a growing list of modules for many widely-deployed cloud environments (AWS, Azure, GCP, OTC, etc.) "out-of-the-box" and can easily be extended for more customised environments, while avoiding any intellectual property rights (IPR) headaches.
XFSC Federated Catalogue	The “Federated Catalogue” service includes a catalogue where Gaia-X resources, asset items, and participants can be found by potential consumers and end users. Resources, asset items and participants are provided at Gaia-X using self-descriptions. https://gitlab.eclipse.org/eclipse/xfsc/cat	The reference implementation of organisational Federated Catalogue supporting SD according to the Gaia-X Trustmodel.
piveau	piveau is a data management ecosystem for the public sector. https://www.piveau.de/en/	It provides components and tools to support the entire data processing chain from harvesting, aggregation, provision, and use. It is highly extensible, focuses on open standards and is designed for use in the cloud and reacts reliably and quickly to unforeseen access peaks.
XFSC OCM	The “Organisation Credential Manager” service establishes trust between the different participants within the decentralised Gaia-X ecosystem. It includes all trust-related functions required to manage and offer Gaia-X self-descriptions in the W3C Verifiable Credential Format. https://gitlab.eclipse.org/eclipse/xfsc/ocm	The reference implementation of organisational Credential Manager due to Gaia-X Trustmodel.
XFSC PCM	The “Credential Manager” service enables Gaia-X users to manage their credentials themselves. To do this, the user needs secure storage (user wallet) and presentation capabilities in the authentication and authorisation processes. https://gitlab.eclipse.org/eclipse/xfsc/pcm	The reference implementation of personal Credential Manager due to Gaia-X Trustmodel.
Apache Spark	Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. https://spark.apache.org/	Apache Spark is highly adopted by thousands of companies. It also integrates with all important frameworks on Data Science and Machine Learning, SQL Analytics and BL and Storage and Infrastructure.
Great Expectations	A powerful platform to uphold data quality. https://greatexpectations.io/	Great Expectations offer broad flexibility and control when creating data quality tests. It also provides auto-updating documentation to ease reports of test suites and results in collaborative environments.
Marquez, OpenLineage	OpenLineage is an open platform for collection and analysis of data lineage. It tracks metadata about datasets, jobs, and runs, giving users the information required to identify the root cause of complex issues and understand the impact of changes. https://openlineage.io/	OpenLineage contains an open standard for lineage data collection, a metadata repository reference implementation (Marquez), libraries for common languages, and integrations with data pipeline tools.
MLflow	MLflow is an open-source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. https://mlflow.org/	MLflow offers several key components to access, evaluate, process and deploy Large Language Models (LLM).
Apache Jupyter	JupyterLab is a web-based interactive development environment for notebooks, code, and data. https://jupyter.org/	Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality. This tool is highly adopted in the data science community
Superset	Apache Superset is an open-source modern data exploration and visualisation platform. https://superset.apache.org/	Superset is fast, lightweight, intuitive, and loaded with options that make it easy for users of all skill sets to explore and visualise their data, from simple line charts to highly detailed geospatial charts. It supports a wide range of data bases.
UVdesk	https://www.uvdesk.com/en/	open-source ITSM tool selected as the tool best matching the tender requirements.
TheHive	https://thehive-project.org/	Same toolset as the one used for cert.eu and other governments institutions.
MISP	https://www.misp-project.org/	Same toolset as the one used for cert.eu and other governments institutions.
Spring Cloud Gateway	software components that act as an API Gateway. https://spring.io/projects/spring-cloud-gateway	This component will manage the routing of API Requests to the several services that compose the SMP middleware. It is easily extendible and configurable in order to implement specific cross cutting concerns as security and the control of Access & Usage policies.
Spring Cloud Circuit Breaker	Library implementing the Circuit Breaker pattern and other HA patterns. https://spring.io/projects/spring-cloud-circuitbreaker	Mitigates high response times and network errors, enhancing system reliability. It implements the Circuit Breaker, Retry and Bulkhead patterns. It is useful for communication inside and outside the SMP Agent perimeter.
Webpack Module Federation	Technology enabling the creation of micro-frontends.	A common Application Shell will be implemented, that dynamically loads the several autonomous Front End modules. Each module can be mapped to a specific micro-service and developed independently by the same Team that is in charge of it, increasing the speed of development of distributed and scalable applications.
Aruba Consent Management	Consent management service.	It manages consent given by Data Providers to the Consumers. It binds consents to specific versions of a legal text. Data Providers can revoke their consent at any time. Specific events are raised for every notable change in the system, that can be easily reviewed and audited.
Spring Cloud Config	https://spring.io/projects/spring-cloud-config
Swagger	https://swagger.io/	The de facto standard of documentation for REST APIs.
Data Mashup Editor (Eng opensource)	The mission of the Data Mashup Editor is to develop a powerful and intuitive graphical tool that simplifies the process of harmonising data from diverse sources, leveraging cutting-edge technologies and intelligent data integration techniques. The Data Mashup Editor is dedicated to ensuring data accessibility, usability, and accuracy, enabling informed decision-making across industries and domains and unlocking the true value of data assets.	The Data Mashup Editor was chosen as one of the tools for data processing building block and data sharing building block due to its ability to seamlessly handle both real-time and batch data streams, while redirecting the output to various entities adopting different technologies and protocols simultaneously. Its internal architecture makes it highly suitable for cloud deployment, ensuring optimal performance and distributed executions. Additionally, it offers an intuitive user experience through its graphical interface, making it easy for users to utilise the tool effectively.
Rule Manager (Eng opensource)	The Digital Enabler Rule Manager is a powerful tool designed for managing trigger rules and automated responses based on specific data values within your platform. This tool offers a user-friendly guided wizard for defining and implementing rules for data processing within the platform.	The Rule Manager was chosen as one of the tools for ata processing building block and data sharing building block due to its capability to create rules of varying complexity based on the data within the system and this gives the possibility of adding a monitoring layer in the processing steps. It integrates seamlessly with the Data Mashup Editor, providing a comprehensive solution for data manipulation. Its internal architecture is well-suited for cloud deployment, ensuring excellent performance and distributed executions. Furthermore, its graphical interface provides users with an intuitive experience, simplifying the process of effectively utilising the tool.
Airflow	Apache Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows. https://airflow.apache.org/	Airflow was chosen as the data orchestration component, in the supporting data services building block, due to its exceptional flexibility, allowing the installation of plugins as needed. Moreover, it seamlessly integrates with cloud architectures, providing excellent support for distributed execution in a microservices environment.

6.3.2. Simpl-Open Technology Choices

The table below presents the list of Open-Source Software currently used by Simpl-Open.

Capability	Sub-Capability	Tool	Description	URL	Rationale	Additional Considerations
Discovery	Metadata	SD (GX-Trustframework)	Metadata of Participants and service offerings (App, Data, Infra) described as GAIA-X Self-Description using an ontology. The SD uses a linked data format and allows the definition of constraints and quality rules.	https://gaia-x.gitlab.io/policy-rules-committee/trust-framework/	Licence: CreativeCommons Community Support: Gaia-X Documentation Available: here Extensibility: yes Adoption by Business: Gaia-x Lighthouse, all Data Space initiatives claiming to be GAIA-X compliant.	Ontology highly adopted in Data Space initiatives. Best choice to convince participants to provide self-descriptions in this way. It can be easily enhanced with sectoral specific parameters.
Discovery	Metadata	XFSC SD Tooling	Tooling to create self-descriptions describe the service offerings (Data, App, Infrastructure).	https://gitlab.eclipse.org/eclipse/xfsc/self-description-tooling	Licence: Apache 2.0 Community Support: XFSC Documentation Available: yes Extensibility: yes Adoption by Business: TrustedCloud ( Spec )	No other FOSS tool available to create customised SD. Schemas can be created via L inkML Generator Tool Fully customisable SD definitions possible.
Discovery	Catalogue at Governance Authority	XFSC Federated Catalogue	Federated Catalogue providing Discovery capability to look up on Self Descriptions of service offerings (Data, App, Infrastructure).	https://gitlab.eclipse.org/eclipse/xfsc/cat	Licence: Apache 2.0 Community Support: XFSC Documentation Available: Web , PDF Extensibility: yes Adoption by Business: Gaia-x Lighthouse	The only implementation of a FOSS federated catalogue supporting SD. i.e. validation of SD when published and searching for SD providing an internal search engine. It also already support semantic validation. In addition the search engine is based on NoSQL which provides the base for knowledge search needed for M2M use cases. CKAN is using PostgresSQL as database. Hence it is not well prepared for ontology searches. There are plugins available to enable limited Ontology search capabilities like SparQL extensions. However, they do not scale and will fail on complex knowledge graph search as needed for ML algorithms. Either a PropertyGraph Database like Neo4J or an RDF-Triple Storage like Apache Fuseki Jena, Virtuoso etc. is needed.
Discovery	Credential Manager at Provider	XFSC OCM	The credential manager to store the Self Descriptions on organisational side. It also covers signing of Self Descriptions created by a provider, revoking a credential, verification and retrieval of credentials as microservices.	https://gitlab.eclipse.org/eclipse/xfsc/organisational-credential-manager-w-stack	Licence: Apache 2.0 Community Support: XFSC Documentation Available: Web Extensibility: yes Adoption by Business:	This is created as part of XFSC matching the needs best for SD. Can be easily replaced with any other wallet solution providing the same protocols in exchanging credentials ( OIDC4VP and OIDC4VC ).
Access control & trust	Authentication Provider	Keycloak	Open-Source Identity and Access Management Add authentication to applications and secure services with minimum effort. No need to deal with storing users or authenticating users. Keycloak provides user federation, strong authentication, user management, fine-grained authorisation, and more.	https://www.keycloak.org/	License: Apache 2.0 Community Support: Huge community based upon years of being active(21K stars on girthub) https://www.keycloak.org/community Documentation Available: Documentation is wide and focus on every aspects of the tool https://www.keycloak.org/documentation Extensibility: yes natively and through REST API( https://www.keycloak.org/docs/latest/server_development/index.html ) Adoption by Business: Spread adoption around the world	An on-premise solution that is a de facto standard and offers a wide-range set of features and a native(java) extensible interface.
Provisioning	VM/Container/Storage provisioning	Crossplane	Crossplane is an open-source Kubernetes add-on that allows to define and automate the infrastructure using Kubernetes-style configuration files. It extends the Kubernetes API to allow to provision and manage cloud resources and services from various providers, such as AWS, GCP, Azure and more, in a unified manner.	https://www.crossplane.io/	License: Apache 2.0 Community Support: Huge community based upon years of being active(9.6K stars on github) https://www.crossplane.io/community Documentation Available: https://docs.crossplane.io/ Extensibility: Yes, highly extensible with support for custom resource definitions (CRDs) and integration with various cloud providers and on-premises environments. Adoption by Business: Increasingly adopted by organisations seeking Kubernetes-native solutions for infrastructure automation and application management.	Multi-cloud environment operation. Crossplane simplifies infrastructure management by bringing the benefits of the Kubernetes declarative model to cloud provisioning. By using Crossplane, teams can leverage the familiar Kubernetes tools and workflows to manage infrastructure alongside their applications, leading to a more consistent, scalable, and efficient infrastructure management process. Crossplane is favored over Terraform ( https://blog.crossplane.io/crossplane-vs-terraform/ ), also because of more permissive license.
Provisioning	VM/Container/Storage provisioning	OpenTofu	OpenTofu is an open-source Infrastructure as Code (IaC) tool and community-driven fork of Terraform. It allows the definition, provisioning, and management of infrastructure using declarative configuration files. OpenTofu supports a wide range of cloud providers like AWS, Azure, and GCP, as well as on-premise systems. It enables infrastructure automation, version control, and consistency across deployments, while ensuring long-term openness and flexibility free from proprietary constraints.	https://opentofu.org/	License: Fully open-source under the MPL 2.0 (Mozilla Public License), maintained by the Linux Foundation under the OpenTofu project Community Support: Rapidly growing, community-led project with strong momentum and contributions from the broader Terraform ecosystem Documentation Available: https://opentofu.org/docs/ Extensibility: Yes, fully extensible with compatibility for existing Terraform providers, modules, and plugins, supporting a wide range of cloud and on-premise environments Adoption by Business: Increasingly adopted by organizations seeking a vendor-neutral, open-source alternative for infrastructure automation without commercial licensing restrictions	Multi-cloud environment operation. OpenTofu simplifies infrastructure management by using a declarative configuration model to provision and manage cloud and on-premise resources. As a community-driven fork of Terraform, OpenTofu enables teams to define infrastructure as code using a consistent language and workflow, ensuring predictable, repeatable, and scalable infrastructure provisioning. This open and transparent approach supports automation, collaboration, and better alignment between development and operations teams, without reliance on proprietary tooling.
Provisioning	VM/Container/Storage provisioning	ArgoCD	ArgoCD is a declarative, GitOps continuous delivery tool for Kubernetes. It is designed to simplify the process of deploying and managing applications on Kubernetes clusters. ArgoCD uses a GitOps approach, which means it uses Git repositories as the source of truth for application configurations.	https://argo-cd.readthedocs.io/en/stable/	License: Apache 2.0 Community Support: Large and active community with many contributors and users Documentation Available: Extensive documentation available, including user guides, API references, and tutorials Extensibility: Highly extensible with support for custom plugins and integrations Adoption by Business: Widely adopted by businesses and organisations, including many Fortune 500 companies	ArgoCD is selected for its ability to simplify and automate the deployment and management of Kubernetes applications. Its declarative, GitOps approach ensures consistency and reproducibility across environments, while features like automated rollouts and rollbacks enhance application availability and resilience. By leveraging ArgoCD, it's possible to set up continuous delivery pipeline and reduce the complexity associated with manual configuration and deployment processes.
Provisioning	Workflow Orchestration	Argo Workflows	Argo Workflows is an open-source container-native workflow engine for orchestrating parallel jobs on Kubernetes. It allows defining complex workflows as YAML files and executing multi-step pipelines directly on Kubernetes, ideal for CI/CD, ML pipelines, and data processing.	https://argo-workflows.readthedocs.io/	License: Apache 2.0 Community Support: Active and large community with contributors from major enterprises and open-source advocates Documentation Available: Yes – comprehensive user and developer guides Extensibility: Highly extensible with support for custom templates, DAG execution, artifact passing, and event triggers Adoption by Business: Widely adopted by organizations for CI/CD pipelines, ML model training, and infrastructure automation	Argo Workflows is chosen for its Kubernetes-native support and flexibility in defining scalable, complex workflows. It allows seamless integration with GitOps, supports step-based and DAG-based workflows, and handles parallel execution effectively. Ideal for task orchestration in cloud-native environments.
Provisioning	Event-Driven Automation	Argo Events	Argo Events is an open-source event-driven workflow automation framework for Kubernetes. It allows users to trigger workflows (e.g., Argo Workflows) based on events from various sources such as webhooks, Kafka, S3, schedules, and more.	https://argoproj.github.io/argo-events/	License: Apache 2.0 Community Support: Large and growing community under the Argo Project umbrella Documentation Available: Yes – detailed documentation with examples and use cases Extensibility: Yes – supports sensors, triggers, and integration with external systems Adoption by Business: Used by teams implementing event-driven GitOps or infrastructure automation workflows	Argo Events complements Argo Workflows by enabling fine-grained, declarative automation based on real-time system and external events. It helps build responsive, scalable pipelines triggered by real-world conditions or scheduled timers in Kubernetes-native environments.
Provisioning	Post Configuration / Application Deployment	Cloud-init	Cloud-init is a popular tool for automating the initialisation and configuration of cloud instances. It is designed to simplify the process of deploying and configuring cloud instances, and is widely used in cloud computing environments.	https://cloud-init.io/	License: Apache 2.0 Community Support: Moderate community support with many users and contributors Documentation Available: Extensive documentation available, including user guides, configuration examples, and troubleshooting guides Extensibility: Highly extensible with support for custom scripts and plugins Adoption by Business: Widely adopted by businesses and organisations, particularly in the cloud computing and DevOps spaces	Cloud-init is selected for its ability to automate the initialisation and configuration of cloud instances, as well as run post-provisioning tasks such as deploying or installing applications, automated network configuration, storage setup, and security hardening. Its modular and customisable approach ensures that instances are properly configured and secured, reducing the risk of errors and improving overall reliability.
Provisioning	Storage (Repository)	Gitea	Gitea is a lightweight, open-source, and highly extensible repository management tool that provides a simple and intuitive way to manage code repositories. It offers a web-based interface for creating, managing, and organising code repositories, and provides features such as collaboration, version control, and issue tracking.	https://about.gitea.com/	License: MIT License Community Support: Large and active community with many contributors and users Documentation Available: Extensive documentation available, including user guides, API references, and tutorials Extensibility: Highly extensible with support for custom plugins and integrations Adoption by Business: Widely adopted by businesses and organisations, particularly in the open-source and developer communities	Gitea is chosen for its ability to provide a lightweight, flexible, and highly extensible repository management solution. Its ease of use, scalability, and customisability make it an ideal tool for managing code repositories.
Data Exchange	Data Exchange Service	EDC	The data exchange service implementing the negotiation protocol (Data Space protocol).	https://projects.eclipse.org/projects/technology.edc	Licence: Apache 2.0 Community Support: Tractus-X and EDC . Documentation Available: Tractus-X and EDC Extensibility: well structured interfaces to customise component Adoption by Business: Catena-X, Eona-X , several other data initiatives using forks of it. Known Friends of EDC.	Can be replaced with any other IDS connector implementing the IDSA Dataspace Protocol and using ODRL expressions for policy . The EDC connector is chosen because it has a good documentation, provides good interfaces and can be easily customised. Second there are two joined active communities to drive the development: Tractus-X and EDC . In addition, the first IDS connector passing the IDSA certification was the TSI connector based on EDC. Also EDC is the only available IDS connector which has already implemented the dataspace protocol. Other initiatives will follow.
Monitoring, Logging, Reporting, Audit	Monitoring and Logging	ELK (Elastic, Logstash, Kibana)	Reliably and securely take data from any source, in any format, then search, analyse, and visualise.	https://www.elastic.co/	License: ELv2 / SSPL Community Support: Largest community in the industry https://www.elastic.co/fr/community Documentation Available: Yes https://www.elastic.co/docs Extensibility: yes Adoption by Business: Most adopted open sources logging/monitoring/reporting/auditing stack in the world Capability rich allowing to configure features removing a need to custom build solution. This includes but it's not limited to: ingestion pipeline, visualizations, health check support, tracing.	The ELK stack is an industry standard for log management and data analysis due to its scalability and powerful features. Elasticsearch handles large volumes of data with real-time search and analytics, Logstash processes and ingests data from various sources, and Kibana provides intuitive visualizations and reporting. Being open-source, it benefits from a large community, continuous improvements, and extensive plugins. Security features like TLS encryption, role-based access control, and audit logging ensure data protection, making ELK a reliable and versatile solution for diverse use cases.
Access control & trust	Commons	OpenBao	Used by Keycloak, EJBCA, Spring Cloud Gateway, and when access to stored credentials is needed by a Java Backend.	https://www.vaultproject.io/	License: Mozilla Public License 2.0 Community Support: High, with active forums and GitHub discussions. https://github.com/orgs/openbao/discussions Documentation Available: Yes, extensive documentation covering all aspects of setup, usage, and integration. https://openbao.org/docs/ Extensibility: Yes, supports plugins and integrations with major cloud platforms and authentication systems. Adoption by Business: Widely adopted globally as open source alternatives to Hashicorp Vault.	OpenBao is an open source, community-driven fork of HashiCorp Vault managed by the Linux Foundation. It is a secrets management and encryption platform that securely stores, manages, and encrypts sensitive data such as passwords, API keys, and certificates. It provides secure access, auditing, and revocation of secrets across distributed infrastructure, applications, and services, enabling secure development, deployment, and operation of modern systems.
Access control & trust	Commons	MinIO	MinIO is a high-performance, S3 compatible object store. It is built for large scale AI/ML, data lake and database workloads. It is software-defined and runs on any cloud or on-premises infrastructure.	https://min.io/	License: GNU AGPL v3 Community Support: High, with active GitHub issues, a strong community forum, and Slack channels. https://slack.min.io/ Documentation Available: Yes, comprehensive and detailed documentation covering deployment, configuration, and API usage. https://min.io/docs/ Extensibility: Yes, supports integrations with multiple cloud platforms, Kubernetes, and third-party tools for storage and analytics. Adoption by Business: Growing adoption worldwide, particularly in data-driven industries leveraging object storage for modern workloads like AI/ML, big data, and cloud-native applications.	Min.io is an open-source, Amazon S3-compatible, distributed object storage server for cloud-native and edge computing applications. It provides a highly available, scalable, and performant storage solution with features like erasure coding, bitrot protection, and encryption, making it suitable for a wide range of use cases, from dev to production.
Access control & trust	Commons	PostgreSQL	used by Keycloak, EJBCA, Spring Cloud Gateway, and when a DB is needed by a Java Backend	https://www.postgresql.org/	License: PostgreSQL License Community Support: Very high https://www.postgresql.org/community/ Documentation Available: Yes, documentation is wide and focuses on every aspect of the Database https://www.postgresql.org/docs/ Extensibility: yes Adoption by Business: Spread adoption around the world	The World's Most Advanced Open-Source Relational Database
Access control & trust	Common Identity provider	EJBCA	One of the world's most popular PKIs, EJBCA gives you time-proven flexibility and robustness. Unlike other open-source certificate authority and PKI solutions, EJBCA is platform-independent and can be scaled up and down to match your needs.	https://www.ejbca.org/	License: LGPL-2.1 Community Support: Huge community based on the mailing list, github, forums, and slack channel Documentation Available: Documentation is wide and helps understand how to use and interact with the tool https://docs.keyfactor.com/ejbca/latest/ Extensibility: yes using REST API Adoption by Business:	The most mature (23 years), used, and rich in features, java-based PKI solution in the open-source panorama.
Access control & trust	Commons Authorisation	Spring Cloud Gateway	It is a spring project that provides libraries for building an API Gateway on top of Spring WebFlux or Spring WebMVC. Spring Cloud Gateway aims to provide a simple, yet effective way to route to APIs and provide cross cutting concerns to them such as: security, monitoring/metrics, and resiliency.	https://spring.io/projects/spring-cloud-gateway	License: Apache 2.0 Community Support: Huge community based upon the spring framework community https://spring.io/community Documentation Available: Documentation is wide and focuses on every aspect of the framework https://docs.spring.io/spring-cloud-gateway/reference/ Extensibility: yes very high Adoption by Business: Spread adoption around the world	Based upon the best java-based backed framework in the world(spring) it also offers a reactive implementation that ensures the maximum level of resiliency and extensibility.
Message Broker	Commons	Apache Kafka	Apache Kafka is an open-source distributed event streaming platform designed for building real-time data pipelines and streaming applications. It serves as a high-throughput, fault-tolerant, and horizontally scalable platform that can handle large volumes of data and stream events in real-time. Kafka uses a publish-subscribe model and durable storage for storing and processing streams of records. Message Brokerage: In addition to its streaming capabilities, Kafka can effectively serve as a message broker, facilitating communication between different components of a system through the asynchronous exchange of messages. It provides features like message queueing, topic partitioning, and consumer group management, making it suitable for implementing a decoupled, event-driven architecture.	https://kafka.apache.org/	License: Apache 2.0 Community Support: Huge community (29K stars on girthub) https://kafka.apache.org/project Documentation Available: yes - https://kafka.apache.org/documentation/ Extensibility: yes Adoption by Business: Spread adoption around the world	Apache Kafka's role as a message broker offers several advantages for handling asynchronous events and message-based communication within distributed systems: Scalability: Kafka's distributed architecture allows for horizontal scaling, enabling high throughput and low latency message processing even under heavy loads. Durability: Messages are stored durably in Kafka, providing fault tolerance and preventing data loss in case of system failures. Reliability: Kafka ensures reliable delivery of messages to consumers through features like message retention and configurable acknowledgment settings. Decoupling: By decoupling producers and consumers through topics, Kafka enables loosely coupled communication between system components, improving flexibility and resilience. Real-time Processing: Kafka's ability to process and react to events in real-time makes it suitable for use cases requiring low-latency messaging, stream processing, and complex event-driven architectures.
Cache	Commons	Redis	Redis Cache is an in-memory data structure store widely used as a caching solution to enhance the performance of applications. By storing frequently accessed data in memory, Redis enables faster data retrieval compared to disk-based databases. It supports a variety of data types such as strings, lists, sets, and hashes, making it versatile for different caching needs. Redis is known for its high throughput, low latency, and scalability, often used for caching web pages, session management, real-time analytics, and message brokering. It also supports persistence, replication, and automatic failover for reliability.	https://redis.io/	License: Redis Source Available License 2.0 (RSALv2) Community Support: high Documentation Available: high Extensibility: yes Adoption by Business: Spread adoption around the world	Redis Cache offers several advantages for improving application performance and scalability: Performance : As an in-memory data store, Redis delivers extremely low-latency and high-throughput data retrieval, significantly boosting application speed. Scalability : Redis supports horizontal scaling through clustering and partitioning, allowing it to handle large datasets and heavy traffic efficiently. Flexibility: With support for various data structures such as strings, lists, sets, and hashes, Redis can handle diverse caching and real-time data processing use cases. Persistence and Reliability: Redis offers optional persistence mechanisms like snapshots and append-only files, ensuring durability, while replication and automatic failover provide high availability and fault tolerance. Integration: Redis integrates easily with various programming languages and frameworks, making it a popular choice for developers seeking an efficient, easy-to-deploy caching solution.
Data Orchestration	Data Orchestration	Dagster OSS	Dagster is a modern data orchestration platform designed to help teams build, run, and observe data workflows in a structured and reliable way. It provides a framework for defining data transformations as modular, testable units called ops , which can be composed into pipelines or jobs. With strong typing, configuration schemas, and built-in observability, Dagster ensures that data workflows are predictable and maintainable, making it easier to catch errors early and manage complex dependencies. It also integrates with scheduling, monitoring, and external resources, enabling seamless automation and coordination of data tasks across diverse environments.	https://dagster.io/	License : Dagster is released under the Apache‑2.0 license Community Support : Dagster has a large, active open‑source community with contributions, discussions, and help across channels like GitHub (issues & discussions), Slack, and other community forums, plus many integrations maintained by contributors. Documentation Available : Extensive official documentation exists that covers getting started, core concepts, integrations, and advanced usage; including tutorials, API references, and community‑oriented guidance. Extensibility : Dagster is highly extensible with custom integrations, plugins, resources, and tools, and it also integrates widely with external services and APIs. Adoption by Business : Dagster is used in production by hundreds of companies worldwide across industries such as software, financial services, and internet businesses	Support for modern concepts like built-in data lineage, test-first development, and developer-friendly UX.
Discovery	Schema Management	Fuseki	Jena is a Java framework for building Semantic Web applications. It provides an extensive Java libraries for helping developers develop code that handles RDF, RDFS, RDFa, OWL and SPARQL in line with published W3C recommendations . Jena includes a rule-based inference engine to perform reasoning based on OWL and RDFS ontologies, and a variety of storage strategies to store RDF triples in memory or on disk.	https://jena.apache.org/documentation/fuseki2/	License: Fuseki is released under the Apache-2.0 license Community Support: High Documentation: Highly available Extensibility : High Adoption by Business : It's the standard for handling RDF/RDFS for Java and not only

The following table links the OSS components to their architecture documentation and installation guide:

OSS	Architecture Documentation	Installation Guide
XFSC Signer	https://gitlab.eclipse.org/eclipse/xfsc/tsa/signer	https://gitlab.eclipse.org/eclipse/xfsc/tsa/signer/-/blob/main/deployment/helm/README.md
XFSC Federated Catalogue	https://gaia-x.gitlab.io/data-infrastructure-federation-services/cat/architecture-document/architecture/catalogue-architecture.html	https://gitlab.eclipse.org/eclipse/xfsc/cat/fc-service/-/wikis/Installation%20&%20Configuration%20Guide
HashiCorp Vault	https://developer.hashicorp.com/vault/docs/internals/architecture	https://developer.hashicorp.com/vault/docs/install
Keycloak	https://www.keycloak.org/docs/latest/authorization_services/index.html#_overview_architecture	https://www.keycloak.org/guides
EJBCA	https://doc.primekey.com/ejbca/ejbca-introduction/ejbca-architecture/internal-architecture	https://docs.keyfactor.com/ejbca/latest/ejbca-installation
Crossplane	https://docs.google.com/document/d/1whncqdUeU2cATGEJhHvzXWC9xdK29Er45NJeoemxebo/edit#heading=h.annq8ww6da48	https://docs.crossplane.io/latest/software/install/
OpenTofu	https://github.com/opentofu/opentofu/blob/main/docs/architecture.md	https://opentofu.org/docs/intro/install/
ArgoCD	https://argo-cd.readthedocs.io/en/stable/operator-manual/architecture/	https://argo-cd.readthedocs.io/en/stable/operator-manual/installation/
Argo Workflows	https://argo-workflows.readthedocs.io/en/stable/architecture/	https://argo-workflows.readthedocs.io/en/stable/quick-start-guide/
Argo Events	https://argoproj.github.io/argo-events/concepts/	https://argoproj.github.io/argo-events/installation/
Cloud-init	https://cloudinit.readthedocs.io/en/latest/	https://cloudinit.readthedocs.io/en/latest/index.html
Gitea	https://docs.gitea.com/category/installation	https://docs.gitea.com/
Apache Kafka	https://kafka.apache.org/documentation/	https://kafka.apache.org/quickstart
Redis	https://redis.io/learn/howtos/quick-start	https://redis.io/docs/latest/operate/oss_and_stack/install/install-redis/
SD (GX-Trust Framework)	N/A	https://code.europa.eu/simpl/simpl-open/development/data1/sdtooling-validation-api-be#installation
XFSC SD Tooling	https://gitlab.eclipse.org/eclipse/xfsc/self-description-tooling	https://gitlab.eclipse.org/eclipse/xfsc/self-description-tooling/sd-creation-wizard-api
XFSC OCM	https://gitlab.eclipse.org/eclipse/xfsc/organisational-credential-manager-w-stack/architecture-documentation	https://gitlab.eclipse.org/eclipse/xfsc/organisational-credential-manager-w-stack/deployment
EDC	https://eclipse-edc.github.io/documentation/	https://eclipse-edc.github.io/documentation/for-adopters/distributions-deployment-operations/
ELK (Elastic, Logstash, Kibana)	https://www.elastic.co/docs	https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html
MinIO	https://min.io/docs/minio/container/operations/concepts/architecture.html	https://min.io/docs/minio/container/operations/installation.html
PostgreSQL	https://www.postgresql.org/docs/current/tutorial-arch.html	https://www.postgresql.org/docs/current/install-binaries.html
Spring Cloud Gateway	https://cloud.spring.io/spring-cloud-gateway/reference/html/#gateway-how-it-works	https://cloud.spring.io/spring-cloud-gateway/reference/html/
Fuseki	https://jena.apache.org/documentation/fuseki2/	https://jena.apache.org/documentation/fuseki2/fuseki-server.html

6.4. Detailed Technical Specifications

This section presents technical implementation details that are particularly relevant for contributing to Simpl-Open and/or implementing it in a Data Space.

6.4.1. Identification, Authentication & Authorisation

The IAA 2-Tier approach in Simpl-Open is already described in the Data Spaces Concepts section of the Simpl-Open High-Level Overview.

Because of the 2-Tier approach, the components are grouped into Tier 1 and Tier 2.

2.27.1. Tier 1 IAA Components

Tier 1 is meant to be under the control of the governance of the organisation that became a Participant of a Dataspace, its components are local to the participant agent and are dedicated to enabling and controlling the access of the organisation’s end users to the resources/functionalities offered by the Simpl-Open agent and are:

Identification and Authentication

The component responsible for identification and authentication is the Tier 1 Authentication Provider realised using an extended version of Keycloak (OpenID Connect Identity Provider) integrated with the User & Roles component.

User & Roles

The User and Roles component is used to define roles used by the Authorisation Tier 1 , manage roles assignment of Tier 1 Authentication Provider end users and assign identity attributes to roles (described in Identity Attributes and User Roles sections below)

Authorisation Tier 1

This component manages permissions, determining what actions each end user is authorised to perform on a specific Agent resource. It plays a critical role in maintaining system security by ensuring that only the necessary users have limited access to specific functions, realised through an API Gateway, more specifically Spring Cloud Gateway and relies on Tier1 Authentication Provider to retrieve roles of authenticated end users to enforce RBAC (Role Based Access Control) policies to authorise or deny the access to the requested agent resource.

RBAC policies will be applied to check if the end user has the authorisation to access the requested agent resource/functionality based upon its assigned roles.

2.27.2. Tier 1 Credential

The tier 1 credential consists of an OpenID Connect (OAuth 2.0) AccessToken issued by the Tier 1 Authentication Provider , in the form of a JWT ( rfc7519 ) that contains standard claims extended with the following four custom claims:

Client Roles

The client roles is an array containing the list of roles assigned to the end user through the functionalities of the User & Roles component:

client-roles : [ “NOTARY”, “ONBOARDER_M”]

this will also be included in every tier 1 access token with the claim name “ client-roles ” of the JWT ( rfc7519 )

Participant ID

The participant ID is the unique and immutable ID used to identify the participant in the tier 2 IAA process. It is represented by a GUID formatted as shown in the following example:

participant_id : “02309243-2f77-456a-a1db-d8e8bb006f74”

this will also be included in every tier 1 access token with the claim name “ participant_id ” of the JWT ( rfc7519 )

Note that the participant ID will never change in time.

Credential ID

The credential ID is the unique ID used to identify the current credential participant in the tier 2 IAA process. It is represented by the Base58BTC ( https://digitalbazaar.github.io/base58-spec ) of the HASH (sha384) of the Participant x509 Certificate used to communicate in the data space as shown in the following example:

credential_id : “z8A3E8X4NkhgnFrczqy54SZjrnoiz6At3rqLosWN75WCkKQEgxmkA3yqpCPtPqHSnS9”

this will also be included in every tier 1 access token with the claim name “ credential_id ” of the JWT ( rfc7519 )

Note that the credential ID will change in time: e.g. when a credential is compromised a new issuance of credentials must occur.

Identity Attributes

Participant identity attributes are used to enable the specification of access to a subset of functionalities for a participant. In the context of Tier 2 communication, the presence of Identity Attributes ensures ABAC compliance. Specifically, services provided by dataspace participants to other participants can be protected by one or more Attributes.

A subset of those attributes can be assigned to Tier 1 roles (see Tier 1 User Roles) meaning that every end user belonging to this role owns it and is represented as in the following example;

identity_attributes : [ “DATA_CONSUMER”, “DATA_ACCESS_LEVL1”]

this will also be included in every tier 1 access token with the claim name “ identity_attributes ” of the JWT ( rfc7519 )

2.27.3. Tier 1 User Roles

Tier 1 roles are the core elements on which the RBAC policies are enforced and are also used by the participant governance to assign a subset of Participant Identity Attributes (see Identity Attributes) to its end users.

Here is the updated list of Roles that are used inside Simpl-Open:

Human Readable Role Name	Role Value	Description	Predefined	Participant	Assigned Identity Attributes	Id Component
Tier 2 authorisation manager	T2IAA_M	In the Dataspace Governance Authority is the one who is in charge of defining and changing the onboarding procedure itself, like setting up the mandatory documents and the rules that will be followed by the onboarding process.	true	Governance Authority		IAA-ONB-FE IAA-ONB-BE
Tier 2 authorisation operator	NOTARY	tier 2 authorisation operator, the one who is in charge of taking care of onboarding requests and follow their process. It will ask for further documents, it will comment on the onboarding requests and reject/approve the requests	true	Governance Authority		IAA-ONB-FE IAA-ONB-BE
Tier 2 setup administration role	ONBOARDER_M	tier 2 setup administrator role, the one who is in charge of finalising the tier 2 setup of an agent installation.	true	All Participant		IAA-U&R-FE IAA-U&R-BE
Tier 2 identity attributes manager	IATTR_M	This role is present only in the Dataspace Governance Authority and its duties are to cover the whole lifecycle of Identity Attributes, from the creation and management to the assignment to participants	true	Governance Authority		IAA-SAP-FE IAA-SAP-BE
Tier 1 user and role manager	T1UAR_M	Tier 1 user and roles manager. In the Dataspace Governance Authority, this role will manage local roles and dataspace identity attributes (defining them and assigning them to participant types + defining their assignability). In any dataspace participant, this role will manage local roles and identity attributes assignment to local roles	true	All Participant		IAA-U&R-FE IAA-U&R-BE
Applicant Representative	APPLICANT	end user responsible for onboarding an applicant dataspace participant who sign up the public dataspace onboarding site to manage the onboarding request. Applicant's primary scope is to create an onboarding request and react on the Tier 2 authorisation operator (NOTARY) interaction to get the onboarding request approved	true	Governance Authority		IAA-ONB-FE IAA-ONB-BE
	Ro-MU-CA	Role defined in XFSC Federated Catalogue: Catalogue Administrator	true	Governance Authority
	Ro-MU-A	Role defined in XFSC Federated Catalogue: Participant Administrator	true	Providers	DATA_PROVIDER_PUBLISHER APP_PROVIDER_PUBLISHER INFRA_PROVIDER_PUBLISHER
	Ro-SD-A	Role defined in XFSC Federated Catalogue: Self-Description Administrator	true	Governance Authority
	Ro-Pa-A	Role defined in XFSC Federated Catalogue: Participant User Administrator	true	Providers	DATA_PROVIDER_PUBLISHER APP_PROVIDER_PUBLISHER INFRA_PROVIDER_PUBLISHER
Researcher	RESEARCHER	Researcher who is able to access research only datasets	false	Consumer	DATA_SEARCHER
SD Publisher	SD_PUBLISHER	Role defined for the user who is responsible for creating and publishing the self-description on the catalogue	true	Providers	DATA_PROVIDER_PUBLISHER DATA_SEARCHER
SD Consumer	SD_CONSUMER	Tier-1 Role for Consumer	true	Consumer	CONSUMER
Schema Manager Admin	GA_SCHEMA_ADMIN	Tier-1 Role for Schema Admin	true	Governance Authority
Schema Manager Viewer	GA_SCHEMA_VIEWER	Tier-1 Role for Schema Viewer	true	Governance Authority
Kibana Business User	KIBANA_BUSINESS_USER	Role for accessing Kibana as a business user (binded to local Kibana user)	true	All Participant
Kibana Admin	KIBANA_ADMIN	Role for accessing Kibana as an admin (binded to local Kibana user)	true	All Participant
Data Orchestration Developer	ORCH_DEVELOPER	Role for developing workfows and services for data orchestration	true	Consumer Provider
Data Orchestration Admin	ORCH_ADMIN	Role for administration and management of the orchestration, like setting schedules, retry of Workflows or Monitoring	true	Consumer Provider
Infrastructure Provider Admin	INFRA_ADMIN	Role defined for management of all the Infrastructure Provider's cloud resources.	true	Provider
Infrastructure Provider Deployer	INFRA_DEPLOYER	Role defined for deactivation and triggering of the Infrastructure Provider's cloud resources.	true	Consumer

2.27.4. Tier 2 IAA Components

Tier 2 is meant to be under the control of the Dataspace Governance Authority and is used by all participant agents to ensure secured and encrypted communications (see Encryption and Guaranteed Authenticity/Integrity sections below), its components are both centralised (in the Authority Agent) and decentralised (local to all agents)

Centralised

Identity Provider Federation

This component includes functionalities about identity information and Tier 2 credential creation, validation and management.

Starting from the onboarding process, the Identity Provider will be used for:

Create the credential: when an applicant participant is onboarded by approving its onboarding request, a Tier 2 credential is created by the identity provider. The participant installs the credential within its own agent.
Validate the credential: the identity provider verifies the received identity Tier 2 credentials.
Management: during the lifecycle of a credential, it can be either renewed or revoked by the Dataspace Governance Authority.

Security Attribute Provider Federation

To implement ABAC policies, which are used in agent-to-agent communications, a set of valid and known Identity Attributes are needed and will be assigned to each dataspace participant by the Governance Authority.

The Security Attribute Provider component implements several functionalities:

Identity Attributes management (create, delete and modify identity attributes)
Identity Attributes Participant assignment (both during the Onboarding and after)
Temporary attestation of the participant’s identity attributes in the form of a signed ephemeral proof

Decentralised

Tier 2 Authentication Provider

This component is responsible for keeping the Tier 2 Credential received during the onboarding process and implements all Tier 2 Identification and Authentication functionalities such as:

Keep safely store the participant agent Tier 2 Credential and its keypair
Check and Validate any Tier 2 credentials coming from other participant agents during the mTLS Authentication against the Identity Provider Federation .
Check and Validate the ephemeral proof received from other participant agents after the successful mTLS Authentication process.
Check and validate the Tier 1 credential forwarded by other participant agents against the ephemeral proof (that contains also the caller Tier 1 Authentication Provider public key)
Request ephemeral proof to the Security Attribute Provider Federation to be used in secured communications with other participant agents

Authorisation Tier 2

This component is realised through an API Gateway, more specifically Spring Cloud Gateway and relies on the Tier 2 Authentication Provider to check Tier 2 credentials and ephemeral proof received during the mTLS Authentication process to enforce ABAC (Attribute Based Access Control) policies to authorise or deny access to the requested agent resource.

ABAC policies will be enforced in any agent-to-agent communication, by verifying whether the requestor’s attributes are permitted to access the requested resource and if needed the enforcement of ABAC policies can be done also in both Tier 1 and Tier 2 credentials (to check if the identity attribute is also present in the Tier 1 credential used by the end user of the caller participant agent)

2.27.5. Tier 2 Credential

The Tier 2 credential has the form of an X509 Certificate and is issued by a Certificate Authority embedded in the Identity Provider Federation .

Identity Attributes

Identity attributes are the most powerful and versatile tool at the disposal of the Dataspace Governance Authority to “design” the governance and the rules in the interactions between Dataspace participants. Some attributes are built in Simpl-Open ( Built-in = true ) and cannot be modified/removed.

Two important properties can be used in the definition of Identity attributes:

Assignable : if true means that any governance of a Participant that receives this identity attribute can assign it to any Tier 1 roles to then give it to its end users, if false means that this identity attribute is Participant wide and is to be considered as assigned to all the end users of the participant.

IsRight : if true means that the identity attribute should be considered as a special centralised right.

Here is the updated list of Identity Attributes that are used inside Simpl-Open:

Human Readable Attribute Name	Identity Attribute Value	Description	Built-in	Assignable	IsRight	Component & Endpoint	Location of configuration
Consumer	CONSUMER	Identity attribute used to tag the consumer participant	true	false	false	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Data Provider	DATA_PROVIDER	Identity attribute used to tag the data provider participant	true	false	false	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Application Provider	APP_PROVIDER	Identity attribute used to tag the application provider participant	true	false	false	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Infrastructure Provider	INFRA_PROVIDER	Identity attribute used to tag the infrastructure provider participant	true	false	false	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Data Provider Publisher	DATA_PROVIDER_PUBLISHER	Identity attribute needed for publishing Data Catalogue	true	true	true	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Application Provider Publisher	APP_PROVIDER_PUBLISHER	Identity attribute needed for publishing Application Catalogue	true	true	true	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Infrastructure Provider Publisher	INFRA_PROVIDER_PUBLISHER	Identity attribute needed for publishing Infrastructure	true	true	true	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml
Data searcher	DATA_SEARCHER	Identity Attributes used for tagging an end user able to act only as a searcher in the catalogue, but he can't start a contract negotiation or transfer process	true	true	true	Should be put in tier2-gateway configuration within GA agent as ABAC configuration	tier2-gateway → spring-configmap.yaml

Built-in identity attributes will be available by default in every Simpl-Open dataspace and cannot be modified by the Governance Authority. The Governance Authority can add custom (not built-in) identity attributes based on specific needs. For example , if a Governance Authority needs to define access levels to resources, they could introduce three new identity attributes such as:

Human Readable Attribute Name	Identity Attribute Value	Description	Built-in	Assignable	IsRight
Basic Access Level	ACCESS_LEVEL_BASIC	Basic Access Level	false	true	true
Medium Access Level	ACCESS_LEVEL_MEDIUM	Medium Access Level	false	true	true
Full Access Level	ACCESS_LEVEL_FULL	Full Access Level	false	true	true

Encryption

In mTLS (mutual Transport Layer Security) communication, encryption of in-transit data ensures that the information exchanged between a client and a server is protected from interception or tampering. This encryption is achieved through the following process:

TLS Handshake : Both the client and server initiate a TLS handshake, during which they exchange public keys and agree on encryption algorithms.
Mutual Authentication : Unlike regular TLS, in mTLS both the client and server authenticate each other by exchanging digital certificates, confirming the identity of both parties.
Symmetric Encryption : After authentication, a symmetric encryption key is established and used to encrypt all subsequent data transmitted between the client and server.

Through this process, data in transit is securely encrypted , preventing unauthorised access or modification, while ensuring that both the client and server are trusted entities.

Guaranteed Authenticity / Integrity

Supports the measures in place to ensure end-to-end data integrity, such that Simpl-Open agents can validate the authenticity of the delivered information.

This capability is achieved by implementing mTLS communication between agents, ensuring that communication can be established only between trusted and known participants from the Authority.
The Governance Authority during the onboarding processes creates unique Identity Credentials for each participant of the Dataspace. Then the participant uses the credential during the mTLS communication.

Components

This section is dedicated to listing all components divided by Frontend FE and Backend BE

Id Component	Component	Participant	Endpoints published on tier1-gateway	Endpoints published on tier2-gateway	Configuration URL
IAA-IDPRO-FE	Identity provider FE	Governance Authority	YES	NO
IAA-IDPRO-BE	Identity provider BE	Governance Authority	YES	YES
IAA-SAP-FE	Security Attribute Provider FE	Governance Authority	YES	NO
IAA-SAP-BE	Security Attribute Provider BE	Governance Authority	YES	YES
IAA-ONB-FE	Onboarding FE	Governance Authority	YES	NO
IAA-ONB-BE	Onboarding BE	Governance Authority	YES	NO
IAA-U&R-FE	User & Roles FE	All Participant	YES	NO
IAA-U&R-BE	User & Roles BE	All Participant	YES	NO
IAA-AUTH-FE	Authentication Provider FE	All Participant	YES	NO
IAA-AUTH-BE	Authentication Provider BE	All Participant	YES	YES
	xsfc-advsearch-be	Providers, Consumers	YES	NO
	simpl-edc	Providers, Consumers	NO	YES
	sd-creator-backend	Providers	YES	NO
	xsfc-catalogue	Governance Authority	NO	YES
	catalogue-query-mapper	Governance Authority	NO	YES
	Infra. Deployment Script Management FE	Providers	YES	NO
	Infra. Deployment Script Management BE	Providers	YES	NO	https://code.europa.eu/simpl/simpl-open/development/infrastructure/infrastructure-be/-/tree/develop?ref_type=heads#configure-tier1-and-tier2-business-logs
	schema-sync-adapter	Providers, Consumers	NO	YES
	asset-orchestrator	Providers	YES	NO	https://code.europa.eu/simpl/simpl-open/development/orchestration-platform/asset-orchestrator/-/blob/feature/add-tags-to-workflow-list/README.md?ref_type=heads#tier1-configuration
	dagster	Providers	YES	NO	https://code.europa.eu/simpl/simpl-open/development/orchestration-platform/dagster/-/blob/feature/oauth2-proxy-integration/README.md?ref_type=heads#tier1-gateway-configuration-participant

6.4.2. Self-Descriptions

The metadata will be described as self-descriptions. These are described in this section.

In the sub-section Self-Description Tooling the tools to create self-descriptions are introduced and the flow of the different steps to be considered are visualised. The SD Schema Creator enables customised schemas for each Data Space. In Schema Definition Properties the proposed attributes any Simpl Data Space should utilise are enlisted. The Validation of Syntax and schema can be looked up in SD Tooling Syntax Validation & Schema Validation.

The structure of Self-Descriptions should be based on the GAIA-X Trustframework . There are already Gaia-X powered Data Spaces providing such an SD. This way the created SD can be easily reused and be enhanced by the special requirements of each sectoral Data Space.

Base Entities and their relationship due to Gaia-X Trustframework

Note

Attributes marked in red color are planned, but not yet implemented.

Data Offering:

Simpl Attribute	Entity	Attribute	Cardinality	Mandatory / Recommended	Data Type	Constraint	Comment
Unique identifier	service-offering	id	1	Mandatory	xsd:string		The id of the ServiceOffering. usually refering to a DID. Set automatically.
Name	service-offering	name	1	Mandatory	xsd:string	sh:maxLength 255	A human readable name of the service offering
Description	service-offering	description	1	Mandatory	xsd:string	sh:maxLength 1000	a short description of the service offering
Location of the dataset (e.g. URL, handle)	service-offering	serviceAccessPoint	1	Mandatory	xsd:anyURI	sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)"	a list of Service Access Point which can be an endpoint as a mean to access and interact with the resource
Keywords	service-offering	keywords	0..16	Recommended	xsd:string	sh:maxLength 50	list of keywords
Language (of the metadata, like the title, description)	service-offering	inLanguage	1	Mandatory	xsd:string	sh:languageIn ("bg" "hr" "cs" "da" "nl" "en" "et" "fi" "fr" "de" "el" "hu" "ga" "it" "lv" "lt" "mt" "pl" "pt" "ro" "sk" "sl" "es" "sv")	The language of the content or performance or used in an action. Please use one of the language codes from the IETF BCP 47 standard . See also availableLanguage .
Version					xsd:string		The version of the self-description. Technical property, set automatically.
Creation date					xsd:dateTimeStamp		The first onboarding date. Technical property, set automatically.
Last update date					xsd:dateTimeStamp		The last update date. Technical property, set automatically.
SD Schema					xsd:string		Reference to the used Schema ID (and version). Technical property, set automatically.
Data Provider	provider-information	providedBy	1	Mandatory	xsd:string	sh:maxLength 255	Reference to Participant SD. To be Set automatically.
Contact point (who to contact in case of questions/issues)	provider-information	contact	1	Mandatory	xsd:string	sh:pattern "^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$"	email adress of the contact point
License	offering-price	license	1..n	Mandatory	xsd:anyURI	sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)" sh:maxLength 255	A list of SPDX identifiers or URL to document
Price Type		priceType		Recommended	xsd:string	sh:in("free" "commercial")	Link to price in the future.
Price (free, under cost)	offering-price	price	1	Mandatory	xsd:decimal	sh:minInclusive 0
Currency		currency		Mandatory	xsd:string	sh:in("BGN" "EUR" "CZK" "DKK" "HUF" "PLN" "RON" "SEK" )
Access policy (to define who can access the dataset)	service-policy	access-policy	0..n	Recommended	xsd:string	sh:pattern "[:,\{\}\[\]]\|(\".?\")\|('.?')\|[-\w.]+"	a list of policy expressed using a DSL (e.g., Rego or ODRL) (access control, throttling, usage, retention, …)
Usage policy (to define how a dataset can be used)	service-policy	usage-policy	0..n	Recommended	xsd:string	sh:pattern "[:,\{\}\[\]]\|(\".?\")\|('.?')\|[-\w.]+"	a list of policy expressed using a DSL (e.g., Rego or ODRL) (access control, throttling, usage, retention, …)
Compliance: Indicates compliance with relevant data protection regulations and standards.	service-policy	dataProtectionRegime	0..n	Recommended	xsd:string	sh:pattern "[:,\{\}\[\]]\|(\".?\")\|('.?')\|[-\w.]+"
Provenance	dataset-properties	producedBy	1	Recommended	xsd:anyURI	sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)"	a resolvable link to the participant self-description legally enabling the data usage
Format under which the data is distributed (e.g. csv, xml, …)	dataset-properties	format	1	Mandatory	xsd:string
Schema of the dataset, depends on the type of data for JSON it would be JSON Schema Description that states what fields the data has and the types.	dataset-properties	openAPI	0..n	Recommended	xsd:anyURI	sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)"	URL of the OpenAPI documentation
Additional Information about the dataset		additionalInfo			xsd:anyURI	sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)"
Related datasets	dataset-properties	relatedDatasets	0..n	Recommended	xsd:string
Target users	dataset-properties	targetUsers	0..n	Recommended	xsd:string
Data Quality (to include metrics such as completeness, accuracy, timeliness and other)	dataset-properties	dataQuality	0..n	Recommended	xsd:string
Encryption: Describes the encryption algorithms and keys used to secure the data.	dataset-properties	encryption	0..1	Recommended	xsd:string
Anonymisation/pseudonymisation: Indicates whether sensitive information has been anonymised or pseudonymised to protect privacy.	dataset-properties	anonymization	0..1	Recommended	xsd:string
Contract template	contract-template	contractTemplate	1..n		xsd:string	sh:in ( "Contract Template 1" "Contract Template 2" "Contract Template 3" )	Refering to an SD of a Contract Template

Infrastructure Offering:

Simpl Attribute	Entity	Attribute	Cardinality	Mandatory / Recommended	Data Type	Constraint
Resource Type	infrastructure-properties		1	Mandatory	xsd:string	sh:in("vm" "container" "block_storage" "object_storage" "relational_db" "document_db")
Region and availability zone	infrastructure-properties		1..n	Mandatory	xsd:string	sh:in("eu-west-1" "eu-west-2" "eu-west-3" "eu-central-1" "eu-north-1" "eu-south-1" "eu-south-2")
Size and capacity	infrastructure-properties		0..1	Recommended	xsd:string	sh:pattern "\d+(\.\d+)?\s?(B\|KB\|MB\|GB\|TB\|PB\|EB\|ZB\|YB)"
Operating system and image	infrastructure-properties		0..1	Mandatory	xsd:string
Network configuration	infrastructure-properties		0..1	Recommended	xsd:string
Security settings (access control, security groups/firewalls, encryption)	infrastructure-properties		0..1	Mandatory	xsd:string
Instance type	infrastructure-properties		0.1	Mandatory	xsd:string
Storage type	infrastructure-properties		0.1	Mandatory	xsd:string
Backup and redundancy	infrastructure-properties		0..1	Recommended	xsd:string	sh:in("full-backup" "incremental-backup" "differential-backup")
Scalability options	infrastructure-properties		0..1	Recommended	xsd:string	sh:in("dynamic-scaling" "scheduled-scaling", "sharding")
Monitoring and logging	infrastructure-properties		0..1	Recommended	xsd:string
Tags and metadata	infrastructure-properties	keywords	0..16	Recommended	xsd:string	sh:maxLength 50
External Url	infrastructure-properties		1	Mandatory	xsd:string	sh:maxLength 255
Deployment script ID	infrastructure-properties		0.1	Mandatory	xsd:string

termsAndCondition structure (defined by Gaia-X Trustframework)

Attribute	Cardinality	DataType	Comment
URL	1	xsd:string	a resolvable link to document
hash	1	xsd:string	SHA256 of the above document

dataAccountExport structure (defined by Gaia-X Trustframework)

The purpose is to enable the participant ordering the service to assess the feasibility to export its personal and non-personal data out of the service.
This export shall cover account data e.g., account holder’s billing information, information on the PII held - but also data provided previously to the service by the user.

Attribute	Cardinality	DataType	Comment
requestType	1	xsd:string	the mean to request data retrieval: API, email, webform, unregisteredLetter, registeredLetter, supportCenter
accessType	1	xsd:string	type of data support: digital, physical
formatType	1	xsd:string	type of Media Types (formerly known as MIME types) as defined by the IANA .

2.28.1. Quality Rules

Currently, only mandatory quality rules are supported. A Quality Score can only be calculated for recommended quality rules thus this will also not be supported.

Mandatory quality *rules are always *enforced during the creation of a Self-Description (SD) for an offering, to ensure the data quality of the SD. A resource provider is not able to publish an SD that is not complying with the mandatory quality rules.

Quality Rule Formalisation

Quality rules are defined in the schema of the self-description (which are semantic RDF Graphs) and allow to express data types, constraints and conditions on those RDF Graphs. Thus, SHACL (Shape Constraint Language) Constraints are intended to be used as the formal notation to express quality rules.

The quality rules that can be defined for an SD property can be based on the data type and/or on a SHACL constraint. Example of Constraints:

Minimum or maximum length of a string value
Value Ranges for Numbers
Non-Negative Numbers
Regular Expressions (Patterns)
List of allowed values
Constraints based on other properties

Data Model (initial)

To define the quality rules, there are three basic entities:

Quality Rule: The (mandatory) quality rule. It is uniquely defined by an id and contains a textual description of the rule in clear text;
Rule Template: The template for the formal definition of the rule. It contains besides the ID a field with SHACL template that is parameterised. The number of parameters and their type is defined in a parameter_schema, i.e., JSON Blob with the parameter and data types;
Quality Dimension: The quality dimension used to group the quality rule for instance FAIR as an example.

Each Quality Rule has exactly one Dimension and one Rule Template associated. The template_assoc also contains the concrete parameterisation for the rule template.

Score Calculation

The score is calculated by dimension.

\delta_{r,s} = \begin{cases}

1 & \text{if the quality rule r is fulfilled for the self-description s} \\

0 & \text{else}

\end{cases}

\newline \newline

\text{score}(s, d) = \frac{\sum_{r \ in ~R_d} \delta_{r,s} * w_r}{\sum_{r \ in ~{R}_d} w_r} * 100

\newline \newline

s \text{ is valid} \Leftrightarrow \forall d \in D: \text{score}(s, d) \geq \text{min_score}_d

The kronecker-delta is 1 if the quality rule r is fulfilled for the self-description s, else 0.

The score for a self-description s and quality dimension d is calculated by the sum over all the Quality rules r for the Dimension d (R_d) multiplied by their weight w. This is then normalised by the sum of all weights for the dimension. Because a value between 0-100 is desired instead of between 0-1, it is multiplied by 100.

A self-description is valid exactly if, for all quality dimension the score is greater than the specified threshold (min_score).

Calculation Process

The calculation of the score (and the validation of the rules) is done during the publication of the self-description s in the query mapper. First, all the active quality rules are retrieved from the database (with the associated SHACL template). All the rules are looped over and validated against the self-description s. The results are added to the quality report for s . After all the rules are processed, all the quality dimensions d are iterated over. For each dimension the score is calculated.

Next, it is checked whether all the mandatory rules are fulfilled and if the score for each dimension is above the defined threshold. If this is not the case, the publication is aborted, and the quality report is returned to the provider. Else, the publication continues, and the quality report is returned to the provider.

2.28.2. Self-Description Tooling

The self-description tooling consists of four different components that are all in their respective repositories:

SD Schema Creator: SD Schemas
This component creates the schemas that describe the form and content of the self-description. It is used by the Governance Authority to set the standard for the Self-Description. Technically it is done by a set of configuration files in the form of YAML-Documents. Those files are verified and transformed into an ontology and SHACL Constraints that are used by the other components to create the wizards. The component is written in Python, and at least the YAML configuration needs to be adjusted for Simpl-Open.
SD Creation Wizard API: SD Creation Wizard API
The main API project. Transform the SHACL-shapes from the SD Schema Creator into JSON forms that are used by the frontend to allow the provider to write new Self-Descriptions.
SD Creation Wizard Frontend: SD Creation Wizard Frontend
Frontend with the forms for the provider to create Self-Descriptions. Written in Angular and NodeJS. The result is an SD in the form of a JSON-LD document that can be uploaded to the catalogue.
SD Validation API: SD Validation API
Validation of the Self-Description against SHACL files. Might be used for the Quality Rule Validation. Written in Java.

SD Schema Creator

Background

Self-Description in the context of DATA/APP are documents that describe the service offering (either Data, Application, or Infrastructure). The Schema of the Self-Description defines the format of the Self-Description, i.e. it is a description about what are the fields for the self-description, their data types and if they are mandatory or not.

Component Self-Description Schema Creator :

The Schema-Framework is a component that is able to generate the self-description schemas from configuration files. The idea is that from a simple configuration the schemas are generated and later used by the provider to write the self-description.

It should include validation of the schema files (syntax and semantic).

The basis of the implementation is the repository from Gaia-X Context sd-schemas

Context View

The main actor in the SD Schema Creator is the Data Governance Authority. They can configure the schema by changing the yaml files that define how the schemas for the different services should look like.

Component View

The input of the system is the SD Schema Configuration, the file uses the LinkML data model and is serialised as a YAML document. After the configuration is changed the process is triggered that first checks the syntax and the semantic. After the validation the configuration files are transformed into two different files that describe the semantic. One is an ontology, i.e. a formal representation of the knowledge which is used as a vocabulary for the SD Tool. The other are constraints in the form of SHACL-Shapes that are used as a template to build the forms in the SD Tool. Both semantic files are serialised as Turtle-files.

Syntax Validation

For YAML files there exist currently no standard for schema validation. To this end, the SD Schema Description is transformed into a JSON Serialisation and a JSON-Schema Description is used for the syntax validation. This JSON-Schema is written by the Data Governance Authority.

Semantic Validation

The Semantic Validation uses a Python script which reads some configuration and guidelines (for instance which fields are mandatory in the schema).

Runtime View

For the current release, the system is simply deployed as a GitLab repository. A GitLab CI Pipeline starts if the configuration is changed by the data governance authority and generates new files. If the Data Provider starts the SD Tool, the SD-Tool pulls from the repository the current SHACL Constraints and Ontology.

SD Tooling Syntax Validation & Schema Validation

Vocabulary, Schema and Self-Descriptions

The vocabulary is a formal description of an ontology, representing knowledge and relationships between the terminologies, containing inference and integrity rules for reasoning.

A schema is describing a data object with constraints on the content, structure and meaning of a graph. These conditions may constrain the number of values that a property may have, the type of values, numeric ranges, string matching patterns or logical combinations of constraints.

The Self-Description is an instance of a schema object, meaning that values are assigned to the properties.

Syntax Validation

Syntax Validation during the process of creating the SD comprises the following:

Formatting: Check if the file is malformed (e.g. missing brackets etc.);
Data Types: Check for the correctly applied values according to data types. Allowed data Types according to dataTypeAbbreviation.yaml .

The syntax validation for data types in the SD Frontend is based on the schema definition, which is the single point of truth.

The Syntax validation on the provider node is based on the schemas that are imposed by Simpl and are intended to guide the user to provide an error free Self-Description.

The Syntax validation on the Governance Authority Node ensures that only valid Self-Descriptions will be published to the catalogue.

Allowed Data Types

xsd:string: ‘ http://www.w3.org/2001/XMLSchema\#string ’

xsd:boolean: ‘ http://www.w3.org/2001/XMLSchema\#boolean ’

xsd:decimal: ‘ http://www.w3.org/2001/XMLSchema\#decimal ’

xsd:float: ‘ http://www.w3.org/2001/XMLSchema\#float ’

xsd:double: ‘ http://www.w3.org/2001/XMLSchema\#double ’

xsd:duration: ‘ http://www.w3.org/2001/XMLSchema\#duration ’

xsd:dateTime: ‘ http://www.w3.org/2001/XMLSchema\#dateTime ’

xsd:time: ‘ http://www.w3.org/2001/XMLSchema\#time ’

xsd:date: ‘ http://www.w3.org/2001/XMLSchema\#date ’

xsd:gYearMonth: ‘ http://www.w3.org/2001/XMLSchema\#gYearMonth ’

xsd:Day: ‘ http://www.w3.org/2001/XMLSchema\#Day ’

xsd:hexBinary: ‘ http://www.w3.org/2001/XMLSchema\#hexBinary ’

xsd:base64Binary: ‘ http://www.w3.org/2001/XMLSchema\#base64Binary ’

xsd:anyURI: ‘ http://www.w3.org/2001/XMLSchema\#anyURI ’

xsd:QName: ‘ http://www.w3.org/2001/XMLSchema\#QName ’

xsd:NOTATION: ‘ http://www.w3.org/2001/XMLSchema\#NOTATION ’

xsd:dateTimeStamp: ‘ http://www.w3.org/2001/XMLSchema\#dateTimeStamp ’

xsd:enum: ‘ http://www.w3.org/2001/XMLSchema\#enum ’

xsd:integer: ‘ http://www.w3.org/2001/XMLSchema\#integer ’

xsd:address: ‘ http://www.w3.org/2001/XMLSchema\#address ’

xsd:nonNegativeNumber: ‘ http://www.w3.org/2001/XMLSchema\#nonNegativeNumber ’

did:example: ‘ https://www.w3.org/TR/did-core/\#example ’

dct:location: ‘ http://dublincore.org/usage/terms/history/\#Location-001 ’

trusted-cloud:meaningfulString: ‘class-placeholder-from-dataTypeAbbreviation.yaml’

Semantic Validation

Semantic Validation during the process of creating the SD comprises:

the verification of property patterns;
data ranges;
other constraints;
the cardinality of the properties;
the ontology/vocabulary compliance.

Examples:

Value Ranges;
Length;
Pattern;
Value Comparison;
Memberships;
Logical.

Constraints can be defined according to Shapes Constraint Language

6.4.3. Policies

For this context, both access and usage policies for resources (Data, Application, or Infrastructure) are defined.

The below definition of the Data Space Support Center is followed:

Access Rules/Policy: define whether access to a resource is allowed or not.
Usage Rules/Policy: define how a resource might or may not be used.

Access control policies control the authorisation to access specific data while the data rights owner retains direct control over the data. Usage policies, including consent, regulate the permissible actions and behaviours related to the utilisation of the accessed data, which means keeping control of data even after the items have left the trust boundaries of the data owner. Policies can only be enforced when technically feasible, otherwise only legal enforcement is possible

https://dssc.eu/space/BVE/357075567/Access+%26+Usage+Policies+Enforcement#Data-Space-Registry

Following this definition the access policies are checked before the provider gives (at least partial) control over to the consumer. The usage policies describe the behaviour after the consumer has access to the resource (Data, Application or Infrastructure).

Policy Language

A formal and machine-readable way to express and enforce the policies is needed. Open Digital Rights Language (ODRL) is intended to be used to write both access and usage policies. https://www.w3.org/TR/odrl-model/

The key components of ODRL are:

Here are the key components of an ODRL usage policy:

Asset : The digital content or service to which the policy applies;
Permissions : Actions that are allowed with respect to the asset (e.g., read, download);
Prohibitions : Actions that are explicitly forbidden;
Constraints : Conditions or limitations that must be met for the permissions to apply (e.g., time restrictions);
Duties : Obligations that must be fulfilled by the user in order to exercise a permission (e.g., attribution, payment).

Different ways exist to serialise the ODRL expressions, and JSON-LD is intended to be used for this part.

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/123 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“permission”: [

{

“target”: “ http://example.com/asset/image123 ”,

“action”: “ http://www.w3.org/ns/odrl/2/distribute ”,

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/purpose ”,

“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,

“rightOperand”: “ http://www.example.com/vocab\#nonCommercial ”

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/payAmount ”,

“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,

“rightOperand”: “0”

}

“duty”: [

{

“action”: “ http://www.w3.org/ns/odrl/2/attribution ”

}

]

}

“prohibition”: [

{

“target”: “ http://example.com/asset/image123 ”,

“action”: “ http://www.w3.org/ns/odrl/2/modify ”

}

]

}

Access Policy

Here is an example of an access policy for a dataset provided by a data provider. The policy will specify who can access the data, under which conditions, and for how long.

Scenario:

The dataset contains research data that can be accessed by different roles:
- Researchers: Full access to the data for analysis;
- Students: Limited access to anonymised data for study purposes;
- External partners: Access to aggregated data for collaboration purposes.
The access is granted for a specific period;
The access is granted only for the geographic location of the EU.

Different datasets for the full data, anonymised data and aggregated data are used.

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/123 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “ http://example.com/dataset/research123 ”,

“assigner”: {

“uid”: “ http://example.com/provider/dataProvider001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “SECURITY_ATTRIBUTE”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

“action”: [

{ “name”: “ http://www.w3.org/ns/odrl/2/read ” }

“target”: “ http://example.com/dataset/research123/aggregated ”,

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,

“operator”: “ http://www.w3.org/ns/odrl/2/leq ”,

“rightOperand”: “2024-12-31T23:59:59”

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,

“operator”: “ http://www.w3.org/ns/odrl/2/geq ”,

“rightOperand”: “2024-01-01T00:00:00”

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/spatial ”,

“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,

“rightOperand”: “ http://www.geonames.org/external-partner-location ”

}

]

}

]

}

Minimal Access Policy

For the the current release, access policies with limited expressive power are planned to be supported. It is possible to define two different actions

http://www.w3.org/ns/odrl/2/ read : the attribute holder is able to search for the dataset/application/infrastructure;
http://www.w3.org/ns/odrl/2/ use : The attribute holder can consume the dataset/application/infrastructure.

while use implies read .

Date time constraints are planned to be supported for specifying when the policy should be valid.

{RESSOURCE_URI}, {POLICY_URI}, {PROVIDER_URI} are later automatically replaced with the correct URI. {SECURITY_ATTRIBUTE_URI} need to be specified but documentation with the available URI is provided, as well as the action (read for searching and use for consumption, which implies read)

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “{POLICY_URI}”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “{RESSOURCE_URI}”,

“assigner”: {

“uid”: “{PROVIDER_URI}”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “{SECURITY_ATTRIBUTE_URI}”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

“action”: [

{ “name”: “ http://www.w3.org/ns/odrl/2/{read/use} ” }

“target”: “{RESSOURCE_URI}”,

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,

“operator”: “ http://www.w3.org/ns/odrl/2/leq ”,

“rightOperand”: “2024-12-31T23:59:59”

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,

“operator”: “ http://www.w3.org/ns/odrl/2/geq ”,

“rightOperand”: “2024-01-01T00:00:00”

}

]

}

]

}

API to get all available attributes with description about the semantic. When will this be available, and can static get list so the development can start.
1. Prioritises before the MTLS;
2. Availability not clear;
3. Provide a static list.
API to get the attributes of the searching consumer. For the use of filtering the results of the catalogue search:
1. over the public key;
2. from the JWT, attributes are in the payload.
How to get Provider ID? While you use self-description? Is it somehow possible to get the ID of the provider from an API to add this information to the Self-description:
1. unique id of the agent is the public key, from the vault (HashiCorp/OCM) or the public endpoint;
2. self-description in long run of the participants.
Map Policy to ABAC (who is doing it?):
1. ABAC only for first layer;
2. Second Layer with policy evaluation in EDC.

Usage Policy

The IDS Usage Control Language is based on ODRL: https://international-data-spaces-association.github.io/DataspaceConnector/Documentation/v5/UsageControl

The Usage Policy is part of the usage contract, as well as the Self-Description. It contains permissions, prohibitions and obligations.

Usage Policy Examples:

Allow the Usage of the Data

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “ http://example.com/dataset/TestData001 ”,

“action”: “ http://www.w3.org/ns/odrl/2/use ”,

“assigner”: {

“uid”: “ http://example.com/provider/dataProvider001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “ http://example.com/roles/dataConsumer001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

}

]

}

Use Data and Delete it After

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “ http://example.com/dataset/TestData001 ”,

“action”: “ http://www.w3.org/ns/odrl/2/use ”,

“assigner”: {

“uid”: “ http://example.com/provider/dataProvider001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “ http://example.com/roles/consumer001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

}

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/deletion ”,

“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,

“rightOperand”: “after_use”

}

]

}

Restricted Number of Usages

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “ http://example.com/dataset/TestData001 ”,

“action”: “ http://www.w3.org/ns/odrl/2/use ”,

“assigner”: {

“uid”: “ http://example.com/provider/dataProvider001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “ http://example.com/roles/consumer001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

}

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/count ”,

“operator”: “ http://www.w3.org/ns/odrl/2/lteq ”,

“rightOperand”: “10”

}

]

}

Duration-restricted Data Usage

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “ http://example.com/dataset/TestData001 ”,

“action”: “ http://www.w3.org/ns/odrl/2/use ”,

“assigner”: {

“uid”: “ http://example.com/provider/dataProvider001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “ http://example.com/roles/consumer001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

}

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,

“operator”: “ http://www.w3.org/ns/odrl/2/leq ”,

“rightOperand”: “2024-12-31T23:59:59”

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,

“operator”: “ http://www.w3.org/ns/odrl/2/geq ”,

“rightOperand”: “2024-01-01T00:00:00”

}

]

}

Extended Scenario

Another example of an extended usage policy for a dataset provided by a data provider. The policy will specify how a resource can be used once the access has been granted.

A dataset contains sensitive health research data. The data provider wants to ensure that this data is used responsibly and in compliance with specific guidelines. The usage policy specifies the following:

The data can only be used for academic research purposes;
The data cannot be shared with third parties;
The data must be deleted after the research project is completed;
The data usage is monitored, and any breach of the policy will result in revocation of access.

{

“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,

“@type”: “Policy”,

“uid”: “ http://example.com/policy/usage/001 ”,

“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,

“target”: “ http://example.com/dataset/health\_research123 ”,

“assigner”: {

“uid”: “ http://example.com/provider/dataProvider001 ”,

“role”: “ http://www.w3.org/ns/odrl/2/assigner ”

“permission”: [

{

“assignee”: {

“uid”: “ http://example.com/roles/researcher ”,

“role”: “ http://www.w3.org/ns/odrl/2/assignee ”

“action”: [

{ “name”: “ http://www.w3.org/ns/odrl/2/use ” }

“constraint”: [

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/purpose ”,

“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,

“rightOperand”: “ http://example.com/purpose/academic\_research ”

{

“leftOperand”: “ http://www.w3.org/ns/odrl/2/deletion ”,

“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,

“rightOperand”: “after_use”

}

]

}

]

}

Policy Enforcement

This section presents draft content for a capability falling behind the scope of the current release and will be completed at a later time.

6.5. Federated Catalogue

Simpl is using the XFSC Federated Catalogue as a Catalogue for Data, Apps and Infrastructure (see architecture document of XFSC Federated Catalogue ).

The Federated Catalogue is not a monolithic application. It consists of multiple components, to reuse existing technology and to allow scaling. Those components can be deployed individually (see section Deployment View).

The components are:

Name	Responsibility
Catalogue	Main component, implementing the core catalogue functionality.
Authentication	External component implementing the authentication flow and user management.
Graph-DB	Graph database, holding all claims contained in active Self-Descriptions. The Graph database is responsible for executing semantic search queries.
File Store	The File store is a blob storage. It holds the Self-Description files and the files for the Schemas. This includes historical versions of the Self-Descriptions and Schemas.
Metadata Store	Store for metadata on the Self-Descriptions and Schemas stored in the File Store.

The architecture of the core component is described in the next sections.

2.30.1. Authentication

The authentication component is responsible for authenticating users. This is not a central component of the catalogue, as it will be implemented by Lot 1 “Authentication & Authorisation” of the GXFS-DE project. For the catalogue implementation, a mock integration is shown, using common, off the shelf software that implements the OpenID Connect standard ^{[

14

]} .

The responsibilities of the authentication components are:

Storage of Users;
Storage of user roles for a Participant.

A user belongs to only one Participant, on whose behalf he or she acts (see specification section 2.4 for more details).

For the implementation, Keycloak will be used. It is widely used and also part of the implementation of other lots. Therefore, this integration of different lots is simplified. The user will get a JSON Web Token (JWT ^{[

15

]} ) with user claims and authorities, which is used to authenticate requests to the catalogue REST API.

An alternative implementation would be Lissi ^{[

16

]} . It is not further considered, as it is not as mature as Keycloak.

2.30.2. Graph database

The graph database holds the claims of verified, active Self-Descriptions. Claims of Self-Descriptions that fail the verification are not added to the graph database. Claims of Deprecated, Expired or Revoked Self-Descriptions will be deleted from the Graph database.

The Graph Database can be considered as a kind of search index. The single source of truth is the active Self-Descriptions, stored in the File Store. This means at any point in time the Graph database can be rebuilt from scratch by reimporting the claims of the Self-Descriptions. This allows the following:

Backup : An explicit backup of the Graph database is not needed. Backing up the Self-Description files (located in the File Storage) and the metadata (located in the Metadata Store) is sufficient to allow the rebuild of the Graph Database;
Scalability : Querying the Graph database might be the most critical part regarding performance. Therefore, the Graph Database can be replicated in the future by multiple, independent instances. Since there are no strict consistency requirements, changes in the Graph can be applied independently. In the control flow, all write operations on Self-Descriptions pass the Metadata Store. Therefore, the consistency can be enforced by that database.

Generically returning the Self-Description files containing claims that influence query response is not possible. To get the relevant Self-Description files, the query to the Graph Database can be formulated to return the Gaia-X entity that is the credentialSubject of a Verifiable Credential. Then this can be used as a filter for the Self-Description endpoint, to download the Self-Description file.

Neo4j is used as implementation of the Graph database.

Limitation: queries to non-Enterprise Neo4j Graph database returns an empty record when no results are found, rather than an empty list.

When there is no data in the Graph database, i.e., no claims extracted from Self-Description, there is still a configuration node for the neosemantics module ^{[

17

]} , which enables Neo4j to support the RDF data model, which is required here. openCypher queries over all nodes without a WHERE clause or without specifying relationships always return this node, unless regular users are revoked access from the configuration node as follows:

DENY MATCH {*} ON GRAPH neo4j NODES _GraphConfig TO PUBLIC

However, this revoke operation is only supported in Neo4j enterprise. ^{[

18

]} It was decided not to implement a workaround that involves query rewriting, as this may have harmful side effects.

2.30.3. File Store

The File Store is responsible to persist all file-based content submitted to the catalogue. These are Self-Descriptions and Schemas.

For the sake of simplicity, a folder in the file system is used as a file store. For future scalability the file store can be simply realised using an Object Storage or Database.

2.30.4. Metadata Store

In the Metadata Store persists the metadata for the elements (Self-Descriptions, Schemas and Trust Anchors). It allows to efficiently identify the relevant files in the file storage, to process the incoming requests.

It is realised as relational database (e.g., PostgreSQL or MariaDB). Since all write requests are handled by the database, the transactional functionality guarantees the consistency of the data.

Todo: include the sections in our Wiki

6.5.1. Contract

The state machine for a Contract Negotiation is visualised in the figure below:

Transitions marked with C indicate a message sent by the Consumer , transitions marked with P indicate a Provider message. Terminal states are final; the state machine may not transition to another state. A new CN may be initiated if, for instance, the CN entered the TERMINATED state due to a network issue. The associated message types to switch into the mentioned states are denoted in the bottom part of each status box. For further information refer to the specification section contract negotiation protocol .

After successful contract negotiation the Transfer Process can be invoked via the data plane. The state machine for the transfer process is shown in the diagram below:

Any implementation of Eclipse Dataspace Protocol must implement the state machines shown above where respective contract messages respectively Transfer messages induces switching of states.

In the EDC connector the IDS Dataspace protocol is implemented. Via State Transition Functions any specific actions can be triggered like invoking consent or contract managers. This is described in Contract Negotiation Architecture .

Steps to be done for Contract Negotiations:

Pre-requisites a provider has to do to publish a service offered at a connector (using a provider connector):

Create an Asset on the provider side;
Create a Policy on the provider side;
Create a Contract definition on the provider side.

Steps to be done on the consumer side to request a service offer from a connector (using a consumer connector):

How to fetch catalogue on the consumer side;
Negotiate a contract on the consumer side;
Getting the contract agreement id.

These steps are described in Transfer-01-negotation .

After successful negotiation process the transfer process can be started.

6.5.2. Infrastructure Provisioning

In the first step, the Infrastructure Provider (or APP/Data Providers, upon their need) can use the Deployment Script Management UI (and/or API) to add their deployment scripts.
These scripts are either Crossplane or Terraform configuration files, that at the time of execution, will:

A) Provision the infrastructure resources (VM, Container or Storage);
B) Deploy Applications over the infrastructure resources (using Cloud-init, and if needed);
C) Load data sets or images on the infrastructure resource (using Cloud-init, and if needed).

After adding the deployment script (via the available UI or the API), the DeploymentScriptID, which is a unique ID for that deployment script will be returned to the provider.

In the second step, at the time of creating infrastructure offerings (or bundles of app/data + infrastructure, as it would be required for use cases explained in BP 09A and BP 09B), the DeploymentScriptID is being added to the Self-Description (SD).

In the third step, when the offer has been selected and successfully contracted, the Infrastructure Provisioner API (the same API that handles the addition/removal and modifications of the Deployment Scripts in step 1) is being called (currently via the Data Space Connector Extension), and the DeploymentScriptID will be passed to that API for execution.

Therefore, that DeploymentScriptID is being validated, and if the validation is successful, the deployment script will be fetched from the storage and executed. The infrastructure Provisioner Module will (as explained above):

A) Provision the infrastructure resources (VM, Container or Storage).;
B) Deploy Applications over the infrastructure resources (using Cloud-init, at the time of first boot of the instance);
C) Load data sets or images on the infrastructure resource (using Cloud-init, at the time of first boot of the instance).

And will share back the access data with the consumer. Currently and at the time of writing this document, the access information and credentials are being shared in form of an email, but in the future wallet solutions are planned to be used.

The communication between the triggering module and the infrastructure provisioner is done via a message broker, to keep the process asynchronous.

The data sharing between two participant agents is done via two connectors based on Eclipse Dataspace Protocol relying on the IDSA Dataspace Protocol ). Dataspace protocol is divided into two parts: First Contract Negotiation has to be invoked and after successful negotiation the Transfer process can be invoked. Contract Negotiation is done via contract negotiation protocol for the data exchange service based on the trust protocol defined by Gaia-x.

The proposed data transaction model scope is compliant with the EDC Dataspace Protocol.

The Information model of the dataspace model is described here .

The figure sketches two implementations of a participant agent:

A Catalog Service is a Participant Agent that makes a DCAT Catalog available to other Participants offering Data services and assets published by providers.
A Connector is a Participant Agent (consumer) that performs Contract Negotiation and Transfer Process operations with another Connector aka participant agent of a Provider. An outcome of a Contract Negotiation may be the production of an Agreement , which is an ODRL Agreement defining the Usage Policy agreed to for a Dataset .

For further information refer to the specification section model .

EDC connector has implemented the above-mentioned dataspace protocol as well as the depicted data and control planes and an additional Management API . This management API is described in detail here .

The Transfer process is described in the Transfer Process Architecture and is implemented by providing a special extension to back-end systems.

In the current release data orchestration refers to the data plane component responsible for the actual data transfer that takes place after a contract is established between the parties through the connector’s control plane. This orchestration of data flow is a crucial step, as it translates contractual agreements into real actions for exchanging data between a source and a destination.

This component will be implemented as an extension of the EDC connector, it has been depicted as an external entity in various diagrams and system architectures. This approach underscores the goal of creating an external and independent solution that is agnostic of the specific connector used. Such independence is achievable as long as the connector supports the IDSA Dataspace Protocol, which is a key requirement to ensure interoperability within distributed ecosystems like sovereign data spaces or shared data infrastructures.

The primary role of this orchestrator is to serve as a bridge between the actual data source, located outside of the Simpl system, and the designated destination where the data is intended to flow. It ensures seamless connectivity between these two points, handling the complexities of transferring data across systems that may differ in protocols and technology. Additionally, the orchestrator is designed as a specific component tailored to each type of data source. This specialisation allows to externalise the technical management of heterogeneous data sources that will be handled in the Simpl scenario, reducing complexity and promoting flexibility in the integration of various data ecosystems.

Steps to be done for transfer process:

Either pull or push pattern can be used for transfer process:

Consumer pull
Provider push

Description according to SAMPLES from EDC: https://github.com/eclipse-edc/Samples/tree/main/transfer

2.33.1. Consumer pull

Following diagram presents the state machine for this case:

Following diagram presents the sequence diagram for this case:

Provider and consumer agree to a contract (not displayed in the diagram);
Consumer initiates the transfer process by sending a DataRequest with destination type HttpProxy;
Provider Data Plane Selector is queried to find a suitable instance;
Provider Control Plane build a DataAddress which type EDR, whose:
- endpoint corresponds to the public API of the selected Data Plane;
- auth key is Authorisation;
- auth code is a signed token generated by the Control Plane with claims;
- dad containing the encrypted DataAddress of the actual data source (provider ecosystem);
- cid claim containing the contract id.
This DataAddress is sent to the consumer Control Plane through the DSP protocol;
Consumer Control Plane converts the DataAddress into an EndpointDataReference object and dispatches it through the EndpointDataReferenceReceiverRegistry.

Once this process is completed, the consumer backend applications can use the received EndpointDataReference in order to query data from the provider Data Plane, by simply providing the provided token in the request header.

NOTE: For a Data Plane instance to be eligible for the Consumer Pull transfer, it must:

contains HttpProxy in the allowedDestTypes;
contain a property which key publicApiUrl, which contains the actual URL of the Data Plane public API.

2.33.2. Provider push

Following diagram presents the state machine for this case:

Following diagram presents the sequence diagram for this case:

Provider and consumer agree to a contract (not displayed in the diagram);
Consumer initiates the transfer process, i.e. sends DataRequest with any destination type other than HttpProxy;
Provider Control Plane retrieves the DataAddress of the actual data source and creates a DataFlowRequest based on the received DataRequest and this data address;
Provider Control Plane asks the selector which Data Plane instance can be used for this data transfer;
Selector returns an eligible Data Plane instance (if any);
Provider Control Plane sends the DataFlowRequest to the selected Data Plane instance through its control API (see DataPlaneControlApi);
Provider Data Plane validates the incoming request;
If request is valid, Provider Data Plane returns acknowledgement;
DataPlaneManager of the Provider Data Plane processes the request: it creates a DataSource/DataSink pair based on the source/destination data addresses;
Provider Data Plane fetches data from the actual data source (see DataSource);
Provider Data Plane pushes data to the consumer services (see DataSink).

6.5.4. Data Visualisation

For visualisation the component Apache Superset is chosen. Superset is a modern data exploration and data visualization platform . It integrates well with a variety of data sources, and it is Open-Source under the Apache License . It comes out of the box with features to create a dashboard or how to explore data . It also provides Security Configurations . A REST API for user & role management can be enabled and even permissions can be customised. Superset’s public REST API follows the OpenAPI specification and is documented here .

The community also provides automatic builds for multi platforms and even prebuild docker builds from a Superset Docker Hub repository .

6.5.5. Logging, Monitoring & Reporting

2.35.1. Types of logs - Reference model

The following table identifies the different types of logs that can be generated by an IT system together with their definition/description:

Grouping	Type of logs	Description
Business logs	Business logs	Record significant events or actions (related to steps within a business process or other functional use cases) that occur within a system, typically used for security, audit and troubleshooting purposes.
Technical logs	Application logs	Record events and activities generated by an application during its runtime, typically used for troubleshooting, monitoring performance and auditing activities within the application.
	Database logs	Record events and activities generated by a database (queries, transactions, schema changes), typically used for troubleshooting, ensuring data integrity (e.g., monitoring transaction rollbacks, deadlocks, or schema violations) and auditing access.
	System logs	Record events and activities generated by the operating system (OS) and system-level processes. These logs provide valuable information for monitoring system health, diagnosing issues and ensuring security. System logs can include: low-level system events (kernel event, hardware error), system-level events (service startups/shutdowns/failure), authentication and authorisation events (login attempts, privilege escalation).
	Network logs	Record events and activities related to network traffic, devices and communications within a network. These logs are essential for monitoring network health, diagnosing issues and ensuring security. Network logs can include: firewall logs (allowed/denied connections, intrusion detection alerts, security policy violations), router and switch logs (device startups, interface status changes, routing protocol updates), DNS logs (queries/responses, cache activity, DNS server configuration changes and errors), proxy logs (user access, URL requests, content filtering, bandwidth usage) and network traffic logs (packet-level data, including source and destination IP addresses, port numbers, protocols, packet payloads).
	Security logs	Security logs are not a distinct type of log, they are a subset of all the other logs listed above, which allow to detect and respond to security incidents effectively. Ex: Intrusion detection alerts, security policy violations, anti-virus scans.
Infrastructure metrics	Infrastructure metrics	A metric is a piece of data that has a name, optional labels and a value. It is not a log per-se, as they need to be retrieved by periodically scrapping an endpoint of the host system (pull instead of push paradigm). Once retrieved, the information is then persisted as a log.
Health check	Health check	A health check is a procedure that helps to determine if a component is functioning correctly or not. Just like infrastructure metrics, health checks are not logs per-se, it is an API exposed by each component to return a simple status on the health of the component, which is queried periodically.

Submission of a contract offer by a provider to a consumer.

For the sake of simplicity, application, database, system, network and security logs are grouped under the more generic term of Technical Logs .

2.35.2. Use Cases and Types of Logs

Use case	Type of logs required	Type of metrics required	Description
Log and monitor business actions, mostly for audit purposes.	Business logs		A business log in this case represents a specific step in a business process that is relevant/meaningful to be tracked. E.g. Submission of an onboarding request.
Log and monitor consumption of a resource (infra/data/app) for various reasons (billing, audit, policy enforcement, regulations compliance ...).		Infrastructure metrics	Depending on the type of data or infrastructure resource that is being consumed, different metrics can be relevant: CPU, RAM, I/O, transfer speed, ...
	Technical logs		For application usage and for some data usage cases, application and database logs will give information on what is being done with the data/application.
Log and monitor the usage of a Simpl-Open agent (of its components) for the purposes of audit and troubleshooting.	Technical logs		All types of technical logs are relevant for troubleshooting purposes and some may also be relevant for audit.
	Business logs		A business log is generated for each incoming and outgoing operation at the boundaries of the agent (communication towards Tier 1 or Tier 2 users).
		Infrastructure metrics	Infrastructure metrics generated by the deployed components of the agent (CPU, RAM, Disk, ...).
Monitor the health of the Simpl-Open agent.		Health check	Health is not logged but only monitored (the monitoring queries each technical component in real time to get its health status).

6.5.6. Business Logging & Monitoring

Business logs are generated for each type of operation on the Simpl-Open agent:

For synchronous operations:
1. Request;
2. Response.
For asynchronous operations:
1. Request;
2. ACK of the request;
3. Callback;
4. ACK of the callback.

Business logs are generated in 2 places (in 2 different Elastic indexes):

The Tier 1 API Gateway for Human to Machine interactions;
The tier 2 API Gateway for M2M interactions.

Business logs contain the following fields:

1. Timestamp - Date and time at which the log was created;
2. Origin - Reference to the end-user (Tier I) or Simpl-Open agent (Tier II) that initiates the HTTP call;
3. Destination - Reference to the end-user (Tier I) or Simpl-Open agent (Tier II) which is targeted by the HTTP call;
4. Business Operations - Reference to the operation that is triggered (List to be defined);
5. Message type - For both sync and async transactions, 4 types: request, request ACK, response, response ACK;
6. Correlation ID - ID automatically generated by the first request in a transaction and is reused by the response and ACKs to correlate between messages that are part of the same transaction.

Business operations reference list:

Business Process	Step in BP	Business Operation Description	Business Operation	Agent - tier1/tier2 gateway	Documented	Integrated	Technical API call/response	Backing Service	Service owner
BP 03A	BP03A.01	Submission of an onboarding request by a provider or consumer.	ONBOARDING_REQUEST	Governance Authority - tier1-gateway	DONE	DONE	POST /onboardingApi/v1/onboardingRequests	Onboarding	Onboarding Team
	BP03A.02 - Approved	Approval of a onboarding request by the Governance Authority.	APPROVE_ONBOARDING_REQUEST	Governance Authority - tier1-gateway	DONE	DONE	POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/approve	Onboarding	Onboarding Team
	BP03A.02 - Rejected	Rejection of an onboarding request by the Governance Authority.	REJECT_ONBOARDING_REQUEST	Governance Authority - tier1-gateway	DONE	DONE	POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/reject	Onboarding	Onboarding Team
	BP03A.09	Confirmation of successful onboarding of a provider or consumer.	APPROVE_ONBOARDING_REQUEST	Governance Authority - tier1-gateway	DONE	DONE	POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/approve	Onboarding	Onboarding Team
	BP03A.11	Confirmation of failed onboarding of a provider or consumer.	REJECT_ONBOARDING_REQUEST	Governance Authority - tier1-gateway	DONE	DONE	POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/reject	Onboarding	Onboarding Team
BP 05B	BP05B.07	Submission of a resource description to the catalogue by a provider.	PUBLISH_CATALOG	Governance Authority - tier2-gateway		DONE	POST /self-descriptions	sd-tooling	Catalogue & Connector Team
BP 06	BP06.01	Search in the catalogue.	ADVANCED_SEARCH QUICK_SEARCH	Governance Authority - tier1-gateway	DONE	DONE	POST /xfsc-advsearch-be/v1/selfDescriptions/advanced GET /xfsc-advsearch-be/v1/selfDescriptions	xfsc-advsearch-be	Catalogue & Connector Team
BP 07	BP07.01	Submission of a contract request by a consumer to a provider.	ISSUE_CONTRACT			DONE	POST /contract/v1/credentials/agreements/{contractAgreementId}/definitions/{contractDefinitionId}	Contract Consumption Service	Contract & Billing Team
	BP07.06 BP07.09	Confirm signing of the Contract Agreement	CONTRACT_TERMINATE CONTRACT_FINALIZE			DONE	POST /agreements/{contractAgreementId}/definitions/{contractDefinitionId}/status	Contract Manager Orchestrator	Contract & Billing Team
BP 08	BP08.01	Submission of an infrastructure resource request by a consumer.					POST /transfer/start	Contract Consumption Service	Contract & Billing Team
	BP08.02	Completion of an infrastructure resource deployment by a provider.	TRIGGER_REQUEST				POST /scripts/trigger	Contract Consumption Service	Contract & Billing Team
BP 09A	BP09A.01	Submission of a request to transfer a data resource.	REQUEST_DATA_RESOURCE				POST /transfer/start	Contract Consumption Service	Contract & Billing Team
	BP09A.02	Completion of a data resource transfer by a provider.	TRANSFER_DATA_RESOURCE				POST /transfer/status/{id}	Contract Consumption Service	Contract & Billing Team
BP 09B	BP09B.01	Submission of a request to load data/application on a provider infrastructure by a consumer.	REQUEST_DATA_APPLICATION_RESOURCE				POST /transfer/start	Contract Consumption Service	Contract & Billing Team
	BP09B.04	Confirmation of a data/application resource deployment.	CONFIRM_DATA_APPLICATION_RESOURCE_DEPLOYMENT				POST /transfer/status/{id}	Contract Consumption Service	Contract & Billing Team

Next to this predefined list of business operations, Simpl logs all incoming and outgoing requests between agents.

Technical implementation

Routes and ABAC/RBAC rules are loaded in the API Gateways through YAML files.

A separate configuration YAML file that maps routes and specific parameters (e.g. HTTP 200 response code) will be created. Currently, only a static configuration is supported. In a future release, it is aimed to support hot config changes.

2.35.4. Resource Consumption Logging & Monitoring

Consumption of a resource (infrastructure/data/application) is logged and monitored for 2 main use cases :

Policy enforcement;
Billing.

The following (sub-)processes are considered:

Data Consumption
1. Direct access to the dataset (BP 09A);
2. Data is accessible from an infrastructure tenant (BP TBD - possibly extension of 09A);
3. Data is accessible through a built-in application deployed on the infrastructure tenant (BP 09B);
Infrastructure Consumption (BP 08).

For each of these scenarios, below Data Usage and Infrastructure Usage sections depict the applicable types of usage policy (which also drives billing) and how consumption can be monitored for each of them.

Data Usage

Direct access to the dataset

In this scenario, the data is shared directly between the provider and the consumer (outside of Simpl-Open) and as such no usage policy can be enforced (only “legal enforcement” possible). It corresponds to the “allow usage of data” and “use data and delete afterwards” policies.

This also implies that billing always happens as a one-time payment, upfront of the consumption (possible extension to BP 07).

There is thus nothing that the Simpl-Open agent can log or monitor during consumption.

Data is accessible from an infrastructure tenant

In this scenario, the data provider shared the data on an infrastructure tenant provisioned by an infrastructure provider.

2 types of usage policies are considered, which can be technically enforced and billed:

Based on the number of usages (e.g. access the data 3 times)
Based on the duration (e.g. access the data for 7 days)

In both cases, policy enforcement and billing can be performed based on the logs from the storage.

Architecture assumptions:

It is assumed that VMs and containers always have an attached storage;
It is assumed that Simpl-Open only supports natively S3-compliant storage but is extensible to support other storages (offering an API).

The logs (e.g. storage, bandwidth) are collected over HTTP through the S3 logging API ( Object Storage: Standardising on the S3 API - Architecting IT ).

The exact list of logs that will be collected by Simpl-Open and the mechanism to collect these logs are still to be defined based on what is offered by the S3 logging API.

Data is accessible through a built-in application deployed on the infrastructure tenant

In this scenario, the data provider gives the consumer access to an application that offers restricted viewing (such as read only) or processing capabilities over the data resource. Only Scenario 1 is considered (a stand-alone application will be deployed on a dedicated infrastructure resource per consumer).

1 type of usage policy is considered, which can be technically enforced and billed:

Based on duration (e.g. access the data for 7 days)

Architecture assumptions:

It is assumed that the application is always deployed and terminated together with the infrastructure resource as part of the deployment script;
It is assumed that Simpl-Open only supports native applications deployed on Kubernetes but is extensible to support other platforms (offering an API).

In this case, monitoring the status of the underlying infrastructure resource is sufficient.

To do so, the following 2 options exist:

Collecting log files from the infra resource;
Collecting logs from the infrastructure provider API.

The first option could be more restrictive as it requires access to the infrastructure resource itself.

Simpl-Open therefore implements option 2 and collects logs through the kube-api exposed by the infrastructure provider.

Infrastructure Usage

2 types of usage policies are considered, which can be technically enforced and billed:

Based on duration (e.g. access to a VM for 7 days);
Based on resource utilisation (e.g. CPU, RAM, storage, bandwidth).

In the first case, monitoring the status of the infrastructure resource is sufficient and in the second case, it requires access to infrastructure metrics of the resource.

Architecture assumptions:

It is assumed that Simpl-Open only supports natively:
- S3-compliant storage
- Kubernetes containers platform
- VMWare virtual machines

but is extensible to support other platforms (offering an API).

Both the status of the resource and infrastructure metrics can be collected through the infrastructure provider APIs:

S3 API for storage;
kube-api for containers;
VMWare API for VMs.

The exact list of logs/metrics that will be collected by Simpl-Open and the mechanism to collect these logs are still to be defined based on what is offered by the APIs.

2.35.5. Reporting

This section is only a placeholder for capabilities falling behind the scope of the current release and will be completed at a later time.

2.35.6. Log Wrapper 2.0

The existing LogWrapper ( Log4J wrapper - SIMPL - Confluence ) has been designed at the early stage of project development. As project progresses there is increasing demand to add additional field to a wrapper with the goal to build better dashboards for the end user.

Nested document for HTTP

Log Type	Old Schema	New Schema	Comments
Infrastructure	{ "timestamp": "2024-08-20T06:20:12.201Z", "level": "INFO", "message": "Application started", "thread": "main", "logger": "eu.simpl.simpl_billing.SimplBillingApplication" }	{ "timestamp": "2024-08-20T06:20:12.201Z", "level": "INFO", "message": "Application started", "http": { "method": "GET", "action": "full URL" }, "thread": "main", "customFields": { "<key1>": "value1", "...": "..." } }	(optional) httpMethod - GET/POST/PUT/DELETE/OPTION (optional) httpAction - full URL of the request customFields - as a place holder map, where could place any field that is application specific.
Business	{ "timestamp": "2024-08-12T12:43:18.437+0200", "level": "BUSINESS", "message": { "msg": "Network", "messageType": "RESPONSE", "businessOperation": "[operation1, operation2]", "origin": "origin_name", "httpStatus": "200", "destination": "destination_name", "correlationId": "correlation_id", "user": "user_name" }, "thread": "main", "httpRequestSize": "null", "httpExecutionTime": "null" }	{ "timestamp": "2024-08-12T12:43:18.437+0200", "level": "BUSINESS", "message": { "msg": "Network", "messageType": "RESPONSE", "businessOperation": "operation name", "origin": "origin_name", "destination": "destination_name", "correlationId": "correlation_id", "user": "user_name", "userIp": "userIp", "customFields": { "<key1>": "value1", "...": "..." } }, "thread": "main", "http": { "method": "GET", "action": "full URL", "status": "200", "requestSize": "null", "executionTime": "null", "responseSize": "null" } }	httpMethod - GET/POST/PUT/DELETE/OPTION httpAction - full URL of the request httpResponseSize - captures the response size businessOperation - is a single value attribute instead of an array. userIp - an IP of a user. customFields - as a place holder map, where could place any field that is application specific.

Log Type

Old Schema

New Schema

Comments

Infrastructure

{

"timestamp": "2024-08-20T06:20:12.201Z",

"level": "INFO",

"message": "Application started",

"thread": "main",

"logger": "eu.simpl.simpl_billing.SimplBillingApplication"

}

{

"timestamp": "2024-08-20T06:20:12.201Z",

"level": "INFO",

"message": "Application started",

"http": {

"method": "GET",

"action": "full URL"

"thread": "main",

"customFields":

{

"<key1>": "value1",

"...": "..."

}

(optional) httpMethod - GET/POST/PUT/DELETE/OPTION

(optional) httpAction - full URL of the request

customFields - as a place holder map, where could place any field that is application specific.

Business

{

"timestamp": "2024-08-12T12:43:18.437+0200",

"level": "BUSINESS",

"message":

{

"msg": "Network",

"messageType": "RESPONSE",

"businessOperation": "[operation1, operation2]",

"origin": "origin_name",

"httpStatus": "200",

"destination": "destination_name",

"correlationId": "correlation_id",

"user": "user_name"

"thread": "main",

"httpRequestSize": "null",

"httpExecutionTime": "null"

}

{

"timestamp": "2024-08-12T12:43:18.437+0200",

"level": "BUSINESS",

"message": {

"msg": "Network",

"messageType": "RESPONSE",

"businessOperation": "operation name",

"origin": "origin_name",

"destination": "destination_name",

"correlationId": "correlation_id",

"user": "user_name",

"userIp": "userIp",

"customFields":

{

"<key1>": "value1",

"...": "..."

}

"thread": "main",

"http": {

"method": "GET",

"action": "full URL",

"status": "200",

"requestSize": "null",

"executionTime": "null",

"responseSize": "null"

}

httpMethod - GET/POST/PUT/DELETE/OPTION

httpAction - full URL of the request

httpResponseSize - captures the response size

businessOperation - is a single value attribute instead of an array.

userIp - an IP of a user.

customFields - as a place holder map, where could place any field that is application specific.

Backward compatibility

A new version of a Java package will be published which should still be able to produce V1 logs. All teams can adopt new version of the library at their own pace. Backward compatibility to be handled at logstash level.

A python equivalent will be produced for Data2 team applications that are Python Based

Links

Requirements: SIMPL-2949 - Getting issue details… STATUS
epic: SIMPL-4114 - Getting issue details… STATUS
Onboarding request for custom fields: SIMPL-18664 - Getting issue details… STATUS
code: https://code.europa.eu/simpl/simpl-open/development/contract-billing/common_logging

6.5.7. Schema Management

2.36.1. Architecture Overview

Introduction & Guiding Principles

This document specifies the high-level, service-oriented architecture for the core governance services within the SIMPL framework, with a primary focus on Schema Lifecycle Management . The design is guided by principles of Separation of Concerns , Decoupling , and Interoperability .

The architecture is defined by the Schema Management Service (SMS) , which is the definitive source of truth and lifecycle manager for all schemas and vocabularies. The SMS provides the tools for a Governance Authority to manage, version, and control the status of schemas.

Downstream services, such as a Catalogue Service , act as consumers of these schemas. The interaction between the SMS and its consumers is event-driven . This ensures that consuming services are decoupled, resilient, and performant, as they do not need to query the SMS in real-time to perform their functions (e.g., validating resource descriptions).

At its core, the architecture defines two key services:

Schema Management Service (SMS) : The exclusive, authoritative system for managing the entire lifecycle of schemas, from creation and versioning to publication and revocation.
Catalogue Service (Example Consumer) : A consuming service that validates and stores Resource Descriptions. It subscribes to events from the SMS to maintain a local, synchronised registry of published schemas.

Service Interaction & Data Flow

The system operates on a clear separation between the management of schemas and their consumption. Management is a direct interaction with the SMS, while consumption is driven by events that the SMS produces.

A. Core Flow: Schema Lifecycle Management (Governance Perspective)

Schema Version Creation : A Governance Administrator creates a new version of a schema by submitting a SHACL file and its associated metadata (e.g., version number, changelog) to the SMS Management API . The SMS validates and stores the new version.
Schema Publication : To make an entire schema family available for use, the Administrator uses the SMS Management API to change the status of the Schema Concept to PUBLISHED.
Event Notification : Upon successfully changing the status, the SMS :
- Updates its internal database to reflect the new status.
- Publishes a SchemaPublished event. This event contains the schema’s metadata, its new status, and the content of its versions.
Schema Revocation : If a schema family is no longer approved for use, the Administrator changes its status to REVOKED via the API. This triggers a SchemaRevoked event, preventing new data from being validated against any version of this schema.

B. Example Use Case: Event-Driven Validation (Consumer’s Perspective)

This flow describes how a consuming service, like a Catalogue, leverages the event-driven model.

Subscription & Caching : The Catalogue Service subscribes to events from the SMS. This is typically implemented via a secure webhook where the SMS calls a private endpoint on the Catalogue (e.g., POST /internal/events/schema-published).
Local Registry Update : When the Catalogue receives a SchemaPublished or SchemaRevoked event, it processes the payload and updates its own local, optimized registry of published schemas . The Catalogue is now self-sufficient for validation.
Submission for Publication : A Provider submits a Resource Description (RD) to the Catalogue Service . The RD references a specific schema version.
Local and Fast Validation : The Catalogue Service performs all validation against its local registry :
- It checks that the schema family is present in its registry and has an active (PUBLISHED) status.
- It uses its local copy of the schema version to perform SHACL validation on the RD.
- Crucially, there are no real-time API calls from the Catalogue to the SMS during the validation process.
Processing : If validation is successful, the Catalogue persists the RD. If it fails (either because the RD is invalid or the schema is not in its local published registry), it returns an error.

Service Breakdown

Schema Management Service (SMS)

Objective : To be the single, authoritative source of truth and lifecycle manager for all governance-related schemas and vocabularies.
Key Responsibilities :
- Providing a secure API for creating and versioning schemas and vocabularies.
- Providing an exclusive interface for Governance Administrators to manage the lifecycle status (PUBLISHED/REVOKED) of schema concepts (families).
- Publishing events (SchemaPublished, SchemaRevoked) to notify subscribed services of lifecycle changes.
- Ensuring the integrity and validity of the schemas it manages.
- Making schema content publicly discoverable and retrievable via stable, referenceable URIs for ad-hoc discovery or bootstrapping new subscribers.
Interfaces :
1. Management API : A private, authenticated RESTful interface
  
  for all management tasks. It is the sole entry point for creating versions and changing the lifecycle status of schema concepts.
2. Resolver Interface : A public, read-only interface that
  
  serves the raw RDF content of schemas and vocabularies.
3. Event Publisher : An internal component that pushes
  
  notifications to the registered webhooks of subscribing services.

Catalogue Service (Example Consumer)

Objective : To act as a repository for validated Resource Descriptions, ensuring data quality by enforcing conformance to published schemas.
Key Responsibilities in this Context :
- Exposing an endpoint for receiving RD submissions.
- Maintaining a local, cached registry of published schemas, synchronized via events from the SMS.
- Exposing a private webhook endpoint for the SMS to push SchemaPublished and SchemaRevoked events.
- Performing all internal validation of submitted RDs against its local schema registry.
- Persisting RDs that successfully pass validation.

2.36.2. Simpl-Open Data and Metadata Models

Data Storage Strategy

Technology : The reference implementation uses Apache Jena Fuseki with a TDB2 backend. This provides a performant, standards-compliant RDF triple store with support for SPARQL queries and updates.
Guiding Principles :
- Data Segregation : The data is partitioned into logical datasets to enforce access control and simplify data management tasks like backup and indexing.
- Immutability : Published assets (schemas, vocabularies, RDs) are treated as immutable. Changes are handled by creating new versions, not by updating existing ones.
- Rich Metadata : Each asset is described with a comprehensive set of metadata properties to support discovery, administration, and provenance tracking.

Logical Datasets

The data is partitioned across five distinct datasets.

ds_schemas

Purpose : Contains only the raw SHACL content of schemas.
Structure : Each schema version is stored in its own named graph. The URI of the named graph is identical to the schema version’s public, dereferenceable URI.
Management : Managed exclusively by the Schema Management Service (SMS) .

ds_schema_metadata

Purpose : Contains only the administrative metadata about schemas (e.g., titles, versions, status, changelogs).
Structure : All schema metadata triples are stored in a single, default graph or a dedicated named graph.
Management : Managed exclusively by the SMS .

ds_vocabularies & 2.4. ds_vocabulary_metadata

Note : These dataset descriptions would follow the same status model as schemas, where the status is on the concept, not the version.
ds_vocabularies Purpose : Contains only the raw RDF content of vocabularies (e.g., SKOS thesauri).
ds_vocabulary_metadata Purpose : Contains only the administrative metadata about vocabularies.

ds_resource_descriptions

OUT OF SCOPE : This is out of scope for schema management but offers an insight into how schemas could be used by a downstream service like a catalogue. The Catalogue Service would manage this dataset to store validated Resource Descriptions.

Rationale for Granular Segregation

This five-dataset approach prioritises security and clarity of ownership over query performance.

Benefit : It provides the highest level of data isolation. Access control policies can be applied at the dataset level, ensuring, for example, that the Catalogue Service has absolutely no ability to modify schema content or metadata.
Trade-off : It introduces complexity for queries that need to join data across these datasets (e.g., the SMS’s internal validation checking a schema’s use of a vocabulary). Such operations require the service logic to perform multiple queries to different datasets and join the data at the application level, or rely on SPARQL’s SERVICE clause for federated queries, which can have performance implications.

Metadata Specification

The following specifies the properties used to describe schemas and vocabularies. Prefixes dct, owl, and simpl refer to Dublin Core, OWL, and a custom SIMPL namespace, respectively.

Vocabulary Metadata

Note : This metadata would be updated to match the schema model: status would be moved from the version to the concept.

Schema Metadata

Schema Concept (the “family”) :
- a simpl:Schema: Declares the resource as a schema concept.
- dct:title: A concise, human-readable title.
- dct:description: A detailed description of what this schema is used to describe.
- simpl:resourceType: A literal classifying the schema’s target. Values: “data”, “infrastructure”, “application”.
- simpl:status: A literal with a value of “PUBLISHED” or “REVOKED”. This controls whether any version of the schema can be used for new resource validation.
- simpl:latestVersion: An object property pointing to the URI of the most recent version of this schema concept.
Schema Version :
- a simpl:SchemaVersion: Declares the resource as a specific version.
- dct:isPartOf: Points back to the parent simpl:Schema concept URI.
- dct:creator: The URI or identifier for the user who submitted this version.
- dct:created: The xsd:dateTime of the submission.
- owl:versionInfo: The semantic version string (e.g., “1.0”, “1.1.2”).
- simpl:changelog: A literal containing a description of changes in this version.

Metadata Field Constraints

To ensure consistency and validity, the following constraints apply to the metadata fields submitted via the API. The schemaName corresponds to the name field submitted during creation, which forms the unique identifier in the API path.

Schema Name	name	String	Required . Must be PascalCase. Alphanumeric only. Min 3, Max 64 chars. No spaces or special characters.	ApplicationAsset
Title	title	String	Required . Plain text. Min 10, Max 255 chars.	"Application Asset Schema"
Description	description	String	Required . Plain text. Min 20, Max 2048 chars.	"A schema for describing a software application..."
Resource Type	resourceType	String	Required . Alphanumeric only. Min 3, Max 64 chars. No spaces or special characters.	application
Status	status	String	Required on PATCH . Must be one of: PUBLISHED, REVOKED.	PUBLISHED	Not required on first schema creation - it's status is always PUBLISHED - will be handled in story where we PUBLISH a schema
Version	version	String	Required . Must follow Semantic Versioning (SemVer) format (X.Y.Z).	"1.2.0"
Changelog	changelog	String	Required . Plain text. Max 1024 chars.	"Added operationalStatus property."	Not required on first schema creation - this is added on version creation

System-Populated Metadata

The following metadata fields are automatically generated by the system and cannot be provided by the user.

Initial Version	owl:versionInfo	String	On Schema Creation	The system automatically sets the initial version to "1.0.0" .
Initial Status	simpl:status	String	On Schema Creation	The system sets the initial status of a new schema concept to "PUBLISHED" by default. AN "PUBLISH" event needs to be triggered
Created Timestamp	dct:created	xsd:dateTime	On Version Creation	A timestamp matching 'xsd:dateTime' requirements
Creator	dct:creator	string	On Version Creation	Some sort of ID of the authenticated user that created the version.
Parent Link	dct:isPartOf	xsd:anyURI	On Version Creation	An automatic link back to the parent simpl:Schema concept URI.
Latest Version Ptr	simpl:latestVersion	xsd:anyURI	On Version Creation	The parent schema concept is updated to point to the URI of the new version.

2.36.3. API Specification

Overview

This section specifies the RESTful API for the Schema Management Service (SMS).

Endpoint Base URL : https://api.simpl.space/api/v1
Authorization : API endpoints enforce Role-Based Access Control (RBAC).
- governance-admin : Required for write operations (POST, PATCH).
- governance-viewer or provider : Required for read operations (GET).

Error Handling

Errors are returned using a structured JSON body compliant with RFC 7807 (Problem Details for HTTP APIs).

Example Error Response:

{

“type”: “[ https://api.simpl.space/errors/conflict\ ]( https://api.simpl.space/errors/conflict )”,

“title”: “Conflict”,

“status”: 409,

“detail”: “Version ‘1.0.0’ for schema ‘DataAsset’ already exists.”,

“instance”: “/api/v1/schemas/DataAsset/versions”

}

Vocabulary Endpoints

List & Query Vocabularies

Endpoint : GET /vocabularies
Description : Retrieves a paginated list of vocabulary concepts.
Query Parameters :
- status (string, default: ACTIVE): Filter concepts by status (ACTIVE, REVOKED).
Success Response (200 OK) : A paginated list of vocabulary concepts.

Get Vocabulary Concept

Endpoint : GET /vocabularies/{vocabName}
Description : Retrieves the metadata for a single vocabulary concept, including its status and a list of all its available versions.
Success Response (200 OK) : A JSON object with the concept’s metadata.

Revoke or Activate Vocabulary Concept

Endpoint : PATCH /vocabularies/{vocabName}
Description : Changes the status of a vocabulary concept to ACTIVE or REVOKED.
Authorization : governance-admin
Request Body : {“status”: “REVOKED”} or {“status”: “ACTIVE”}
Responses :
- 200 OK: Returns the full, updated resource representation of the vocabulary concept.
- 404 Not Found.

Create Vocabulary Version

Endpoint : POST /vocabularies/{vocabName}/versions
Description : Submits a new version of an existing vocabulary concept.
Authorization : governance-admin
Request Content-Type : multipart/form-data
Parts :
1. metadata: JSON object. {“version”: “1.1”, “changelog”: “Added new status.”}
2. file: The vocabulary content as a .ttl file.
Responses :
- 201 Created: The Location header is set, and the response body contains the new version’s resource representation.
- 400 Bad Request, 409 Conflict.

Get Vocabulary Version Metadata

Endpoint : GET /vocabularies/{vocabName}/versions/{version}
Description : Retrieves the metadata for a single vocabulary version.
Responses :
- 200 OK: Returns the full, updated resource metadata.
- 404 Not Found.

Schema Endpoints

Create Schema

Endpoint : POST /schemas
Description : Creates a new schema concept and its initial version (v1.0.0). The system automatically creates the first version.
Authorization : governance-admin
Request Content-Type : multipart/form-data
Parts :
1. metadata: JSON object containing the concept’s properties. {“name”: “DataAsset”, “title”: “Data Asset”, “description”: “Schema for describing a dataset.”, “resourceType”: “data”}
2. file: The schema content as a .ttl file.
Responses :
- 201 Created: The Location header is set to the new schema concept’s URI. The response body contains the new concept’s resource representation, including the auto-created first version.
- 400 Bad Request: The file part contains invalid SHACL or the metadata is malformed.
- 409 Conflict: A schema with the provided name already exists.

List & Query Schemas

Endpoint : GET /schemas
Description : Retrieves a paginated list of schema concepts. This is the primary discovery endpoint for any schema consumer (e.g., Providers, tools, administrators).
Query Parameters :
- resourceType (string): Filter by type (data, infrastructure, application).
- status (string, default: PUBLISHED): Filter concepts by status (PUBLISHED, REVOKED).
Success Response (200 OK) : A paginated list of schema concepts.

Get Schema Concept

Endpoint : GET /schemas/{schemaName}
Description : Retrieves the metadata for a single schema concept, including its status and a list of all its available versions.
Success Response (200 OK) : A JSON object with the concept’s metadata.

Revoke or Publish Schema Concept

Endpoint : PATCH /schemas/{schemaName}
Description : Changes the status of a schema concept to PUBLISHED or REVOKED. This controls whether the schema family is active for use.
Authorization : governance-admin
Request Body : {“status”: “REVOKED”} or {“status”: “PUBLISHED”}
Responses :
- 200 OK: Returns the full, updated resource representation of the schema concept.
- 404 Not Found.

4.5. Create Schema Version

Endpoint : POST /schemas/{schemaName}/versions
Description : Submits a new version of an existing schema concept. The server performs all internal validation. A new version does not have its own status.
Authorization : governance-admin
Request Content-Type : multipart/form-data
Parts :
1. metadata: JSON object. {“version”: “1.0.0”, “changelog”: “Initial version.”}. Note that resourceType is part of the concept and is not submitted here.
2. file: The schema content as a .ttl file.
Responses :
- 201 Created: The Location header is set, and the response body contains the new version’s resource representation.
- 400 Bad Request (e.g., for validation failure).

Get All Versions for a Schema

Endpoint : GET /schemas/{schemaName}/versions
Description : Retrieves a list of all available versions for a single schema concept.
Authorization : governance-viewer or higher.
Success Response (200 OK) :
- A JSON array where each object is a schema version resource, containing versionInfo, changelog, created, etc.
Error Response :
- 404 Not Found: If the {schemaName} does not exist.

Get Schema Version Metadata

Endpoint : GET /schemas/{schemaName}/versions/{version}
Description : Retrieves the metadata for a single schema version.
Responses :
- 200 OK: Returns the full, updated resource metadata.
- 404 Not Found.

Event Notification API (Webhooks)

To support a decoupled, event-driven architecture, the SMS provides a mechanism for consuming services to subscribe to lifecycle events via webhooks. When a registered event occurs (e.g., a schema is published), the SMS will send a POST request to the subscriber’s registered URL with a detailed event payload.

Webhook Management

Endpoint : POST /webhooks
- Description : Creates a new subscription to receive event notifications.
- Request Body :
- {
- “targetUrl”: “ https://catalogue.example.com/internal/events/simpl-sms ”,
- “events”: [“SchemaPublished”, “SchemaRevoked”],
- }
- Responses :
  - 201 Created: The webhook subscription was created successfully.

{
"webhookId": "",
"targetUrl": " https://catalogue.example.com/internal/events/simpl-sms ",
"events": []
}

400 Bad Request: The targetUrl is invalid or malformed.

Endpoint : GET /webhooks
- Description : Lists all active webhook subscriptions.
Endpoint : DELETE /webhooks/{webhookId}
- Description : Deletes a webhook subscription.

Event Payload Structure

When an event is triggered, the SMS will send a POST request to the targetUrl.

Example: SchemaPublished Event Payload

This payload is self-contained, providing the consumer with all the information needed to add the schema to its local registry without further API calls.

Schema Published

{
"eventId": "evt_123456789",
"eventType": "SchemaPublished",
"timestamp": "2023-10-27T10:00:00Z",
"data": {
"schema": {
"uri": " https://api.simpl.space/schemas/ApplicationAsset ",
"title": "Application Asset",
"description": "Schema for describing a software application.",
"resourceType": "application",
"status": "PUBLISHED"
},
"version": {
"uri": " https://api.simpl.space/schemas/ApplicationAsset/1.0.0 ",
"version": "1.0.0",
"changelog": "Initial version.",
"created": "2023-01-15T09:30:00Z"
}
}
}

Schema Revoked

{
"eventId": "evt_123456789",
"eventType": "SchemaRevoked",
"timestamp": "2023-10-27T10:00:00Z",
"data": {
"schema": {
"uri": " https://api.simpl.space/schemas/ApplicationAsset ",
"title": "Application Asset",
"description": "Schema for describing a software application.",
"resourceType": "application",
"status": "PUBLISHED"
},
"version": {
"uri": " https://api.simpl.space/schemas/ApplicationAsset/1.0.0 ",
"version": "1.0.5",
"changelog": "updated field x",
"created": "2023-01-15T09:30:00Z"
}
}
}

2.36.4. Content Examples and Use Cases

This section provides concrete examples of a schema and a corresponding resource description. It illustrates how a schema uses terms from the core simpl vocabulary and walks through the end-to-end validation use case based on the defined event-driven architecture.

Core simpl Vocabulary Excerpt

This excerpt defines the operationalStatus property and the Active, Inactive, and Decommissioned concepts. These terms would be part of the foundational simpl vocabulary managed by the Governance Authority.

@prefix rdf: < http://www.w3.org/1999/02/22-rdf-syntax-ns\# > .

@prefix rdfs: < http://www.w3.org/2000/01/rdf-schema\# > .

@prefix owl: < http://www.w3.org/2002/07/owl\# > .

@prefix simpl: < https://api.simpl.space/meta\# > .

# Definition of the property itself

simpl:operationalStatus a rdf:Property, owl:ObjectProperty ;

rdfs:label “Operational Status”@en ;

rdfs:comment “Describes the current operational state of an asset.”@en .

# Definition of allowed values (as individual concepts)

simpl:Active a owl:NamedIndividual ;

rdfs:label “Active”@en .

simpl:Inactive a owl:NamedIndividual ;

rdfs:label “Inactive”@en .

simpl:Decommissioned a owl:NamedIndividual ;

rdfs:label “Decommissioned”@en .

Example Schema: ApplicationAsset

This schema describes a software application. It uses standard properties (from dct:) and properties defined within the simpl: vocabulary. The simpl:operationalStatus property is constrained by a sh:in list, which references concepts from the core vocabulary.

File Submitted by User : application-asset-v1.2.0.ttl
Submission Metadata (JSON part) : {“version”: “1.2.0”, “changelog”: “Added operational status.”}
Content of application-asset-v1.2.0.ttl :

@prefix sh: < http://www.w3.org/ns/shacl\# > .

@prefix simpl: < https://api.simpl.space/meta\# > .

@prefix dct: < http://purl.org/dc/terms/ > .

@prefix xsd: < http://www.w3.org/2001/XMLSchema\# > .

< https://api.simpl.space/schemas/ApplicationAsset/1.2.0 >

a sh:NodeShape ;

sh:targetClass simpl:ApplicationResource ;

sh:property [

sh:path dct:title ;

sh:datatype xsd:string ;

sh:minCount 1 ;

sh:maxCount 1 ;

] ;

sh:property [

sh:path simpl:owner ;

sh:datatype xsd:string ;

sh:minCount 1 ;

] ;

sh:property [

sh:path simpl:operationalStatus ;

sh:minCount 1 ;

sh:maxCount 1 ;

# Constrains the value to be one of the concepts from the core simpl vocabulary

sh:in ( simpl:Active simpl:Inactive simpl:Decommissioned ) ;

] .

End-to-End Use Case: Validating a New Resource Description

This walkthrough illustrates the data flow for publishing a new ApplicationResource, reflecting the event-driven architecture where the Catalogue Service is a subscriber to the Schema Management Service (SMS) .

Phase 1: Schema Publication and Notification (Pre-requisite)

Administrator Publishes Schema : A Governance Administrator uses the SMS Management API to change the status of the ApplicationAsset schema family to PUBLISHED.
SMS Publishes Event : The SMS successfully updates its database and sends a SchemaPublished event notification to all its subscribers, including the Catalogue Service. The event payload contains all the metadata and SHACL content for all versions of the ApplicationAsset schema.
Catalogue Service Updates Local Registry : The Catalogue Service receives the event, validates its signature, and populates its own local, optimized registry of published schemas . It now has a local copy of the ApplicationAsset schema and knows it is active for validation.

Phase 2: Resource Description Submission and Validation

Provider Creates Resource Description : A Provider authors a Resource Description, ensuring it conforms to a specific version of a published schema.
@prefix simpl: < https://api.simpl.space/meta\# > .
@prefix dct: < http://purl.org/dc/terms/ > .
< https://my-company.com/resource/app-001 >
a simpl:ApplicationResource ;
dct:conformsTo < https://api.simpl.space/schemas/ApplicationAsset/1.2.0 > ;
dct:title “Customer Relationship Manager” ;
simpl:owner “Sales Department” ;
simpl:operationalStatus simpl:Active .
Provider Submits to Catalogue Service : The Provider sends the Resource Description content in a POST request to the Catalogue Service .
Catalogue Service Performs Local Validation :
- The Catalogue Service parses the submitted RDF and extracts the schema URI: https://api.simpl.space/schemas/ApplicationAsset/1.2.0 .
- It consults its local registry . It confirms that the ApplicationAsset schema family is present and its status is PUBLISHED.
- It retrieves the content for version 1.2.0 from its local cache.
- No API call is made to the SMS.
- The Catalogue Service loads the provider’s RD and the local schema content into its internal SHACL engine for validation.
Publication and Storage :
- Since validation succeeds, the Catalogue Service persists the Resource Description in its own dataset, making it published and discoverable.
- If validation had failed (e.g., the RD was invalid, or the schema was not found in the local registry), the service would have returned an error to the Provider without storing the RD.

2.36.5. Resolver Interface

Objective

The Resolver Interface is a public, read-only set of HTTP endpoints designed to provide stable, referenceable URIs for accessing the content of schemas and vocabularies managed by the Schema Management Service (SMS). It is the primary means for ad-hoc discovery and retrieval of schema and vocabulary resources by any client, including developers, tools, or other services bootstrapping their local caches.

The design is guided by the following principles:

Public Accessibility : Unlike the private, authenticated Management API, the Resolver Interface is open to the public for read-only operations.
Dereferenceable URIs : The URIs for schema and vocabulary concepts and versions are stable and can be resolved over HTTP to retrieve their content or metadata.
Content Negotiation : The interface supports content negotiation, allowing clients to request the resource representation that best suits their needs, such as RDF in various serializations or a JSON representation of the metadata.
Statelessness : Each request to the resolver contains all the information needed to process it, adhering to REST principles.

Content Negotiation

Content negotiation allows a client to request a specific representation of a resource. The Resolver Interface uses the standard HTTP Accept header for this purpose. Clients should specify their desired media type in this header.

Supported media types for schema and vocabulary versions include:

text/turtle: The raw SHACL or RDF content in Turtle format.
application/ld+json: The content in JSON-LD format. (PENDING)
application/rdf+xml: The content in RDF/XML format. (PENDING)

If a client does not provide an Accept header, the interface will respond with a default: text/turtle.

Schema Resolver Endpoints

These endpoints provide access to content of schema concepts and their versions.

Resolve Schema Concept

Resolves a schema concept. By default, this endpoint returns the raw SHACL content of the latest published version of the schema, providing a stable URI for clients that always need the most up-to-date version.

Endpoint : GET /schemas/{schemaName}
Description : This endpoint provides a single, stable URI for a schema family. Its behavior depends on the Accept header.
- Requesting an RDF media type (e.g., text/turtle) retrieves the content of the latest schema version.
Authorization : None required. This is a public endpoint.
Responses :
- 200 OK : Returns the appropriate resource based on the Accept header.
- 404 Not Found : If the {schemaName} does not exist.
- 406 Not Acceptable : If the server cannot provide a representation in the requested format.

Example 1: Resolving the latest schema content (default)

Request:

GET /schemas/ApplicationAsset HTTP/1.1

Host: api.simpl.space

Accept: text/turtle

Response (Content-Type: text/turtle):

@prefix sh: < http://www.w3.org/ns/shacl\# > .

@prefix simpl: < https://api.simpl.space/meta\# > .

@prefix dct: < http://purl.org/dc/terms/ > .

@prefix xsd: < http://www.w3.org/2001/XMLSchema\# > .

< https://api.simpl.space/schemas/ApplicationAsset/1.2.0 >

a sh:NodeShape ;

sh:targetClass simpl:ApplicationResource ;

sh:property [

sh:path dct:title ;

sh:datatype xsd:string ;

sh:minCount 1 ;

sh:maxCount 1 ;

] ;

sh:property [

sh:path simpl:owner ;

sh:datatype xsd:string ;

sh:minCount 1 ;

] ;

sh:property [

sh:path simpl:operationalStatus ;

sh:minCount 1 ;

sh:maxCount 1 ;

sh:in ( simpl:Active simpl:Inactive simpl:Decommissioned ) ;

] .

Resolve Schema Version

Retrieves the raw SHACL content for a specific version of a schema.

Endpoint : GET /schemas/{schemaName}/{version}
Description : This endpoint unambiguously resolves to the SHACL file for a specific, immutable version of a schema.
Authorization : None required. This is a public endpoint.
Content Negotiation : Supported. Clients can request different RDF serializations via the Accept header.
Responses :
- 200 OK : Returns the schema content in the requested or default format.
- 404 Not Found : If the {schemaName} or {version} does not exist.
- 406 Not Acceptable : If the server cannot provide a representation in the requested format.

Request:

GET /schemas/ApplicationAsset/1.2.0 HTTP/1.1

Host: api.simpl.space

Accept: text/turtle

Response should be identical to above

6.5.8. Notification Service

This section outlines the architecture for the notification service, which uses an asynchronous API with Kafka for message queuing.

2.37.1. AsyncAPI Specification

The following AsyncAPI specification defines the contract for the notification service. It details the channels, messages, and operations for sending notifications.

https://code.europa.eu/simpl/simpl-open/development/contract-billing/notification-service/-/blob/main/docs/asyncApi/asyncapi.yaml

asyncapi: ‘3.0.0’

info:

title: Notification Service API

version: ‘1.0.0’

description: API documentation for a notification service using Kafka.

defaultContentType: application/json

servers:

production:

host: ‘kafka://localhost:9094’

protocol: kafka-secure

description: Kafka server

channels:

notifications:

address: “notifications”

messages:

EmailNotification:

$ref: ”#/components/messages/EmailNotification”

operations:

SendNotification:

action: send

summary: Sending notification message to Kafka topic ‘notifications’

channel:

$ref: ’#/channels/notifications’

components:

messages:

EmailNotification:

name: EmailNotification

title: Sending email notification

payload:

type: object

properties:

channel:

type: string

enum:

- email

description: Type of notification channel.

message:

type: string

description: Body of the message.

to:

type: string

description: Email address of the recipient.

cc:

type: array

items:

type: string

description: List of email addresses in CC.

subject:

type: string

description: Subject of the message.

“$ref: ’#/components/messages/EmailNotification‘“

2.37.2. Publishing to Kafka

For any service to send notifications, the service will act as a Kafka producer and publish messages to the notifications topic. The notification service will then consume these messages and send out the actual notifications (e.g., emails).

Message Flow

Construct the Message : the service creates a message that conforms to the EmailNotification schema defined in the AsyncAPI specification.
Serialize the Payload : The message payload is serialized into a JSON string.
Publish to Kafka : The serialized payload is sent to the notifications Kafka topic.

Example Flow Diagram

Producer Configuration

The calling service will need to be configured with the Kafka broker details to connect and publish messages. On dev this is deployed in the common namespace along with other shared services.

We also need to setup a configuration for the ‘to:’ email address used in the notifications.

Example: New Schema Notification

Here is an example of a notification message that would be sent to providers when a new schema is published.

{

“channel”: “notifications”,

“to”: “ providers@example.com ”,

“subject”: “New <ResourceType> Schema Published: <New Schema Name>”,

“message”: “A new schema has been published with the following details:\n\n- **Name**: New Schema Name\n- **Title**: Title of the New Schema\n- **Description**: A brief description of what this new schema is about.\n- **Resource Type**: Schema”

}

7. Simpl-Open Security Architecture

7.1. Introduction and Perimeter of Intervention

The scope of Simpl-Open’s security is limited to its role as an agent that facilitates communication between participants (nodes).

An example of a typical end-to-end (E2E) flow is outlined below:

A Consumer decides to access a dataset managed by a Data Provider.
- The Data Provider ensures the security of its dataset and compliance with the applicable regulations, at the time of its creation.
- Only a portion of the dataset may be made available for sharing.
- Simpl-Open does not oversee the control of datasets, which remain entirely under the ownership and management of the Data Provider.
A Consumer reserves an infrastructure tenant from an Infrastructure Provider.
- This tenant is a Platform-as-a-Service (PaaS) environment derived from Infrastructure-as-a-Service (IaaS) and PaaS services provided by the Infrastructure Provider.
- Both the Data Provider and the Consumer can access this dedicated tenant.
Data is transferred from a Data Provider to a Provider’s infrastructure (dedicated tenant) using the Simpl-Open Agent, which manages:
1. Contract Establishment between consumer and provider’s
  
  organisations.
2. Secure Communication: Ensuring safe data transfer from the
  
  provider (source) organisation to the consumer (target) organisation.
3. Access Control: Granting tenant access to authorised personnel
  
  only.

This typical end-to-end flow is presented on the following figure.

Simpl-Open functions as middleware, managing Agent-to-Agent communication flows without storing any datasets. As the primary decision-makers in data processing, legal and security responsibilities regarding the data rest solely with the participants (Data Controllers). This includes their obligations to comply with legal and regulatory requirements for both data usage and provision.

As Simpl-Open is a distributed System, the classical end-to-end responsibilities (such as security, operations etc.) are segmented as follows:

Network responsibility: facilitates the pure exchange of information through a network of Simpl-Open agents (i.e. deployment of Simpl-Open).
Local Node responsibility: each participant (node) is accountable for managing its datasets, applications, infrastructure, and workstation in compliance with its local regulations.

This segmentation of responsibilities is depicted on the following figure.

The overall security of a Data Space is the result of contributions from multiple actors, with their respective responsibilities, structured as follows:

Governance Authority : orchestrates the security framework across all participants.
Every Participant : Each participant is required to have local IT security plans and implement measures for their personnel, IT systems, and local deployment of the Simpl-Open agent.
Deployment of Simpl-Open network : provides security capabilities to ensure a robust protection for the node-to-node communication.
Simpl-Open agent : Each agent includes features to comply with the Simpl-Open IT Security Plan, ensuring alignment with the product’s security requirements.
Simpl-Open development : The development process adheres to stringent security measures, ensuring the product is resilient against potential threats.

This section focuses exclusively on the architecture of Simpl-Open as a product.

Separate architecture documents will be created for each deployment of Simpl-Open, including IT security plans tailored to specific Data Spaces and detailing the responsibilities assigned to each participant.

Several aspects of Security have been implemented specifically into the “ DevSecOps Approach ” section of the Architecture document, on those areas:

Domain	Confluence Reference
User Management	Audit process (WIP)
OVH audit trails	OVH Log Data Platform service is used for K8s audit logs management.
Security testing (SAST, SCA, DAST)	SAST, DAST and SCA are implemented as part of the DevSecOps pipelines, as described in the DevSecOps Approach section.
Backup and restore	Cluster backups are made using Velero .

These part covers the aspect related to the “Production of Simpl-Open” as SW product.

7.3. Simpl-Open (Product) Security Architecture

The following tables present the features that have already been introduced as part of the security architecture of Simpl-Open.

These features were identified consequently to other business features described in SC1 Annex 1 or were implemented based on standard best practices in application architecture.

In future version of the architecture document, each relevant section could be updated to highlight how the security controls are implemented in Simpl-Open. This could be in the shape of a dedicated security related paragraph in the respective sections, describing the specific security control implementation.

7.3.1. Functional Security

The following table presents the features that have been analysed and designed to address the security aspects listed below.

ID	Domain	Node	Feature	Section of document
1	Tier 1 Access Control	All	RBAC (Role Based Access Control)	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
2	Tier 2 Access Control	Governance Authority(Management)	ABAC(Attribute Based Access Control)	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
3	Local Directory System	All	Tier 1 Authentication Provider (OpenID Connect) User & Roles	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
4	Tier 1 Authorisation	All	Authorisation Tier 1	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
5	Tier 2 Authorisation	Governance Authority	Authorisation Tier 2	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
6	Tier 1 Authentication	All	Tier 1 Authentication Provider (OpenID Connect)	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
7	Tier 2 Authentication	Governance Authority	Identity Provider Federation Security Attribute Provider	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
8	Tier 2 Authentication	All	Tier 2 Authentication Provider	Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation
9	Communication	All	Encryption Integrity Authentication	Simpl-Open High Level Overview > Data Space Concepts (see " Data Space Participant: Tier I and Tier II ")
10	Logging	All	Logging Monitoring Reporting	Simpl-Open Technology Architecture > Detailed Technical Specifications > Logging, Monitoring & Reporting

7.3.2. Technical Security

Technical security includes deployment aspects of the open-source technology components, which are outlined below. These have been implemented based on general secure development guidelines and standard security architecture patterns.

ID	Domain	Node	Components	Section of document
1	Local Directory System	All	Keycloak federated with any Local IDP User & Roles microservice	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA
2	Tier 1 Authorisation	All	Spring Cloud Gateway	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA
3	Tier 2 Authorisation	Governance Authority	Spring Cloud Gateway	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA
4	Tier 1 Authentication	All	Keycloak	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA
5	Tier 2 Authentication	Governance Authority	EJBCA Security Attribute Provider microservice	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA
6	Tier 2 Authentication	All	Tier 2 Authentication Provider microservice	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA
7	Logging	All	Monitoring Service (ELK)	Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 3 - Management/Operation of Data Space

2.39.1. Deployment Consideration

Next to the above features implemented as part of the Simpl-Open product, every participant should consider the following typical infrastructure deployment practices (outside of Simpl-Open scope), including technical security / hardening features, such as:

- DMZ protected access and Network Security of the DMZ (DDoS or any other attack), IDS, FW, VPN
- 2-3 tiers deployment view (Via Container Security or VMs/Network Security)
  - API Gateway / Front End Layer
  - Backend Services
  - Secure Interface towards each Node Applications/Data Sources

These recommended measures are depicted on the following figure:

2.39.2. Agent-to-Agent Communication - Details

This section describes the handshake process to establish a secured mTLS connection with another agent.

Initial version (current release)

The initial version of the handshake process is designed to assume that, the called endpoint belongs to another Simpl-Open agent (is secured by its Tier 2 Authorisation Gateway) and for this reason is triggered in any HTTP call performed by the mTLS HTTP Client (the only option for the current release) to try to establish a mTLS connection. However, if the target endpoint does not belong to a Simpl-Open agent, the communication fallback to standard HTTPS(TLS)

Handshake process steps:

The caller mTLS client uses the Tier 2 Authentication Provider to retrieve a valid Ephemeral Proof
1. if it already exists, from the agent cache.
2. if not present in the cache, a new proof is then requested from the Authority and stored in the cache for subsequent calls until it expires.
The caller agent performs an mTLS authentication
1. the caller agent performs always a credential check (OCSP request) to the Authority Identity Provider.
2. the called agent Tier 2 Authentication Gateway performs a credential check (OCSP request) to the Authority Identity Provider only if no validated proof associated with the credential’s public key is found in his cache.
The caller agent always sends the ephemeral proof to be validated
1. the called agent checks the cache to see if the proof was already validated (if true step 3.b is not performed).
2. the proof is validated and stored in the cache with a TTL (Time To Live) calculated according to its expiration time.
Call the endpoint
1. after that ABAC policies are enforced and passed, and the request is processed.

Enhanced version

The enhanced version of the handshake process is designed to discover if the called endpoint belongs to another Simpl-Open agent (is secured by its Tier 2 Authorisation Gateway) and only in this case an attempt to establish a mTLS connection is done. Anyway, if the target endpoint does not belong to a Simpl-Open Agent the communication fallback to standard HTTPS (TLS), the main optimisations done are:

No mTLS authentication is done for non-Simpl-Open endpoints
Only one credential check per called agent per proof is performed
No multiple exchanges of the proof that are already exchanged
Extended Ephemeral Proof Validation(issuance and expiration time are checked to be valid by all agents)

Handshake process steps:

The caller agent mTLS client (client/transparent proxy/any other implementation) uses the Tier 2 Authentication Provider to retrieve a valid Ephemeral Proof and its HASH
1. if it already exists, from the agent cache
2. if not present in the cache a new proof is then requested from the Authority and stored in the cache for subsequent calls until it expires.
The caller agent performs a preflight call (HTTP OPTION) to the Called T2 preflight endpoint using the proof HASH as query string param, to notify which proof will be used:
1. If the called agent retrieves the proof return a 200 - OK HTTP status code otherwise a 204 - No Content that means that the proof needs to be validated (all other statuses means that the called endpoint doesn’t belong to another agent and the handshake is stopped)
The caller agent performs an mTLS authentication
1. the caller agent tries to retrieve the credential check response associated with the received credential’s public key from the cache
2. if no credential check response is found in the cache, the caller agent performs a credential check request (OCSP) to the Authority Identity Provider and stores it in the cache with a specified TTL
3. the called agent Tier 2 Authentication Gateway tries to retrieve the credential check response associated with the received credential’s public key from the cache
4. if no credential check response is found in the cache, the called agent Tier 2 Authentication Gateway performs a credential check request (OCSP) to the Authority Identity Provider and stores it in the cache with a specified TTL
only if a 204 - No Content status code was received in step 2 then the caller agent sends the ephemeral proof to be validated:
1. the proof is validated and stored in the cache with a TTL calculated according to its expiration time(and not greater than a specified MaxTTL parameter)
Call the endpoint
1. after that ABAC policies are enforced and passed, and the request is processed

2.39.3. Principles of Operations Security: Technical Accounts for Administration

The administration of the Simpl-Open agent requires technical accounts that are allowed to:

Perform the first configuration of the agent
Create the accounts for managing the operations of the agent components. A standard structure for these accounts will be proposed but this structure should be tailored to the business and technical organisation of each participant, where rights are assigned to roles, and roles are assigned to people.

8. DevSecOps Approach

This section gives an overview of the architecture for the DevSecOps tools and environments for Simpl-Open. The following diagram is taken over from Specific Contract 1 - Terms of reference, which provides an overall view of the required DevSecOps approach completed with the relevant choices of tools/technologies in our implementation.

8.1. Overview

This architecture diagram below shows the main components of the DevSecOps toolchain used to comply with the above-mentioned approach for the development of Simpl-Open.

The central CI/CD pipeline to build and test the applications and components, as well as the code repositories, are on a GitLab instance on code.europe.eu.

OVH is used for the different Kubernetes clusters:

Dedicated cluster with various namespaces for development;
Dedicated cluster with various namespaces for integration;
Dedicated cluster with various namespaces for end-to-end testing;
Dedicated clusters for Keycloak identity management, GitLab Runners, DevSecOps tools and DevSecOps tool testing (staging env for DevSecOps tools).

Tickets, test cases and test reports are found in Jira, which is set up along with Xray for test management.

The diagram reflects the current status of the toolchain, with planned elements shown shaded.

8.2. Planning and Design of Clusters

The DevSecOps team provides the infrastructure resources to the development teams. It is responsible for the setup of the Kubernetes clusters on OVH and the management of those.

The cluster “Dev-components” is set up as the development environment. Each team gets isolated name space(s) to run their services.

To avoid vendor-lock-in, it is proposed to avoid using managed services on OVH like managed databases. This guarantees that the designed solution will also work on other cloud platforms without modifications needed.

The DevSecOps team manages the clusters via Rancher and provides access to the projects and namespaces only to the team members of the product streams.

The expected workload for each of the environments is estimated based on the input from the different development teams and used for initial cluster sizing.

The table below shows the different stages and what they are used for.

Stage	Purpose	Data	Operations Level Agreement	Target User Group (responsible for deployment)	Release Management	Deployment Strategy (when releases are applied)
Dev	The development stage is where new features and enhancements are developed and tested; Isolated Simpl product teams dev environments (opt: connected to feature branches); No involvement in commercial applications.	Synthetic, manually created data per dev test to conduct functional testing of the individual product. (For further details see Test Plan).	n/a	Developers of consortia members QA engineers for testing	dev versions, e.g. 0.0.4-snapshot	Daily and continuous builds from develop branch and/or feature branch
Int	The int stage is dedicated to comprehensive testing of features and fixes before they are promoted to release branch; Used for functional testing of API endpoints and correct interaction between different components.	Synthetic, manually created data per dev test to conduct functional testing of individual product; May use sanitised production data. The sanitisation process will ensure no sensitive information will be used.	n/a	QA engineers for testing	release candidate versions	Manual sync for release candidates
Pre-Prod	The pre-prod stage serves as a pre-production environment where the validation of the system's end-to-end workflows (End-to end testing) are taking place from user scenario point of view; Also, this stage is responsible for all non-functional testing (load/stress testing, security testing by DAST).	Subset of production data or representative data to simulate the production environment accurately; Data will be sanitised or anonymised to protect sensitive information while preserving the integrity of the dataset.	n/a	E2E testing team	stable release versions	manual sync for released versions Aimed to automate as much as feasible as maturity grows

Stage

Purpose

Data

Operations Level Agreement

Target User Group

(responsible for deployment)

Release Management

Deployment Strategy (when releases are applied)

Dev

The development stage is where new features and enhancements are developed and tested;
Isolated Simpl product teams dev environments (opt: connected to feature branches);
No involvement in commercial applications.

Synthetic, manually created data per dev test to conduct functional testing of the individual product. (For further details see Test Plan).

n/a

Developers of consortia members
QA engineers for testing

dev versions, e.g. 0.0.4-snapshot

Daily and continuous builds from develop branch and/or feature branch

Int

The int stage is dedicated to comprehensive testing of features and fixes before they are promoted to release branch;
Used for functional testing of API endpoints and correct interaction between different components.

Synthetic, manually created data per dev test to conduct functional testing of individual product;
May use sanitised production data. The sanitisation process will ensure no sensitive information will be used.

n/a

QA engineers for testing

release candidate versions

Manual sync for release candidates

Pre-Prod

The pre-prod stage serves as a pre-production environment where the validation of the system's end-to-end workflows (End-to end testing) are taking place from user scenario point of view;
Also, this stage is responsible for all non-functional testing (load/stress testing, security testing by DAST).

Subset of production data or representative data to simulate the production environment accurately;
Data will be sanitised or anonymised to protect sensitive information while preserving the integrity of the dataset.

n/a

E2E testing team

stable release versions

manual sync for released versions
Aimed to automate as much as feasible as maturity grows

8.3. Cluster Provisioning and Setup

8.4. Environment Onboarding Process

If a Dev-Team needs a new environment for any stage, they need to create an issue in the following GitLab repo: Simpl/Operations/Environment-onboarding.

The DevSecOps team will create the environment with the default tool stack and grant access to it afterwards.

Process Description:

Create project in Rancher in the desired cluster (dev, int, …);
Deploy default toolstack (ingress etc.);
Create project in Argo CD;
Grant access to Rancher project and Argo CD project.

8.5. Security and Access Control

The following best practices are used to secure the environments.

8.5.1. Access Management

Access is granted / revoked based on the process described below.

Definition of Basic roles (as tracked in PMO master list)

User role name	Description
ADMIN	Role for the operation of the DevSecOps toolchain
PSO/ EC	Members of PSO accessing the DevSecOps toolchain for quality assurance purposes
DEVELOPER	Developers who will use the DevOps pipeline for development activities
LEAD DEVELOPER	Developer with code ownership and elevated security privileges
DEVELOPER OPS	Developers with elevated infrastructure privileges
LEAD DEVELOPER OPS	Developers with code ownership and elevated security + infrastructure privileges
TESTER	Testers who will take part in the testing of developed code

8.5.2. Mapping of basic roles

This table shows the mapping between the basic roles and the internal roles within each tool.

Tool (with internal Roles) Basic Role	code.europe.eu	Argo CD (ADMIN, DEV, READ-ONLY)	Rancher (ADMIN, Project Member, Read-Only)	Vault (ADMIN, DEV, DENY-ALL)	Fortify (Security Lead (Admin), Developer, Lead Developer, Tester)	SonarQube (ADMIN, DEV)	Prometheus	Grafana	Loki	Aerokube Moon
everyone on PMO master list (non-need for any specific DevSecOps role)	DEVELOPER on Simpl group level (subject to self-registration)	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
ADMIN	MAINTANER is set manually for DevSecOps team	ADMIN for Argo CD	ADMIN	ADMIN	Security Lead (Admin)	ADMIN	ADMIN	ADMIN	ADMIN	ADMIN
PSO	DEVELOPER on Simpl group level (subject to self-registration)	READ-ONLY	READ-ONLY	DENY-ALL	n/a	Dev	n/a	VIEWER	n/a	n/a
DEVELOPER	DEVELOPER on Simpl group level (subject to self-registration)	Project Member (per project)	n/a	n/a	Developer (per application)	Dev	n/a	VIEWER	n/a	n/a
LEAD DEVELOPER	DEVELOPER on Simpl group level CODEOWNER in repository (subject to self-registration)	Project Member (per project)	n/a	n/a	Lead Developer (including Developer rights) (per application)	Dev	n/a	VIEWER	n/a	n/a
DEVELOPER OPS	DEVELOPER on Simpl group level (subject to self-registration)	Project Member (per project)	Project Member (per project)	DEV (per project)	Developer (per application)	Dev	n/a	VIEWER	n/a	n/a
TESTER	DEVELOPER on Simpl group level (subject to self-registration)	READ-ONLY	n/a	n/a	Tester	Dev	n/a	VIEWER	n/a	TESTER

Keycloak is used as a central instance for user management and providing the login mechanisms for the different tools:

8.6. Security Checks

Application security scans will be done by Fortify on Demand (FoD). The scope is the following:

Static Application Security Test (SAST);
Static Component Analysis (SCA);
Container Scanning;
Dynamic Application Security Test (DAST).

SAST (Static Application Security Testing) is a method of testing the source code of an application for security vulnerabilities without executing the application. It’s a type of white-box testing that analyses the application’s internal structures and logic on the code level for flaws that might lead to security risks and vulnerabilities. Main advantages of using SAST:

Enables detection and remediation of vulnerabilities early in the Software Development Lifecycle (SDLC), reducing costs and risks;
Can analyse the entire codebase, providing a comprehensive security assessment;
Helps the project to comply with security standards like OWASP, PCI DSS and others

WHEN: SAST is performed on every commit/merge

WHAT: the source code is scanned

WHERE: during the development lifecycle

WHO: the pipeline is starting the analysis automatically

Fortify is integrated with the central component development pipeline which triggers a Static Application Security Test (SAST) when new code is merged in the repository by the development or the integration teams. For the feature branch a scan can be requested manually. The result of the scan is shown on the dashboard of Fortify. Developers can review the results of their components and handle identified vulnerabilities in the next version of the code. A quality gate set in Fortify must be met for the pipeline to merge the code to the main branch.

SCA (Software Composition Analysis) is a process used to identify and manage risks associated with the use of third-party and Open-Source software components in an application. It is a critical aspect of modern software development, as applications increasingly rely on external libraries and frameworks. Main advantages of using SCA:

Protects the Simpl agent by identifying and addressing vulnerabilities in external components;
Mitigates legal risks from improper use of Open-Source licenses;
Automates tracking and reporting of third-party components.

WHEN: SCA is performed on every commit/merge

WHAT: third-party libraries are scanned

WHERE: during the development lifecycle

WHO: the pipeline is starting the analysis automatically

SCA is integrated with the same approach as SAST using the debricked service of the Fortify online platform.

Container scanning refers to the process of analysing and inspecting container images for security vulnerabilities, compliance issues, malware and other potential risks before they are deployed in a production environment. Here’s a brief overview:

Purpose: The primary goal is to ensure that containers are free from known security vulnerabilities and adhere to organisational security policies. This helps in maintaining the integrity, confidentiality and availability of applications running inside these containers.
Components Scanned:
- Base Images: Checking if the base images from which containers are built have any known vulnerabilities.
- Dependencies: Examining all the software libraries and dependencies included within the image for vulnerabilities or outdated versions.
- Configuration: Assessing the container’s configuration files for potential security misconfigurations.

WHEN: container scanning is performed in the development lifecycle at every branch and i every commit/merge

WHAT: the containers and images used are scanned

WHERE: during the development lifecycle

WHO: the pipeline is starting the analysis automatically

DAST (Dynamic Application Security Testing) is a method used to identify security vulnerabilities in an application by analysing it during runtime. It simulates attacks on a running application, typically from an external perspective, to uncover vulnerabilities that can be exploited in real-world scenarios. Main advantages of using DAST:

Identifies vulnerabilities of the Simpl software as an attacker would exploit them
Finds issues related to application logic, runtime behaviour and server configuration.

WHEN: at the end of the development lifecycle, after e2e testing.

WHAT: DAST is performed on the Agents.

WHERE: DAST is performed in the pre-prod environment

WHO: the end2end team is responsible for configuring and running the scans

DAST is implemented using the WebInspect service of the Fortify online platform.

While SAST, SCA and Container Scanning are integrated into the component pipeline and used on the component level, DAST will be triggered manually after deployment of the integrated Simpl agent in the pre-prod environment and the completion of preparational steps. This testing type may require a runtime of up to 2-3 days by Fortify. Once the testing is complete the Fortify dashboard will provide an overview of the results. Similarly to SAST and SCA developers can review the identified issues and act on them as necessary.

8.6.1. Kubernetes best practices

In order to set up a flexible and scalable environment for managing our containerised applications, Kubernetes has been identified as the best fit technology. The primary reasons for the choice are:

vendor neutral platform;
support of microservices infrastructure;
autoscaling capabilities to handle growing and fluctuating workloads;
support for DevSecOps;
support of multi-tenant environments;
high availability.

Main features of Kubernetes:

Network Policies : Defined network policies network policies to control traffic within the cluster; OVH provides a default set of policies. Inside the Kubernetes cluster ingress and egress isolation for pod level according to the specific needs of the cluster are used.
Secret Management : Usage of Kubernetes Secrets and/or Vault to store sensitive data securely. Secrets used in the pipeline are stored in an external tool like Vault;
Role-Based Access Control (RBAC) : Implemented RBAC to manage access to cluster resources based on user roles; Keycloak groups are mapped to Rancher projects to ensure proper isolation of namespaces. User roles are mapped to Keycloak groups/roles.
Service Accounts : Usage of Service Accounts to authenticate and authorise pods;
Image Scanning : The integrity and security of container images are verified before deployment in the pipeline; This is done by using Trivy/Fortify triggered by the pipeline.
Regular Updates and Patching : To keep the Kubernetes distribution and components up-to-date with the latest security patches regular updates are done. Since Kubernetes is a managed service, updates are made available by OVH. Admins regularly check update options and decide to stay with the current version or update.

8.7. Continuous Deployment and GitOps

This section outlines the implementation of continuous deployment (CD) and GitOps in Simpl-Open using GitLab CI/CD, Helm Charts, Argo CD and multiple environments.
The goal is to automate the release management process, ensuring consistent and reliable deployments across various environments.

8.7.1. Architecture

The architecture consists of:

GitLab : The source code is managed on the GitLab instance at code.europa.eu . There also the CI/CD pipeline is used;
Helm Charts : Package managers for Kubernetes applications;
Argo CD : A continuous deployment tool for automating the application release process for the development, integration and pre-prod environments
Fleet Management : A K8 concept and tool to centrally manage DevSecOps tools, agents, components on every cluster in the landscape.
Multiple Environments : The deployment is done in multiple environments.

8.7.2. GitFlow

The project uses GitFlow, as a branching strategy for Git repositories designed to streamline collaboration and manage releases in software projects. GitFlow has become widely adopted in software development workflows, especially for projects with regular release cycles.

The advantages of the GitFlow approach:

Clearly defines branches for development, features, releases and hotfixes;
Makes it easier for teams to work on different features or issues simultaneously;
Facilitates managing multiple releases and hotfixes.

Artefacts should be versioned according to the Semantic Versioning Concept.

The GitFlow approach in Simpl is depicted in the following diagram:

Explanation for the branches:

main : The main branch, which represents the production-ready code;
develop : The development branch, where new features are developed and tested;
feature/ *: Feature branches for specific tasks or fixes;
release : Release branch for release candidates;
hotfix : Forked from tags of the main branch used for urgent fixes.

In GitLab (code.europa.eu) the main and develop branches are set up as protected. Merge to main can be initiated from develop, release and hotfix. Developers remove release and hotfix following the merge of the updated code to main. Develop is only allowed to merge code from feature/* (for instance feature/SIMPL-1234). Developers remove the feature branch after merging to develop.

8.7.3. CI/CD Pipeline

The pipeline is implemented using GitLab CI/CD. The are multiple steps included to ensure proper testing and security before the deployment:

Pipeline features:

Building the Code : Compile the source code into executable binaries or artifacts;
Perform Unit Testing : Execute automated tests to verify individual components of the codebase for correctness;
Create package for distribution : Bundle the application into distributable formats like jar and publish to the GitLab artifacts;
Perform Quality Testing with Sonar : Perform static code analysis to identify code quality issues and technical debt;
Build and Push Docker Image : Create a Docker image from the application and push it to the GitLab Container registry;
Perform Image Scan and produce SBOM : Scan the Docker image for vulnerabilities and compliance issues using Trivy, list all dependencies created with Trivy and Fortify;
Perform SAST (Static Application Security Testing)/SCA (Software Composition Analysis) : Analyse the source code for security vulnerabilities without executing the code using Fortify;
SCA (Software Composition Analysis) : Identify and analyse Open-Source components (dependencies scanning) for known vulnerabilities using Trivy for image scanning and Fortify for dependencies;
Release reports, Java package and Helm Chart : Update the version, package and release a Helm chart for Kubernetes deployments in the GitLab Package Registry;

Pipeline runs can be tracked on the UI of GitLab. Issues are indicated by the progress diagrams on the UI, details are provided by GitLab based on the logs of the failing jobs.

8.7.4. Release Management

The release management process will be carried out on two distinct levels (App of Apps concept):

Unitary Component development
Agent Components (integration and pre-prod)

The following diagram shows the overall process:

As shown in the diagram there are multiple stages with different environments:

Development Environment : The development environment is where new features and enhancements are developed and tested on component level;
Integration Environment : The integration environment is dedicated to integration activities and integration testing before they are promoted to pre-prod;
Pre-Prod Environment : The pre-prod environment serves as an environment where features are integrated and tested together as a cohesive unit, end to end. This is where load testing is taking place.

A Production environment is not planned for Simpl-Open, just for Simpl-Labs and Simple-Live.

As an overall concept, the release management process is automated using GitLab CI/CD and Argo CD:

Prepare a Release : A new release version is created following the GitFlow approach on GitLab;
Build and Test : The extended pipeline stages run to validate the release quality, including E2E testing and extensive security scanning;
Deploy : Argo CD deploys the release to the target environment;
Verify : Verify the deployment by running tests and monitoring application logs.

8.7.5. Helm Charts

Helm Charts are used to manage components and Kubernetes applications. Similarly, Helm Charts define the application on Agent level.

Chart Management : Helm Charts are managed using GitLab CI/CD, allowing for automated updates and versioning;
Deployment : Helm Charts are deployed to the target environment using Argo CD.

Component development teams release their component by a Helm Chart. By the application of the App of Apps concept, on the Agent level (App) the individual components (Apps) are defined and managed. This is ensured by the configuration of Helm Charts in a hierarchical manner (Agent level configuration overwrites component level configuration).

Benefits of the App of Apps Concept

Scalability : Simplifies managing a large number of components in complex agents;
Centralised Management : Enables a single point of control for all components;
Flexibility : Supports managing components across multiple environments or clusters;
Modularity : Each component remains independently manageable, facilitating updates and troubleshooting;
GitOps Alignment : Integrates seamlessly with GitOps workflows for declarative management.

8.7.6. Argo CD

Argo CD is used to automate the deployment process. It is deployed in each environment to support the release process.

Application Definition : Define applications in Argo CD’s configuration file (application.yaml);
Source Code Management : Argo CD manage automatic deployments. Deployments happen based on the following triggers: by repo source code change, package registry change and manually. Dev team have the freedom to configure this for their components.
Deployment Strategies : Choose deployment strategies for each application, such as rolling updates or blue-green deployments.

8.8. Testing

The testing process is separated into different phases which are shown in the diagram below. The full test process is described in the testing document.

8.9. Monitoring and Logging

Monitoring and logging tools are used to track application performance and detect issues:

Prometheus : A monitoring tool that collects metrics from the environments/ tools and stores them in a timeseries database; Deployed as an agent on all clusters.
Grafana : A visualisation tool that displays dashboards based on Prometheus data. Centrally deployed on the DevSecOps-tools cluster as a central instance.
Promtail : will be deployed on the clusters as the agent to discover and gather logs.
Loki : will be deployed to centrally aggregate and manage collected logs.

Prometheus agents will be deployed to all Kubernetes clusters and connected to the central Prometheus instance (on the DevSecOps tool server) to consolidate metrics. Grafana, deployed on the DevSecOps cluster will use the data available for the Grafana central instance to visualise data.

Metrics data is retained for 1 month.

Currently, email based alerting mechanism is set up in Grafana to notify the operators for events configured in the tool.

Further extension of the infrastructure will be done with the deployment of Promtail and Loki.

8.10. Backup and restore

On the Kubernetes clusters Velero is used for backing up and restoring persistent volumes. The tool is deployed on the following clusters:

Cluster	Backup policy
dev-components	WEEKY, DAILY
devint-agents	WEEKY, DAILY
devsecops-keycloak	NONE
devsecops-runners	WEEKLY
devsecops-tools	WEEKY, DAILY
devsecops-toolstest	NONE
preprod-agents	WEEKY, DAILY

Backup is done for the specific Namespaces configured for the process.

Data restoration is carried out with the same tool.

8.11. Deprovision of Environments

Deprovisioning refers to the process of removing or deleting resources that were initially provisioned. In this context, it involves removing the application from the Kubernetes clusters across different environments and deleting the infrastructure managed by Terraform.

For Simpl there are two steps for the deprovisioning: One for the application and the second for the overall infrastructure to shut down Simpl completely.

8.11.1. Step 1: Application Deprovisioning

The application deployed via ArgoCD can be removed by deleting the relevant application resource. This can be achieved using the ArgoCD CLI or the ArgoCD API.

Please note that this process has to be repeated for every instance of the application running on different Kubernetes clusters within each environment.

8.11.2. Step 2: Infrastructure Deprovisioning

Terraform maintains an up-to-date state file that reflects the current state of the infrastructure. To deprovision the infrastructure, it needs to be destroyed using Terraform.

Terraform offers the `destroy` command to delete the infrastructure which was deployed by Terraform. The command compares the state file to the current infrastructure and removes everything that exists.

The command has to be executed for each of the environments separately. This can be done exclusively by the admins of the DevSecOps team for each environment.

If the infrastructure including the DevSecOps-Tools-Cluster is destroyed, also the managed applications like Keycloak, Vault etc. are deleted. Any necessary data must be backed up before this process is started.

9. Annexes

9.1. Annex 1 - Mapping between functional requirements and components

While L2 requirements are mapped to functional requirements through the use of components in Jira, the table below provides an extract from this mapping.

Requirement ID	Summary	Component/s
SIMPL-402	Create usage policy	Resource Offering Editor
SIMPL-409	Assign usage policy	Resource Offering Editor
SIMPL-415	Enforce usage policies	Contract Management, Data Space Connector
SIMPL-469	Quick Search	Federated Catalogue, Search
SIMPL-500	Semantic Validation	Federated Catalogue, Vocabulary Management
SIMPL-503	Access policy publication	Resource Offering Editor
SIMPL-514	Assign Contract Template	Contract Management, Resource Offering Editor
SIMPL-1610	Defining preconfigured attributes	IAA
SIMPL-1612	Tier 2 identity attributes configuration	IAA
SIMPL-1613	Tier 2 attributes management - services	Onboarding, IAA
SIMPL-1616	Authentication between participant agents	IAA
SIMPL-1619	Handling different versions of application	Federated Catalogue, Resource Offering Editor
SIMPL-1629	Unified Orchestration Mechanism	Infrastructure Management
SIMPL-1630	Cross-Platform Service Management	Infrastructure Management
SIMPL-1655	Participant offboard operations	Onboarding, IAA
SIMPL-1658	Implement monitoring actions	Observability
SIMPL-1672	View the onboarding process documentation and initiate the onboarding	Onboarding
SIMPL-1673	Register onboarding application	Onboarding
SIMPL-1674	Onboarding request - tracking by applicant	Onboarding, IAA
SIMPL-1675	Onboarding requests - automated tracking and monitoring	Onboarding
SIMPL-1676	Onboarding requests - verification support	Onboarding, IAA
SIMPL-1677	Onboarding requests - manual approval support	Onboarding, IAA
SIMPL-1679	Onboarding requests - rejection support	Onboarding, IAA
SIMPL-1681	Attribute selection	Onboarding, IAA
SIMPL-1682	Create credential request	Onboarding, IAA
SIMPL-1683	Credential creation	Onboarding, IAA
SIMPL-1684	Credential request - tracking by participant	Onboarding, IAA
SIMPL-1685	Credential request - notification of completion	Onboarding
SIMPL-1686	Credentials installation and review - services	Onboarding, IAA
SIMPL-1687	Credentials installation and review - status and information	Onboarding, IAA
SIMPL-1688	Credentials installation and review - identity attributes check	Onboarding, IAA
SIMPL-1689	Users and roles configuration	Onboarding, IAA
SIMPL-1696	Mandatory quality rules	Federated Catalogue
SIMPL-1698	Validation of a resource description - feedback to the provider	Federated Catalogue, Schema Management
SIMPL-1699	Syntax Validation	Federated Catalogue, Schema Management
SIMPL-1704	Creating a resource description	Resource Offering Editor
SIMPL-1705	Uploading a resource description	Federated Catalogue, Resource Offering Editor
SIMPL-1715	Access policy definition	Resource Offering Editor
SIMPL-1719	Advanced Search	Federated Catalogue, Search
SIMPL-1728	Attributes of a self-description for a dataset	Federated Catalogue
SIMPL-1729	Attributes of a self-description for an application	Federated Catalogue
SIMPL-1730	Support for sharing across the Federated Dataspace	Federated Catalogue
SIMPL-1731	Adding a vocabulary	Vocabulary Management
SIMPL-1734	Advance search - Search parameters compliant with constraints and vocabularies	Schema Management, Vocabulary Management
SIMPL-1739	Triggering Mechanism	Data Space Connector, Infrastructure Management
SIMPL-1740	data space IAA Tier 2 customization	IAA
SIMPL-1741	End user authentication process - services	IAA
SIMPL-1743	Identity provider federation initialisation	IAA
SIMPL-1744	Ensure RBAC compliance	IAA
SIMPL-1745	Roles management operations	IAA
SIMPL-1746	Identity provider federation configuration	IAA
SIMPL-1747	Identity provider federation APIs	IAA
SIMPL-1748	End user authentication process - api	IAA
SIMPL-1749	Adding attributes of a self-description for a dataset/application/infrastructure	Federated Catalogue, Vocabulary Management
SIMPL-1751	Update vocabulary	Vocabulary Management
SIMPL-1752	Remove vocabulary	Vocabulary Management
SIMPL-1753	Updating attributes of a self-description for a dataset/application/infrastructure	Schema Management
SIMPL-1754	Selecting shared entries	Federated Catalogue
SIMPL-1755	Selecting dataspaces for catalogue sharing	Federated Catalogue
SIMPL-1756	Publishing shared entries to selected dataspace	Federated Catalogue
SIMPL-1757	Quality dimension and Quality Rules	Federated Catalogue, Resource Offering Editor
SIMPL-1758	Calculation of Quality Score	Federated Catalogue
SIMPL-1772	Storing results	Search
SIMPL-1784	Data sharing	Data Space Connector, Data Transfer
SIMPL-1787	Duplication of source before applying data processing	Data Transfer
SIMPL-1788	Template and Policy Engine for VM	Infrastructure Management
SIMPL-1789	Integration with Cloud APIs through Crossplane	Infrastructure Management
SIMPL-2882	Log infrastructure consumption metrics in the provider agent	Observability
SIMPL-2884	Metrics to log during Infrastructure resource consumption	Observability
SIMPL-2889	Monitoring infrastructure consumption	Observability
SIMPL-2894	Simpl shall log metrics when data is transferred through the Simpl-Open agent	Observability
SIMPL-2902	Monitoring data consumption	Observability
SIMPL-2904	Log all the metrics in a central repository per agent	Observability
SIMPL-2906	Logging amount and type of data transferred through Simpl-Open agent	Observability
SIMPL-2907	Logging the reason for transferring data	Observability
SIMPL-2914	Logs and traces compliant with EU regulations and with the rules set for the audit process	Observability
SIMPL-2916	Pre-configured monitoring dashboard	Observability
SIMPL-2917	Participant to configure custom dashboards	Observability
SIMPL-2919	Monitoring Simpl-Open agent software components technical logs	Observability
SIMPL-2921	Monitoring Simpl-Open agent infrastructure metrics	Observability
SIMPL-2924	Healthcheck endpoint for all of application components	Observability
SIMPL-2926	Application healthchecks in the monitoring dashboard	Observability
SIMPL-2929	Send alert when a component is unhealthy	Observability
SIMPL-2930	Store the alerts	Observability
SIMPL-2932	Make all logged information retrievable in real time from a reporting module	Observability
SIMPL-2941	Simpl shall store technical logs of agent (software) components in a log repository	Observability
SIMPL-2945	Store technical logs of the infrastructure on which Simpl-Open is deployed in a log repository	Observability
SIMPL-2946	Log Simpl agent infrastructure metrics	Observability
SIMPL-2949	Simpl shall log all business actions in the central logs repository	Observability
SIMPL-2966	Simpl shall log the usage of data/application resource on a provider's infrastructure by a consumer	Observability
SIMPL-2969	Simpl shall log all Tier I accesses to the agent	Observability
SIMPL-2970	Simpl shall log all security events generated by its components	Observability
SIMPL-3180	Alert thresholds definition	Observability
SIMPL-3182	Alert triggering	Observability
SIMPL-3382	The Usage Contract Agreement stored in human readable format	Contract Management
SIMPL-3835	Monitoring Simpl-Open agent Tier II transactions	Observability
SIMPL-3886	Monitoring Simpl business logs	Observability
SIMPL-3995	Define the onboarding process documentation	Onboarding
SIMPL-4417	Automated deployment of Simpl-Open pre-configured monitoring dashboard	Observability
SIMPL-4421	Simpl shall log all Tier II transactions	Observability
SIMPL-4422	Monitoring Simpl-Open agent infrastructure technical logs	Observability
SIMPL-4423	Monitoring Simpl-Open agent Tier I accesses	Observability
SIMPL-4424	Monitoring Simpl-Open agent security events	Observability
SIMPL-4428	Monitor Simpl agent infrastructure components health	Observability
SIMPL-4494	Sorting search results	Search
SIMPL-4495	Filter search result based on access policy	Federated Catalogue, Search
SIMPL-4889	Publishing a resource description	Federated Catalogue, Resource Offering Editor
SIMPL-5396	Request a data resource	Data Space Connector, Data Transfer
SIMPL-6100	Requesting an infrastructure resource	Infrastructure Management
SIMPL-6109	Access policy enforcement	Data Space Connector
SIMPL-6122	Data Visualization	Data Transfer, Infrastructure Management
SIMPL-10173	Configure a ruleset for the automatic validation of onboarding request documents	Onboarding
SIMPL-10174	Define identity attributes for an Onboarding Procedure Template	Onboarding, IAA
SIMPL-10489	Onboarding request automated document validation	Onboarding
SIMPL-10572	Governance Authority - Credentials actions	IAA
SIMPL-10594	Participant - Credential Renewal and Deployment	IAA
SIMPL-11315	Governance Authority‚ retrieving schemas and schema versions	Schema Management
SIMPL-11316	Governance Authority‚ creating a new schema for a new resource type	Schema Management
SIMPL-11318	Governance Authority‚ creating a new version of an existing schema for a resource type	Schema Management
SIMPL-11320	Governance Authority‚ revoking a schema	Schema Management
SIMPL-11321	Governance Authority‚ retaining a revoked schema for existing resource descriptions	Resource Offering Editor, Schema Management
SIMPL-11322	Governance Authority‚ ensuring that a revoked schema is not available for publishing a new resource description	Resource Offering Editor, Schema Management
SIMPL-11323	Governance Authority‚ validating a schema‚ syntax, semantics, and default properties	Schema Management
SIMPL-11328	Governance Authority - publishing a validated schema	Schema Management
SIMPL-11333	Governance Authority‚ notifying Providers about schema changes	Schema Management
SIMPL-12197	Governance Authority - Identity attributes assignment to participants	IAA
SIMPL-12898	A Provider consults an overview of its Resource descriptions	Federated Catalogue, Search
SIMPL-12903	A Provider consults the details of one of its own resource descriptions	Federated Catalogue, Search
SIMPL-12904	A Provider consults the version history of one of its own resource descriptions	Resource Offering Editor

Architecture

D1.3.4 Functional and Technical Architecture Specifications

1. Introduction

1.1. Scope of this document

1.2. Target Audience

1.3. Changes with respect to the previous version

1.3.1. 06 Mar 2026

1.3.2. 13 Feb 2026

1.3.3. 23 Jan 2026

1.3.4. 19 Dec 2025

1.3.5. 07 Nov 2025

1.3.6. 26 Sep 2025

1.3.7. 05 Sep 2025

1.3.8. D1.3.2 → D1.3.3

1.3.9. D1.3.1 → D1.3.2

2. Simpl-Open High-Level Overview

2.1. Simpl-Open Description

2.2. Data Space Concepts

2.2.1. Actors and Data Space Deployment

2.2.2. Data Space Participant: Tier I and Tier II

2.2.3. Anatomy of a Simpl-Open service

2.2.4. Built-in Services

Local Built-in Services

Cross-Agent Built-in Services

2.2.5. Access-through Services

Local Access Through

Remote Access Through

2.2.6. Simpl-Open Service Template - The Echo Service

2.3. Access Control & Trust

2.4. Digital Identities integration with EU Digital Identity Framework - eIDAS

2.4.1. eIDAS - EUDI Framework

Trust Services

eSignature

Electronic Identification

eID

2.4.2. Digital Identities in Simpl

Electronic Identification Use Cases (applicable only on Tier 1)

Onboarding using Electronic Identification

Participants’ end users (both Consumer/Providers organisations) login and access the Agent functionalities using Electronic Identification

Trust Services

Participants’ end users eEIDAS electronic Signatures

Participants decide when the eIDAS QES is required

2.4.3. References

2.5. Connector

2.6. High-Level Architecture

2.7. Capabilities (Level 1)

2.8. Services (Level 2)

2.7.1. Administration Dimension

2.7.2. Data Dimension

2.7.3. Integration dimension

2.7.4. Infrastructure Dimension

2.7.5. Governance Dimension

2.7.6. Security Dimension

2.8.1. Deployment model

2.8.2. Scope covered by the Release 3.0

2.9. Architecture Framework

2.9.1. Architecture Approach

2.9.2. Architecture Principles

2.10. Architecture Patterns

2.11. Assumptions and Architecture Decisions

2.11.1. Assumptions

2.11.2. Architecture Decisions Record (ADR)

3. Simpl-Open Business Architecture

3.1. Actors

3.2. Simpl-Open Functional Architecture

4. Simpl-Open Application Architecture

4.1. Application Components Views

4.2. ACV - Domain 1 - Access Control & Trust

4.2.1. ACV - Domain 1 - Access Control & Trust - Static Views

ACV Static - Authorisation Service

ACV Static - Identity Attributes Service

ACV Static - Identity Provider Service

ACV Static - Onboarding Service

ACV Static - Tier 1 Authentication Service

ACV Static - Tier 2 Authentication Service

ACV Static - User Management Service

4.2.2. ACV - Domain 1 - Onboarding & IAA - Dynamic Views

ACV Dynamic - BP 03A – Onboarding of a new data space Participant - Providers (data - application - infrastructure) & Consumers

ACV Dynamic - BP 03B - Participant User and Roles Configuration

ACV Dynamic - BP 03C - End User Role Request