More information about Architecture
|
EUROPEAN COMMISSION
DIRECTORATE-GENERAL FOR COMMUNICATIONS NETWORKS, CONTENT AND TECHNOLOGY Future Network Cloud and Software |
Document Control Information
| Settings | Value |
|---|---|
| Document Title: | D1.3.4 Functional and Technical Architecture Specifications |
| Project Title: | SC1 |
| Document Author: | Sovereign-X |
| Doc. Version: | 0.1 |
| Sensitivity: | Limited |
| Date: | 24 Dec 2025 |
Document history:
The Document Author is authorised to make the following types of changes to the document without requiring that the document be re-approved:
To request a change to this document, contact the Document Author or Owner.
Changes to this document are summarised in the following table in reverse chronological order (latest version first).
| Revision | Date | Created by | Short Description of Changes |
|---|---|---|---|
| 0.1 | 31 Dec 2025 | Sovereign-X | SfR M24 first Iteration - Submitted for Review |
The purpose of this document is to describe the functional and technical architecture of Simpl-Open, following the approach further described in the Architecture Approach section. It includes the following content (non-exhaustive list):
A high-level overview of Simpl-Open architecture vision;
A description of Architecture Principles, Assumptions and Decisions that drive the Simpl-Open architecture;
A description of the Simpl-Open architecture from a business, application, data and technology perspective, each of them described using appropriate diagrams (BPMN, ArchiMate, UML);
A description of the Simpl-Open security architecture.
The intended audience of this document comprises people involved in the architecture, design, integration, testing and maintenance of Simpl-Open.
It mainly targets architects, but can also be helpful for developers, testers and other stakeholders involved in Simpl-Open, as well as stakeholders involved in Simpl-Live or other data spaces interested in integration of Simpl-Open.
Updated “ACV Static - Data Orchestration Service” to include the auth proxy and made the asset orchestrator part of release.
Updated “TCV Static - Data Orchestration Service” to include the auth proxy and made the asset orchestrator part of release.
Updated “LDM - Domain 1 - Access Control & Trust” to include the Role entity in Users & Roles component.
Updated “PDM - Domain 1 - Access Control & Trust” to include the Role table in Users & Roles component.
Updated “ ACV Dynamic - BP 03B – Participant User and Roles Configuration ” to include enable/disable functionality
Updated “ACV Dynamic - BP 03C - End User Role Request” to include Role Request review functionality in the frontend
Update “ACV Static - Tier 1 Authentication Service” to include the external identity provider
Update “TCV Static - Tier 1 Authentication Service” to include the external identity provider
Update “ACV Dynamic - BP 03B – Participant User and Roles Configuration” to include users and roles management
Update “TCV Dynamic - BP 03B - Participant User and Roles Configuration” to include users and roles management
Added “ACV Dynamic - BP 03C - End User Role Request” for end user role request
Added “TCV Dynamic - BP 03C - End User Role Request” for end user role request
Update “APIs” to include new Onboarding and Users&Roles API description
Update “CDM - Domain 1 - Onboarding & IAA” to include roles request in Users&Roles data model
Update “LDM - Domain 1 - Onboarding & IAA” to include roles request in Users&Roles data model
Update “PDM - Domain 1 - Onboarding & IAA” to include roles request in Users&Roles data model
Update “ ACV Static - Catalogue Client Service ” to remove EDC Connector adapter
Update “ TCV Static - Catalogue Client Service ” to remove EDC Connector adapter
Update “ ACV Static - Resource Offering Service ” to remove EDC Connector adapter
Update “ TCV Static - Resource Offering Service ” to remove EDC Connector adapter
Update “ ACV Static - Connector Service ” to add EDC Connector adapter
Update “ TCV Static - Connector Service ” to add EDC Connector adapter
Update “User Interfaces” to mark Identity Provider frontend as a released component
Update “ACV Static - Tier 1 Authentication Service” to include the authenticator plugin
Update “TCV Static - Tier 1 Authentication Service” to include the authenticator plugin
Update “APIs” to include Users And Roles v2 APIs
Fixed descriptions of entities and attributes “LDM - Domain 1 - Onboarding & IAA”
Updated “Simpl-Open Application Architecture” with Data Orchestration Service
Update “ACV - Domain 2 - Publish and consume resources” to include the Sync Schema Adapter, Schema Management Service and Data Orchestration Service
Update “ACV Static - Schema Management Service” to include more description and the schema synch adapter
Added the anonymisation services to “ACV Static - Data Orchestration Service”
Update “ACV Dynamic - BP 06 – Consumer searches resources in data space catalogues” to include the Sync Schema Adapter
Update “ACV Dynamic - BP 05B - Provider manages resource descriptions” to include the Sync Schema Adapter
Updated “User Interfaces” to add Schema Management UI and Data Orchestration UI
Updated “APIs” to add Data Orchestration Interfaces and update Schema Management and Synch Schema Service
Updated “Technology Deployment View” to include the Schema Management Service
Updated “Identification, Authentication & Authorisation” to reflect Roles & Identity Attributes for Schema Management and Data Orchestration
Updated “Simpl-Open Technology Choices” to reflect Apache Fuseki and Dagster
Updated “Custom Components Data Model” to reflect Conceptual, Logical and Physical Data Models for the Sync Schema Adapter
Updated “Simpl-Open Application Architecture” to explain the difference between mandatory and optional services
Updated “Simpl-Open Functional Architecture” to rename domain 1 into Access Control & Trust
Updated “ACV - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “CDM - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “LDM - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “PDM - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “TCV - Domain 1 - Access Control & Trust” to rename domain 1 into Access Control & Trust
Updated “ LDM - Domain 2 - Publish and consumer resources ” to update the Infrastructure Provider Storage
Update “ APIs ” to include Infrastructure Provider API description
Updated “ High-Level Architecture ” to reflect the new capabilities map structure and removed “ Annex 1 - Architecture Building Blocks ” as this is now described in the high-level architecture itself
Updated “ Simpl-Open Application Architecture ” to refer to the website for NFRs instead of the legacy annex and removed “ Annex 3 - Non-Functional Requirements ”
Updated “ Assumptions and Architecture Decisions ” to include missing decisions
Updated “ACV - Domain 3 - Management/Operation of Data Space” with new overview diagram using application services and individual service static views
Updated “TCV - Domain 3 - Management/Operation of Data Space” with new structure and individual service static views
Added “TCV Dynamic - BP 02C - Manage resource description schemas”
Added “TCV Static - Schema Management Service”
Updated “ACV Static - Schema Management Service” with description
Added “ACV Dynamic - BP 02C - Manage resource description schemas”
Updated “CDM - Domain 2 - Publish and consume resources” to include the Schema Synch Service
Updated “LDM - Domain 2 - Publish and consume resources” to include the Schema Synch Service
Updated “PDM - Domain 2 - Publish and consume resources” to include the Schema Synch Service
Added “ACV Static - Schema Synch Service”
Added “TCV Static - Schema Synch Service”
Added Schema Management to “Detailed Technical Specifications”
Updated “User Interfaces” to add Infrastructure UI for deployment script VM templates
Update “ACV Dynamic - BP 12C – Credentials actions by the Governance Authority” to include the Identity Provider Frontend as officially available
Update “ACV Static - Identity Provider Service” to include Identity Provider Frontend as officially available
Update “CDM - Domain 1 - Onboarding & IAA” to update the data model of Identity Provider and Authentication Provider
Update “LDM - Domain 1 - Onboarding & IAA” to update the data model of Identity Provider and Authentication Provider
Update “PDM - Domain 1 - Onboarding & IAA” to update the data model of Identity Provider and Authentication Provider
Update “APIs” to include the new auto renewal APIs for the authentication provider component
Update section “Simpl-Open Application Architecture” to reflect the new structure of the section
Update section “ACV - Domain 2 - Publish and consume resources” with new overview diagram using application services
Add section “ACV - Domain 2 - Publish and consume resources - Static Views” containing individual service static views
Add section “ACV - Domain 2 - Publish and consume resources - Dynamic Views” to reorganise the already existing dynamic views
Update section “Simpl-Open Technology Architecture” to reflect the new structure of the section
Update section “TCV - Domain 1 - Onboarding & IAA” to reflect the new structure of the section
Add section “TCV - Domain 1 - Onboarding & IAA - Static Views” containing individual service static views
Add section “TCV - Domain 1 - Onboarding & IAA - Dynamic Views” to reorganise the already existing dynamic views
Update section “TCV - Domain 2 - Publish and consume resources” to reflect the new structure of the section
Add section “TCV - Domain 2 - Publish and consume resources - Static Views” containing individual service static views
Add section “TCV - Domain 2 - Publish and consume resources - Dynamic Views” to reorganise the already existing dynamic views
Removed hyperlinks from section “User Interfaces” .
Update section “ LDM - Domain 1 - Onboarding & IAA” to update the data model for the security attributes provider and authentication provider components
Update section “PDM - Domain 1 - Onboarding & IAA” to update the data model for the security attributes provider and authentication provider components
Update section “APIs” to update Security Attributes Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 v2 APIs
Update section “*ACV Static - Tier2 Authentication Service” *to include the communication with Security Attributes Provider and Identity Provider
Update section “APIs” to include Security Attributes Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 and Tier2 v2 APIs, Identity Provider Tier1 v2 APIs
Update section “User Interfaces” to include participant management functionalities in the Onboarding frontend
Update section “User Interfaces” to include credential renewal functionalities in the participant utility frontend
Update section “CDM - Domain 1 - Onboarding & IAA” to update the data model for identity provider and authentication provider components
Update section “ LDM - Domain 1 - Onboarding & IAA” to update the data model for identity provider, security attributes provider and authentication provider components
Update section “PDM - Domain 1 - Onboarding & IAA” to update the data model for identity provider, security attributes provider and authentication provider components
Update section “ACV - Domain 1 - Onboarding & IAA” to include credential renewal flows
Create section “ACV Dynamic - BP 12C – Credentials actions by the Governance Authority”
Update section “TCV - Domain 1 - Onboarding & IAA” to include credential renewal flow
Create section “TCV Dynamic - BP 12C – Credentials actions by the Governance Authority”
Update section “ACV Dynamic - BP 07 - Consumer and Provider establish a usage contract for selected catalogue items” to include traceability to the business process on the diagram
Update section “ACV Dynamic - WF 12B - Local Node Logging and Monitoring” to include traceability to the business process on the diagram
Update section “ACV Dynamic - BP 09A - Consumer consumes a data resource from a Provider” to include traceability to the business process on the diagram
Update section “ACV Dynamic - BP 09B - Consumer receives a data processing service on a data resource via an application” to include traceability to the business process on the diagram
Update section “Application Components Views” to reflect the new structure of the section
Update section “ACV - Domain 1 - Onboarding & IAA” with new overview diagram using application services
Add section “ACV - Domain 1 - Onboarding & IAA - Static Views” containing individual service static views
Add section “ACV - Domain 1 - Onboarding & IAA - Dynamic Views” to reorganise the already existing dynamic views
Added first version of orchestration platform to “ACV - Domain 2 - Publish and consume resources”
Update section “ACV Dynamic - BP 08 - Consumer consumes an infrastructure resource from a Provider” to add traceability to BPs
Update section “ PDM - Domain 1 - Onboarding & IAA ”
Update section “ LDM - Domain 1 - Onboarding & IAA ”
Update section “ CDM - Domain 1 - Onboarding & IAA ”
Update section “ ACV - Domain 1 - Onboarding & IAA ” to include Document Validation and Hashicorp Vault technology
Update section “ TCV - Domain 1 - Onboarding & IAA ” to include Document Validation and Hashicorp Vault technology
Update section “ TCV Dynamic - BP 03A - Onboarding of a participant - Tier II ”
Update section “TCV Dynamic - BP 03B - Onboarding Tier 1 - Organisation Local IDP(Directory) Connection/Mapping” to include Document Validation and Hashicorp Vault technology
Update section “ ACV Dynamic - BP 03A - Onboard a Participant ” to include ArchiMate Refactoring, traceability with BP03A and Document Validation Service
Update section “ ACV Dynamic - BP 03B - Connect/map Organisation Local IDP (Directory) ” to include ArchiMate cleanup and traceability with BP03B
Update section “ APIs ” to include IAA OpenAPI definition, AsyncAPI definition, API descriptions
Update section “ Data Space Concepts ” to include a the new “Anatomy of a Simpl-Open service” section
Update section: “ ACV - Domain 2 - Publish and consume resources ” with new components & APIs to include EDC Connector Adapter, Validation Service and Contract Consumption Adapter
Update section: “ ACV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue ”
Update section: “ ACV Dynamic - BP 09A - Consumer consumes a data resource from the provider ”
Update section: “ ACV Dynamic - BP 09B - Consumer receives data processing service over a dataset via an Application ”
Update section: “ TCV - Domain 2 - Publish and consume resources ” with new components & APIs to include EDC Connector Adapter, Validation Service and Contract Consumption Adapter
Update section: “ TCV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue ”
Update section: “ TCV Dynamic - BP 09A - Consumer consumes a data resource from the provider ”
Update section: “ TCV Dynamic - BP 09B - Consumer receives data processing service over a dataset via an Application ”
Update section: “ APIs ” to include EDC Connector Adapter, Validation Service and Contract Consumption Backend
Update section: “ ACV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider ”
Update section: “ TCV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider ”
Update section: “ Technology Deployment View ”
Add “ Architecture Patterns ” section into the “ Architecture Framework” and remove “ Annex 4 - Architecture Patterns ”
Remove section “ List of Business Processes ” and referred to the website instead
Update section “ Self-Description Tooling ” to remove the Flow Diagram (duplication with ACV Dynamic)
Update section “ Simpl-Open Application Architecture ”
Update section: “ APIs ” to include post-configuration and decommissioning
Update section: “ Simpl-Open Technology Choices ” to include Terraform related technologies
Update section: “ User Interfaces ” with the Infrastructure Deployment Script Management UI
Update section: “ Infrastructure Provisioning ”
Updated section “ Annex 2 - Mapping between functional requirements and components ” with latest list of requirements
Update section: “ Open-Source Components Data Model ” to include OpenTofu
Add section “ Digital Identities integration with EU Digital Identity Framework - eIDAS ” in the “ Data Spaces Concepts ” section
Update section: “ Simpl-Open Security Architecture ” with updated diagrams
Update section: “ LDM - Domain 2 - Publish and consume resources ” with updated Infrastructure Provider fields
Update section: “ PDM - Domain 2 - Publish and consume resources ” with updated Infrastructure Provider fields
Update “ Architecture Decisions Record ” with latest decisions
Updated “ACV Dynamic - BP 05B - Publish and consume” according to BP and renamed to “ACV Dynamic - BP 05B - Manage resources”
Removed (and replaced by “ACV Dynamic - BP 05B - Manage resources” ) sections:
“ACV Dynamic - BP 05B - Request sd”
“ACV Dynamic - BP 05B - Retrieve status”
“ACV Dynamic - BP 05B - Update SD status”
Add section “User Interfaces” into the “ Simpl-Open Application Architecture”
Add section “Custom Components Data Model” into the “Simpl-Open Data Architecture”
“Simpl-Open Security Architecture” enhanced
“UI/UX Style Guide” referenced in the UIs section
Added Health checks and Tracing for Monitoring component
With the ongoing exponential growth of data, there is a pressing need within the European Union to provide access to resilient and competitive data storage and processing capacities for both the private and the public sector. In particular, the European Commission aims to address the need for more data sharing and decentralised data processing closer to the user (at the edge). It is also critical to deploy EU data services in the public and private sectors to grant Europe a leading status as a data-driven society and improve data usage within the European Union. The data services of various organisations within the same industry sector should be abstracted into sector-specific Data Spaces. This could bring several benefits, such as greater productivity, improvements in health and well-being, adaptation to environment and climate change, transparent governance and convenient public services.
To support the above-mentioned objectives, the European Commission is creating an Open-Source, multi-vendor, large-scale, modular and interoperable middleware called “Simpl-Open”. Simpl-Open will be the basis for a European Cloud Federation enabling the operation and interconnection within and in between various European data spaces and the safe migration of the users to the cloud.
Simpl-Open will federate data, application and infrastructure across the European Union with secure, resilient, energy efficient and accessible cloud-to-edge capabilities. It will allow EU stakeholders to pool together resources to create more business value, increase resource usage efficiency and reduce costs and duplication of efforts. Simpl-Open considers both the public sector as well as EU business as core stakeholders. Using the features provided by Simpl-Open, an open marketplace for EU resources will be created that enables energy-efficient reuse of efforts achieved by other EU participants.
The following figure displays how the architecture vision of Simpl-Open maps the five actor groups (see definition in Actors section) and different Data Spaces:
At the core of Data Spaces lie the five types of actors that Simpl-Open considers. These actors are a symbolic representation of a distributed network of cooperating parties in an open ecosystem. Simpl-Open, represented by the Simpl-Open Agent, spans across these actors, enabling asset sharing between them. It provides common services on which Data Spaces can be built. Simpl-Open stays agnostic to the specifics of a particular Data Space, allowing additional Data Space specific services to be added on top of Simpl-Open. This added layer can, for example, contain standards on data representation, enforce common quality certifications, or define peer review rules to assess data quality. The Data Space specific services tailor the ecosystem beyond simple sharing of assets; they make sure those assets become valuable to participants.
Simpl-Open does not only aim to be used to build Data Spaces but it also creates interoperability between Data Spaces. As multiple Data Spaces incorporate Simpl-Open, Data Spaces become more connected. This enables services to cross the boundaries of specific Data Spaces. Such services will initially be more limited, as Simpl-Open cannot capture the details of all different Data Spaces. It will be up to the user to deal with the specifics of each Data Space in interpreting the assets that it obtains. To make this illustrative view more tangible, the following figure presents an example of how a set of distributed actors might interconnect to form a Data Space. It is important to note that this figure displays one possible scenario of many possible ways different participants might interact. The number of participants in a Data Space or the number of stakeholders behind a single actor is only limited by its technical feasibility. This implies that large numbers of participants and stakeholders can interact simultaneously. The Simpl-Open Agent in the figure serves as an abstract component that actors need to deploy to become part of the Data Space.
It is important to note that each of these displayed actors are an abstraction to the internal systems of one or more stakeholders. The deployment of Simpl-Open in a Data Space can have various degrees of granularity. The stakeholder behind an actor can be an individual user that has the capabilities to deploy a Simpl-Open Agent or can be an entire data sharing initiative on its own. It is up to the Data Space governance authority to decide how Simpl-Open best provides value and what level of granularity of the deployment fits best.
This section defines general concepts that are necessary for a good understanding of the Simpl-Open documentation and ecosystem in general.
As described above, a Data Space consists of actors (individuals or entities) who need to interact with each other. In Simpl-Open, it is assumed that individuals (called end-users ) are always part of an entity (called participant ). A participant can operate one or multiple node (s) which represent a distinct and/or isolated set of IT resources and can participate to different Data Spaces with various participant roles (including Data Provider, Application Provider, Consumer, Infrastructure Provider and - exactly one - Governance Authority). Nodes can also be spread across physical locations.
Example: a university (= participant ) which embodies students, researchers or accountants (= end-users ). The university has a dedicated network (with connected IT resources) for its sciences department (= the sciences node ) hosted in Paris (= physical location ) and a dedicated network for its economy department (= the economy node ) hosted in Rome (= physical location ). The sciences department might want to offer the results of its researches (= data provider role ) to a science-related Data Space while the economy department might want to consume data (= consumer role ) from an economy-related Data Space.
Simpl-Open Agent is a middleware, to be deployed on each node, acting as a local gateway for secure communication within a Data Space.
The following diagram illustrates this deployment view:
The following diagram translates the above deployment view into a data domain model to better understand the relationship and cardinality between the different entities.
Only the HL concepts of Tier I and Tier II are presented here as it is required for a good understanding of the next sections. More details can be found in the Simpl-Open Application and Technology Architecture, especially related to Domain 1.
Identification, authentication and authorisation are of paramount importance within a Data Space.
The identification must be supported by the governance authority. This authority is in charge of reviewing the identity details of organisations that want to participate in the Data Space. If the authority approves the participation of and organisation in the Data Space, it provides a proof of identification that the organisation installs in its Simpl-Open Agent. With this proof, the participant authenticates itself to other Data Space participants and other participants define authorisation rules based on the verifiable identity.
The identification system plays an important role in two functionalities of Simpl-Open:
Establish secure communication channels;
Provide the information on which participants can base themselves to define access and usage policies.
To keep the identification system manageable in a large-scale environment, the identification is split into two tiers:
The first tier manages the identification, authentication and authorisation of the organisation’s members (humans or machines) to use the Simpl-Open Agent of their organisation;
The second tier identifies and authenticates the organisation as a whole in the Simpl network.
The figure below depicts this two-tier approach:
In the first tier , the Simpl-Open Agent connects to the preferred IAA system of the organisation: EU Login, eID, Microsoft AD, OpenID Connect, etc. This mechanism is already well established and not unique to Simpl.
The second tier involves the machine-to-machine authentication and identification of an organisation in the Simpl network. Each organisation holds an “Identity” file to support the identification, authentication and authorisation of the organisation in the Data Space. Recalling the two functionalities that the IAA supports, the necessary content of this Identity file becomes apparent:
For the establishment of a secure communication channel between participants (1) , the Identity file should contain a proof of the organisation’s public key. Each Data Space participant will create a cryptographic public/private keypair that is used in the asynchronous authentication mechanisms needed to establish a communication channel. An example of how such a secure communication channel can be established is the well-known TLS/SSL protocol. The Identity file associates the public key of an organisation to its identity. Proving the identity of the organisation then becomes proving the possession of the private key that belongs to the respective public key. This way, the organisation can be authenticated in the network and a secure communication channel can be established.
Access control and authorisation by providers (2) can be performed based on custom identity attributes of an organisation. Examples of such attributes are the organisation name, geographical location, whether it is a private or public institution, etc. Based on these attributes, providers can define access and usage policies for their resources. For example, a provider can open a resource to all public institutions, or to all participants from a specific Member State. On the other hand, the access control policies can be more stringent and access is only allowed for a specific organisation. The Identity file proves the attributes of an organisation and, as such, ensures the trust on which a provider can rely to enforce their access control.
Simpl-Open is deployed within the participant organization’s premises (e.g., data center) and is intended to be connected to the internet via a firewall. Only the Tier 2 Gateway is designed to be exposed through the firewall, enabling agent-to-agent communication.
The Tier 1 Gateway is intended to remain privately accessible within the organization’s internal network.
Simpl-Open Agents consist of a tailored set of Simpl-Open services, depending on the participant’s role (e.g., Consumer, Data Provider, Infrastructure Provider, Application Provider).
Each Simpl-Open service includes the following components:
Tier 1 Frontend : Accessible by the organization’s end users, this interface provides access to the agent’s functionalities.
Tier 1 Gateway : Secures internal traffic and enforces Role-Based Access Control (RBAC) policies.
Local Tier 1 Backend : Located behind the Tier 1 Gateway, it delivers local services to the agent and may also interact with remote Tier 2 Backends.
Tier 2 Gateway : Secures inter-organizational communications and enforces Attribute-Based Access Control (ABAC) policies.
Remote Tier 2 Backend : Accessed through the Tier 2 Gateway, it offers services to external agents.
Local Resource (Data/Infrastructure/Application) : Resources owned by the organization but external to the Simpl-Open Agent, accessible through the Local Tier 1 Backend.
Remote Resource (Data/Infrastructure/Application) : Resources owned by another organization and external to its Simpl-Open Agent, accessible through the Remote Tier 2 Backend.
According to the Architecture Vision documents, services are categorized into two types: Built-in Services and Access-Through Services.
Built-in services are services that Simpl-Open offers to end users and are completely implemented by the middleware.
An example of a local Built-in Service is the User & Roles component. This service enables agent administrators to manage users and roles locally within the scope of the agent.
The Catalog is an example of a Cross-Agent Built-in Service. It allows the local Tier 1 Catalog Backend of a consumer to interact with the Remote Tier 2 Catalog service provided by the Governance Authority.
This kind of service enables access to the Application/Data/Infrastructure resources that providers can offer through the Simpl-Open middleware.
A Local Access-Through Service enables an external application (or frontend) to access to a participant’s internal resource via the local Tier 1 component. Currently, none of the services within Simpl-Open operate in this manner.
A Remote Access-Through Service enables access to a remote resource via the local Tier 1 component, which communicates with the remote Tier 2 component through Tier 2 communication.
The echo service is an example of cross-agent built-in service that allows to check if the connection and the attribute exchange between participant is working. A boilerplate example of the Echo local and remote backend has been open sourced here .
How IAA works at a high level:
Roles are used to enforce RBAC (role-based access control) to end users that access Simpl-Open functionalities in tier 1;
Identity Attributes are used to enforce **ABAC (**attributes-based access control) in the agent-to-agent (node-to-node) communication in tier 2;
Assignable Identity Attributes are used to be assigned to Roles enabling end users belonging to those roles to act on behalf of the Participant in a certain context.
Second Tier IAA - X.509 certificates with dynamic attribute provisioning
For clarification purposes, next an example is shown on how Tier II will work in practice:
1.1 - John Doe logs into the Consumer IAA Tier 1 System.
1.2 - IAA Tier 1 System retrieves user roles from Simpl-Open Agent User roles module and assign to John Doe the rights to access the Data Space functionality through Consumer’s Simpl-Open Agent, from now on all actions performed by John Doe are actually performed by the Simpl-Open Agent of Consumer which in turn interacts with the other Simpl-Open Agents (Provider and/or Data Space built-in capabilities).
2.1 - John Doe makes the infrastructure request to the Provider Simpl-Open Agent that validates it against the Access Control and Trust capability.
2.2 - Provider and Consumer authenticate each other using the mutual x509 TLS Authentication.
2.3 - Provider and Consumer verify validity of the x509 certificate through the Identity provider federation.
2.4 - Provider enforces access control policy based on embedded identity attributes and authorise Consumer Simpl-Open Agent.
2.5 - Consumer requests his own identity attributes ephemeral proof to Identity provider federation.
2.6 - Identity provider federation responds to Consumer ephemeral proof with identity attributes.
2.7 - Consumer sends ephemeral proof with identity attributes to Provider.
2.8 - Provider checks and validates the ephemeral proof, then enforces access control policy based on embedded identity attributes and authorises Consumer Simpl-Open Agent.
3.1 - Once verification against Access Control and Trust is successfully passed, the Provider uses his own Infrastructure/User data services module to fulfil received requests:
3.2 - Provider checks the policies querying Contracts module.
3.3 - Provider enforces retrieved contract policies.
3.4 - Activate the Provisioning module to fulfil the requested resource.
4 - Provider returns an affirmative response to Consumer request.
The process explained above is depicted below :
In Simpl Open project, Digital Identities are the basis on which IAA core functionalities are built (Governance Authority relies on a x509 Certification Authority building block to issue and manage the digital identity to the Participants that onboard) and are used exclusively for tier 2 Authentication; furthermore, a full integration with the EUDI Framework (planned in the Simpl Open roadmap) will enable the middleware to be used in all possible scenarios, from the least to the most demanding in terms of trust and regulation compliance. On this page, it will be described how these digital identities are used and how the middleware is designed to integrate with the EUDI Framework.
Consists of 2 main elements that represent the main functionalities offered and precisely:
Create and validate electronic signatures, seals, time stamps, delivery services and certificates for website authentication.
Create and verify electronic signatures in line with European standards.
This is the European Commission’s digital building block that was created to enable applications to integrate with eIDAS Trust services
Electronically identify users from all across Europe.
Offers digital services capable of electronically identifying users from all across Europe.
This is the European Commission’s digital building block was created to enable applications to integrate with eIDAS Electronic Identification
The 2 elements will be joined in the EUDI Wallet that will be used for both Electronic Identification and Qualified Electronic Signatures
Digital identities in Simpl-Open are split into two kinds:
Issued by the Governance Authority and exclusively dedicated to IAA
operations
like intra-agent secure communication (mTLS),ABAC policies
enforcement, etc.
Used for both electronic identification and electronic signatures,
and Simpl-
Open is designed to permit the selection of the electronic signature
level that
best fits the scenario to cover (e.g. a qualified electronic
signature for
contracts, and advanced electronic signature for service offering
self-descriptions)
Dataspace Governance Authority can decide to use the identification information provided by eID during the onboarding process to simplify and speed up the approval of the onboarding request.
Organisations, like for example universities, can decide to rely on the identification information provided by eID to identify and give roles/permissions to their end users.
In contexts where the Dataspace Governance Authority require that a certain participant end user (e.g. the legal representative of a Participant) sign Contracts, SLA, Terms and Conditions, Agreements, etc, using Qualified Electronic Signatures.
In contexts where a Data Provider require that for a certain Service
Offering additional Contracts, SLA, Terms, and
Conditions, Agreements, need to be signed by the Consumer using
Qualified Electronic Signatures.
eIDAS - EUDI Framework - https://eidas.ec.europa.eu/efda/home#/screen/home
eSignature - https://ec.europa.eu/digital-building-blocks/sites/display/DIGITAL/eSignature
eID - https://ec.europa.eu/digital-building-blocks/sites/display/DIGITAL/eID
The IDSA Reference Architecture Model defines the connector as being the technical core component required for a participant to join a Data Space.
DSSC defines the connector as a technical software component that is run by (or on behalf of) a participant and that provides connectivity with similar components run by (or on behalf of) other participants, to enable the secure and trusted sharing of data.
A connector can provide more functionality than what is strictly related to connectivity. The connector can offer technical modules that implement data interoperability functions, authentication interfacing with trust services and authorisation, resource description, contract negotiation, etc.
DSSC uses “participant agent services” as the broader term to define these services.
DSSC also distinguishes the 2 major components that make up a connector:
The control plane is responsible for deciding how data is managed, routed and processed. For example: the control plane handles the identification of users and the handling of access and usage policies.
The data plane handles the actual exchange of data.
This implies that the control plane by its nature can be standardised to a high-level, while the data plane is likely to be different for each Data Space (as different types and sorts of data exchange take place in each Data Space).
The data plane needs to be integrated with the control plane to ensure that it can work with the necessary control mechanisms.
DSSC identifies the different categories of components within a Data Space, making a distinction between the (1) participant agent (= connector in DSSC vocabulary) and (2) shared services:
Within the control plane, several components can be identified:
A Participant Wallet: providing participants with the ability to store and exchange identities and other attestations. For instance, in the form of Verifiable Credentials.
A Data, Services & Offerings Catalogue: providing participants with the ability to share (on a technical level) the data, services and offerings which are provided through the data plane.
Components for Contract Negotiation: providing participants with the ability to share data access and usage policies with others in the Data Space and to enforce these on the data plane. For instance: to create an authorisation registry, which - based on policies - can determine who gets access to a certain data set or service.
On the data plane there is the actual transfer process. As indicated before, the data plane is likely to be highly application specific. It should however work in conjunction with the control plane, e.g. to ensure that no data sharing can start before certain conditions are met (identification, contract negotiation, etc.).
Note that components of the connector can have different granularities. They can be conceived as an integrated component, but they can also consist of multiple (packaged) components (e.g. with a separated, but linked, component for Participant Wallet).
Concretely for Simpl-Open, a connector is used to implement the 3 parts of the IDSA Data Space Protocol :
Publication and request of catalogue items - mapping to Data, Services & Offerings Catalogue component of the control plane ;
Contract negotiation - mapping to Contract Negotiation component of the control plane;
Data transfer process - mapping to the Data Plane.
The control plane of the connector is also used as orchestrator between the 3 parts.
The current implementations of connectors do not cover all the needs envisioned in Simpl-Open and therefore extension points are planned, for instance, to cover the infrastructure provisioning.
This section elaborates on the High-Level Architecture of Simpl-Open. It presents the capabilities of Simpl-Open and the building blocks that support these capabilities. It is important to remark that the high level architecture lays out the capabilities of Simpl-Open as a whole. How these capabilities are realised is then described in the following sections of the document.
The concepts described in this section have been, for a large part, already developed in the Architecture Vision Document of the Simpl Preparatory Study. They are taken over in this document and updated/complemented where needed to stay up-to-date with the current developments of Simpl-Open.
Six architectural dimensions describe Simpl-Open: the integration dimension, the data dimension, the infrastructure dimension, the administration dimension, the governance dimension and the security dimension.
The integration dimension contains the capabilities that enable participants to integrate with each other in a secured and trusted manner. This is required for the well-functioning of a Data Space integrating Simpl-Open. These capabilities regard security, access control and trust and federation management.
The data dimension focuses on semantic interoperability, data models, data quality and governance of data. It ensures that data can be understood, processed, and exchanged consistently across participants through standardized vocabularies and quality management.
The infrastructure dimension allows end users to utilise and manage infrastructure resources offered by infrastructure providers. Simpl-Open can connect to third-party infrastructure resources, enabling end users to execute applications and manage workloads.
The administration dimension provides supporting capabilities for the well-functioning of the other dimensions as well as administration of Simpl-Open. The administration layer allows actors to operate their components in the Data Space.
The governance dimension establishes and enforces policies, manages risks, oversees compliance and provides audit and assurance for the entire ecosystem. It supports the implementation of legal and organisational interoperability through policy management, contract management, participant lifecycle, and audit capabilities.
The security dimension ensures that all interactions and data exchanges across the Simpl-Open ecosystem are confidential, authentic, and tamper-resistant. It focuses on safeguarding communications, assets, and operations through technical and procedural resilience measures.
Each of these six layers is further detailed in the following sub-sections.
The following figure presents the Level 1 capability map of Simpl-Open, where capabilities are applied to the dimensions:
In the Administration dimension , the Observability capability monitors system health, usage, and performance across the data space, providing insights and dashboards for operational oversight. The Support capability provides operational assistance to participants and end-users through service desk services, ticketing systems, and status pages. It enables troubleshooting, issue tracking, and knowledge sharing to ensure smooth installation, configuration, and ongoing use of Simpl-Open components. The Notification and messaging capability provides asynchronous, event-driven notifications to users and admins for key workflows like onboarding requests and governance actions
In the Data dimension , the Data governance capability ensures that data sharing adheres to defined quality, metadata, and governance standards. It provides services like data lineage, data profiling and data quality rules. The Data processing capability provides the means to transform, aggregate and visualise datasets across multiple sources. The Supporting data services capability provides the foundational data services that enable efficient, scalable, and reliable management of data operations across the ecosystem, including orchestration and distributed execution. The Semantics & Vocabulary ensures semantic interoperability across the Data Space by providing standardized vocabularies, ontologies, and schema management. It enables participants to understand and interpret shared data consistently through formal knowledge representation and mapping services.
In the Integration dimension the Data sharing capability allows participants to exchange data with others through interoperable interfaces, where the Application sharing capability allows participants to make applications and services available to others through interoperable interfaces, as well as provide algorithms and models for AI-based processing. The Federated Management capability manages identity federation, catalogue federation, trust anchoring, and cross-domain access across multiple data spaces. The Resource discovery capability supports consumers in finding available resources securely and efficiently through catalogues. The Policy enforcement capability enforces access and usage policies at runtime integration points where policy decisions are applied. The Contract enforcement capability clarifies its role in validating and enforcing contractual terms at integration points, connecting it to policy management and billing. The Supporting Integration Services emphasizes its supporting role in maintaining persistent resource addresses across federated environments and integration endpoints. The Resource sharing service will embed all services related to generic resource sharing, specifically focused on the implementation of the connector protocol.
In the Infrastructure dimension , the Provisioning capability handles allocation, lifecycle, and orchestration of infrastructure resources required by participants and data services. The Supporting infrastructure services capability provides underlying infrastructure-level services such as distribution and the management of distributed resources. The HPC capability enables the execution of high-performance computing workloads where demanding analytical or AI-driven computations are needed, leveraging shared or external infrastructure resources.
In the Governance dimension , the Consent management capability ensures that data subjects’ consent preferences are properly captured, managed, and respected throughout data processing activities. The Contract management capability governs the lifecycle of contractual agreements between participants, ensuring that terms and obligations are traceable and enforceable. The Policy management capability enables the lifecycle, the definition and distribution of access and usage policies across the Data Space. The Audit capability provides transparency and verifiable evidence of compliance, supporting accountability and continuous assurance. The Participant management capability handles onboarding, identity validation, and lifecycle management including offboarding of all participants in the ecosystem.
In the Security dimension , the Credential management capability covers the implementation of digital signatures to guarantee data confidentiality, integrity, and authenticity, along with the storage of these credentials and signatures in the digital wallet. The CSIRT capability provides coordinated incident detection, response, and resolution services. It ensures operational readiness against threats, manages vulnerability disclosures, and leads recovery activities in case of security incidents. The Access control and trust capability enables secure and trusted collaboration between participants within the Data Space. It ensures that only authenticated and authorised entities can access shared data, services, and applications, while maintaining interoperability across different trust domains.
The following figure presents the Level 2 capability map of Simpl-Open, where business services are applied to capabilities:
The Observability capability has the following services: Resource usage, QoS metrics and alerts, Exporting, Dashboarding, Logging, Performance monitoring, Energy metrics and alerts, and Reporting:
The Resource usage service: Provides visibility into consumption of compute, storage, and network resources to support capacity planning and chargeback.
The QoS metrics and alerts service: Tracks SLOs and emits alerts on threshold breaches to enable timely operational responses.
The Exporting service: Enables scheduled or ad‑hoc export of metrics and logs to external observability or compliance systems.
The Dashboarding service: Offers configurable dashboards for real‑time and historical operational insights across tenants and domains.
The Logging service: Centralizes, indexes, and retains logs with correlation to traces and metrics for efficient troubleshooting.
The Performance monitoring service: Measures latency, throughput, and error rates to detect regressions and bottlenecks early.
The Energy Metrics & Alerts service: Captures energy usage KPIs and triggers notifications to optimize sustainability targets.
The Reporting service: Generates scheduled and on-demand reports aggregating operational data, compliance evidence, and business metrics. Supports customizable report templates, multi-format exports (PDF, CSV, JSON), and role-based access to reporting views. Enables stakeholders to track resource consumption, policy adherence, SLA compliance, and data space activity over time.
The Support capability has the following services: Service desk, Support page, Ticketing system:
The Service desk service: Provides first‑line assistance, triage, and knowledge‑base guidance for participants and operators.
The Support page service: Publishes status, FAQs, runbooks, and contact channels to streamline self‑service support.
The Ticketing system service: Orchestrates issue lifecycle with SLAs, prioritization, and handoffs across resolver groups.
The Notification and messaging capability has the following services:
The Data governance capability has the following services: Data lineage, Data profiling, Data quality rules.
The Data lineage service: Records end‑to‑end provenance to enable impact analysis, compliance evidence, and reproducibility.
The Data profiling service: Analyzes datasets for structure, distributions, and anomalies to inform governance decisions.
The Data quality rules service: Defines and evaluates quality checks with reporting and remediation workflows.
The Data processing capability has the following services: Data analytics, Data visualisation, Anonymisation.
The Data analytics service: Provides batch and interactive analytics for descriptive, diagnostic, and predictive insights.
The Data visualisation service: Delivers charts and exploratory views to communicate insights and monitor KPIs.
The Anonymisation and pseudonymisation service: Applies masking, pseudonymisation, and differential privacy patterns to protect personal data.
The Supporting data services capability has the following services: Data orchestration, Distributed execution, Semantic mapping.
The Data orchestration service: Coordinates multi‑step pipelines with dependencies, retries, and policy‑aware scheduling.
The Distributed execution service: Runs data jobs elastically across clusters with placement, scaling, and fault tolerance.
The Semantics & Vocabulary capability has the following services: Semantic mapping service, vocabulary hub and ontology management and schema management
The Semantic mapping service: Discovers and documents schema/ontology mappings across domains. Supports semantic interoperability and cross-domain discovery.
The Vocabulary hub service: Manages, versions, and publishes controlled vocabularies (SKOS, DCAT, schema.org , domain-specific ontologies). Enables cross-domain data understanding. Authors, aligns, and publishes ontologies (OWL, RDF) for domain modelling (e.g., manufacturing, healthcare). Enables semantic queries and reasoning.
The Schema management service: Stores, versions, and governs data schemas (JSON Schema, Avro, Parquet metadata) linked to vocabularies. Supports Metadata description service (Governance) in enforcing DCAT-AP compliance
The Data sharing capability has the following services: Bulk data transfer, Data streaming, Simple data transfer.
The Bulk data transfer service: Moves large datasets reliably with checkpointing, integrity checks, and resume support.
The Data streaming service: Publishes and subscribes to real‑time event flows with ordering, retention, and replay.
The Simple data transfer service: Provides lightweight pull or push exchanges for small files and APIs.
The Application sharing capability has the following services: Calculation algorithm, Machine Learning Model, Software apps (Rendering engine).
The Calculation algorithm service: Exposes deterministic computational functions for remote execution.
The Machine Learning Model service: Serves trained models with versioning, inference endpoints, and monitoring.
The Software apps (Rendering engine) service: Hosts interactive applications and engines for domain‑specific processing and visualization.
The Supporting Integration Services capability has the following service: Resource address management.
The Federated Management capability has the following services: Federation orchestration.
The Resource discovery capability has the following services: Resource catalogue, Search engine.
The Resource catalogue service: Publishes registries of datasets, services, and apps with federation support.
The Search engine service: Indexes and queries resources with fine‑grained policy‑aware filtering.
The Policy Enforcement capability has the following services: Policy Enforcement Point service.
The Contract enforcement capability has the following services: Contract Enforcement service.
The Resource sharing capability has the following services: Resource sharing runtime.
The Provisioning capability has the following services: Infrastructure provisioning.
The Supporting infrastructure services capability has the following services: Infrastructure orchestration, Distributed management.
The Infrastructure orchestration service: Automates deployment and day‑2 operations via declarative control and runbooks.
The Distributed management service: Manages multi‑site topologies, synchronization, and drift remediation.
The HPC capability has the following services: HPC.
The Consent management capability has the following services: Consent management service.
The Contract management capability has the following services: Billing, SLA Management, License asset, Contract establishment.
The Billing service: Calculates and issues invoices based on usage, entitlements, or fixed agreements.
The SLA Management service: Tracks service commitments and penalties with evidence and notifications.
The License asset service: Manages software and content licenses, entitlements, and renewals.
The Contract establishment service: Establishes (and invalidates) contract agreements.
The Policy management capability has the following services: Policy decision point service and policy administration point service.
The Policy Decision Point service: Evaluates policies against attributes, contracts, and consent to render decisions (grant/deny/obligation). Takes attributes from PIP adapters and usage context from PEP.
The Policy Administration Point service: Authors, approves, versions, and distributes policies and contract-linked obligations. Manages policy lifecycle (draft → approved → active → deprecated)
The Policy Information Point service: Adapts attributes from external sources (participant registry, consent store, contract service, catalogue) to feed PDP decisions. Bridges organizational context to policy evaluation
The Audit capability has the following services: Audit.
The Resource management capability has the following services: Metadata description
The Participant management capability has the following services: Onboarding, User roles, Offboarding.
The Onboarding service: Validates identities, performs due diligence, and provisions initial access.
The User roles service: Defines and assigns roles and responsibilities with least‑privilege defaults.
The Offboarding service: Revokes access, archives evidence, and ensures controlled exit procedures.
The CSIRT capability has the following services: Incident response, Threat monitoring.
The Incident response service: Coordinates detection, containment, eradication, and recovery with post‑incident review.
The Threat monitoring service: Continuously monitors for indicators of compromise and emerging vulnerabilities.
The Access control and trust capability has the following services: Identity provider, Authentication provider federation, Authorisation, Security attribute provider federation, Encryption, Guaranteed Authenticity / Integrity.
The Identity provider service: Issues and manages identities with lifecycle hooks for onboarding and offboarding.
The Authentication provider federation service: Federates external IdPs to enable single sign‑on across domains.
The Authorisation service: Enforces fine‑grained, policy‑based access decisions for data, services, and apps.
The Security attribute provider federation service: Aggregates and validates assurance attributes to support trust decisions.
The Encryption service: Protects data in transit and at rest using modern, configurable cryptography.
The Guaranteed Authenticity / Integrity service: Uses signatures and checksums to ensure tamper detection and provenance.
The Credential Management capability has the following services: Wallet service, VC issuance/verification service, Signing service
The Wallet service: The wallet service manages the storage, signing, and lifecycle of resource descriptions, verifiable credentials, and usage contracts for dataspace participants
The VC issuance/verification service: Issues, verifies, and manages W3C-compliant verifiable credentials for organization identity, user attributes, and data provenance. Includes VC issuance, verification, and trust anchor registry
The Signing service: Generates digital signatures using cryptographic keys, verifies signatures to confirm authenticity and integrity. Ensures Non-repudiation: The signer cannot later deny they signed something. Provides Tamper detection: Any change to signed data invalidates the signature
The architecture of Simpl-Open follows a loosely coupled self-contained architecture which groups components into building blocks, capability by capability. This approach permits the deployment Simpl-Open agent in different flavours depending on the type of participant, e.g. an Infrastructure Provider requires a different subset from the full Simpl-Open stack than a Data Provider. This modular architecture within a Data Space is presented on the following figure:
Below figure depicts the capabilities that will be (partially) implemented as part of the Release 3.0 (December 2025):
The current version of this document covers the architecture of the Release 3.0 only and as such, following sections only focusses on components implementing the capabilities mentioned above as being in scope of the Release 3.0.
Placeholders have also been added for content that will be made available after the Release 3.0, with clear disclaimers at the beginning of the respective sections.
The architecture of Simpl-Open is created using a layered approach, inspired by the TOGAF methodology, which is reflected in the structure of this document:
Business Architecture - describes how Simpl-Open should achieve its business goals and respond to the strategic drivers set out in the Architecture Vision. This layer was already defined in the preparatory study and this document only provides an update on the functional capabilities (which have evolved since then) and revisited concepts of business processes.
Application Architecture - develops the target application architecture of Simpl-Open that enables the business architecture and the architecture vision, in a way that addresses the requirements. It identifies architecture components through Solution Views (business process-based approach, both static and dynamic) and Deployment Views (agent type-based approach, static only).
Data Architecture - presents data entities and/or collections and how they are structured within the system.
Technology Architecture - develops the target technology architecture that enables the application architecture to be delivered through technology components and technology services. Each application building block is mapped to a technology implementing the capabilities. Just like for the application architecture, both Solution and Deployment Views are defined.
Security Architecture - covers the security aspects of the architecture.
The list of architecture principles and patterns to which Simpl-Open adheres is presented in the next section.
The source of the ArchiMate diagrams presented in this document are available as an Archi model, versioned in the Simpl-Open repository .
More information on how the model can be accessed is available here .
Simpl-Open is designed upon ten architectural principles. Each of these principles is applied throughout Simpl-Open’s design. They are all equally important to the design. The following figure provides an overview of these principles:
Federation : Federated systems describe autonomous entities, tied together by a specified set of standards, frameworks, and legal rules. Simpl-Open should federate data, infrastructure and applications. This principle is key to enable interoperability and information sharing among the different entities that will be part of Simpl, while giving maximum autonomy to service owners.
Modularity : The architecture of Simpl-Open needs to be defined in a modular way which allows the replacement or addition of components without affecting the rest of the system. This also provides the possibility to implement every component with a different open-source technology. Through modularity, Simpl-Open users are able to deploy a specific subset of components that are tailored for their purposes.
Loose coupling : Components and services should have minimal dependencies on each other. Standardised, business-oriented APIs make sure consumers are not impacted by changes to services. This allows service owners to change implementation, switch out components, or modify data records behind the APIs without downstream impact to end users. This principle ties in with the modularity and resilience principle.
Resilience: Components of the architecture must be fault tolerant, such that failures in one of them will have minimal impact on other components. Single points of failure need to be avoided to the maximum extent possible as the main objective is achieving a distributed architecture.
Openness & agnosticism : The open specification allows insights into all parts of the architecture without any proprietary claims. It makes adding, updating or changing components easy for all users. Services should be provided irrespective of specific technologies and should be executable in all environments.
Composability & extensibility : Simpl-Open’s architecture should allow services to deliver value to the business in different contexts, providing the necessary tools to facilitate their composition together with other services to form new aggregated services. Simpl-Open remains open to iterative growth allowing the addition of new services and capabilities that fit future use cases to the platform. An open development community should be promoted in order to enable the contribution of new features that extend Simpl-Open’s functionalities by its members.
Interoperability : Simpl-Open enables interoperability between its participants to share resources in a well-specified manner. The architecture should describe the technical means to achieve this and be agnostic to the specific implementation details of each participant.
Scalability & elasticity : Simpl-Open provides the means to accommodate larger workloads and allow new entities and users on the platform without affecting the performance. Both vertical scaling – i.e. the practice of adding more resources to a single node – and horizontal scaling – i.e. the process of duplicating nodes – should be possible. Simpl-Open’s performance should be able to follow user demand without deteriorating.
Security, privacy & trust : Users of Simpl-Open must be confident that when they interact with other entities they are doing so in a secure and trustworthy environment and in full compliance with relevant regulations. Data confidentiality, availability and integrity must be guaranteed. Privacy of data subjects, Simpl-Open users, or individuals must be assured.
Discoverability : All services that are deployed in Simpl-Open will be ‘publicly’ exposed and discoverable in a service registry or catalogue. In this context, ‘public’ is seen as visible by all approved participants of a Data Space, not the public internet. Services will adhere to a service description, providing interested parties with a clear understanding of their business purpose and technical interface.
These architecture principles are completed with coding principles which can be found in the Development Handbook
Simpl-Open is designed using a combination of the following architecture patterns:
| Pattern | Short Description |
|---|---|
| Microservices architecture | Breaks down applications into small, independently deployable services. Each service focuses on a specific business capability and communicates via APIs, enhancing flexibility and scalability. |
| Event-driven architecture | Components communicate by producing and reacting to events, promoting loose coupling and scalability. This pattern enhances responsiveness and supports real-time processing. |
| Asynchronous communication | Allows components to interact without waiting for immediate responses. Improves system performance, decoupling, and scalability by enabling non-blocking interactions, typically using queues or background workers. |
| Stateless design | Ensures each request is independent and self-contained, avoiding reliance on stored session data. Improves scalability, simplifies fault tolerance, and supports load balancing. |
| Least privilege | Grants each user or service only the minimum level of access required to perform its tasks. This limits potential damage in the event of a breach and reduces the attack surface. |
| Defence in depth | Applies multiple layers of security controls throughout the system. If one layer is compromised, others still provide protection, improving the overall resilience of the system. |
| Zero trust | Assumes no implicit trust in users or systems, whether inside or outside the network. Continuously verifies identities and enforces strict access controls before granting access to resources. |
| Retries | Automatically re-attempts failed operations after a delay, particularly useful in cases of transient failures. Helps improve reliability without requiring manual intervention. |
| Circuit breakers | Monitors service calls and halts repeated failures by stopping requests to underperforming services. This prevents cascading failures and gives systems time to recover. |
| Graceful degradation | Ensures that the system continues to provide limited or reduced functionality when some components fail. Improves user experience and system robustness under failure conditions. |
This information is based on currently available information tailored for Release 3.0 (December 2025 release) only.
| ID | Topic | Assumptions |
|---|---|---|
| ASM-01 | Data Space data management (downloading data vs using it) |
|
| ASM-02 | Possible data sharing scenarios |
The following scenarios to share data exist:
|
| ASM-03 | Actors with multiple participant roles |
|
| ASM-04 | Distinction between Certificate/Credentials | There is a clear distinction between credentials for securing the Data Space (Tier 1 and 2 IAA) and the credentials for signing SDs and contracts (legally binding signature). |
| ASM-05 | Data sharing connector |
|
| ASM-06 | Contract signature | Currently, only a simple signature is used (not a legally valid one). |
| ASM-07 | Usage of a Data Space connector |
Any communication/transfer between agents will be done via Data Space connectors. They are responsible to implement the 3 aspects of the Data Space Protocol (DSP):
|
| ASM-08 | Storage attached to VMs and containers | It is assumed that VMs and containers always have an attached storage. |
| ASM-09 | Type of storage supported | It is assumed that Simpl-Open only supports natively S3-compliant storage but is extensible to support other storages (offering an API). |
| ASM-10 | Deployment and termination of built-in applications | It is assumed that the application is always deployed and terminated together with the infrastructure resource as part of deployment script. |
| ASM-11 | Type of built-in application deployment supported | It is assumed that Simpl-Open only supports natively applications deployed on Kubernetes but is extensible to support other platforms (offering an API). |
| ASM-12 | Supported infrastructure resources |
but is extensible to support other platforms (offering an API). |
| ID | Title | Context | Decision | Consequence | Date | Decision Maker |
|---|---|---|---|---|---|---|
| ADR-01 | API Guidelines | One of the base principles of Simpl is interoperability, and in this respect, REST API guidelines should be established. |
The decision is to use :
|
The guidelines should be implemented for each custom-built component in Simpl-Open. | 29 Nov 2024 | DG Connect |
| ADR-02 | PostgreSQL Deployment Model |
The different patterns to persist data in the Simpl-Open microservices architecture are:
Option 3 has not be further analysed as it creates tight coupling between services and goes against the architecture principles of Simpl-Open. |
The decision is to select option 2 (a database per service) as default option for the development of Simpl-Open. Note: this decision does not prevent the final data space user to change the deployment model that best fits its interests. |
Pros
Cons
|
23 Jan 2025 | DG Connect |
| ADR-03 | Notification service |
In certain cases, notifications need to be sent to end-users. E.g. when an onboarding request is created, a notary of the GA should be notified. It is important to distinguish notifications from asynchronous call-backs. For notifications, the Simpl-Open custom-built components need a way to generate events. Assumption: in data spaces the likelihood of asynchronous processing is high due to the federated nature. Following options have been considered:
|
The decision is to select option 2 (independent and asynchronous notification microservice). |
Pros
Cons
|
23 Jan 2025 | DG Connect |
| ADR-04 | Distributed tracing |
Operating a complex, distributed system like Simpl-Open, with its federated, modular, and loosely coupled architecture, requires deep visibility into how requests travel across its many independent components providing engineers with clear optics into existing flows. This enhanced visibility is enabling teams to rapidly spot bottlenecks, pinpoint errors, and diagnose performance degradation much faster than would otherwise be possible by using a standard logs. Ultimately, effective tracing aims to provide a unified, 360-degree view of all interacting components, consolidated within a single dashboard, improving observability and troubleshooting efficiency across the entire system. A need for tracing comes from the real-engineering pain-point experienced by a team in Simpl-Open project. Below are options that were considered for implementation of Tracing in Simpl-Open. All presented options take advantage of modular and resilient architecture of Simpl-Open ensuring Tracing scalability and interoperability:
|
The decision is to select option 3 (Integrate Open Telemetry with ELK though log collector). |
Pros:
Cons:
|
17 Apr 2025 | DG Connect |
| ADR-05 | Access control inside Simpl-Open Agent |
Tier 1 communication with Simpl-Open microservices is protected by an API gateway that validates JWT tokens at the edge. The assumption has been that internal communication cannot be accessed by an attacker, so no internal service-to-service traffic security is in place. However, this model does not align with the Zero Trust approach, which assumes that threats may exist within the internal network. To follow this Zero Trust approach, Simpl-Open need to ensure that all service-to-service (east-west) communication is authenticated and authorized, preventing unauthorized access even within the internal network. Below are options that were considered:
|
The decision is to select option 5 (Microservices to validate OAuth2 JWT roles and scopes with optional Istio service mesh). |
Pros:
Cons:
|
15 May 2025 | DG Connect |
| ADR-06 | Healthchecks monitoring |
To monitor the health of Simpl-Open components, a healthcheck mechanism should be implemented. Below are options that were considered:
|
The decision is to select option 3 (Enhance Probing with Spring Boot Actuator). |
Pros:
Cons:
|
10 Jul 2025 | DG Connect |
| ADR-07 | Terraform Provisioning | As OVH does not have an official Crossplane plugin and the existent one, does not support the provisioning of Virtual Machines, a research was done in order to use Terraform instead. |
It was confirmed that using an OVH Token, the Infrastructure Provisioner can deploy a Virtual Machine in a similar fashion when compared to the IONOS flow. The changes required related to the Infrastructure Provisioner will be done on the definition of the resource template and on the ArgoCD part. The chosen technology to support the creation of Virtual Machines using Terraform language is OpenTofu . Being a fully open-source solution, community-driven alternative to Terraform, ensuring long-term flexibility, transparency, and independence from proprietary licensing changes, it maintains full compatibility with existing Terraform configurations and supports new infrastructure-as-code configurations through full compatibility with the Terraform environment. |
The Architecture related to the “BP08” will not have any changes, only the Technology View. | 28 May 2025 | DG Connect |
| ADR-08 | Signer service eIDAS compliance |
The OCM signer service provides a cryptographic function that binds the organization's identity to the claims within a Verifiable Credential, making it trustworthy and verifiable in a digital environment.
While the OCM performs a digital signature that provides integrity and links the VC to the issuing organization, this process, by default, does not include the eIDAS requirements to achieve the legal equivalence of a handwritten signature (Qualified Electronic Signature).
Some data spaces may require achieving the legal equivalence of a handwritten signature (QES) for certain scenarios (e.g. establishing a contract). The EU Digital Signature Service (DSS) offers exactly that by facilitating the creation and validation of electronic signatures in line with eIDAS regulations. Unlike the general OCM signer service which focuses on the cryptographic binding for VC integrity, DSS is specifically designed to help implement signing processes that adhere to the legal and technical requirements defined by eIDAS, including working with qualified certificates and signature creation devices when needed to achieve the highest levels of assurance and legal recognition within the European Union. Below are options that were considered:
|
The decision is to select option 1 (Standalone service). |
|
26 Jun 2025 | DG Connect |
| ADR-09 | Infrastructure Consumption Monitoring |
Key architectural considerations:
Other useful aspects:
Below are options that were considered:
|
The decision is to select option 3 (dedicated microservice). |
Pros :
Cons :
|
10 Jul 2025 | DG Connect |
An actor refers to an entity or participant that interacts with the system. Actors can be users, applications, Simpl-agent, etc. They play specific roles and have distinct permissions within the Data Space ecosystem.
The following context diagram introduces the main actors that will interact with each other using Simpl-Open and their interactions.
These actors are defined as follow:
| Application Provider | The application providers cover all the Data Space actors offering applications to the consumers or any other type of participant. The term “application” is used in a rather broad sense in this document and it covers any sort of executables including applications, as well as algorithms, such as a trained AI model that users can leverage to analyse their data. Application providers can also define the access control policies regarding their resources and bill the users for their usage. |
|---|---|
| Data Provider | This category covers all the Data Space actors offering data to the consumers. They can share one or more data sets and regulate the access and usage over the data with the help of policies. In order to compensate the data usage, the data providers can also bill the Data Space consumers. An example of a data provider can be an energy network operator sharing data on the energy grid load towards energy production facilities (who act as consumers) for production optimisation application. |
| Infrastructure Provider | The infrastructure providers offer infrastructure resources and services to the consumers (or possibly to any other type of participant) to enable them to process the data provided by the data providers. They can, for example, launch virtual machines or containers and run applications, algorithms, or other executables on top of the underlying infrastructure. Similarly to the data providers, the infrastructure providers can define access control policies for the infrastructure resources and bill the middleware users for their usage. |
| Consumer | A consumer aims at using data, applications and infrastructure shared by providers. They can search for these and use them as allowed by the policies. For data, this means typically either using them online by utilising the infrastructure and applications provided by application and infrastructure providers, or if policy allows, download them for local usage. |
| Governance Authority | The Data Space participant that is accountable for creating, developing, operating, maintaining and enforcing the governance framework for a particular Data Space. 1 |
1 https://dssc.eu/space/Glossary/176553985/DSSC+Glossary+%7C+Version+2.0+%7C+September+2023
The following diagram presents Simpl-Open functional architecture.
An agent per type of participant is represented and the functional components are represented as ArchiMate services.
Below are described all the functional components presented on the diagram, how they implement the building blocks from the high-level architecture, and how they interact between them. These interactions are highlighted with numbers on the diagram, which are linked to the below description through the purple numbers between brackets.
The Onboarding component implements the Onboarding building block: it provides the functionalities to submit, review and approve onboarding requests and deliver to the applicant the necessary security credentials to join a Data Space.
Both consumers and providers (data/application/infrastructure) can request to join a Data Space through the Onboarding component ( 1 ). This component allows the governance authority to control the required onboarding documents and approve/reject the onboarding request ( 2 ). If approved, the onboarding component sets up accesses and rights into the IAA component of the Governance authority ( 3 ) and delivers security credentials to the applicant ( 4 ).
The IAA component implements the Identity Provider Federation, Authorisation, Security Attribute Provider Federation and User Roles building blocks: it serves as a security intermediary for all communications between actors and components of Simpl-Open.
Once a participant has received security credentials from the Onboarding component, they install the credentials into their own IAA component ( 5 ). Once installed, the IAA component of all participants is federated, using the IAA component of the governance authority as trust anchor ( 6 ). In reality, the IAA component is connected to any single component of Simpl-Open as any interaction with the agent must be authorised and authenticated. For the sake of keeping the diagram readable, the relations between the IAA component and all the other components are not represented on the diagram.
The Vocabulary Management component implements part of the Metadata Description building block: it serves to harmonise the vocabularies in the Data Space, by providing the definition of metadata representation and, if required, the data representation standards.
The governance authority defines the vocabularies through the Vocabulary Management component ( 7 ).
The governance authority defines the schemas through the Schema Management component ( 8 ).
The Schema Management component implements another part of the Metadata Description building block: it provides the functionalities to define the ontologies and schema of the resource description (i.e. what properties can/should be part of it, what are their types, constraints and vocabulary).
The Resource Offering Editor component implements the last part of the Metadata Description building block. It provides the functionalities to create and sign resource descriptions (in the form of Self-Descriptions). It remains up to date with current metadata description standards by fetching schemas and vocabularies from the Schema Management and Vocabulary Management components ( 9 ).
The Federated Catalogue component implements the Resource Catalogue building block and part of the Search Engine building block. It provides the functionalities for providers to publish their resources and for consumer to discover these resources.
The Search component implements the remaining part of the Search Engine building block. It provides the functionalities for consumers to query and filter catalogue items to find the most suitable resources.
The Data Space Connector implements part of the Resource Catalogue, Usage Contract, Data Orchestration and Simple Data Transfer building blocks. It provides an implementation of the Data Space Protocol and acts as an orchestrator between its 3 parts:
Local Assets Catalogue in which the providers register the information, related to their own published resources, that is required for supporting the contract negotiation and transfer process;
Contract Negotiation provides the electronic contract negotiation required for consuming any type of resources;
Transfer Process supports the triggering of the data transfer or deployment of other types of resources.
Providers (data/application/infrastructure) create and sign resource description in the Resource Offering Editor component ( 10 ) and can then register it in the Local Assets Catalogue of their respective Data Space Connector component ( 11 ). The data registered in the Local Assets Catalogue of the Data Space Connector is the minimal subset of metadata required to enable the 2 next parts of the DSP: contract negotiation and transfer process.
Once registered locally, the Resource Offering Editor can publish the entire resource description (in the form of a Self-Description) to the Federated Catalogue component ( 12 ).
The Federated Catalogue validates the submitted resource descriptions against the schemas and ontologies provided by the Schema Management component ( 13 ) and against the vocabularies provided by the Vocabulary Management component ( 14 ).
Consumers can browse the resource offerings published in the Federated Catalogue through the Search component ( 15 ). Instead of having a search functionality embedded in the Federated Catalogue , the Search component is represented as a distinct component of the consumer agent, connecting to the Federated Catalogue in the governance authority agent ( 16 ), to enable the 2 tiers approach for IAA (the consumer end-user connects to the Search component via tier 1 and the Search component connects to the Federated Catalogue via tier 2).
Once consumers have found a resource offering that they would like to consume, they can request the consumption in the Search component which initiates a contract negotiation with the provider through the Data Space Connector component ( 17 ). The Search component has obtained from the Federated Catalogue the address of the provider’s Data Space Connector and the identifier to the resource offering, and provides these elements to the Data Space Connector . Based on these 2 elements, the consumer’s Data Space Connector initiates a contract negotiation with the provider’s Data Space Connector ( 18 ).
Based on the received resource offering identifier, the provider’s Data Space Connector can query its Local Assets Catalogue to obtain the necessary metadata to create a contract ( 19 ).
The provider’s Data Space Connector provides the contract to the consumer’s Data Space Connector for signature by the consumer ( 20 ).
As signing a contract is not explicitly part of the Data Space protocol, the signature process is not implemented within the Data Space Connector. Instead, it is externalised to the Contract Management component ( 21 ).
The Contract Management component implements the last part of the Usage Contract building block. It provides the functionalities to create, sign and persist usage contracts.
The consumer signs contracts through the Contract Management component ( 22 ).
Once signed by the consumer, its Data Space Connector provides the contract back to the provider’s Data Space Connector for the provider to sign it ( 23 ). As for the consumer, the signature is delegated to the Contract Management component ( 24 ) through which the provider can counter-sign the contract ( 25 ). The Contract Management component persists the signed contract and provides a copy to the consumer via their Data Space Connectors ( 26 ).
The Contract Management component of the consumer persists the signed contract ( 27 ).
Once a usage contract agreement is established, the Data Space Connector of the provider can start data and/or infrastructure consumption.
For standalone infrastructure consumption (see BP 08), the Data Space Connector of the infrastructure provider triggers the deployment of the Infrastructure Resource through the Infrastructure Management component ( 28 ).
The Infrastructure Management component implements the Infrastructure Orchestration, VM Provisioning, Container Provisioning and Storage Provisioning building blocks. It provides the necessary features to deploy and configure (incl. policies) infrastructure resources. It also partly implements the Data Visualisation building block by providing the functionality to deploy a built-in data visualisation application on the infrastructure resources. The remaining part of the Data Visualisation building block is implemented by the built-in application itself.
The Infrastructure Management component deploys and configures the requested Infrastructure Resource ( 29 ) and provides access details back to the consumer via the Data Space Connector ( 30 ).
The consumer gets access details from their Data Space Connector ( 31 ) and can access the Infrastructure Resource using these details (outside of Simpl-Open) ( 32 ).
For direct data download (see BP 09A), the Data Space Connector of the data provider accesses the Data Resource through the Data Transfer component ( 33 ).
The Data Transfer component provides the functionalities to access various types of data resources and transfer them between participants. It implements the Data Orchestration and Simple Data Transfer building blocks.
The Data Transfer component accesses the Data Resource ( 34 ) and transfers a copy of it to the consumer via the Data Space Connector ( 35 ). The consumer’s Data Space Connector stores the copy of the Data Resource on the consumer side ( 36 ), which can be accessed by the consumer ( 37 ).
For access to data over an application deployed on an infrastructure, currently, both the data and application resources are already available in the infrastructure provider and are deployed together with the infrastructure resource. In a future release, a solution involving the Data Space Connectors of both infrastructure and data providers will be envisaged.
The Observability component implements the Logging building block and part of the Monitoring building block. It provides the functionalities to collect and monitor logs and metrics from the other components of the agent.
In reality, the Observability component is connected to any single component of Simpl-Open as all of them produce logs and are monitored. For the sake of keeping the diagram readable, the relations between the Observability component and all the other components are not represented on the diagram.
From the above architecture, 3 functional domains can be distinguished:
Access Control & Trust - This domain provides the means to join a Data Space and establish trust between participants.
Publish and consume resources - This domain is about the essence of a Data Space: allow to share resources (datasets, infrastructure, applications) between the participants.
Management/Operation of Data Space - This domain provides the functionalities that are necessary to manage and operate a Data Space.
The table below summarises how the functional components implement the building blocks from the high-level architecture, and how they map to the functional domains.
| Functional architecture component | High-level architecture building block implemented | Functional domain |
|---|---|---|
| Onboarding | Onboarding | Access Control & Trust |
| IAA |
Identity Provider Federation Authorisation Security Attribute Provider Federation Authentication Provider User Roles |
Access Control & Trust |
| Vocabulary Management | Metadata Description (partly) | Publish and consumer resources |
| Schema Management | Metadata Description (partly) | Publish and consumer resources |
| Resource Offering Editor | Metadata Description (partly) | Publish and consumer resources |
| Federated Catalogue |
Resource Catalogue Search Engine building block (partly) |
Publish and consumer resources |
| Search | Search Engine | Publish and consumer resources |
| Data Space Connector |
Resource Catalogue (partly) Usage Contract (partly) Data Orchestration (partly) Simple Data Transfer (partly) |
Publish and consumer resources |
| Contract Management | Usage Contract (partly) | Publish and consumer resources |
| Infrastructure Management |
Infrastructure Orchestration VM Provisioning Container Provisioning Storage Provisioning |
Publish and consumer resources |
| Data Transfer |
Data Orchestration Simple Data Transfer |
Publish and consumer resources |
| Observability |
Logging Monitoring (partly) |
Management/Operation of Data Space |
A mapping between the functional requirements level 2 and the functional components presented above is provided in annex.
The business processes and the underlying functional requirements are available from the Simpl Programme website .
Simpl-Open Application Architecture develops the target application architecture of Simpl-Open that enables the business architecture and the architecture vision, in a way that addresses the requirements.
It identifies architecture components through following views:
| View | Description |
|---|---|
| Application Services Static View | Provide a view per business domain of the application services implementing the domain. |
| Application Components Static View | Provide a view per application service of the application "Solution", with all the main components and interactions. |
| Application Components Dynamic View | Provide a dynamic view per business process (or sub-process) on how application components are used to satisfy different workflows. |
Next to these architecture views, are provided:
Simpl-Open non-functional requirements are available on the Simpl website .
Within Simpl-Open Architecture, three categories of components can be distinguished:
Simpl-Open Domain 1 (Access Control & Trust) Mandatory Services: these services are technical prerequisites to be installed for any Simpl-Open agent. Domain 1 services should be installed in at least 2 agents to allow the distributed IAA mechanism of Simpl-Open (based on 2-Tier approach).
Simpl-Open Domain 2 & 3 Optional Services: represents any other components which can be deployed on top of the Access Control & Trust components to provide all capabilities of Simpl-Open.
External Dependencies: These are mandatory capabilities for the services that require it but that can be replaced by equivalent technologies.
The following figure presents exhaustively the list of Domain 1 Services and External Dependencies, and provides a few examples of the Domain 2 & 3 Services:
Application components views are presented per functional domain in following sub-sections.
For each functional domain, are presented:
a static view of the entire domain which presents all the application services that are necessary to implement the functionalities of the domain and how they interact with each other;
a set of static views that zoom each in a specific application service and present all the application components composing the service as well as integration with other services;
a set of dynamic views that present how a subset of the application components is used to satisfy different (parts of) business processes.
The static domain view illustrates the structural organisation of application services involved in the domain, segmented into two types of agents (Governance Authority and a generic applicant/participant which can represent Consumer, Data Provider or Infrastructure Provider), showcasing the roles each plays in the domain.
As per the legend, components highlighted in red are foreseen to be part of Simpl-Open but are not part of the current release, while components highlighted in orange are external to Simpl-Open.
The red numbers on the diagrams help to correlate APIs with their definition which can be found in the Interfaces section, while the green letters represent the link with the User Interfaces which can be found in the same Interfaces section.
Authorisation
The authorisation component processes all Tier 1 and Tier 2 inbound traffic originating from external sources and enforces RBAC and ABAC rules.
Security Attributes Provider
The Security Attributes Provider component is deployed in the Governance Authority Agent and registers the participant’s security identity attributes. Upon approval of an onboarding request, the onboarding component calls the Security Attributes Provider to associate the security identity attributes to the participant.
Identity Provider
This component is deployed inside the Governance Authority Agent. It generates and renew the credentials for a newly onboarded participant and stores them along with the participant’s information. This component also allows the applicant participant to download the generated security credentials that can then be installed in the Tier 2 Authentication provider of the participant agent.
Onboarding
The Onboarding component is deployed inside the Governance Authority Agent and it’s the core for managing onboarding requests by applicants (applicants can be both providers and consumers). This is where the applicant requests new Tier 1 credentials and initialises its onboarding request. The Governance Authority Tier 2 authorisation operator can approve, reject, or require new documents to fulfil the request. After the request has been approved, the applicant must create its keypair to be associated with the credential and can submit the public key to the governance authority which triggers the creation of a Tier 2 credential by the Identity Provider component.
Refer to ACV Dynamic - BP 03A - Onboard a Participant for a full description of a Participant Onboarding.
Document Validation Service
An external validation service where custom document validation logic can be implemented and exposed through a well-established API contract.
Tier 1 Authentication Provider
The Tier 1 authentication provider contains the participant users, roles and allows IdP Federation.
Tier 2 Authentication Provider
The tier 2 authentication provider is the component that:
Manages the storage and update of the security credentials inside the Credentials Database/Vault component.
Inside a participant, is involved in 2 steps after the onboarding request has been approved:
when the applicant representative creates/uploads a keypair into a participant agent
when the applicant representative installs the security credentials previously generated by the governance authority
In the communication between participants, helps the Authorisation Tier 2 components to validate Tier 2 credentials (Ephemeral Proof and Security Credentials).
Keeps a copy of the identity attribute of the dataspace local to the agent
Keeps details about the participant organization owning the agent
Support the credential renewal flow via the governance authority
Exposes internal APIs to help Simpl components to fetch information about the participants and their identity attributes
Credentials Database/Vault
Component that handles the physical storage of the participant credentials
User & Roles
This component works as an interface in front of the tier 1 authentication provider. Its responsibilities are:
reading and writing users and roles in the tier 1 authentication provider;
map the Tier 1 Roles to assignable security identity attributes;
create an applicant user along with temporary credentials in the tier 1 authentication provider at the beginning of the onboarding process.
A new participant in a Data Space – whether a data provider, application provider, infrastructure provider, or consumer – begins by registering itself and obtaining a temporary Tier 1 credential.
Using the Tier 1 temporary credential, the new participant submits an onboarding request by completing the required information forms.
The onboarding request is then processed by the Governance Authority and either approved or rejected. During the review process, the Governance Authority can provide comments on the onboarding request and submit requests for additional documents to the applicant participant.
Assuming the onboarding request gets approved, the new participant creates a Tier 2 key pair and begins the process of obtaining a valid identity credential for its Simpl-Open Agent. This credential proves the ‘identity’ of the installed Simpl-Open Agent and enables secure communication with other Data Space participants. Simpl-Open Agents will only permit communication with other network participants who hold a valid identity credential.
After the participant has successfully acquired a valid identity credential, they proceed to install this credential within their Simpl-Open Agent. This installation process involves integrating the credential into the agent’s system, ensuring that it is properly recognised and authenticated.
Applicant Participant creates an onboarding request: the applicant participant requests credentials in the Governance Authority providing information about the organisation and the participant’s role in the Data Space (consumer or data/infrastructure/application provider). Credentials are created in the Governance Authority Tier1 Authentication provider through the Users&Roles component. After the credentials have been created and stored in Tier1 User Database, the onboarding component creates an onboarding request with the status IN PROGRESS.
Applicant participant submits the onboarding request (1) : the applicant participant logs in to the onboarding Frontend using the temporary credentials and fills the onboarding request. In addition to organisation data, the applicant must also upload any required documents, as specified by the onboarding procedure. A comment section is available to facilitate communication between the applicant and the Governance Authority. Once all mandatory documents have been provided, the applicant can submit the request for review to the Governance Authority representatives.
Governance Authority representative reviews the onboarding request (1) : when the onboarding request has been submitted, the governance authority representative reviews it and decides:
to APPROVE the onboarding request and proceed to the credential’s creations step.
to REQUEST A REVIEW to the participant applicant, possibly requiring additional documents (“temporary rejection”)
to REJECT the onboarding request. In this case the onboarding process stops.
As soon as the onboarding request has been approved, the onboarding component creates the participant and saves the participant identity attributes in the Security Attributes Provider component.
Applicant creates a keypair: once the onboarding request has been created, the applicant representative can start the credentials creation process inside the participant agent. The applicant representatives generate a keypair and stores it in the participant agent.
Applicant triggers credential creation: the public key (whose keypair is safely stored inside the participant agent) in the form of a Certificate Signing Request (CSR) is sent by the applicant representative to the governance authority. Using the public key, the onboarding component triggers a credential creation through the identity provider component. After the creation, the applicant representative can download the credential.
Applicant installs credentials (1) : the participant applicant can install the generated credential inside the participant agent along with the previously generated keypair. The Tier 1 public key is sent to the governance authority via Tier 2 communication. The governance authority receives the tier1 public key and notifies that the onboarding request has completed.
(1) The integration with the Simpl Open notification service has not yet been included in the latest release.
The participant must configure the User and Roles module to allow end users to log in and start operating with Simpl-Open. Simpl-Open administrators log into the Participant’s Agent and begin configuring roles within the Simpl-Open Agent.
After configuring roles, they can federate the local identity provider with the Authentication Provider module of the Simpl-Open Agent, if needed. This step is crucial as it ensures that the Simpl-Open Agent can accurately verify and manage existing users’ identities.
Next, end users must be managed. Administrators can create end users within Simpl-Open or manage existing users’ identities through IdP Federation. To enable end users to use Simpl-Open functionalities, administrators must assign roles to every user in the Participant’s Agent according to their duties and responsibilities. This assignment ensures that each user has the appropriate access and permissions to perform their tasks effectively.
To enable Tier 1 users to operate their agent after the login, at least one role must be assigned to them. Roles can either be assigned by Simpl-Open administrators or requested directly by the end user.
This process describes the Individual User Onboarding functionality, which allows users to request a role (or set of roles) directly.
1. Role Request Submission
After logging into the Participant’s Agent, the end user can create a role request and specify the desired roles. Once submitted, the system notifies Simpl-Open administrators that a new request is available for review.
2. Role Request Review
Simpl-Open administrators can review every role request submitted by the end users of the Participant’s Agent. They can either approve the request (assigning one or more roles to the requester) or reject it, in which case no roles are assigned.
Following the review, the system sends a notification to the end user informing them that their request has been processed.
The Governance Authority is responsible for managing the status of credentials issued to participants, ensuring compliance and preserving trust within the dataspace. The initial issuance of a participant’s credential, along with the first assignment of identity attributes, occurs during the onboarding process (see BP03A). After onboarding, the full credential lifecycle and any subsequent identity attribute assignments are managed by the Identity Provider component within the Governance Authority.
The following diagram outlines the components involved in the actions of revoking, suspending, reactivating, renewing credentials and editing a participant’s identity attributes assignment.
Governance Authority Revokes a credential: Permanently revoke a participant’s credentials. This credential will no longer be available for use in the future.
Governance Authority Suspends a credential: Temporarily suspend credentials.
Governance Authority Reactivates a credential: Restore suspended credentials once issues are resolved.
Governance Authority Renews a credential: Extend the validity of credentials approaching expiry. Renewal can be:
Manual: the participant submits a Credential Renewal Request and the credential is renewed by the Governance Authority
Automatic: a participant can be allowed to have an auto renewal credential in place. When the credential is about to expire, its credential is automatically renewed
Edit Identity Attributes: Update the participant’s assigned identity attributes as needed.
To share the provider of the data, application or infrastructure offering needs to make its offering available and findable for the interested consumers. To this end, the provider needs to describe its offering in the form of metadata (called Self-Description) and make it available in a central catalogue. This catalogue needs to provide appropriate functionality for the consumer to find his desired data, application or infrastructure offerings. For the consumption of the offerings are provided functionalities to negotiate a binding contract with validation of the access policy (control plane) as well as the technical consumption, e.g., the file transfer for data or the triggering of infrastructure deployment in the case of infrastructure.
The static domain view illustrates the structural organisation of application services involved in the domain, segmented into four types of agents (Governance Authority, Consumer, Data Provider and Infrastructure Provider), showcasing the roles each plays in the domain.
As per the legend, components highlighted in red are foreseen to be part of Simpl-Open but are not part of the current release, while components highlighted in orange are external to Simpl-Open.
The red numbers on the diagrams help to correlate APIs with their definition which can be found in the Interfaces section, while the green letters represent the link with the User Interfaces which can be found in the same Interfaces section.
To keep the diagram lighter, following components (and their relations) have only been represented within the Data Provider agent but are in reality also part of the typical deployment of an Infrastructure Provider agent:
Connector
Contract Manager Orchestrator
Contract Manager Backend
Signer Orchestrator
Signer Async Adapter
Signer Backed
VC Issuer
Wallet
SD Tooling
Catalogue Client Application
Synch Schema Adapter
Policy Template Datastore
Contract Template Datastore
Data Orchestration Service
Schema Synch Service
Catalogue Client Application
The Catalogue Client Application Frontend is the primary interface through which users interact with the Catalogue. It presents search fields and options to users, which in case of advanced search are defined by the schema. It contains:
Quick Search UI - This UI allows the consumer/provider to perform a Quick Search on the respective Catalogue.
Advanced UI - This UI allows the consumer/provider to perform an Advanced Search on the respective Catalogue.
The Catalogue Client Application Backend
sends the policy-filtered queries to the Catalogue Component via the Adapter Component. After receiving results from the Catalogue, it presents them in a structured format, ensuring that users can easily navigate and interpret the returned self-descriptions and metadata.
transforms the schema definition automatically to front end files that are used to generate a custom made frontend to define the Self-Description.
Validation Backend
The Validation Backend performs syntax validation for the Self-Description on the provider side before they are published to the catalogue. Furthermore, it validated the resource source address, which is used for registering service offerings in the Connector.
Contract Consumption Adapter
The Contract Consumption Adapter component is requesting an Offering from the Provider. This Offering is returned with the Offering ID and respective usage & access policies.
Once the user accepts the conditions (usage & access policies) the Contract Consumption Adapter builds the request to start the Contract Negotiation via the EDC Connector Adapter and retrieves the Status of the Contract Negotiation.
Catalogue
Operating on the Governance Authority node, the Catalogue component functions as the central publication point for signed self-descriptions. It includes secure API functionalities for publishing, querying and managing self-descriptions. After publication, the self-description becomes accessible to potential consumers via the Catalogue’s API. The Catalogue also manages the status of self-descriptions and facilitates seamless access to information and metadata stored in the system’s databases.
When a search request is made via the Catalogue Client Application, the Catalogue’s Search Engine processes the request, taking into account the filters and parameters provided by the Policy Filter Service and Adapter Component. This ensures that the search results returned to users are both relevant and compliant with defined policies.
The Catalogue component also works closely with the Schema Registry to ensure semantic consistency across searches.
The Catalogue component contains:
Catalogue Database - The catalogue database is one or multiple databases that persist the published Self-Descriptions.
Search Engine - The search engine indexes the entries in the catalogue database and allow for a performant search.
Vocabulary Datastore - The vocabulary datastore contains the loaded ontologies and schemas of the catalogue used for the semantic validation.
Management Service - The management service allows to perform several operation on the self-description, for instance the revocation of a Self-Description.
Syntax Validation Service - The Syntax Validation Service checks the syntax of the Self-Description before publication.
Semantic Validation Service - The Semantic Validation Service checks the semantic of the Self-Description before publication. In detail it performs both a validation of SHACL Constraints and checks if the Self-Description complies with the ontologies in the catalogue.
Quality Rule Validation Service - The Quality Rule Validation Service checks the quality of the Self-Description before publication. It checks if all mandatory quality rules are fulfilled and uses the recommended quality rules to calculate the quality score for the Self-Description.
Remark on Catalogue Deployments
In the current architecture view the catalogue is depicted as a single component, but yet a different schema is used for each type of resource (data, application and infrastructure). The catalogue might thus be deployed multiple times (e.g.) for testing purposes. The way this is deployed is subject to change. In the future the catalogues (data, infrastructure and application) may be kept in a single component deployment and can be separated by the different schemas.
According to Data Space Protocol (DSP) specification each implementation of a connector has to provide a local assets catalogue instance providing all registered service offerings (asset) and usage contract offerings of this provider. Hence as a prerequisite to adding/updating a resource this service offerings (assets) have first to be registered at the connector. There, the contract negotiation id to start the contract negotiation will be created. This id is crucial for self-description to provide any customer the link to start contract negotiation.
Query Mapper Adapter
Policy Filter Service
Connector
The Connector component registers each resource (dataset, application, or infrastructure) as an asset within the Data Space, associating policies and contracts with each asset. It also provides controlled endpoints for each resource, playing an intermediary role in the contract negotiation process by leveraging the policies and contract templates associated with the resource. This enables the management of contractual relationships between providers and consumers. The connector functions also as a gateway for secure data exchange and ensures that policies are enforced during data consumption. It is responsible for enforcing security protocols and managing policies that govern access to the data (simple dataset or bundle).
The Connector component is implementing the Data Space protocol and contains the following sub-components:
Control Plane -
The control plane of the connector acts as a
state machine, overseeing the various states and transitions
specified in the Contract Negotiation Protocol. It ensures that
all agreements between the data provider and consumer are
finalised before any data transactions take place. The Control
Plane at the Provider side includes the Local Assets Catalogue
component. An Asset is the primary building block for resource
sharing, it represents any data or API endpoint that can be
shared. Assets are descriptors that are loaded into EDC via its
Management API during the registration phase performed before
uploading a resource to the catalogue. In the case of a bundle,
it is the URL that triggers the deployment script which will
deploy the requested infrastructure and application.
The control plane to perform its functionalities interacts with
the Management API, the Protocol API and the Policy Engine.
Data Plane - The Data Plane enables the data exchange based on the transfer protocol which will only take place in case the contract negotiation protocol has successfully established a contract. This second part is controlled by the control plane and performed by the data plane. The data plane component, which consists of an extension of the connector, manages the actual data exchange, ensuring that data flows securely from the provider’s source to the consumer’s specified destination, aligning with the agreed-upon terms of the contract. In the scenario of a bundled infrastructure, data and application, the role of the data plane component is performed by the infrastructure orchestrator which is responsible for retrieving the deployment script ID from the Asset of the resource to be used and triggering the execution of the script on the infrastructure provider. Once the provider completes the deployment, it will return the access details for the newly created environment. These details will then be forwarded to the user, who will use the provided information to access the infrastructure directly.
Management API - The Management API is a RESTful interface for client applications to interact with the control plane.
Dataspace protocol API - The Dataspace protocol API is a RESTful API interface that is used for the contract negotiation protocol.
Policy Engine - The Policy Engine is crucial in making decisions based on the policies tied to the requested resource. The policy engine is able to perform this operation because the policies are registered and linked to the registered assets (Assets component in the Control Plane). This allows the policies to be retrieved at this moment and the necessary checks to be carried out. This component evaluates whether all policy requirements are met and if they are not, they can halt the process to prevent unauthorised access.
Triggering Extension - The triggering extension will send the DeploymentScriptID and the email address of the consumer, to the Infrastructure Triggering Module, at the time of finalising a contract agreement. This will result into provisioning of the infrastructure resources and deployment of the applications on that resource.
S3 Object Storage Extension - Can transfer datasets from the S3 Object Storage of the Data Provider to the S3 Object Storage of the Data Consumer, at the time of finalising a Data Transfer Contract.
The EDC Connector Adapter handles interaction with the EDC Connector. This includes
the registration of the resource offering together with the associated policies in the connector during the creation of the resource description
well as providing the Connector references of the asset for the resource descriptions during the request of a resource and the consumption
Contract Manager (Orchestrator and Backend)
Message Broker
Contract Template Datastore
Orchestration Platform
Provisioned Node (Infrastructure Consumer) / Private Network
Is created by the Infrastructure Provisioner (see ACV Static - Infrastructure Provisioning Service ) on behalf of the Consumer
So In principle the Consumer has access
to the Infrastructure (Provisioned Node)
and the Private Network
Triggering Module
The Triggering Module component is responsible for adding, managing and executing the deployment scripts and finally sharing the access data. The triggering module is made of three submodules:
Script Storage Management submodule (Accessible via the API and the Infrastructure Deployment Script Management UI) : That is responsible for adding and managing the deployment scripts. It contains the following functions:
Add Script : Enables users to add deployment scripts to the local repository and database, ensuring the scripts are accessible for future provisioning tasks. This function also performs security checks to prevent the uploading of malicious scripts and files.
Remove/Invalidate Script : Manages the removal or invalidation of outdated scripts from the repository and database.
Script Execution submodule : When an API call requests the triggering of the deployment script, this module initiates the execution process. Its functions are:
Retrieve Deployment Script : Retrieves deployment scripts from the repository, allowing the Infrastructure Provisioner to execute the necessary steps for the resource provisioning and software deployment.
Validate Deployment Script : Generates and compares a hash of the retrieved script from the repository to the hash that was stored in the database at the time of storing the script, to check for integrity and authenticity, confirming the script is secure and unaltered.
Trigger Execution : Communicates with the Infrastructure Provisioner via a message broker to initiate the provisioning process.
Access Management submodule : When the provisioning is done, it shares the access information such as endpoints and credentials with the consumer:
The Triggering Module exposes its functionality via an API, enabling other Simpl-Open Agent modules to interact with it as needed. After triggering the execution of the deployment script, the triggering module listens for provisioning completion events from the Infrastructure Provisioner to confirm successful deployment and share the access data.
Infrastructure Provisioner
The Infrastructure Provisioner component is an asynchronous service that orchestrates the actual provisioning of infrastructure resources and potential deployment of the applications and datasets (in case they are a part of the deployment script). Upon receiving a deployment trigger from the Triggering Module, this component follows several steps to ensure resources are provisioned, configured and made accessible. This module contains two submodules for provisioning and decommissioning:
Provisioning sub-component: Provisions the infrastructure resources, creates/grants access to them and runs post-configuration processes to set policies and to deploy applications.
Execute Deployment Script : Runs the deployment script received from the Triggering Module, provisioning resources such as compute instances, storage, or other assets.
Set Policies : Defines infrastructure specific usage and access policies to govern resource usage, aligning with predefined rules on the deployment script to control who can access the provisioned infrastructure resource.
Create Access Information : Generates and provides access credentials and endpoints, allowing authorised users to interact with the infrastructure.
Post Configuration : Can deploy applications and load datasets on the provisioned infrastructure resource.
Share Access Data : Returns the generated access information back to the Triggering Module / Access Management, so the information can be shared with the consumer.
Decommissioning sub-component: It will decommission the infrastructure asset based on the criteria set by the business (e.g., the end date of the contract.) The two main functions are:
Pre-decommissioning : Initiates the pre-set decommissioning configurations such as notifying the consumer and making snapshots/backups.
Access Revocation : Revokes user access, if applicable and triggers the final termination process.
This infrastructure provisioner is not directly exposed via the public API but through the processes of the triggering module.
Infrastructure Provider Storage
Message Broker
Verifiable Credentials Issuer (VC Issuer)
The VC Issuer component securely issues and manages verifiable credentials, providing transparent and reliable validation of usage contracts. It relies on the Signer component to apply cryptographic signatures to contracts, ensuring data integrity.
Signed usage contracts are stored in the Wallet component for secure access, facilitating a robust and trustworthy credential management process.
Policy Template Datastore
SD Tooling
Located on the Provider Node, the SD Tooling component enables providers to define self-descriptions for their resources by leveraging schemas from the Schema Registry. This ensures each self-description adheres to predefined properties and constraints. The SD Tooling Component supports both UI and API methods, providing flexibility to providers. It works in tandem with the Policy Creator and Contract Template components, allowing providers to incorporate policy and contract terms directly into self-descriptions.
The SD Tooling component contains:
SD Manager - The SD Manager allows the user to manage his published Self-Description, for instance triggers the revocation.
SD Creation Tool - The SD Creation Tool supports the provider in the creation of the Self-Description of their resources, by providing a generated frontend from the schema with the correct property fields.
Policy Creator - The Policy Creator component enables the creation and management of Access and Usage Policies for resources. Access Policies determine the accessibility of a resource, while Usage Policies outline the permissible uses and monitor the extent of usage to support billing based on consumption. These policies are serialised into a standardised format to ensure consistent application and interpretation across components. Integrated into the Self-Description, they contribute to a governed, comprehensive resource description.
Contract Template Editor - The Contract Template Editor enables the creation and the customisation of contract templates linked to resources in self-descriptions. Theses templates, once created, are stored in the Contract Template Datastore.
Schema Management
The Schema Management Service component represents the Metadata Description building block, enabling the Governance Authority to define the structure of self-descriptions. Using a UI or API, the Governance Authority can establish properties, data types, constraints and controlled vocabularies that apply across resources (datasets, applications, infrastructure). The resulting schema configurations are automatically transformed into semantic files and managed within the Schema Synch Service, ensuring the Provider Node has access to the most current schema standards for generating self-descriptions in compliance with governance protocols.
The Schema Backend functions as a central repository and management interface for schemas created by the Governance Authority. These schemas, represented as ontologies and structured schema definitions, are actively managed to provide consistent standards across resource descriptions. Serving as an application component rather than a simple data storage element, the Schema Registry facilitates regular synchronisation with the Provider Node, ensuring that providers always have access to the latest schema standards needed for creating compliant self-descriptions.
The Schema Registry is used by the catalogue client application to enable semantic consistency by defining and validating the terms used in self-descriptions and search fields. The Search Client uses the schema to define the search fields for the advanced search. This automatic form generation helps prevent ambiguous searches and ensures users can only search for terms recognised within the Data Space.
Schema Synch Adapter synchronises the schemas with the agents and makes sure the schemas for dependent components are accessible and up-to-date
The Schema Synch Service implements:
The Schema Synch Adapter API, that received any updated from the Schema Management Service
The Schema Synch Adapter , that is retrieving the Schema updates and processed them
Signer Service
The Signer Service component manages the digital signing of self-descriptions, ensuring their authenticity and integrity. Upon completion, the self-description is signed using the provider’s private key to verify identity and prevent tampering. Once signed, the self-description is ready for distribution and is published to the relevant Catalogue component for broader access. This service is crucial for establishing trust between providers and consumers.
The Signer Service component provides cryptographic signing capabilities for contracts, ensuring non-repudiation and authenticity. This component validates the identity and integrity of each contract, instilling confidence in the security of agreements.
Vocabulary Management
Wallet
The following sub-sections contain dynamic views that each present how a subset of above-described application components is used to satisfy different (parts of) business processes:
The first part is that the provider needs to describe his resource using a predefined schema that when tailored to the resource at hand becomes a self-description. Next the provider needs to make this self-description (SD) available for potential search. This process is described in detail in ACV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue. In simple terms, this publication process consists of:
The provider describes his offering using the SD Tooling, how the description should look like is defined in the schema of the self-description.
The provider registers the SD as an asset in the connector. The asset is composed by a subset of the metadata present in the SD, only the one that will be necessary afterwards during the consumption.
The provider signs the Self-Description with his credentials to proof that he is the owner and to make the Self-Description tamper-proof.
Finally, the provider publishes the Self-Description to the central catalogue on the Governance Authority Node so the consumer can search for it. The Governance Authority checks automatically if the Self-Description is correct according to the syntax, semantics and quality.
To update a Self-Description consist of at first revoking the old version of the Self-Description and publishing a new version, for detail see ACV Dynamic - BP 05 - Add or Update Resource (Publish) on Catalogue.
The third process is that the consumer searches the catalogue for dataset, application or infrastructure offering. The consumer defines the search terms in the search client app and the catalogue on the governance authority agent executes the search in the catalogue. This is described in detail in ACV Dynamic - BP 06 - Search on Catalogue (Infrastructure, Data, Application). The process consists of:
The Consumer (or provider) uses the search client app to write his search terms. It is possible to use two different ways of searching, quick search or advanced search.
The consumer calls a service in the Query Mapper Adapter on the Governance Authority. This service maps the search terms onto executable queries for the catalogue and also ensures that the consumer can only see the offerings that allow it by enforcing the policy.
The query is executed by the catalogue itself and the results are returned to the consumer (provider).
The fourth part consists of the consumption which is declined in 3 different resource consumptions: data direct download, Infrastructure consumption and data consumption through an application (also sometimes referred to as “bundle”):
The consumer uses the connector to establish a contract with the provider (described in detail in ACV Dynamic - BP 07A - Establish a usage contract agreement).
The control planes perform the contract negotiation between the connector of the consumer and provider (also includes the enforcement of the policies).
The data plane is used to transfer the data from provider to consumer.
The consumer uses the connector to establish a contract with the provider (described in detail in ACV Dynamic - BP 07A - Establish a usage contract agreement).
The control planes perform the contract negotiation between the connectors of the consumer and the provider. The control plan also includes the enforcement of the policies.
The infrastructure provider retrieves, validates and triggers the deployment script of the infrastructure offering.
The infrastructure provider retrieves the access data for the infrastructural resource and shares them with the consumer.
The uses the connector to establish a contract with the provider (described in detail in ACV Dynamic - BP 07A - Establish a usage contract agreement).
The control plane is negotiated between the connector of the consumer and provider (also includes the enforcement of the policies).
An infrastructure is provisioned for the consumption (ACV Dynamic - BP 08 - Consumers select and use an Infrastructure Catalogue Resource from the Infrastructure Provider) and the dataset and application are installed on that infrastructure.
The consumer gets (restricted) access to this infrastructure.
This process outlines how a self-description can be defined and subsequently published in the Catalogue. Certain fields within the self-description link to other resources, which therefore need to be created beforehand. For instance, an infrastructure offering requires a deployment script to be added in advance so that it can be referenced within the self-description.
Schema Synchronisation: The SD Tooling component on the Provider side initiates a request for schema definitions from its local Schema Registry, which is kept in sync with the one on the Governance Authority node. This ensures consistent schema access and alignment across participants, supporting unified self-description formats in the Data Space. The retrieved schema definitions are stored in a local Schema Datastore on the Provider’s end, ensuring quick access and version control.
Create Self-Description: Providers can create a new self-description or modify an existing one through the User Interface. This interface allows them to fill in necessary fields or use a previously stored self-description template.
Syntax Validation: The Syntax Validation component within SD Tooling checks the initial structure of the self-description to confirm it meets the required format. While primarily focusing on the form of the self-description, this step also checks for basic schema compliance. If any issues are found, the provider is prompted to make corrections before proceeding.
Registering Self-Description: Following syntax validation, the self-description is directed to the Connector component, where it is registered as an asset. This registration is critical for linking the self-description to a specific connector instance, enabling controlled access for consumption by the consumer.
Signing and Publication: The Signing/Publication Service manages the integrity and authenticity of the self-description. It signs the document using the Provider’s private key to prevent tampering and then publishes the signed self-description to the Catalogue. A copy of the signed self-description is also stored locally in the Provider’s Wallet for record-keeping purposes. The Wallet maintains a history of signed copies, with any necessary purge or retention policies applied to manage storage effectively. These policies should specify when older records are archived or deleted to optimise space and meet governance standards.
Semantic Validation: After publication, the Catalogue on the governance node initiates Semantic Validation. This step checks that the self-description adheres to the Data Space’s vocabularies and ontology standards, ensuring semantic consistency.
Quality Check: The Catalogue also performs a Quality Rules check to verify that the self-description meets all mandatory quality standards. If any semantic issues are identified, the End User is notified to address specific violations. If the self-description passes all checks successfully, the End User receives a confirmation notification, indicating that the resource is now ready for publication within the Data Space.
Database Storage: Upon passing all validations, the self-description is stored in the Catalogue’s Database along with its associated metadata, making it discoverable and accessible to other participants in the Data Space.
This process illustrates how the metadata (such as status) of a self-description (SD) can be retrieved by a Provider. The different possible statuses for a self-description are outlined in the Gaia-X Federation Services documentation (40.3 Product Constraints) here .
Initiate Metadata Request: The Provider initiates a request to retrieve metadata associated with a specific self-description. This request includes the unique identifier of the desired self-description and is sent to the Catalogue on the Governance Authority Node .
Query Metadata: Upon receiving the request, the Governance Authority Node processes it by querying its Metadata Database to locate the requested metadata. This step ensures that the metadata aligns with the unique identifier provided.
Return Metadata: After locating the requested metadata, the Catalogue prepare a response containing the metadata details. This ensures the requested information is available for consumption.
Display to User: The SD Manager receives the metadata and presents it to the Provider’s end user, allowing them to view details like the status of the self-description.
This sequence describes the steps taken by a provider to retrieve a complete Self-Description (SD) for a resource.
Initiate SD Request : The process begins with the provider using the SD Manager to initiate a request for a specific self-description by sending its unique identifier (SD ID) to the Management Service hosted on the Governance Authority Node .
Query Self-Descriptions : After processing, the Management Service queries the Self-Description Database to retrieve the detailed self-description associated with the given SD ID.
Respond with Self-Description : Once the full self-description is retrieved, it is sent back to the provider through the Response SD component. This self-description includes all necessary metadata and resource information.
Display to User : The SD Manager on the provider node displays the retrieved self-description to the provider’s end-user through its User Interface.
Optional Storage in Wallet : Optionally, the retrieved self-description can be stored in the provider’s Wallet for local record-keeping or offline access.
This process outlines how a provider can revoke an SD in the system, with possible statuses detailed in the Gaia-X documentation (40.3 Product Constraints).
Initiate Status Change : The provider, through the SD Manager on the Provider Node, initiates a request to revoke a specific SD by sending the SD ID to the Catalogue on the Governance Authority Node .
Revoke SD : The system then revokes the SD in the Catalogue database to reflect the new status for the Self-Description.
Response and Display : The Management Service confirms the status update by returning the new status to the SD Manager . If the user interface (UI) is used, the updated status is displayed to the provider end user.
The process describes the end user searching for a resource in the catalogue. The end user can either use the quick search or the advanced search. For the advanced search, it is a prerequisite that the local schema registry of the provider/consumer is synced manually with the central schema registry of the governance authority. The search request is sent to the catalogue. The query mapper translates the query input to the related database query language and adds the filters based on the access policies related to the user performing the request. The search engine executes the search queries and returns the results. The result is then displayed in the end User’s search Client.
User Search Request Initiation
The user initiates a search request through the Search Client.
Here, the user enters the search criteria, which could include
keywords, filters, or other parameters relevant to the desired
resources (e.g., datasets or applications).
For the Advanced Search the form of the search is defined by the
schema in the
Schema Registry
.
Policy Filter Service
The validated search request is forwarded to the Policy Filter
Service. This service checks the user’s access rights based on the
policies defined in the
Policy Creator Component
.
By applying the relevant filters, the Policy Filter Service modifies
the search query to restrict results to only those resources the
user is authorised to view.
Query Translation by Adapter Component
The Query Mapper Adapter Component receives the
policy-filtered search request and translates it into a query
language that aligns with the Catalogue’s database structure.
This step includes mapping the search parameters to the Catalogue’s
internal query schema and embedding any access restrictions set by
the Policy Filter Service directly into the query.
Catalogue Component Query Execution
The Catalogue Component receives the translated and filtered
query from the Adapter. Within the Catalogue, the
Search Engine
processes the query by scanning its database, which houses all
signed self-descriptions, metadata and associated policies.
The Catalogue ensures that each self-description or metadata entry
returned aligns with the access policies, ensuring compliance with
data governance standards.
Result Return to Search Client After processing the query, the Catalogue Component sends the authorised results back through the Adapter Component , which re-formats them for the Search Client’s display needs. The Search Client then presents these results to the user in a structured format, along with relevant metadata to provide a comprehensive view of each item.
This dynamic view for the “ Establish Usage Contract Agreement ” process captures the flow of interactions between various components involved in initiating, negotiating, validating and finalising a contract agreement between a Consumer and Provider. The view is structured into four primary sections representing different roles: Consumer, Connector, Provider and Governance Authority.
Preconditions:
Consumer Discovery and Decision: The Consumer must have discovered and selected the desired resource from the Dataspace’s Catalogue, reviewed the associated terms and conditions within the Usage Contract template (Business Process - 06) and made the decision to consume the resource (Business Processes - 08, 09A and 09B) .
No Existing Contract: There must not be an existing Usage Contract in place that covers the specific resource and terms of the current consumption request.
Initiating Contract Negotiation (Consumer to Connector) : The Consumer initiates a contract negotiation through the Connector’s control plane by creating a “Contract Offer Request.” This request is sent to the Provider’s Connector, initiating the contract establishment process.
Contract Offer Creation and Validation (Provider) : Upon receiving the Contract Offer Request, the Provider’s Connector verifies access policy and existence of a contract and then generates a “Contract Offer” and sends it back to the Consumer Connector for validation. The Consumer then reviews and validates this offer to ensure it meets their requirements.
Agreement Formation and Validation (Bidirectional Communication) : If the Consumer accepts the offer, the Consumer Connector initiates the creation of a “Contract Agreement.” This agreement is validated by both the Consumer and Provider’s Connectors to ensure mutual compliance. Once validated, both parties confirm the contract through Verifiable Credentials.
Verification and Issue of Usage Contract VC : The Provider invokes the VC Issuer to issue a Verifiable Credential (VC) for the Usage Contract Agreement. This credential is signed by a signer and service subsequently returned and stored securely within the VC storage of the Wallet for regulated access to usage terms. This is then repeated on the Consumer’s side to issue, sign and securely store VC for the Usage Contract Agreement on the customer’s side.
Persisting Agreement (Wallet & Storage) : After the VC Usage Contract is signed and securely stored in the digital wallets of both the consumer and the provider, a copy of the contract (in a format to be determined, potentially a third VC or a traditional record) will be stored by the provider for future reference, such as billing and auditing purposes.
The dynamic view diagram illustrates the orchestrated interactions required to provision infrastructure resources and to deploy applications on the provisioned infrastructure asset. This view focuses on the coordinated roles of the Triggering Module , Broker , Storage and Infrastructure Provisioner.
Preconditions :
Infrastructure of the Data Space governance authority had been set up (agent deployed);
Infrastructure of the infrastructure provider had been set up (agent deployed);
Infrastructure service offering(s) had been listed on the catalogue (and therefore registered as connector assets), as per BP 05;
Infrastructure Consumer has been onboarded to the Data Space (as per BP 03A and BP 03B);
Infrastructure Consumer is authenticated and has been authorised;
Main Infrastructure instance of the infrastructure consumer had been set up (agent deployed).
Triggering and Infrastructure Provisioner Modules
This process outlines how the deployment script can be added, removed, invalidated and triggered, using the Triggering and Infrastructure Provisioner Modules.
Triggering Module
The module is in charge of doing the management of the life cycle of the Deployment Scripts and Configuration Scripts. The building blocks are:
API : The request to the triggering module API would be received either from the “script management UI” when a deployment script is being added or is being modified, or from other components of Simpl such as connector extensions (at the time of contracting between two connectors, to send the Deployment Script ID and other relevant information such as the Consumer Email, and trigger the execution of deployment script, which provisions the infrastructure resources and deploys apps asynchronously). The same applies to the decommissioning process that can be triggered from the “script management UI” or from a “Triggering Decommission” client.
Script Storage Management : is the functionality of the backend, accessible via the API which also is available via the UI that relies on the API. Using this functionality, service providers (infrastructure, app or data) can store Deployment Scripts and receive a unique identifier (DeploymentScriptID) assigned to that specific script. At the time of adding the scripts, they are being validated to not contain malicious code. Script are added to a repository and a database at the same time, to have a mechanism to check their integrity in the future and at the time of retrieval.
Add Script: Loads the deployment script
Generate Unique ID : Assigns a unique identifier (DeploymentScriptID) to each script for tracking purposes.
Validate Script : Ensures the script is free of malicious code. Scripts failing this check are rejected.
Hash the Script : Generates a hash value for the script, which is stored in the database to enable future integrity checks.
Store Script in DB : Saves the script’s metadata and hash securely in the database, with protections against SQL injection attacks.
Store Script in Repo : Stores the actual script file in a local repository for retrieval during execution.
Script Configuration Management : is the functionality of the backend, accessible via the API which also is available via the UI that relies on the API. Using this functionality, service providers (infrastructure, app or data) can store Configurations Scripts and assign them to specific Deployment Script. At the time of adding a Configuration, it’s being validated. Configurations are added to a database at the same time and are bonded to a Deployment Script.
Add Configuration: Loads the configuration script
Validate the Configuration : Ensures the Configuration script is in the correct format. Scripts failing this check are rejected.
Store Script in DB : Saves the Configuration script in the database, with protections against SQL injection attacks.
Template Management : Templating is a feature that empowers providers to provision Virtual Machines (VMs)
based on predefined characteristics, primarily focusing on hardware specifications and operating systems.
A template serves as a blueprint for defining how VMs can be configured and deployed.
Templates are used to build deployment script based on templates components.
Invalidate Script: is the functionality of the backend for disabling a Deployment Script, accessible via the API which also is available via the UI that relies on the API. Using this functionality, service providers (infrastructure, app or data) can disable Deployment Scripts. At the time of disabling a Deployment Script, it’s being checked.
Invalidate Script : Changes the status of the Deployment Script
Validate Removal Criteria : Enforces predefined rules before invalidating a script.
Flag as Invalid in DB : Updates the script’s validity status in the database to indicate it is no longer active.
Remove from Repo : Deletes the script file from the repository, though its metadata remains in the database for audit or business purposes.
Script Execution : is responsible for handling deployment script retrieval, validation, and execution requests.
Retrieve Deployment Script : When a request containing the DeploymentScriptID is received, the module retrieves the script from the repository.
Validate Deployment Script : Checks the integrity of the retrieved script by generating a new hash and comparing it with the hash stored in the database. Integrity failures trigger errors, preventing execution.
Recognise Post-Configuration Script : If the Crossplane/Terraform configuration file contains a Cloud-init configuration section containing post provisioning configurations, it will be recognised, to be encoded to base64 (as described in the next steps, since it’s required by Crossplane/Terraform), after proper modifications (e.g., adding a public key or password that’s generated by the access management module, as described in the Access Management ).
Hash/Encode : Encodes the Cloud-init configuration (if exists) using Base 64. Hashes the randomly generated password by the Access Management (if exists) using SHA 256.
Modify Deployment Script : Replaces the simple-text Cloud-init configuration with the base64 encoded version of it, which contains the added information such as the encrypted password or the public key.
Trigger Deployment Script Execution : Sends the deployment script to the Infrastructure Provisioner via the Message Broker , ensuring asynchronous communication for scalability.
Access Management: is responsible for the creation of the access of the resource and to send it to the End User.
Generate Password : When a request containing the DeploymentScriptID is received, the module retrieves the script from the repository.
Retrieve Access Data : Checks the integrity of the retrieved script by generating a new hash and comparing it with the hash stored in the database. Integrity failures trigger errors, preventing execution.
Share Access Data : Shares the endpoints, credentials and any information relevant to the provisioned instance (or deployed applications). Currently, it relies on the SMTP emailer and will be replaced by wallet solutions in a future release.
SMTP Emailer : Shares the access information with the Consumer, using the email address which was received during the triggering process.
Decommissioning: is the functionality of decommissioning a resource that has been created through a DeploymentScriptID, accessible via the API which also is available using the UI that relies on the API. Using this functionality, one can disable Deployment Scripts upon specific scenarios like for instance: end of a contract, violation of a policy, etc…
Invalidate Script : Changes the status of the Deployment Script
Validate Removal Criteria : Enforces predefined rules before invalidating a script.
Flag as Invalid in DB : Updates the script’s validity status in the database to indicate it is no longer active.
Remove from Repo : Deletes the script file from the repository, though its metadata remains in the database for audit or business purposes.
Message Broker: The Message Broker facilitates communication between the Script Execution Module and the Infrastructure Provisioner Module .
Infrastructure Provisioner Module
The module is in charge of provisioning and decommissioning of the infrastructure resources and completing post-provisioning configuration tasks. The building blocks are::
Provisioning: is the responsible for the provisioning process on the Cloud Provider side.
Validate Deployment Script : Checks the script for syntax correctness and interpretability to avoid execution errors.
Execute Deployment Script : Provisions infrastructure resources based on the script’s configuration.
Post Configuration : Completes additional tasks, such as setting policies, deploying applications, mounting or attaching storages and loading datasets as specified on the post configuration (Cloud-init) section of the deployment script.
Share Access Data : Shares access information (such as endpoints) to the Access Management Module , via the Message Broker.
Decommissioning: is the responsible for the decommissioning process on the Cloud Provider side.
Run pre-decommissioning tasks : Such as making a snapshot of the resources, depending on the business requirements (yet to be clarified by Business).
Decommission : Terminate/destroy the resources.
Storage Solutions: Consists of the Database , Git-Based Repository and Wallet ensuring secure storage and retrieval of deployment scripts and passwords.
Database : Stores metadata and hashes for each script to facilitate integrity verification.
Repository : Hosts the actual deployment scripts for retrieval during provisioning.
Wallet : If a random password generation is necessary for the instance that is going to be provisioned, this password will be temporarily stored on the wallet, until the provisioning is finalised and the password is going to be communicated to the consumer and to be deleted from the wallet.
Cloud Provider: Consists of the Cloud Provider Infrastructure where the resources are created.
This section describes the capabilities falling behind the scope of the current release and will be enhanced at a later time. In particular it includes only the direct data download capability for data sharing.
The Data Provider is open to offering straightforward access to the dataset for the consumer. This access can be facilitated through a direct download, making the process simple and efficient. To ensure proper governance, a formal contract will be established between both parties. Since the data is downloaded, Simpl no longer has control over its usage, and therefore this contract will define and enforce legally binding usage policies as well as access policies. These measures will provide clarity and security for both the Data Provider and the consumer, safeguarding proper usage of the data.
Preconditions:
Data Provider has registered the resource at the Connector;
Data Provider has created the SD for the resource (meta data description) and uploaded the SD to the data catalogue;
Consumer has logged in through their agent;
Consumer found the needed dataset using the searching capabilities on the Data Catalogue;
If the contract doesn’t exist, Consumer and Provider must establish a Contract on the requested resource (BP7 - see relative architecture for further details);
Consumer has available and compatible storage.
Assumptions for the current release:
The view shows the dynamic application view of consuming a data resource by directly being given access to the dataset. It outlines the key functional components involved in the process of consuming a data resource.
The consumer initiates a request for the resource previously found in the catalogue, for which a contract has already been established. This request is sent to the provider through the connector. Upon receiving the request, the provider verifies the policies to ensure that the consumer has the necessary permissions to perform the requested action. The policies that are checked are only those that can technically be enforced. For all others, since the dataset is downloaded, the contract enforces the legally binding usage policies. Once the policies are confirmed, the transaction takes place between the two data orchestrator components, which, at the moment, are implemented as extensions of the EDC connector. These orchestrator components handle the interface between the connector and the actual source on the provider side and sink on the consumer side of the data.
Dataset Selection by the Consumer :
The process begins in the Catalogue Client Application on the consumer side, where the consumer selects a dataset of interest. This action initiates a “Request Consumption of Data Asset” message, which is sent to the Contract Negotiation Adapter .
This message signals the consumer’s intent to access the resource and moves the negotiation process forward.
Creation of Request Bundle :
The Contract Negotiation Adapter takes the consumer’s request and compiles a Request Bundle .
This bundle includes information about the selected dataset and any initial parameters needed to facilitate negotiation. It is forwarded to the consumer’s Connector (Control Plane) for further processing.
Requesting an Offering from the Provider :
The consumer’s Connector (Control Plane) sends a Request Offering message to the provider’s Connector (Control Plane) .
This step involves querying the provider’s system to locate the requested dataset and determine its availability.
Asset Validation by the Provider’s Connector :
The provider’s Connector (Control Plane) checks the Asset Catalogue for the requested dataset:
If the dataset is not found , the provider responds with a Resource Not Found Message . This message is propagated back through the consumer’s Connector and Contract Negotiation Adapter to notify the consumer, effectively halting the process.
If the dataset is found , the workflow transitions into the contract negotiation phase.
Contract Negotiation Between Connectors :
Once the dataset is validated, the consumer and provider’s Connectors (Control Planes) begin negotiating the terms of usage.
This includes setting access conditions, pricing, compliance requirements, and obligations. The outcome of this step is a draft contract that must undergo further validation.
Policy Evaluation on the Provider’s Side :
The draft contract is sent to the provider’s Policy Engine for evaluation against governance and compliance rules.
If the policy check fails , the Policy Engine sends a notification back to the consumer (via the Connector Control Plane and Contract Negotiation Adapter ) explaining the violation. The process halts here unless the consumer modifies the request to comply.
If the policy check succeeds , the Policy Engine approves the contract, and the workflow proceeds to finalisation.
Notification of Policy Check Results :
The results of the policy evaluation (either success or failure) are sent back to the consumer’s Connector (Control Plane) .
In case of failure , the Contract Negotiation Adapter notifies the consumer, providing details about the violation.
If the policy check is successful , the contract is finalised and marked as complete.
Finalisation of Contract Agreement :
Once approved, the contract agreement is formalised within the Control Plane of both the consumer and provider connectors.
At this point, both parties have a binding agreement that governs the terms for the upcoming data transfer.
Initiation of File Transfer Request :
With the contract in place, the consumer’s Data Plane Extension for S3 sends a Request File Transfer message to the provider’s Data Plane Extension for S3 .
This request includes the contract agreement ID as a reference, ensuring that the data transfer adheres to the agreed terms.
Processing the File Transfer :
The provider’s Data Plane Extension for S3 verifies the file transfer request using the contract ID and cross-checks it against the agreed terms.
Once validated, the dataset is securely transferred to the consumer, completing the process.
The consumer seeks to perform actions such as visualisation or processing on a dataset owned by a data provider but does not have direct access to the data itself. Instead, the consumer selects and enters into a contract for an offering from the data provider, which includes the provisioning of an infrastructure resource. An application is then deployed on this infrastructure, enabling the necessary processing of the dataset. Access is provided exclusively through a direct link to the application, ensuring that the consumer cannot directly access the data. As part of the contractual agreement, the consumer is prohibited from attempting to access the data in any way.
Preconditions:
Data Provider has registered the resource at the Connector;
Data Provider has created the SD for the resource (metadata description) and uploaded the SD to the data catalogue;
Consumer has logged in through their agent;
Consumer found the needed resource using the searching capabilities on the Data Catalogue and selected the bundle of dataset, application and infrastructure associated with the dataset;
If the contract doesn’t exist, Consumer and Provider must establish a Contract on the requested resource (BP07) (see relative architecture for further details).
The view shows the dynamic application view of consuming a bundle resource (dataset, application and infrastructure bundled together) by being given access to the provisioned node where the bundle is deployed. It outlines the key functional components involved in the process of consuming the resource.
In this scenario, the goal is to ensure that the consumer gains access only to the application, without direct access to the dataset itself. When the consumer selects and contracts a data processing service offering after the contracting is done, the infrastructure resource provisioning and the deployment of the application over the infrastructure instance will take place in the background and the Consumer will in the end receive the access data and credentials only to the deployed application. The access to the application is not depicted in this scenario.
The components coloured in grey are related to the BP07 (Contract Manager) as well as BP06 (Search) and they are mainly referring to the preconditions.
The diagram represents the action performed to trigger the Bp which consists in using the endpoint, with the needed parameters, contained in the selected description, the user can initiate the contract negotiation process.
The diagram also shows the flow of how the consumer is requesting a bundled resource via a data provider to the infrastructure provider and receives the respective access. The consumer will request the offering via the Catalogue Client UI, based on a previously identified search result. The Contract Negotiation Adapter is handling the request from the consumer, filtering for the requested asset on the provider’s catalogue and return the offering along with the respective usage and access policies. The user is then accepting those and at the same time start the contract negotiation. The contract negotiation adapter is building the request for the connector. After finalising the contract negotiation, the infra structure deployment is triggered. Once this step is completed, the access information is passed to the user,
Resource Selection :
Request Offering :
The Contract Negotiation Adapter processes the Consumer’s request by forwarding it to the Connector (Control Plane) on the Data Provider’s side.
The Connector checks the requested resource against its Asset Catalogue :
If the asset is not found , a “Resource Not Found” message is sent back to the Consumer.
If the asset is found , the Connector retrieves the associated offering details, including usage and access policies and returns them to the Consumer for review.
Policy Agreement :
The Consumer, using the Catalogue Client Application , reviews the retrieved offering details.
Upon agreeing to the usage and access policies, the Consumer initiates a contract negotiation via the Contract Negotiation Adapter .
Contract Negotiation Request :
The Contract Negotiation Adapter composes a contract negotiation request and sends it to the Connector (Control Plane) of the Data Provider.
The Policy Engine evaluates the request based on predefined policy rules:
If the policy check fails , the Consumer is notified of the violation.
If the policy check succeeds , the contract is finalised and a confirmation is sent to the Consumer.
Infrastructure Deployment Trigger :
Triggering Deployment :
Provisioning and Deployment :
The Triggering Module provisions the infrastructure instance and deploys the application onto it as per the deployment command.
Once the deployment is complete, the Triggering Module fetches the access credentials (e.g., application URL, API keys) and sends them back to the Infrastructure Orchestrator .
Returning Access Information :
The Infrastructure Orchestrator relays the access information to the Connector (Control Plane) on the Data Provider’s side.
The Connector forwards the access details to the Contract Negotiation Adapter , which delivers them to the Consumer, completing the workflow.
This scenario follows the one above. Once the login credentials are received, the user accesses the dedicated infrastructure through a direct link.
The static domain view illustrates the structural organisation of application services involved in the domain, segmented into four types of agents (Governance Authority, Consumer, Data Provider and Infrastructure Provider), showcasing the roles each plays in the domain.
Schema Management
The Schema Management Service component represents the Metadata Description building block, enabling the Governance Authority to define the structure of self-descriptions. Using a UI or API, the Governance Authority can establish properties, data types, constraints and controlled vocabularies that apply across resources (datasets, applications, infrastructure). The resulting schema configurations are automatically transformed into semantic files and managed within the Schema Synch Service, ensuring the Provider Node has access to the most current schema standards for generating self-descriptions in compliance with governance protocols.
The Schema Backend functions as a central repository and management interface for schemas created by the Governance Authority. These schemas, represented as ontologies and structured schema definitions, are actively managed to provide consistent standards across resource descriptions. Serving as an application component rather than a simple data storage element, the Schema Registry facilitates regular synchronisation with the Provider Node, ensuring that providers always have access to the latest schema standards needed for creating compliant self-descriptions.
The Schema Registry is used by the catalogue client application to enable semantic consistency by defining and validating the terms used in self-descriptions and search fields. The Search Client uses the schema to define the search fields for the advanced search. This automatic form generation helps prevent ambiguous searches and ensures users can only search for terms recognised within the Data Space.
Schema Synch Adapter synchronises the schemas with the agents and makes sure the schemas for dependent components are accessible and up-to-date
This section describes the architecture for Monitoring and Logging, within a single node (Simpl-Open agent) and does not (yet) consider inter-nodes setup.
Simpl-Open Application Component
This component represents an abstraction of any Simpl-Open application component which are being monitored. These components can produce:
Technical logs generated by the application and the underlying platform - e.g. access logs, error logs, etc. ;
Business events generated by the application upon specific triggers in the business workflow - e.g. “Participant successfully onboarded”;
Infrastructure metrics - e.g. CPU utilisation, RAM utilisation, etc. ;
Health checks which are APIs implemented by the application components of the Simpl-Open agent to report on the status of the service - e.g. HTTP 200 “OK”.
Tracing data to discover potential bottlenecks in the request processing.
Platform API
Monitoring Service
Log Collection Agent
Log Ingestion Pipeline
Infrastructure Metrics Collection Agent
Logs Repository
Monitoring Space
Logs Visualisation
Reporting
Alert Manager
Health Checks
Application Tracing
The process describes the governance authority end user managing schemas for resource descriptions. The end user can either create a new resource description schema, revoke and existing one, or create a new version of it.
Below diagram describes how the components presented above interact with each other to rend the functionalities.
The Simpl-Open Application Component generates various types of data, including technical logs, business events, infrastructure metrics and health check outputs. These data streams are exposed via APIs for collection.
Log and Event Generation : The Simpl-Open Application Component produces technical logs, business events, infrastructure metrics and health check data, exposing these via APIs.
Logs Collection : The Logs Collection Agent retrieves technical logs and business events, forwarding them to the Logs Ingestion Pipeline for processing.
Metrics Collection : The Infrastructure Metrics Collection Agent gathers infrastructure metrics and directly forwards them to the Logs Repository.
Log Transformation : The Logs Ingestion Pipeline processes and transforms the raw logs into a standardised format before storing them in the Logs Repository.
Centralised Storage : The Logs Repository stores technical logs, infrastructure metrics and business events in dedicated sections, ensuring they are accessible for subsequent steps.
Log Visualisation : The Logs Visualisation component retrieves and displays logs for analysis, allowing users to review technical, infrastructure and business-related events.
Data Aggregation : The Monitoring Space aggregates logs and metrics, enabling real-time analysis of system health and performance.
Alert Generation : The Alert Manager processes aggregated data, generating alerts for any anomalies or threshold breaches and notifying relevant stakeholders.
Report Generation : The Reporting Module queries logs and metrics from the Logs Repository to create detailed reports.
Report Presentation : These reports are displayed through a user-friendly interface, providing actionable insights for decision-making.
The table below presents the APIs of all the components depicted on the application deployment views. These APIs can be correlated to the Application Components Views static diagrams (per domain) through the numbering appearing on both the diagrams and the first column of this table.
Simpl-Open uses 2 types of APIs:
Synchronous JSON/HTTP APIs;
Asynchronous JSON/Kafka APIs.
Each API is described in a functional way and linked to the technical contract definition (e.g. OpenAPI definition for sync APIs) which is stored in GitLab.
The API guidelines of Simpl-Open can be found in the Simpl Contributions Code of Conduct & Guidelines.
The ‘ Monitoring Integration ’ column indicted if an API endpoint emits business log to the centralised monitoring component. In the end state all API endpoints should emits business logs to ensure auditability and traceability of all flows.
| # | Component | Sync APIs | Async APIs | Monitoring Integration (yes/no) | API Guidelines Phase 1 Compliancy | ||
|---|---|---|---|---|---|---|---|
| Name | Technical definition | Name | Technical definition | ||||
| 1 | SD Tooling |
SD Tooling API
|
https://code.europa.eu/simpl/simpl-open/development/data1/sdtooling-api-be/-/blob/main/openapi/openapi-v1.yaml?ref_type=heads | c: schema-changed | yes | yes | |
| 2 | EDC Connector Adapter |
EDC Connector Adapter Application API
|
https://code.europa.eu/simpl/simpl-open/development/data1/edcconnectoradapter/-/blob/main/openapi/openapi-v1.yaml | yes | yes | ||
| 3 | Validation API |
Validation API
|
https://code.europa.eu/simpl/simpl-open/development/data1/sdtooling-validation-api-be/-/blob/main/openapi/openapi-v1.yaml?ref_type=heads | yes | yes | ||
| 4 | sync-schema-adapter |
Sync Schema Adapter
|
https://code.europa.eu/simpl/simpl-open/development/data1/schema-sync-adapter | yes | yes | ||
| 5 | Signer Service |
Signer Service
|
https://gitlab.eclipse.org/eclipse/xfsc/tsa/signer/-/blob/ocm-wstack/gen/http/openapi3.json | ||||
| 6 | Catalogue Client Application |
Advanced Search API
|
https://code.europa.eu/simpl/simpl-open/development/data1/xfsc-advsearch-be/-/blob/main/openapi/openapi-v1.yaml | yes | yes | ||
| 7 | Schema Management |
Schema Management API
Subscription API
|
https://code.europa.eu/simpl/simpl-open/development/gaia-x-edc/simpl-schema-manager/-/blob/develop/openapi/schema_openapi.yaml?ref_type=heads | p: schema-changed | |||
| 8 | Vocabulary Management |
|
TBD | ||||
| 9 | Catalogue |
GAIA-X Federated Catalogue
|
https://code.europa.eu/simpl/simpl-open/development/gaia-x-edc/simpl-fc-service/-/blob/develop/openapi/fc_openapi.yaml?ref_type=heads | ||||
| 12 | Query Mapper Adapter |
Query Mapper Adapter API
|
https://code.europa.eu/simpl/simpl-open/development/gaia-x-edc/poc-gaia-edc/-/blob/develop/openapi/adapter_openapi.yaml | ||||
| 13 | Tier 1 Authentication Provider |
Keycloak APIs
|
https://www.keycloak.org/docs-api/latest/rest-api/index.html | ||||
| 14 | Tier 2 Authentication Provider |
Authentication Provider API - Tier 1 + Tier2 - V1 (Deprecated, see the deprecation details inside the OpenAPI specification)
|
https://code.europa.eu/simpl/simpl-open/development/iaa/authentication_provider/-/blob/develop/openapi/authenticationprovider-v1.yaml?ref_type=heads | ||||
|
Authentication Provider API - Tier 1 - V2 Keypairs - Manage and store participant keypair securely
Credentials - API for managing participant's credentials
Participants - Retrieve information about the agent participant and other data space participants
Identity Attributes - Retrieve information about the dataspace Identity Attributes
Tier 1 Credentials - Validates the Tier 1 credentials of an end-user originating from an external participant.
Automatic Renewals - Allow setting and retrieving the automatic renewal configuration.
|
https://code.europa.eu/simpl/simpl-open/development/iaa/authentication_provider/-/blob/develop/openapi/authenticationprovider-tier1-v2.yaml?ref_type=heads | ||||||
|
Authentication Provider API - Tier 2 - V2 Participants - Retrieve information about the agent participant and other data space participants
|
https://code.europa.eu/simpl/simpl-open/development/iaa/authentication_provider/-/blob/develop/openapi/authenticationprovider-tier2-v2.yaml?ref_type=heads | ||||||
|
Authentication Provider Async API
|
https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/authenticationprovider/v1/asyncapi.yaml?ref_type=heads | ||||||
| 15 | User & Roles |
User and Roles API - Tier 1 - V1
|
https://code.europa.eu/simpl/simpl-open/development/iaa/users-roles/-/blob/develop/openapi/usersroles-v1.yaml?ref_type=heads | ||||
|
User and Roles API - Tier 1 - V2 Users - Manages the Users of the Simpl-Open Agent
Roles - Manages the Roles of the Simpl-Open Agent
User Session - Helps users to retrieve their session data
Role Requests
|
https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/raw/develop/simpl-api-iaa/src/main/resources/static/openapi/tier-1/usersroles-tier1-v2.yaml?ref_type=heads | ||||||
| Users and Roles Async API | https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/usersroles/v1/asyncapi.yaml?ref_type=heads | ||||||
| 16 | Security Attributes Provider |
Security Attributes Provider API - Tier1 + Tier2 - V1 (Deprecated, see the deprecation details inside the OpenAPI specification)
|
V1 |
||||
|
Security Attributes Provider API - Tier1 - V2 Identity Attributes - Manages the identity attributes of the data space
Participant - Manages the assignment of identity attributes to participant
|
Tier 1 - V2 |
||||||
|
Security Attributes Provider API - Tier2 - V2 Identity Attributes - Allow agents to manage identity attributes via Tier 2 communication
Credentials - Manage information related to Tier 2 credentials
Ephemeral Proof - Manage the issuance of ephemeral proof via Tier 2 communication
Participants - Manage participant identity attributes
|
Tier 2 - V2 |
||||||
| 17 | Identity Provider |
Identity Provider API - Tier1+ Tier2 - V1 (Deprecated, see the deprecation details inside the OpenAPI specification)
|
V1 |
||||
|
Identity Provider API - Tier1 - V2 Participants - Allow the creation and retrieval of participant, also allowing to manage the credential requests (CSR).
Credentials - Manage the digital credentials of participants, enabling secure authentication and verification of their identity and roles within trusted data exchange processes.
Automatic Renewals - Allow the creation, retrieval and editing of the default automatic renewal configuration for the dataspace participants.
|
Tier 1 - V2 |
||||||
|
Identity Provider API - Tier2 - V2 Credentials - Manage participant credentials for Tier 2 communication, allowing the participant to interact with the governance authority
Participants - Manage participant additional information such as Tier 1 public keys sent over the Tier 2 communication channel
|
Tier 2 - V2 |
||||||
|
EJBCA REST Interface API
|
https://docs.keyfactor.com/ejbca/latest/ejbca-rest-interface | ||||||
| 18 | Onboarding |
Onboarding API v1 Onboarding Validation Rules - API for managing Onboarding Validation Rules
Onboarding Templates - API for managing Onboarding Templates
Mime Types - API for managing Mime Types
Onboarding Requests Management - API for managing Onboarding Requests Management
Onboarding Statuses - API for managing Onboarding Statuses
Participant Types - API for managing Participant Types
eIDAS Attributes - APIs to retrieve the eIDAS attributes provided in the official eIDAS attribute profile
Onboarding Users - APIs to retrieve user data to help onboarding process
|
https://code.europa.eu/simpl/simpl-open/development/iaa/onboarding/-/blob/develop/openapi/onboarding-v1.yaml?ref_type=heads | ||||
|
Onboarding API v2 Onboarding Requests Management - Manage onboarding requests for applicant who want to join the data space
|
https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/raw/main/simpl-api-iaa/src/main/resources/static/openapi/tier-1/authenticationprovider-tier1-v2.yaml?ref_type=heads | ||||||
|
Onboarding Async API
|
https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/onboarding/v1/asyncapi.yaml?ref_type=heads | ||||||
| 20 | Connector Management API |
management-api
|
https://app.swaggerhub.com/apis/eclipse-edc-bot/management-api/0.7.0 | ||||
| 21 | Connector Control Plane |
control-api
|
https://app.swaggerhub.com/apis/eclipse-edc-bot/control-api/0.7.0 | ||||
| 22 | Connector Data Plane |
public-api
|
https://app.swaggerhub.com/apis/eclipse-edc-bot/public-api/0.7.0 | ||||
| 23 | Triggering Module |
Infrastructure Provisioning API
|
https://code.europa.eu/simpl/simpl-open/development/infrastructure/infrastructure-be/-/blob/develop/openapi/infrastructure-provisioning-api.yaml | ||||
| 24 | Contract Manager Orchestrator |
Contract Manager
|
code.europa.eu/simpl/simpl-open/development/contract-billing/contract/-/raw/develop/openapi/openapi3-v1.yaml?ref_type=heads |
|
src/main/java/eu/europa/ec/simpl/contracts/kafka/events · main · Simpl / Simpl-Open / Development / Contract-Billing / contract · GitLab | ||
| 25 | Contract Manager Backend |
|
src/main/java/eu/europa/ec/simpl/contracts/kafka/events · main · Simpl / Simpl-Open / Development / Contract-Billing / contract · GitLab | ||||
| 27 | VC Issuer |
|
Currently under investigation | ||||
| 28 | Contract Consumption Service |
Contract Consumption API
|
https://code.europa.eu/simpl/simpl-open/development/data1/contract-consumption-be/-/blob/main/openapi/openapi-v1.yaml | yes | yes | ||
| 29 | Notification Service |
Notification Service API
|
https://code.europa.eu/simpl/simpl-open/development/contract-billing/notification-service/-/blob/develop/docs/asyncApi/asyncapi.yaml?ref_type=heads | ||||
| 30 | Tier 2 Gateway | NA/ - this is an API gateway and does not implement any API. |
Tier 2 Gateway Async API
|
https://code.europa.eu/simpl/simpl-open/development/iaa/common/-/blob/develop/simpl-api-iaa/src/main/resources/asyncapi/tier2gateway/v1/asyncapi.yaml?ref_type=heads | |||
| 31 | Orchestration Engine API |
|
|||||
| 32 | Infrastructure Provider API |
|
https://code.europa.eu/simpl/simpl-open/development/infrastructure/infrastructure-be/-/blob/main/openapi/infrastructure-provisioning-api.yaml?ref_type=heads |
completed, planned
The Simpl-Open UX/UI Style Guide can be found in the Simpl Contributions Code of Conduct & Guidelines.
| # | Component | Domain | Description | Functionalities |
|---|---|---|---|---|
| A | User & Roles | Domain 1 | IAA frontend that allows to manage participant local users, assign roles to user and assign identity attributes to roles. |
|
| B | Onboarding | Domain 1 | IAA frontend that allows Participant and Governance Authority representatives to manage the onboarding requests of new participants in the Governance Authority agent. |
|
| C | Security Attributes Provider | Domain 1 | IAA frontend component that allows a Governance Authority representative to manage the security attributes of a data space. |
|
| D | Identity Provider | Domain 1 | Allows the Governance Authority representative to manage participants and their credentials. |
|
| E | Participant Utility | Domain 1 | IAA frontend that allows the management of credentials Participant Agent. It allows a Participant Representative to generate Keypairs and install credentials to complete the onboarding process. |
|
| F | Catalogue Client Application | Domain 2 | The Catalogue Client Application is the primary interface through which users interact with the Catalogue. It presents search fields and options to users, which in case of advanced search are defined by the schema. |
Quick Search: put a number of search terms into the bar
click on "Search" to receive the results
Advanced Search Select the Schema to search for
Fill out the properties that you want to search on
Click on "Search" to receive the results
Data Consumption Search for a valid document (see above)
Click on the "More details" button to enable the "Request resource" button Click on the "Request resource" button A contract offer will appear after a short loading period
Clicking "Decline" will close the modal Clicking "Accept" will start the contract negotiation and will redirect to the contract negotiation status page
The page refreshes every 3 seconds automatically to retrieve a new status until the status is "FINALIZED". It stops auto-refresh after that. You can also manually refresh the page to refresh the status. When the status is "FINALIZED" the "Start Transfer" button will appear. Clicking "Start transfer" will open a modal and it'll display the required data destination fields depending on resource type. For data offerings, this form will pop-up:
Fill out the fields one-by-one, then scroll down to the bottom of the form and click the "Start Transfer" button:
You'll be redirected to the transfer process status page:
The page refreshes every 3 seconds automatically until the "DEPROVISIONED" or "TERMINATED" state is reached. The page can also be manually refreshed. |
| G | SD Tooling | Domain 2 | Frontend with the forms for the provider to create Self-Descriptions. Written in Angular and NodeJS. The result is a SD in the form of a JSON-LD document that can be uploaded to the catalogue. |
Select Schema for the SD to create
Fill out the generated form with all mandatory properties
Publish the SD to the catalogue on the Governance Authority
|
| H | Schema Management UI | Domain 2 | N/A - not part of the current release. | |
| I | Vocabulary Management UI | Domain 2 | N/A - not part of the current release. | |
| J | Infrastructure Deployment Script Management UI | Domain 2 | User Interface for adding and removing (invalidating) the Deployment Scripts, that can provision infrastructure resources and/or deploy applications. The UI also allows the addition of Post-Configuration script associated with a Deployment Script. |
|
| K | Orchestration Management UI | Domain 2 | UI layer from dagster, allows you to manage the workflow |
|
| L | Infrastructure Deployment Script Management UI | Domain 2 | User Interface for managing templates for deployment scripts. A template is defined for a specific cloud environment and a specific deployment script is generated from such a template. |
|
The following table presents a mapping between the components from the functional architecture and the ones from the application architecture.
| Functional Component | Application Component |
|---|---|
| Onboarding | Onboarding |
| IAA | Authorisation |
| Tier 1 Authentication Provider | |
| Tier 2 Authentication Provider | |
| Identity Provider | |
| Security Attributes Provider | |
| User & Roles | |
| Credential Database/Vault | |
| Vocabulary Management | Vocabulary Management |
| Schema Management | Schema Management |
| Schema Registry | |
| Service Offering Editor | SD Tooling |
| Signer Service | |
| Wallet | |
| Policy Template Datastore | |
| Federated Catalogue | Catalogue |
| Search | Catalogue Client Application |
| Data Space Connector | Connector |
| Contract Management | Contract Manager Orchestrator |
| Contract Manager Backend | |
| Contract Template Datastore | |
| Data Transfer | Connector |
| Infrastructure Management | Triggering Module |
| Infrastructure Provisioner | |
| Observability | Monitoring Module |
Simpl-Open Data Architecture presents data entities and/or collections and how they are structured within the system.
Given that Simpl-Open combines existing/reusable open-source components and custom-built components, the following approach is followed:
For open-source components, the dedicated sub-section provides a link to the available data model documentation of the component.
For custom components, the dedicated sub-section describes the data model per component (as per the microservices approach) through the following layers:
| Layer | Description |
|---|---|
| Conceptual | The conceptual data model (CDM) operates at a high level, providing an overarching perspective on the application's data needs. It defines a broad and simplified view of the data to create a shared understanding of the application by capturing the essential concepts. These essential concepts are captured in an Entity Relationship Diagram (ERD) and the accompanying entity definitions. |
| Logical |
The logical data model (LDM) contains representations that fully defines relationships in data, adding the details and structure of essential entities. The LDM remains data platform agnostic because it focuses on business needs, flexibility and portability. The LDM includes the specific attributes of each entity, the relationships between entities and the cardinality of those relationships. |
| Physical | The physical data model (PDM) is a data model that represents relational data objects. It describes the technology-specific and database-specific implementation of the data model and is the last step in transforming from the logical data model to a working database. The physical data model includes all the needed physical details to build the database. |
| OSS | Data Model |
|---|---|
| XFSC Signer | The self-description is wrapped into a verifiable credentials and the proof section of the VC contain the signature. The data model is defined here: https://www.w3.org/TR/vc-data-model/#proofs-signatures |
| XFSC catalogue |
XFSC catalogue stores the data in three different ways:
|
| OpenBao | Secrets data is stored in secret engine https://openbao.org/docs/internals/architecture/ . The data model depends on the model used currently a Key-Value (KV) Store Data Model is used. |
| Keycloak | Keycloak use a "code first" approach to data modelling. There are no data model diagrams available in their documentation, but the data model is described in their code repository: https://github.com/keycloak/keycloak/tree/main/model |
| EJBCA | EJBCA data model diagram is located https://doc.primekey.com/ejbca/ejbca-introduction/ejbca-architecture/internal-architecture |
| Crossplane |
Resource Definition: https://docs.crossplane.io/latest/concepts/composite-resource-definitions/ |
| OpenTofu |
Resource Definition: |
| Kubernetes |
Kubernetes objects:
https://kubernetes.io/docs/concepts/overview/working-with-objects/ |
| Ansible |
Ansible Data Manipulation: https://docs.ansible.com/ansible/latest/playbook_guide/complex_data_manipulation.html |
| ArgoCD |
RBAC Model https://argo-cd.readthedocs.io/en/stable/operator-manual/rbac/#rbac-model-structure |
Conceptual data model of components from domain 1. Please refer to Domain 1 Logical Data Model for a complete description of entities and their fields.
Handles the onboarding of a new participant in the Data Space.
| Entity | Description |
|---|---|
| Participant Type | A type of participant in the dataspace. It can be a consumer, an application provider, a data provider, or an infrastructure provider. |
| Onboarding Applicant | An applicant representing an organisation that seeks to join a dataspace. |
| Onboarding Procedure Template | A template that defines, for each participant type, the data that must be provided by an applicant (see User Roles) to complete the onboarding process. |
| Document Template | A component of an onboarding procedure template. It defines the document that must be uploaded as part of an onboarding request. |
| MIME Type | The MIME type associated with a document template. |
| Onboarding Request | An instance of an onboarding procedure template, created by an applicant. The request can change status based on actions taken by the applicant (e.g. submission) or by governance authority representatives (e.g. rejection, approval, or request for review). |
| Document | An instance of a document linked to an onboarding request, uploaded by an applicant (see User Roles). |
| Comment | A comment that can be added to an onboarding request by either an applicant or a governance authority representative (e.g. Notary). |
| Identity Attribute | A Tier 2 identity attribute used within the dataspace. |
| Validation Rule | A definition containing the parameters required to validate a document uploaded by an applicant. Validation rules can be combined into hierarchical structures. |
| Validation Rule Execution | A record of the outcome of a validation performed on a document uploaded by an applicant. |
Microservice that helps to map tier 1 roles with tier 2 security attributes.
| Entity | Description |
|---|---|
| Identity Attribute | A Tier 2 identity attribute used within the dataspace. It can be assigned to one or more Tier 1 roles. |
| Role | A Tier 1 role assigned to a user within the SIMPL agent. It can have one or more Tier 2 identity attributes assigned. |
| User | A Simpl-Open end-user that uses the agent's functionalities |
| Roles Request | Created by an end-user to request specific roles and access the agent's functionalities. |
Microservice that provides Ephemeral Proofs to onboarded Data Space participants. It’s the core of Dynamic Attribute Provisioning. Deployed only by the Data Space Governance Authority.
| Entity | Description |
|---|---|
| Participant | An onboarded Data Space participant. |
| Identity Attribute | A Tier 2 identity attribute used within the dataspace. |
Microservice that handles the credentials for each Data Space participant. Deployed only by the Data Space Governance Authority.
| Entity | Description |
|---|---|
| Participant | An onboarded Data Space participant, along with the information needed to issue a credentials. |
| Credential | A credential (currently x509 Certificate) signed by the Governance Authority and later provided to the participant (see Credential in Authentication Provider component). |
| Auto Renewal Defaults | The default auto-renewal configurations of the dataspace. |
| Participant Auto Renewal | The credential auto-renewal configurations of the specific participant. |
| Auto Renewal Errors | Errors that may arise during credential auto renewal of a participant. |
| Entity | Description |
|---|---|
| KeyPair | A KeyPair (public and private) linked to the participant's credential. |
| Credential | A credential issued by the Governance Authority to the participant. The participant uses it to communicate with other participants. |
| Private Key | The private key content related to the keypair. |
| Participant | The information of the participant owning the agent. |
| Identity Attribute | A local copy of the dataspace identity attributes. |
| Auto Renewal Config | The agent's credential auto-renewal configuration. |
| Credential Sync Error | Execution errors that may happen during credential synchronisation with the Governance Authority. |
| Entity | Description |
|---|---|
| Infrastructure Provider | Represents the signed agreement between a provider and a consumer. |
Handles the storage and management of deployment scripts by infrastructure providers for provisioning infrastructure instances and applications.
Entity Description
| Entity | Description |
|---|---|
| Infrastructure Provider | Represents the company that offers infrastructure deployment services. |
| Deployment Script | Represents the deployment script uploaded by the provider to enable provisioning of infrastructure instances and applications. |
| Script Trigger | Represents a provisioning request for a deployment script. |
| Script Identify | Represents metadata about deployment scripts, including their hash for integrity checks. |
It manages the storage of information related to schemas, including their versions, associated metadata, and the events related to the publication and revocation of a schema.
Handles the onboarding of a new participant in the Data Space.
Entity Descriptions and Attributes
MimeType
Description: Represent the allowed MIME types for onboarding request documents.
Attributes
id: the identifier of the MIME type.
description : A human-readable text that describes the MIME type (e.g. “pdf”, “zip”).
value : The actual MIME type value following the RFC6838 (e.g. “application/pdf”, “application/zip”).
ParticipantType
Description: The participant type is related to an onboarding procedure template.
Attributes
id: The identifier of the participant type.
value : The code of the participant type.
label: A human-readable name for the participant type.
OnboardingProcedureTemplate
Description : The template of an onboarding procedure. Along with the document template, it defines the information that has to be filled out by the applicant.
Attributes
id: The template identifier
description: A brief description of the onboarding procedure template (e.g. “The role of this participant in the dataspace is …”).
participant_type_id: The participant type the onboarding procedure template refers to. References to ParticipanType entity.
expiration_timeframe : An expiration timeframe after which the onboarding request is considered rejected, expressed in seconds.
expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
OnboardingProcedureTemplateIdentityAttribute
Description: The mapping between the onboarding procedure template and the dataspace identity attributes.
Attributes:
onboarding_procedure_template_id : the identifier of the onboarding procedure template. References to OnboardingProcedureTemplate entity.
identity_attribute_code : the code of the identity attribute mapped to the onboarding procedure template.
DocumentTemplate
Description: The information related to a document that has to be uploaded.
Attributes:
id : The template identifier.
name: The short name of the document template.
description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”).
mandatory : Specifies if the document template has to be provided or is optional. Defaults to true (mandatory).
mime_type_id : The document mime type. References the MimeType entity.
OnboardingApplicant
Description: The information regarding the applicant who opens an onboarding request.
Attributes:
id: The identifier of the applicant
username: The username of the user (same as the one in Keycloak).
firstname: User’s first name.
lastname: User’s last name.
OnboardingRequest
Description: Onboarding request represents an instance of an onboarding request created by an applicant.
Attributes:
id : the identifier of the onboarding request
onboarding_procedure_template_id : The onboarding procedure template that the onboarding request refers to. References the OnboardingProcedureTemplate entity.
onboarding_status_id : The status of the onboarding request. References the Onboarding request status.
expiration_timeframe: An expiration timeframe after which the onboarding request is considered rejected.
expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR)
participant_type_id : The participant type, copied from the related onboarding procedure template. References the ParticipantType entity.
participant_id : The participant’s identifier. Populated when the onboarding request is approved and the participant is created.
rejection_cause : The text explaining why the request is rejected.
onboarding_applicant_id: The identifier of the applicant representative that created the onboarding request. References the OnboardingApplicant entity.
organization : The name of the organisation that opened this onboarding request through the applicant representative.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
OnboardingRequestIdentityAttribute
Description: The mapping between the onboarding request and the dataspace identity attributes.
Attributes:
onboarding_request_id : The identifier of the onboarding request. References the OnboardingRequest entity.
identity_attribute_code : The code of the identity attribute mapped to the onboarding procedure template.
Document
Description: The document uploaded by an applicant to complete the onboarding request.
Attributes:
id: the identifier of the document.
description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”)
document_template_id : The document template in the onboarding procedure to which this document refers. References the DocumentTemplate entity. It can be null if the document is requested during the onboarding of the applicant participant.
onboarding_request_id : The identifier of the onboarding request. References the OnboardingRequest entity.
mime_type_id : The document type. References the MimeType entity.
content : The actual content of the document uploaded by the applicant dataspace participant during the request creation or editing. If null, it means that the document has not been uploaded yet by the applicant dataspace participant.
fileSize : The size of the uploaded file.
filename : The name of the uploaded file.
Comment
Description: Comments inserted by the actors involved in the onboarding process.
Attributes:
id: The identifier of the comment.
onboarding_request_id : The identifier of the onboarding request to which the comment belongs. References the OnboardingRequest entity.
author : The author of the comment. It’s the username stored in Keycloak.
content : The comment written by the author.
OnboardingStatus
Description: Supporting table containing the status values (APPROVED, IN PROGRESS, IN REVIEW, REJECTED, EVALUATING).
Attributes:
id: The id of the status.
value : The actual status of an onboarding request.
label : A human-readable label for the status.
EventLog
Description: Register Business Events related to an onboarding request (Comment Inserted, Onboarding Request Status Change).
Attributes:
id : The identifier of the event.
onboarding_request_id : The identifier of the related onboarding request. References the OnboardingRequest entity.
initiator_user_id : The identifier of the user that caused the event.
initiator_service : The identifier of the component or service that caused the event (e.g. background service monitoring stale onboarding request).
event_type : Type of event (e.g. COMMENT_INSERTED, STATUS_CHANGED).
event_details : Additional JSON metadata that contains details about the event (e.g. new state).
entity_id : The id of the entity related to the event (e.g. the id of the comment if the event Type is COMMENT_INSERTED.
creation_timestamp : The creation timestamp of the event.
ValidationRule
Description: Validation rule context used to validate documents uploaded by the applicants.
Attributes:
id : The identifier of the validation rule.
name : The short name of the validation rule.
description : A detailed description of the validation rule.
document_template_id: The identifier of the document template to which the rule applies. References the DocumentTemplate entity.
onboarding_procedure_template_id : The onboarding procedure template where the rule has been created. References the OnboardingProcedureTemplate entity.
valid_since : The date from which the rule becomes valid and must be evaluated.
valid_to : The date until which the rule remains valid and must be evaluated.
active: Boolean parameter indicating if the rule is active. An inactive rule is not evaluated.
type: the validation rule type (CONTENT_CHECK, PRESENCE, COMPOSITE).
auto_approval: Flag indicating that if the rule passes, the related onboarding request is automatically approved.
required: Flag indicating that if the rule does not pass, the related onboarding request is automatically rejected.
content_validation_rule: When the type is CONTENT_CHECK, it contains the URL to an external validation service.
strategy : The evaluation strategy for a composite rule. ALL indicates that all child rules must be valid. AT_LEAST_ONE indicates that only one rule needs to be valid.
parent_id : The id of the parent COMPOSITE rule, if the rule is a child rule. Reference the ValidationRule entity.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
ValidationRuleExecution
Description: The rule execution outcome of a rule evaluated against an actual document uploaded by the applicant.
Attributes:
id: The identifier of the execution.
validation_rule_id: The identifier of the validation rule used for this execution. Reference the VaklidationRule entity.
document_id: The ID of the document on which the validation was performed. References the Document entity.
onboarding_request_id: The onboarding request this execution is related to. References the OnboardingRequest entity.
execution_start_date : The start date and time of the rule execution.
execution_end_date : The end date and time of the rule execution.
status : The outcome of the validation (SUCCESS, IN PROGRESS, ERROR, FAULT).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
ValidationRuleExecution
Description: A validation rule remark. Register the result of the external validation service when a content check rule fails.
Attributes:
id: The identifier of the remark.
execution_id: The identifier of the validation rule execution to which the remark refers. References the ValidationRuleExecution entity.
jsonb: An unstructured field containing the remark.
Microservice that helps to map tier 1 roles with tier 2 security attributes.
Entity Descriptions and Attributes
IdentityAttributeRole
Description: The mapping between the tier 1 role and the assignable tier 2 identity attributes.
Attributes:
id : The unique ID of the attribute → role association.
ida_code: The unique identity attribute code.
role_name : The role name mapped to the attribute. The role name references a role defined inside the tier1 authentication provider.
enabled : Flag indicating if the association between the identity attribute and the role is valid.
RoleRequest
Description: The role request created by an end-user to request a role in the agent.
Attributes :
id: The unique ID of the roles request
user_email: the email of the user who requested the roles
status: the status of the request (open, cancelled, approved, rejected)
reviewed_by: the id of the user that reviewed the request
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
RoleRequested
Description: A specific role linked to a role request.
Attributes :
id: The unique ID of the requested role request
role_request_id: the id of the parent role request
role: the code of the requested role
requested_by: either the id of the end-user that requested the role or the id of the approver that added the role to the existing role request
approved: if the role has been included when the parent role request has been accepted
requested_timestamp : The request timestamp.
Role
Description: An End-User Simpl-Open role
Attributes:
id: The unique ID of the role.
code: The role code.
name: The role’s human-readable name.
description: The role’s human-readable description.
builtin: Boolean indicating if the role built-in (default) for Simpl-Open.
enabled: Boolean indicating if the role can be assigned to a user and can be included in their session after authentication.
Microservice that provides Ephemeral Proofs to onboarded Dataspace Participants. It’s the core of the Dynamic Attribute Provisioning approach. Deployed only by the Data Space Governance Authority.
Entity Descriptions and Attributes
IdentityAttribute
Description: The complete list of all the identity attributes of the Data Space.
Attributes:
id : The unique ID of the identity attribute.
code : The identity attribute unique code. This is the actual identifier that is used to enforce authorisation across participants.
name : The human-readable identity attribute name.
description : The description of the identity attribute.
assignable_to_roles : Flag indicating if the identity attribute can be assigned to a role.
enabled : True if the identity attribute is enabled for this participant.
is_right : Flag indicating if the identity attribute is considered a legal right.
built_in : Boolean indicating that the identity attribute is built-in (installed with the agent and not modifiable).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
ParticipantIdentityAttribute
Description: Maps a participant with their identity attribute.
Attributes:
participant_id: The identifier of the participant associated with the entity attribute.
identity_attribute_id : The identifier of the identity attribute. References the IdentityAttribute entity.
Microservice that handles the credentials for each dataspace participant. Deployed only by the Data Space Governance Authority.
Entity Descriptions and Attributes
Participant
Description: Contains the information of the dataspace participants.
Attributes:
id : The unique ID of the participant.
organization : The organisation name of the participant.
participant_type : The type of the participant (CONSUMER, DATA PROVIDER, INFRASTRUCTURE PROVIDER, APPLICATION PROVIDER).
certificate_signing_request_content : The content of the CSR needed to issue a credential to the participant
tier1_public_key_content : Contains the tier 1 public key (Keycloak public key) used by the participant Keycloak to sign user tier1 JWTs.
active_credential_id : The id of the the participant’s active credential. References the Credential entity.
applicant_email : The email of the applicant responsible for the onboarding procedure of the participant
is_authority : Boolean indicating that the participant is the Governance Authority of the data space.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
renewal_request_timestamp : The timestamp of the renewal request issuance by the participant.
Credential
Description: Metadata of the credential stored in the credential factory component (EJBCA)
Attributes:
id : The unique ID of the credential
participant_id : The id of the participant owning the credential. References the Participant entity.
credential_type: Type of the credential, currently only x509 credentials are supported
certificate_authority : The certificate authority name of the credential factory component (EJBCA)
serial: The serial number of the credential in the credential factory component (EJBCA)
credential_id: The id of the credential inside the credential factory component (EJBCA)
expiry_date: the expiration date of the credential/
AutoRenewalDefault
Description: the default auto-renewal configuration for the data space.
Attributes:
id: the id of the auto-renewal configuration.
days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process is triggered by the scheduled job.
modified_by_user: indicates if the default installation configuration has been overwritten by a user of the Governance Authority.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
AutoRenewalParticipant
Description: auto-renewal configurations that override the default ones for the participant
Attributes:
participant_id: the id of the participant.
days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process for the participant is triggered by the scheduled job.
boolean: indicates if the auto renewal is enabled for the participant.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
AutoRenewalError
Description: stores the auto-renewal error details link to each participant.
Attributes:
id: the id of the logged error.
participant_id: the id of the participant for whom the autorenewal has failed.
description: the description of the error.
creation_timestamp : The creation timestamp.
Microservice that manages the credentials and the tier2 authentication of a participant
Entity Descriptions and Attributes
KeyPair
Description: The keypair created or uploaded by an applicant representative to initiate the credential creation after the approval of an onboarding request.
Attributes:
id : The unique ID of the participant.
name: The name of the KeyPair, inserted by the user upon creation.
active: Boolean indicating that the keypair is linked to an active credential.
public_key : The keypair public key content.
public_key_hash: The keypair public key hash.
certificate_signing_request: The content of the CSR, needed for requesting the issuance of a new credential linked to the keypair
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
PrivateKey
Description: The private key content related to the keypair.
Attributes:
id: The ID of the private key.
private_key: The private key encrypted content.
keypair_id: The keypair linked to this key. References the KeyPair entity.
creation_timestamp : The creation timestamp.
Credential
Description: The credential that allows tier2 communication of the participant in the dataspace. Can be empty if the configured credential storage is Hashicorp Vault.
Attributes:
id : The unique ID of the credential.
content : The credential (x509 certificate or foreseen SSI Verifiable Credential).
credential_id: The Base58 of the credential content.
issuance_date: The date and time when the credential was issued.
expiry_date: The expiry date and time of the credential.
keypair_id: the keypair linked to this credential. References the KeyPair entity.
creation_timestamp : The creation timestamp.
Identity Attribute
Description: When used inside a participant agent, it contains a local copy of the identity attributes of the data space, in sync with the identity attributes provided by the governance authority.
Attributes:
id : The id of the identity attribute
code : The unique code identifying the identity attribute
name: The human readable identity attribute name.
description : The description of the identity attribute
assignable_to_roles : Boolean indicating if the identity attribute is assignable to roles
enable : Boolean indicating if the identity attribute is enabled or not
assigned_to_participant : Boolean indicating if the identity attribute is currently assigned to the participant.
creation_timestamp : The creation timestamp
update_timestamp : The update timestamp
ParticipantInfo
Description: the details of the participant. Needed to retrieve the basic information related to the organization owning the agent. Populated only after the onboarding process has completed.
Attributes:
id: the ID of the entry.
participant_id: the ID of the participant owning the agent.
organization: the name of the organisation of the participant owning the agent.
authority_creation_timestamp: the creation timestamp of the participant inside the governance authority.
authority_update_timestamp: the update timestamp of the participant inside the governance authority.
creation_timestamp: the creation timestamp.
update_timestamp: the update timestamp.
AutoRenewalConfig
Description : stores the auto-renewal config for the agent.
Attributes:
id: the ID of the entry.
enabled: indicates if the auto-renewal is enabled for the participant agent.
CredentialSyncExecutionError
Description: logs the execution errors that may happen during credential syncronization with the Governance Authority.
Attributes:
id: the ID of the entry.
execution_timestamp: the execution timestamp of the attempted credential synchronisation.
error_message: the error details.
Contract manager handles the integration between the connector and VC Issuer component, Signer component, and Wallet component.
In the current release, the Contract Manager stores contract agreement related data in a single table for two main purposes:
Establish the data persistence for billing purposes (future feature)
Demonstrate contract negotiation status
contract_agreements
contract_agreement_id: UUID of contract agreement issued by the Connector
contract_definition_id ID of the contact definition
consumer_signature_date: Date and time of the consumer signature event
provider_signature_date: Date and time of the provider signature event
status: Status of contract negotiations
contract_negotiation_id: ID of the contract negotiation
asset_id: ID of the asset
provider_id: ID of the provider
consumer_id: ID of the consumer
contract_offer_id: ID of the contract offer
Handles the logical representation of the infrastructure provider’s deployment scripts and their relationships, ensuring efficient storage, retrieval, and management.
Entity Descriptions and Attributes
Configuration
Description : Represents the companies offering infrastructure deployment services.
Attributes :
id: Unique identifier for the configuration.
file_name: Configuration name.
configuration: Instruction containing the configuration.
script_id: The deployment script that is bonded to the configuration.
Deployment Script
Description : Stores details about deployment scripts used for infrastructure provisioning.
Attributes :
id : Unique identifier for the script.
title : Title of the script.
description : A short description of the script.
valid : Indicates if the script is valid.
creation_date : Date when the script was uploaded.
update_date : Last modification date.
cloud_provider_id : Links to the Infrastructure Provider table.
gitea_sha : Hash of the script in the repository.
location : Location in the repository.
content : Content of the deployment script.
file_name : Name of the file that was uploaded.
script_identify_id : Links to the Script Identify table.
Script Trigger
Description : Represents provisioning requests for deployment scripts.
Attributes :
id : Unique identifier for the provisioning request.
status : Status of the provisioning process (e.g., Received, Sent, Running).
resource_status : Status of the provisioned resource (e.g., Provisioning, Activated).
decommissioned_date_time : Decommissioning timestamp.
key_user : Credential retrieval key part 1.
key_vault : Credential retrieval key part 2.
requester_email : Email address of the requester.
provisioned_date_time : Provisioned timestamp.
volume_id : Id of the Virtual Machine’s storage.
datacenter_id : Id of the Datacenter where the Virtual Machine is running.
error_message : Error message related to the provisioning/decommissioning process.
requester_unique_id : Unique identifier for the resource request.
script_id : Links to the Deployment Script table.
Script Identify
Description : Stores metadata for deployment scripts, such as hashes for integrity verification.
Attributes :
id : Unique identifier for the metadata entry.
deployment_script_id : Links to the Deployment Script table.
hash : Hash of the deployment script.
Template :
Description : Stores VM template related data and references.
Attributes :
id : Primary Key, unique identifier for each Template.
cloud_environment_id : Reference to the Cloud Environment where the template will be running.
cloud_provisioner_template_id : Reference to the Cloud Provider Template file and description of VM limits.
name : Template name.
description : Template description.
cpu_core : The number of cores of the VM.
ram : The amount (mb) of memory to be assign to the VM.
storage : The storage (mb) size of the VM.
os : Name of the Operative System do be installed.
active : Indicates if the template is active (True) or inactive (False).
creation_date : Date when template was stored.
script_id : Referenc to the Deployment script created based on this template.
Component
Description : Represents a component that can be applied to a VM Template.
Attributes :
id : Unique identifier for the component.
name : Name of the component
description: Description of the component
content: Content of the component
creation_date : creation date
update_date: update date
active: if the component is active or not
Component Type
Description : Represents a the types of components (VM Configuration, Post configuration and Policies)
Attributes :
id : Unique identifier for the component type.
name : component type name.
Cloud Provider :
Description : Simple and basic attributes for a cloud provider.
Attributes :
id : Primary Key, unique identifier for each Provider
cloud_provider_name : Cloud Provider name.
Cloud Environment :
Description : The main characteristics for one of many environments a cloud provider provides.
Attributes :
id : Primary Key, unique identifier for each Cloud Environment
cloud_provider_id : Reference to Cloud Provider stored in the database (e.g. Ionos, AWS…).
environment_name : Environment name.
environment_description : Environment description.
iac : Infrastructure as Code technology to support the deployed resources.
datacenter_name : Name of the DataCenter where the resources will be provisioned.
datacenter_description : Description of the DataCenter.
location : Cloud Environment location identifier (e.g., us-east-1, europe-west3).
vault_path: Vault path where the cloud environment token is securely stored.
total_cpu_cores : The number of total cores available for the Cloud Environment
used_cpu_cores : The number of cores used by the Cloud Environment
total_ram : The total amount of memory available for the Cloud Environment
used_ram : RAM used by the Cloud Environment
total_storage : The total amount of storage available for the Cloud Environment
used_storage : Storage used by the Cloud Environment
vault_user : Vault identity or role with permissions to access the cloud environment token.
vault_key : Vault path or key where the cloud environment token is securely stored.
active : if the Cloud Environment is active or not.
Cloud Provisioner Template :
Description : The provisioner specific data to derive/create templates.
Attributes :
id : Primary Key, unique identifier for each Cloud Provisioner Template.
file_name : Template file title.
file : File content of the Cloud Provider Template (Terraform/Crossplane)
min_cpu_core : Minimum number of cores for the VM
max_cpu_core : Maximum number of cores for the VM
min_ram : Minimum ram size (Mb) for the VM
max_ram : Maximum ram size (Mb) for the VM
min_storage : Minimum storage size (Gb) for the VM
max_storage : Maximum storage size (Gb) for the VM
os : (List of) OS’s allowed for the creation of VMs
is_ovh : Flag to identify OVH templates
ovh_flavor : Ovh flavor for the VM
ovh_project_id : Ovh project id related to the environment of the vm
ovh_os_image_id : Ovh os image id for the VM
ovh_region : Ovh region where the VM will be running
Handles the logical representation of schema and update events
Entity Descriptions and Attributes
Schema
Description : Stores details about schema info and publication status
Attributes :
id : Unique identifier for the schema.
creation_date : Creation date of schema
latest_version : Latest version available for the schema
metadata : Metadata info for the schema
name : Name of the schema as created on Schema Management Service GA side
resource_type : Last modification date.
status : Publication status of schema
update_date : Last update date of specific schema
Schema Event
Description : Stores details about notification events related to publication and revoking of schema, produced by Schema Management Service
Attributes :
id : Unique identifier for the schema.
changelog : Changelog info associated to event
event_date : Date the event occurred
event_id : Event id generated by schema management service
event_type : Event type (PUBLISH | REVOKE)
info : Last modification date.
origin : System originating event
processing_status : Processing status of event
version : Schema version event refers to
Attributes labelled with “NN” are Not Null.
Contains the export of the physical data model of IAA Microservice. Please refer to the LDM - Domain 1 - Access Control & Trust for a description of entities and fields.
Postgres physical data model of the Onboarding service. It handles the onboarding of a new participant in the Data Space.
mime_type
id : The identifier of the MIME type.
value : A human-readable text that describes the MIME type (e.g. “pdf”, “zip”).
name : The actual MIME type value following the RFC6838 (e.g. “application/pdf”, “application/zip”).
participant_type
id : The identifier of the participant type.
value : The code of the participant type.
label : A human-readable name for the participant type.
onboarding_procedure_template
id : The template identifier.
description : A brief description of the onboarding procedure template (e.g. “The role of this participant in the dataspace is …”).
participant_type_id : The participant type the onboarding procedure template refers to. Foreign key to the participant_type table.
expiration_timeframe : An expiration timeframe after which the onboarding request is considered rejected, expressed in seconds.
expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
onboarding_procedure_template_identity_attribute
onboarding_procedure_template_id : The identifier of the onboarding procedure template. Foreign key to the onboarding_procedure_template table.
identity_attribute_code : The code of the identity attribute mapped to the onboarding procedure template.
document_template
id : The template identifier.
name : The short name of the document template.
description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”).
mandatory : Specifies if the document template has to be provided or is optional. Defaults to true.
mime_type_id : The document MIME type. Foreign key to the mime_type table.
onboarding_applicant
id : The identifier of the applicant.
username : The username of the user (same as the one in Keycloak).
firstname : User’s first name.
lastname : User’s last name.
onboarding_request
id : The identifier of the onboarding request.
onboarding_procedure_template_id : The onboarding procedure template that the onboarding request refers to. Foreign key to the onboarding_procedure_template table.
onboarding_status_id : The status of the onboarding request. Foreign key to the onboarding request status.
expiration_timeframe : An expiration timeframe after which the onboarding request is considered rejected.
expiration_timeframe_timeunit : The time unit for the expiration timeframe (HOUR, DAY, YEAR).
participant_type_id : The participant type, copied from the related onboarding procedure template. Foreign key to the participant_type entity.
participant_id : The participant’s identifier. Populated when the onboarding request is approved and the participant is created.
rejection_cause : The text explaining why the request is rejected.
onboarding_applicant_id : The identifier of the applicant representative that created the onboarding request. Foreign key to the onboarding_applicant entity.
organization : The name of the organisation that opened this onboarding request through the applicant representative.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
onboarding_request_identity_attribute
onboarding_request_id : The identifier of the onboarding request. Foreign key with the onboarding_request table.
identity_attribute_code : The code of the identity attribute mapped to the onboarding procedure template.
document
id : The identifier of the document.
description : A brief description of the requested document (e.g. “Business License”, “Proof of Identity”).
document_template_id : The document template in the onboarding procedure to which this document refers. Foreign key to the document_template table. Can be null if the document is requested during onboarding of the applicant participant.
onboarding_request_id : The identifier of the onboarding request. Foreign key to the onboarding_request entity.
mime_type_id : The document type. Foreign key to the mime_type table.
content : The actual content of the document uploaded by the applicant dataspace participant during the request creation or editing. Null if not uploaded yet.
fileSize : The size of the uploaded file.
filename : The name of the uploaded file.
comment
id : The identifier of the comment.
onboarding_request_id : The identifier of the onboarding request to which the comment belongs. Foreign key to the onboarding_request table.
author : The author of the comment. It’s the username stored in Keycloak.
content : The comment written by the author.
onboarding_status
id : The id of the status.
value : The actual status of an onboarding request.
label : A human-readable label for the status.
event_log
id : The identifier of the event.
onboarding_request_id : The identifier of the related onboarding request. Foreign key to the onboarding_request table.
initiator_user_id : The identifier of the user that caused the event.
initiator_service : The identifier of the component or service that caused the event (e.g. background service monitoring stale requests).
event_type : Type of event (e.g. COMMENT_INSERTED, STATUS_CHANGED).
event_details : Additional JSON metadata that contains details about the event (e.g. new state).
entity_id : The id of the entity related to the event (e.g. the id of the comment if event_type is COMMENT_INSERTED).
creation_timestamp : The creation timestamp of the event.
validation_rule
id : The identifier of the validation rule.
name : The short name of the validation rule.
description : A detailed description of the validation rule.
document_template_id : The identifier of the document template to which the rule applies. Foreign key to the document_template table.
onboarding_procedure_template_id : The onboarding procedure template where the rule has been created. Foreign key to the onboarding_procedure_template table.
valid_since : The date from which the rule becomes valid and must be evaluated.
valid_to : The date until which the rule remains valid and must be evaluated.
active : Boolean parameter indicating if the rule is active. An inactive rule is not evaluated.
type : The validation rule type (CONTENT_CHECK, PRESENCE, COMPOSITE).
auto_approval : Flag indicating that if the rule passes, the related onboarding request is automatically approved.
required : Flag indicating that if the rule does not pass, the related onboarding request is automatically rejected.
content_validation_rule : When the type is CONTENT_CHECK, contains the URL to an external validation service.
strategy : The evaluation strategy for a composite rule. ALL means all child rules must be valid; AT_LEAST_ONE means only one rule must be valid.
parent_id : The id of the parent COMPOSITE rule if the rule is a child rule. Foreign key to the validation_rule table.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
validation_rule_execution
id : The identifier of the execution.
validation_rule_id : The identifier of the validation rule used for this execution. Foreign key to the validation_rule table.
document_id : The ID of the document on which the validation was performed. Foreign key to the document table.
onboarding_request_id : The onboarding request this execution is related to. Foreign key to the onboarding_request table.
execution_start_date : The start date and time of the rule execution.
execution_end_date : The end date and time of the rule execution.
status : The outcome of the validation (SUCCESS, IN PROGRESS, ERROR, FAULT).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
validation_rule_execution_remark
id : The identifier of the remark.
execution_id : The identifier of the validation rule execution to which the remark refers. Foreign key to the validation_rule_execution table.
jsonb : An unstructured field containing the remark in JSONB format.
Postgres physical data model of the User Roles Microservice. It helps to map tier 1 roles with tier 2 security attributes.
identity_attribute_roles
id : The unique ID of the attribute to role association.
ida_code : The unique identity attribute code.
role_name : The role name mapped to the attribute. The role name references a role defined inside the tier1 authentication provider.
enabled : Flag indicating if the association between the identity attribute and the role is valid.
role_request
id: The unique ID of the roles request.
user_email: the email of the user who requested the roles.
status: the status of the request (open, cancelled, approved, rejected).
reviewed_by: the id of the user that reviewed the request.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
role_requested
id: The unique ID of the requested role request
role_request_id: the id of the parent role request. Foreign key to role_request table.
role: the code of the requested role.
requested_by: either the id of the end-user that requested the role or the id of the approver that added the role to the existing role request.
approved: if the role has been included when the parent role request has been accepted.
requested_timestamp : The request timestamp.
role
id: The unique ID of the role.
code: The role code.
name: The role’s human-readable name.
description: The role’s human-readable description.
builtin: Boolean indicating if the role builtin (default) for Simpl-Open.
enabled: Boolean indicating if the role can be assigned to a user and can be included in their session after authentication.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
Postgres physical data model for the Security Attributes Microservice. It contains the association between participants and identity attributes. Being the core of the Dynamic Attribute Provisioning approach, it is queried by the identity provider to build the Ephemeral Proofs to onboarded Dataspace Participants. Deployed only by the Data Space Governance Authority.
identity_attribute
id : The unique ID of the identity attribute.
code : The unique identity attribute code.
name : The human-readable identity attribute name.
description : Flag indicating if the association between the identity attribute and the role is valid.
assignable_to_role : Flag indicating if the identity attribute can be assigned to a role.
enabled : Flag indicating if the identity attribute is enabled for the dataspace.
is_right : Flag indicating if the identity attribute is considered a legal right (currently not used).
built_in: Flag indicating that the identity attribute is built-in (installed with the agent and not modifiable).
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
participant_identity_attribute
participant_id : The ID of the participant (owned by the identity provider component).
identity_attribute_id : The ID of the participant associated with the entity attribute. Foreign key to identity_attribute table.
Postgres physical data model for the Identity Provider Microservice. It contains the participant information, the participant’s Certificate Signing Request(CSR) and the participant’s tier1 public key. Deployed only by the Data Space Governance Authority.
participant
id : The unique ID of the participant.
participant_type : The type of the participant.
organization : The organisation name of the participant.
certificate_signing_request_content : The content of the CSR needed to issue a credential to the participant
tier1_public_key_content : Contains the tier 1 public key (Keycloak public key) used by the participant Keycloak to sign user tier1 JWTs.
active_credential_id : The id of the participant’s active credential. Foreign key to the credential table.
is_authority : Boolean indicating that the participant is the Governance Authority of the data space.
renewal_request_timestamp : The timestamp of the renewal request issuance by the participant.
applicant_email : The email of the applicant responsible for the onboarding procedure of the participant.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
credential
id : The id of the credential
participant_id : The id of the participant owning the credential. Foreign key to the participant table.
credential_type: Type of the credential, currently only x509 credentials are supported
certificate_authority : The certificate authority name of the credential factory component (EJBCA)
serial: The serial number of the credential in the credential factory component (EJBCA)
credential_id: The id of the credential inside the credential factory component (EJBCA)
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
auto_renewal_default
id: the id of the auto-renewal configuration.
days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process is triggered by the scheduled job.
modified_by_user: indicates if the default installation configuration has been overwritten by a user of the Governance Authority.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
auto_renewal_participant
participant_id: the id of the participant. Foreign key to the participant table.
days_before_expiry: the number of days prior to the credential’s expiration at which the auto-renewal process for the participant is triggered by the scheduled job.
boolean: indicates if the auto renewal is enabled for the participant.
creation_timestamp : The creation timestamp.
update_timestamp : The update timestamp.
auto_renewal_error
d: the id of the logged error.
participant_id: the id of the participant for whom the autorenewal has failed.
description: the description of the error.
creation_timestamp : The creation timestamp.
Postgres physical data model for the authentication provider microservice. It contains the keypair and the credentials that the participant uses to communicate with other participants. It also contains a local copy of the dataspace identity attributes.
keypair
id : The unique ID of the participant.
name: The name of the KeyPair, inserted by the user upon creation.
active: Boolean indicating that the keypair is linked to an active credential.
public_key : The keypair public key content.
public_key_hash: The keypair public key hash.
certificate_signing_request: The content of the CSR, needed for requesting the issuance of a new credential linked to the keypair
private_key
id: The ID of the private key.
private_key: The private key encrypted content.
keypair_id: The keypair linked to this key. Foreign key to the keypair table.
credential
id : The unique ID of the credential.
content : The credential (x509 certificate or foreseen SSI Verifiable Credential).
issuance_date: The date and time when the credential was issued.
expiry_date: The expiry date and time of the credential.
keypair_id: the keypair linked to this credential. Foreign key to the keypair table.
identity_attribute
id : The id of the identity attribute
code : The unique code identifying the identity attribute
description : The description of the identity attribute
assignable_to_roles : Boolean indicating if the identity attribute is assignable to roles
enable : Boolean indicating if the identity attribute is enabled or not
assigned_to_participant : Boolean indicating if the identity attribute is currently assigned to the participant.
creation_timestamp : The creation timestamp
update_timestamp : The update timestamp
participant_info
id: the ID of the entry
participant_id: the ID of the participant owning the agent.
organization: the name of the organization of the participant owning the agent
authority_creation_timestamp: the creation timestamp of the participant inside the governance authority
authority_update_timestamp: the update timestamp of the participant inside the governance authority
creation_timestamp: the creation timestamp
update_timestamp: the update timestamp
automatic_renewal_config
id: the ID of the entry
active: indicates if the auto-renewal is enabled for the participant agent.
credential_sync_execution_error
id: the ID of the entry
execution_timestamp: the execution timestamp of the attempted credential synchronisation.
error_message: the error details.
DBML
Table contract_agreements {
contract_agreement_id UUID [primary key]
contract_definition_id uuid
consumer_signature_date timestamptz
provider_signature_date timestamptz
status text
contract_negotiation_id text
asset_id text
provider_id text
consumer_id text
}
Infrastructure Provider - PDM
Table infrastructure_provider {
id BIGINT [pk]
name VARCHAR(255)
}
Table deployment_script {
id BIGINT [pk, increment]
valid BOOLEAN
creation_date DATE
description VARCHAR(255)
file OID
gitea_sha VARCHAR(255)
location VARCHAR(255)
original_file_name VARCHAR(255)
title VARCHAR(100)
update_date DATE
cloud_provider_id BIGINT [ref: > infrastructure_provider.id]
script_identify_id BIGINT [ref: > script_identify.id, unique]
}
Table script_trigger {
id BIGINT [pk, increment]
decommissioned_date_time TIMESTAMP
provisioned_date_time TIMESTAMP
key_user VARCHAR(255)
key_vault VARCHAR(255)
requester_email VARCHAR(255)
requester_unique_id VARCHAR(255)
resource_status VARCHAR(255)
status VARCHAR(255)
script_id BIGINT [ref: > deployment_script.id]
volume_id VARCHAR(255)
datacenter_id VARCHAR(255)
error_message TEXT
}
Table script_identify {
id BIGINT [pk]
deployment_script_id VARCHAR(50) [unique]
hash VARCHAR(255)
}
Table config_file {
id BIGINT [pk, increment]
file_name VARCHAR(255)
file TEXT
script_id BIGINT [ref: > deployment_script.id]
}
Table template (
id bigserial,
cloud_environment_id bigint [ref: > cloud_environment.id],
cloud_provisioner_template_id bigint [ref: > cloud_provisioner_template.id],
name varchar(100),
description text,
cpu_core integer,
ram integer,
storage integer,
os varchar(50),
active boolean,
creation_date date,
script_id bigint [ref: > script.id]
)
Table template (
id bigserial,
cloud_environment_id bigint [ref: > cloud_environment.id],
cloud_provisioner_template_id bigint [ref: > cloud_provisioner_template.id],
name varchar(100),
description text,
cpu_core integer,
ram integer,
storage integer,
os varchar(50),
active boolean,
creation_date date,
script_id bigint [ref: > script.id]
)
Table cloud_environment
(
id bigserial,
cloud_provider_id bigint [ref: > cloud_provider.id],
environment_name varchar(100) ,
environment_description text,
iac varchar(50),
datacenter_name varchar(100),
datacenter_description text,
location varchar(100),
vault_path varchar(255),
total_cpu_cores integer,
used_cpu_cores integer,
total_raminteger,
used_ram integer,
total_storage integer,
used_storage integer,
vault_user varchar(255),
vault_keyvarchar(255)
)
Table cloud_provider
(
id bigint,
cloud_provider_name varchar(255)
)
Table cloud_provisioner_template
(
id bigserial,
file_name varchar(100),
file text,
min_cpu_core integer,
max_cpu_core integer,
min_ram integer,
max_ram integer,
min_storage integer,
max_storage integer,
os text[],
cloud_provisioner_template_uuid varchar(50),
is_ovh boolean,
ovh_flavor varchar(255),
ovh_project_id varchar(255),
ovh_os_image_id varchar(255),
ovh_region varchar(255),
)
DBML
CREATE TABLE IF NOT EXISTS public.schema
(
i_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1 ),
d_creation_date timestamp(6) without time zone NOT NULL,
s_latest_version character varying(50) COLLATE pg_catalog.”default” NOT NULL,
s_metadata character varying(255) COLLATE pg_catalog.”default” NOT NULL,
s_name character varying(250) COLLATE pg_catalog.”default” NOT NULL,
s_resource_type character varying(50) COLLATE pg_catalog.”default” NOT NULL,
s_status character varying(25) COLLATE pg_catalog.”default” NOT NULL,
d_update_date timestamp(6) without time zone,
CONSTRAINT schema_pkey PRIMARY KEY (i_id),
CONSTRAINT schema_un1 UNIQUE (s_name)
)
CREATE TABLE IF NOT EXISTS public.schema_event
(
i_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 9223372036854775807 CACHE 1 ),
s_changelog character varying(250) COLLATE pg_catalog.”default”,
d_event_date timestamp(6) without time zone NOT NULL,
s_event_id character varying(250) COLLATE pg_catalog.”default” NOT NULL,
s_event_type character varying(25) COLLATE pg_catalog.”default” NOT NULL,
s_info character varying(2000) COLLATE pg_catalog.”default”,
s_origin character varying(25) COLLATE pg_catalog.”default” NOT NULL,
s_processing_status character varying(25) COLLATE pg_catalog.”default” NOT NULL,
s_version character varying(50) COLLATE pg_catalog.”default”,
i_schema_id bigint NOT NULL,
CONSTRAINT schema_event_pkey PRIMARY KEY (i_id),
CONSTRAINT schema_name_un1 UNIQUE (s_event_id),
CONSTRAINT fklpbr0w32che4eibn92v5c464 FOREIGN KEY (i_schema_id)
REFERENCES public.schema (i_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE CASCADE
)
Simpl-Open Technology Architecture develops the target technology architecture that enables the application architecture to be delivered through technology components and technology services. Each application component is mapped to a technology implementing the capabilities.
It identifies technology components through the following views:
| View | Description |
|---|---|
| Technology Components Static View | Provide, per application service, an enriched view of the Application Components Static View by adding technology components that support the implementation of the application components. |
| Technology Components Dynamic View | Provide a dynamic view (sequence diagrams) per business process (or sub-process) on how technology components are used to satisfy different workflows. |
| Technology Deployment View | Provides an aggregated view of how the different technology components (cross BPs and domains) are deployed for all Simpl-Open agent types (Governance Authority, Data Provider, Infrastructure Provider, Application Provider, Consumer). |
Next to these architecture views, the following are provided:
A table of OSS Technology - with reasons for selecting them and links to existing documentation such as data models and installation guides;
Detailed technical specification that are particularly relevant for contributing to Simpl-Open and/or implementing it in a data space.
Technology components views are presented per functional domain in following sub-sections.
For each functional domain, are presented:
a static view of the entire domain which enriches the application components view with the technologies that are implementing the components;
a set of dynamic views (sequence diagrams) that present how a subset of the technology components are used to satisfy different (parts of) business processes.
This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.
Authorisation
The Authorisation Tier 1 RBAC component is implemented with Spring Cloud Gateway.
The Authorisation Tier 2 ABAC component is implemented with Spring Cloud Gateway.
Security Attributes Provider
The Attributes Management component is implemented as a Java backend application.
The Security Attributes Provider UI component is implemented as an Angular frontend application.
The Attributes Database is implemented in PostgreSQL.
Identity Provider
The Credential Management service is implemented as a Java backend application.
The Credential Verification service is implemented as a Java backend application.
The Identity Provider UI component is implemented as an Angular frontend application.
The Credential Factory component is implemented with Enterprise JavaBeans Certificate Authority (EJBCA).
The Identity Database is implemented in PostgreSQL.
Onboarding
The Onboarding Manager component is implemented as a Java backend application.
The Onboarding UI component is implemented as an Angular frontend application.
The Onboarding Database component is implemented in PostgreSQL.
Document Validation
Tier 1 Authentication Provider
The Users Management component, providing the Agent Users Management and Local IDP Federation services, is implemented with Keycloak.
The Tier 1 Authentication Provider UI component is implemented as an Angular frontend application.
The User Database is implemented in PostgreSQL.
The Authenticator Plugin is a custom Keycloak SPI that allows to add custom claims to Tier1 JWT tokens
Tier 2 Authentication Provider
The Credential Management component is implemented as a Java backend application.
The Tier 2 Authentication Provider UI component is implemented as an Angular frontend application.
Credentials Database/Vault
User & Roles
The User & Roles Management component is implemented as a Java backend application.
The User & Roles UI component is implemented as an Angular frontend application.
The User & Roles Database is implemented in PostgreSQL.
This perspective illustrates the interactions and the flows between all the technological components.
Applicant Onboarding Request Submission
mermaid diagram Expand source
sequenceDiagram
actor applicant as Applicant
participant obui as Onboarding UI
participant t1 as Tier 1 Gateway
participant ob as Onboarding
participant sap as Security Attributes Provider
participant uar as Users & Roles
participant idp as Identity Provider
participant kc as Keycloak
applicant ->> obui: create onboarding request
activate obui
obui ->> t1: create onboarding request
t1 ->> ob: request
ob ->> sap: fetch identity attributes
sap —>> ob: identity attributes
ob ->> ob: save onboarding request
ob ->> uar: create user
uar -> kc: create user with credentials
kc —>> uar: user details
uar —>> ob: user details
ob —>> t1: onboarding request
t1 —> obui: request
obui —>> applicant: ok
deactivate obui
applicant ->> obui: login
activate obui
obui ->> kc: login using temporary credentials
kc —>> obui: login status
obui —>> applicant: ok
deactivate obui
applicant ->> obui: submit
activate obui
obui ->> t1: Submit Onboarding Request
t1 ->> ob: Request
ob —>> t1: Ok, submitted
t1 —> obui: submitted
obui —>> applicant: ok
deactivate obui
box Governance Authority
participant obui
participant t1
participant ob
participant sap
participant uar
participant idp
participant kc
end
Governance Authority Onboarding Review
mermaid diagram Expand source
sequenceDiagram
actor gar as GA Representative
participant obui as Onboarding UI
participant t1 as Tier 1 Gateway
participant ob as Onboarding
participant sap as Security Attributes Provider
participant uar as Users & Roles
participant idp as Identity Provider
participant kc as Keycloak
gar ->> obui: login
activate obui
obui ->> kc: login using temporary credentials
kc —>> t1: login status
t1 —> obui: status
obui —>> gar: ok
deactivate obui
gar ->> obui: Approve/Reject
activate obui
alt Reject
obui ->> t1: reject
t1 ->> ob: reject
ob —>> t1: Ok, rejection completed
t1 —> obui: completed
end
alt Request revision
obui ->> t1: request revision
t1 ->> ob: revision
ob —>> t1: Ok, revision requested
t1 —> obui: requested
end
alt Approve
obui ->> t1: Approve
t1 ->> ob: Approve
ob ->> idp: create participant
idp —>> ob: participantId
ob ->> sap: assign identity attributes
sap —>> ob: attribute assigned
ob —>> t1: Ok, approved
t1 —> obui: approved
end
obui —>> gar: review completed
deactivate obui
box Governance Authority
participant obui
participant t1
participant ob
participant sap
participant uar
participant idp
participant kc
end
Applicant installs credentials
mermaid diagram Expand source
sequenceDiagram
participant kcp as Keycloak
participant uarp as Users & Roles
participant hc as Hashicorp Vault
participant authp as Authentication Provider
participant t1p as Tier 1 Gateway
participant pui as Participant Utility UI
actor applicant as Applicant
participant obui as Onboarding UI
participant t1 as Tier 1 Gateway
participant t2 as Tier 2 Gateway
participant sap as Security Attributes Provider
participant uar as Users & Roles
participant idp as Identity Provider
participant kc as Keycloak
participant EJBCA as EJBCA
applicant ->> pui: login
activate pui
pui ->> kcp: login using temporary credentials
kcp —>> pui: login status
pui —>> applicant: ok
deactivate pui
applicant ->> pui: generate keypair
activate pui
pui ->> t1p: generate keypair
t1p ->> authp: generate keypair
authp ->> authp: generate keypair
authp ->> hc: store private key
hc ->> authp: ok
authp —>> t1p: ack keypair generated
t1p —>> pui: ack keypair generated
pui —>> applicant: ack keypair generated
deactivate pui
applicant ->> pui: generate CSR
activate pui
pui ->> t1p: generate CSR
t1p ->> authp: generate CSR
authp ->> authp: generate CSR
authp —>> t1p: CSR
t1p —>> pui: CSR
pui —>> applicant: CSR
deactivate pui
applicant ->> obui: login
activate obui
obui ->> kc: login using temporary credentials
kc —>> obui: login status
obui —>> applicant: ok
deactivate obui
applicant ->> obui: upload CSR
activate obui
obui ->> t1: upload CSR
t1 ->> idp: CSR
idp ->> EJBCA: generateCredential
EJBCA —>> idp: credential
idp —>> t1: credential
t1 —> obui: credential
obui —> applicant: credential
deactivate obui
applicant ->> pui: upload credential
activate pui
pui ->> t1p: upload credential
t1p ->> authp: upload credential
authp ->> hc: store credential
hc —>> authp: ok, credential stored
authp —>> t1p: ok, credential stored
t1p —>> pui: ok, credential stored
pui —>> applicant: ok, credential stored
deactivate pui
authp —>> uarp: credential installed (EVENT)
activate uarp
uarp -> kcp: get Tier 1 public key
kcp —>> uarp: Tier 1 public key
uarp ->> t2: upload Tier 1 public key
t2 ->> idp: upload Tier 1 public key
idp —> t2: ok, public key stored
t2 —> uarp: ok, public key stored
deactivate uarp
box Governance Authority
participant obui
participant t1
participant t2
participant sap
participant uar
participant idp
participant kc
participant EJBCA
end
box Participant
participant t1p
participant pui
participant hc
participant authp
participant uarp
participant kcp
end
This perspective illustrates the interactions and the flows between all the technological components.
Configure identity provider federation
mermaid diagram Expand source
sequenceDiagram
actor u as User
participant t1 as Tier 1 Authorization provider<br>(Keycloak)
participant idp as Organization IdP
u ->> t1: login
activate t1
t1 —>> u: login status
deactivate t1
u ->> t1: configure IdP connection
activate t1
t1 ->> idp: IdP federation
idp —>> t1: federation set up completed
t1 —>> u: ok, federation completed
deactivate t1
Configure users and roles
mermaid diagram Expand source
sequenceDiagram
actor u as End user
participant urui as Users & Roles UI<br/>(Typescript - Angular)
participant t1 as Tier 1 Gateway<br/>(Java - Spring Cloud Gateway)
participant ur as Users & Roles<br/>(Java - Spring Boot)
participant kc as Tier 1 Authorization provider<br>(Keycloak)
u ->> urui: login
activate urui
urui ->> kc: login using credentials
kc —>> t1: login status
t1 —> urui: status
urui —>> u: ok
deactivate urui
alt Create end user
u ->> urui: create end user
activate urui
urui ->> t1: create end user
t1 ->> ur: create end user
ur ->> kc: create end user
kc —>> ur: ok, end user created
ur —>> t1: ok, end user created
t1 —>> urui: ok, end user created
urui —>> u: ok, end user created
deactivate urui
end
alt Update end user
u ->> urui: update end user
activate urui
urui ->> t1: update end user
t1 ->> ur: update end user
ur ->> kc: update end user
kc —>> ur: ok, end user updated
ur —>> t1: ok, end user updated
t1 —>> urui: ok, end user updated
urui —>> u: ok, end user updated
deactivate urui
end
alt Delete end user
u ->> urui: delete end user
activate urui
urui ->> t1: delete end user
t1 ->> ur: delete end user
ur ->> kc: delete end user
kc —>> ur: ok, end user delete
ur —>> t1: ok, end user delete
t1 —>> urui: ok, end user delete
urui —>> u: ok, end user delete
deactivate urui
end
alt Enable/Disable end user
u ->> urui: enable/disable end user
activate urui
urui ->> t1: enable/disable end user
t1 ->> ur: enable/disable end user
ur ->> kc: enable/disable end user
kc —>> ur: ok, end user enabled/disabled
ur —>> t1: ok, end user enabled/disabled
t1 —>> urui: ok, end user enabled/disabled
urui —>> u: ok, end user enabled/disabled
deactivate urui
end
alt Assign roles to end user
u ->> urui: assign roles to end user
activate urui
urui ->> urui: select roles
urui ->> t1: assign roles to end user
t1 ->> ur: assign roles to end user
ur ->> kc: assign roles to end user
kc —>> ur: ok, roles assigned
ur —>> t1: ok, roles assigned
t1 —>> urui: ok, roles assigned
urui —>> u: ok, roles assigned
deactivate urui
end
alt Create role
u ->> urui: create role
activate urui
urui ->> t1: create role
t1 ->> ur: create role
ur ->> kc: create role
kc —>> ur: ok, role created
ur —>> t1: ok, role created
t1 —>> urui: ok, role created
urui —>> u: ok, role created
deactivate urui
end
alt Update role
u ->> urui: update role
activate urui
urui ->> t1: update role
t1 ->> ur: update role
ur ->> kc: update role
kc —>> ur: ok, role update
ur —>> t1: ok, role update
t1 —>> urui: ok, role update
urui —>> u: ok, role update
deactivate urui
end
alt Delete role
u ->> urui: delete role
activate urui
urui ->> t1: delete role
t1 ->> ur: delete role
ur ->> kc: delete role
kc —>> ur: ok, role delete
ur —>> t1: ok, role delete
t1 —>> urui: ok, role delete
urui —>> u: ok, role delete
deactivate urui
end
alt Assign identity attributes to role
u ->> urui: assign identity attributes to role
activate urui
urui ->> urui: select identity attributes
urui ->> t1: assign identity attributes to role
t1 ->> ur: assign identity attributes to role
ur ->> kc: assign identity attributes to role
kc —>> ur: ok, identity attributes assigned
ur —>> t1: ok, identity attributes assigned
t1 —>> urui: ok, identity attributes assigned
urui —>> u: ok, identity attributes assigned
deactivate urui
end
This perspective illustrates the interactions and the flows between all the technological components.
Role request submission and cancellation
mermaid diagram Expand source
sequenceDiagram
actor u as End user
participant urui as Users & Roles UI<br/>(Typescript - Angular)
participant t1 as Tier 1 Gateway<br/>(Java - Spring Cloud Gateway)
participant ur as Users & Roles<br/>(Java - Spring Boot)
participant kc as Tier 1 Authorization provider<br>(Keycloak)
u ->> urui: login
activate urui
urui ->> kc: login using credentials
kc —>> t1: login status
t1 —> urui: status
urui —>> u: ok
deactivate urui
alt Submit role request
u ->> urui: Create role request
activate urui
urui ->> urui: select roles
urui ->> t1: create role request
t1 ->> ur: create role request
ur —>> t1: ok, role request created
t1 —>> urui: ok, role request created
urui —>> u: created
deactivate urui
end
alt Cancel role request
u ->> urui: Cancel role request
activate urui
urui ->> t1: cancel role request
t1 ->> ur: cancel role request
ur —>> t1: ok, role request cancelled
t1 —>> urui: ok, role request cancelled
urui —>> u: ok, role request cancelled
deactivate urui
end
ARole request review
mermaid diagram Expand source
sequenceDiagram
actor u as Users Roles Manager
participant urui as Users & Roles UI<br/>(Typescript - Angular)
participant t1 as Tier 1 Gateway<br/>(Java - Spring Cloud Gateway)
participant ur as Users & Roles<br/>(Java - Spring Boot)
participant kc as Tier 1 Authorization provider<br>(Keycloak)
u ->> urui: login
activate urui
urui ->> kc: login using credentials
kc —>> t1: login status
t1 —> urui: status
urui —>> u: ok
deactivate urui
u ->> urui: Approve/Reject
activate urui
alt Reject
urui ->> t1: reject
t1 ->> ur: reject
ur —>> t1: ok, role request rejected
t1 —> urui: completed
end
alt Approve
urui ->> urui: select roles
urui —>> t1: approve
t1 ->> ur: approve
ur ->> kc: assign roles to end user
kc —>> ur: roles assigned
ur —>> t1: ok, role request approved
t1 —> urui: completed
end
urui —>> u: review completed
deactivate urui
Governance Authority Representative revokes, suspends or reactivates credentials
mermaid diagram Expand source
sequenceDiagram
actor garep as Governance Authority<br> Representative
participant obui as Onboarding UI<br><br>(Angular)
participant t1 as Tier 1 Gateway<br><br>(Spring Cloud Gateway)
participant idp as Identity Provider<br><br> (Java - Spring Boot)
participant ejbca as Credential Store<br><br>(EJBCA)
alt revoke
garep ->> obui: revoke credential
activate obui
obui ->> t1: revoke credential
t1 ->> idp: revoke credential
idp ->> ejbca: revoke credential
note right of ejbca: credential must be active
ejbca —>> idp: ok, revoked
idp —>> t1: ok, revoked
t1 —> obui: ok, revoked
obui —>> garep: ok, revoked
deactivate obui
end
alt suspend
garep ->> obui: suspend credential
activate obui
obui ->> t1: suspend credential
t1 ->> idp: suspend credential
note right of ejbca: credential must be active
idp ->> ejbca: suspend credential
ejbca —>> idp: ok, suspended
idp —>> t1: ok, suspended
t1 —> obui: ok, suspended
obui —>> garep: ok, suspended
deactivate obui
end
alt re-activate
garep ->> obui: re-activate credential
activate obui
obui ->> t1: re-activate credential
t1 ->> idp: re-activate credential
idp ->> ejbca: re-activate credential
note right of ejbca: credential must be suspended, not revoked
ejbca —>> idp: ok, re-activated
idp —>> t1: ok, re-activated
t1 —> obui: ok, re-activated
obui —>> garep: ok, re-activated
deactivate obui
end
box Governance Authority
participant obui
participant t1
participant idp
participant ejbca
end
Applicant representative request credential Renewal - Governance authority renews credential
mermaid diagram Expand source
sequenceDiagram
actor garep as Governance Authority<br> Representative
participant obui as Onboarding UI<br><br>(Angular)
participant t1 as Tier 1 Gateway<br><br>(Spring Cloud Gateway)
participant idp as Identity Provider<br><br> (Java - Spring Boot)
participant ejbca as Credential Store <br><br>(EJBCA)
participant t2 as Tier 2 Gateway<br><br>(Spring Cloud Gateway)
participant authp as Authentication Provider<br><br> (Java - Spring Boot)
participant t1p as Tier 1 Gateway<br><br>(Spring Cloud Gateway)
participant putility as Participant Utility<br><br>(Angular)
actor partrep as Participant Representative
alt renewal request
Note right of partrep: keypair and CSR already generated
partrep ->> putility: submit credential renewal request (CSR)
activate putility
putility ->> t1p: credential renewal request (with CSR)
t1p ->> authp: credential renewal request (with CSR)
authp ->> t2: credential renewal request (with CSR)
t2 ->> idp: credential renewal request (with CSR)
idp ->> idp: store credential request (CSR)
idp —>> t2: ok, request accepted
t2 —>> authp: ok, request accepted
authp —>> t1p: ok, request accepted
t1p —>> putility: ok
putility —>> partrep: ok
deactivate putility
end
alt renew
garep ->> obui: renew credential
activate obui
obui ->> t1: renew credential
t1 ->> idp: renew credential
idp ->> idp: fetch latest CSR
idp ->> ejbca: renew credential (CSR)
ejbca —>> idp: ok, renewed
idp —>> t1: ok, renewed
t1 —> obui: ok, renewed
obui —>> garep: ok, renewed
deactivate obui
end
box Governance Authority
participant obui
participant t1
participant idp
participant t2
participant ejbca
end
box Participant
participant putility
participant t1p
participant authp
end
Governance Authority Representative assigns identity attributes to a participant
mermaid diagram Expand source
sequenceDiagram
actor garep as Governance Authority<br> Representative
participant sapui as Security Attributes Provider UI <br><br>(Angular)
participant t1 as Tier 1 Gateway <br><br>(Spring Cloud Gateway)
participant sap as Security Attributes Provider<br><br> (Java - Spring Boot)
garep ->> sapui: assign identity attribute
activate sapui
sapui ->> t1: assign identity attribute (participantId + identity attributes)
t1 ->> sap: assign identity attribute (participantId + identity attributes)
sap ->> sap: map identity attributes to participant
sap —>> t1: ok, assigned
t1 —> sapui: ok, assigned
sapui —>> garep: ok, assigned
deactivate sapui
box Governance Authority
participant sapui
participant t1
participant sap
end
This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.
Catalogue Client Application
The Catalogue Client Application Backend component is implemented as a Java backend application.
The Catalogue Client Application UI component is implemented as an Angular frontend application.
Validation Backend
Contract Consumption Adapter
Catalogue
The S earch Engine component is implemented with XFSC .
The Catalogue Database component is implemented in PostgreSQL with Neo4J.
The Vocabulary Datastore component is implemented as a File system.
The Management Service is implemented with XFSC.
The Sematic Validation service is implemented with RDFLib pySHACL.
The Quality rule validation service is implemented with RDFLib pySHACL.
The Syntax Validation service is implemented with RDFLib pySHACL.
Query Mapper Adapter
Connector
The Connector component is implemented as an Eclipse Dataspace Connector.
The Control plane component is implemented as a Java backend application.
The Data plane component is implemented as a Java backend application.
The Infrastructure orchestrator is implemented as a Java backend application.
The Policy engine component is implemented as a Java backend application.
EDC Connector Adapter
Contract Manager Backend
The Contract Manager Backend component is implemented as a Java backend application.
The API interface is implemented as a Kafka consumer Json/Kafka.
Contract Manager Orchestrator
The Contract Manager Orchestrator component is implemented as a Java backend application.
The API interface is implemented as a Kafka consumer Json/Kafka.
Message Broker
Contract Template Datastore
Orchestration Platform
Orchestration Engine (Dagster Deamons): Several Dagster features, like schedules, sensors, and run queueing, require a long-running dagster-daemon process to be included with your deployment. They start the RunLauncher as ephemeral processes.
Orchestration Run Worker (K8Run Launcher): The run launcher is the interface to the computational resources that will be used to actually execute Dagster runs. It receives the ID of a created run and a representation of the pipeline that is about to undergo execution. We use the K8Run Launcher, which is a run launcher that allocates a Kubernetes job per workflow run.
Code Location: A code location is a collection of Dagster definitions loadable and accessible by Dagster’s tools, such as the CLI, UI. A code location comprises:
A reference to a Python module that has an instance of Definitions in a top-level variable
A Python environment that can successfully load that module
Orchestration Management UI (Dagit) : Dagit is Dagster’s browser-based orchestration console that provides an intuitive, real-time view into your data pipelines and assets. It acts as the operational hub for engineers, analysts, and operators to develop, launch, and monitor jobs: you can visualize graphs of ops, configure runs, inspect logs and intermediate results, manage schedules and sensors, and observe asset materializations, all without leaving the UI.
Orchestration Engine API: Dagster exposes a GraphQL API as its primary programmatic interface to the orchestration engine. This API underpins both Dagit and the Python client libraries, and lets you manage every aspect of your Dagster instance without using the UI. Through it, you can:
Launch and cancel runs (jobs, backfills, re-executions).
Query run states, logs, and event streams for monitoring.
Manage workspace locations (code servers).
Asset Orchestrator : A component developed for SIMPL Open. It connects the data and application offerings from the catalogue with the orchestration engine.
Auth Proxy : Authentication sidecar component to integrate the orchestration platform with the IAA stack in a loose coupled way.
Repository (Gitea) : The repository is not only a logical container for services, workflows, and schedules but also the natural unit of versioning and auditability for your orchestration code. Because a repository is defined in source control ( Gitea Repository ), every change to a job graph, op implementation, resource configuration, or schedule is captured as a commit with author, timestamp, and diffs. This enables you to:
Version your pipelines : each commit or tag corresponds to a known set of jobs/ops; you can deploy specific versions of the repository image to different environments (dev/test/prod).
Audit changes : by inspecting the repository history you can see who modified a job, added a sensor, or changed a resource definition, providing traceability for compliance.
Rollback safely : if a change breaks a pipeline, you can redeploy the previous repository image and Dagster will run the older job definitions.
Tie runs to code versions : by embedding a Git commit hash or image tag in the run metadata, you get a direct link between a Dagster run and the exact code and configuration that produced it.
This approach shifts “audit and versioning” from being an afterthought in the orchestration layer to being a first-class property of the development workflow, making it straightforward to satisfy governance, reproducibility, and regulatory requirements.
CICD (Gitea Actions): In our orchestration platform running, the CI/CD is the mechanism that moves changes from development into the productive cluster in a controlled, auditable way. It allows for automatic tests before the publication and better auditability. Rollbacks are trivial because previous images and manifests are retained. This setup ensures that the version of each Dagster job or op running in production is traceable to a specific commit, that deployments are reproducible, and that production workloads can scale or recover automatically under Kubernetes while still meeting compliance and reliability requirements.
Provisioned Node (Infrastructure Consumer) / Private Network
Is created by the Infrastructure Provisioner (see TCV Static - Infrastructure Provisioning Service ) on behalf of the Consumer
So In principle the Consumer has access
to the Infrastructure (Provisioned Node)
and the Private Network
Infrastructure Connector Service :warning:
Currently (2025-09-26) it is not clear what the term “Connector” really means
No definition found anywhere in the documentation
In the “data space universe” a connector typically means:
A (software) component that acts as a trustworthy gateway for data exchange, enabling secure, sovereign, and standardized sharing of data between organizations and systems while enforcing usage policies.
Such connectors/gateways often use frameworks like “Eclipse Dataspace Components (EDC)” and operate on the Application level
Expressed in this picture with all the assets
:warning: Note: typos, connection lines probably wrong
If the above definition is true then the access to the Infrastructure (Provisioned Node)/Private Network is not part of the connector.
Therefore not known what this really means for Infrastructure
Triggering Module
The Script Storage Management Module component is implemented as a Java backend application.
The Script Execution Module component is implemented as a Java backend application.
The Access Management Module component is implemented as a Java backend application.
The Triggering Module UI component is implemented as an Angular frontend application.
The API interface is implemented as a Kafka consumer Json/Kafka.
Infrastructure Provisioner
The Infrastructure Provisioner component is implemented in ArgoCD.
The Infrastructure Provisioner component is implemented in Crossplane.
The API interface is implemented as a Kafka consumer Json/Kafka.
Infrastructure Provider Storage
The Database is implemented in PostgreSQL.
The Repository is implemented in Git-based.
Message Broker
Policy Template Datastore
SD Tooling
The SD Manager component is implemented as a Java backend application.
The Validation BE component is implemented as a Java backend application.
The SD Creation Tool component is implemented with XFSC Organisation Credential Manager.
The SD Tooling UI component is implemented as an Angular frontend application.
The Schema Management Backend Service implements:
The Schema Management, which is storing the Schemas
The Schema Subscription API, where any Service can subscribe for Schema Updates
The Schema Management Backend API for creating, updating and revoking schemas
The Scheme Management UI
The Schema Synch Service implements:
The Schema Synch Adapter API, that received any updated from the Schema Management Service
The Schema Synch Adapter , that is retrieving the Schema updates and processed them
Signer Service
Vocabulary Management
The Vocabulary Management Backend component is implemented as a File System.
The Vocabulary Management Frontend component is implemented as a ReactJS application.
Wallet
This perspective illustrates the interactions and the flows between all the technological components.
Schema Synchronization
The figure illustrates the role of the schema-sync-adapter component. This component receives notifications from the Schema Management Service when a schema is published, versioned, or revoked. It then retrieves the schema from the Schema Management Service and stores it in the schema storage (NFS), making it available for use by the SD-Tooling .
This perspective illustrates the interactions and the flows between all the technologies.
The search stack is split into a consumer/provider part and a centralised one.
The first one includes a client that offers a UI to the end user. The frontend application, for the advanced search, checks the parameters of the search with a local instance of the schema cache system previously synced with the instance present on the Governance Authority side. This allows us to perform a check on the parameters inserted in the Advanced search UI and send to the Query Mapper Adapter queries that are consistent with the schemas of the resources present in the Data Space. Furthermore, on both sides is present an instance of the Spring Cloud API Gateway which takes care of securing the connection towards the other agent.
In the Governance Authority instance, apart from the already mentioned components, there is the Query Mapper & Filter Adapter which is in charge of translating the incoming query to the required query language and applying the filtering based on access policies. Then this last component redirects the resulting query to the API of the catalogue. The XFSC catalogue includes an internal search engine that will be used to perform the query on the Self-description present underneath Neo4J DB. The catalogue has behind also a Postgres DB for managing metadata and ensuring efficient file identification and data consistency.
This perspective illustrates the interactions and the flows between all the technological components.
This perspective illustrates the interactions and the flows between all the technological components.
This perspective illustrates the interactions and the flows between all the technologies.
The data consumption BP9A is mainly addressed by the EDC connector Java backend.
The EDC connector is in charge of various steps of the Dataspace
protocol. In particular, the core of the backend is the control plane
that is in charge of the contract negotiation and the selection of the
correct data plane depending on the type of resource requested while the
actual data transfer is performed by the selected EDC connecter Java
extension which will connect to the real data source.
In the consumption process policies should be checked and this action is
performed by the Policy Module present in the EDC connector Java
backend.
This perspective illustrates the interactions and the flows between all the technologies.
The data consumption BP which encompasses also the infrastructure provider is addressed by the infrastructure-related components (see BP8 for the details about this part).
The request from the user is directed to the Data provider EDC connector Java backend which forwards the request to the custom EDC extension connector that is capable of interacting with infrastructure Provider APIs.
This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.
This section describes the architecture for Monitoring and Logging, within a single node (Simpl-Open agent) and does not (yet) consider inter-nodes setup.
This perspective illustrates the correlation between the architectural elements and the technologies, components, and interfaces intended for use in implementing the application components.
The monitoring service is based primarily on the Elastic stack.
Filebeat and Metricbeat are used to collect respectively technical logs and infrastructure metrics.
As Simpl-Open application services are deployed as containers in Kubernetes, both technical logs and infrastructure metrics are collected via the kube-api.
Technical logs are then forwarded to Logstash for processing and potential transformation. Business logs are directly sent by the application services to Logstash.
Metricbeat, Heartbeat and Logstash forward respectively infrastructure metrics and logs (technical and business) to Elasticsearch which acts as a central logs repository.
Kibana is used as a user interface to provide reporting, log visualisation, monitoring space and alerting capabilities. Kibana also queries the health endpoints of the services, exposed as REST/JSON APIs, to display their health in a dashboard.
A custom reporting application exposes a REST/JSON API to query logs for other purposes such as monitoring federation (i.e. forwarding some logs to the Governance Authority) or billing.
Health check component query and collects health data from components across Simpl-Open agent and store it for further health visualizations.
Tracing captures detailed information about requests flow across Simpl-Open components to help to discover potential bottlenecks.
The Schema Management Backend Service implements:
The Schema Management, which is storing the Schemas
The Schema Subscription API, where any Service can subscribe for Schema Updates
The Schema Management Backend API for creating, updating and revoking schemas
The Scheme Management UI
Schema Lifecycle Management (Governance Perspective)
Schema Version Creation : A Governance Administrator creates a new version of a schema by submitting a SHACL file and its associated metadata (e.g., version number, changelog) to the SMS Management API . The SMS validates and stores the new version.
Schema Publication : To make an entire schema family available for use, the Administrator uses the SMS Management API to change the status of the Schema Concept to PUBLISHED.
Event Notification : Upon successfully changing the status, the SMS :
Updates its internal database to reflect the new status.
Publishes a SchemaPublished event. This event contains the schema’s metadata, its new status, and the content of its versions.
Schema Revocation : If a schema family is no longer approved for use, the Administrator changes its status to REVOKED via the API. This triggers a SchemaRevoked event, preventing new data from being validated against any version of this schema.
The content presented in this section presents a view on the currently available release for the GA, Data Provider, Infra Provider and Consumer. Application Provider view falls behind the scope of the current release.
The following Technology Deployment View describes how the different technology components are deployed for all Simpl-Open agent types (Governance Authority, Data Provider, Infrastructure Provider, Application Provider, Consumer):
Simpl-Open is designed to be a container-native application and is provided with all the required deployment artefacts to be deployed on a pre-existing Kubernetes Cluster .
Each agent is deployed inside its own Kubernetes Namespace .
Three types of workloads are used:
Deployment - used for managing a stateless application workload, where any Pod in the Deployment is interchangeable and can be replaced if needed.
StatefulSet - used to run one or more related Pods that do track state somehow (for example, if the workload records data persistently). StatefulSet can match Pods with PersistentVolumes.
DaemonSet - used for Pods that provide facilities that are local to nodes. Every time a node is added to the cluster, and it matches the specification in a DaemonSet, the control plane schedules a Pod for that DaemonSet onto the new node. Each pod in a DaemonSet performs a job similar to a system daemon on a classic Unix / POSIX server.
Kubernetes Services are used to expose certain components, running as one or more pods, behind a single outward-facing endpoint, even when the workload is split across multiple nodes.
The present section is divided into 2 parts:
Roadmap of 3 Years with draft consideration about Open-Source Software product selection;
Open-Source Product Decision, as architecture is further analysed, and components / interface are confirmed.
The Roadmap OSS selection is assumed to be valid until the respective capabilities are confirmed or amended by the Decisions that will happen in Agile fashion quarter by quarter.
The following illustration presents the Draft 3 Years Roadmap of the Open-Source Software product selection to implement the functional capabilities required by Simpl-Open.
Also below is presented the table with the rationale for selection, available today.
As a general process, quarter by quarter, release by release, the Architecture team will further analyse capability by capability and confirm or amend selection based on detailed requirement and detailed architecture, including interaction with other technologies/components.
The draft table below provide a first rationale of selection identified as preliminary stages.
| Tools | Description | Rationale |
|---|---|---|
| Eviden Open-Source |
Partitum is a Proven solution component of the Eviden Clearing house as a service. This solution is currently running at Athumi (Belgium – Flanders). A Data Space intermediate, that is responsible for securely exchanging data between the different actors in a Data Space community and monetisation. The product provides the necessary tools to remove financial burden for the actors by: · Onboard the different actors in your eco-system and taking care of the contractual and financial agreements necessary to exchange data; · Clearing of transactions based upon contractual agreements between the actors and their risk profile; · Settlement of executed transactions between different actors; · Automatically invoicing through billing or self-billing. |
No integrated toolset available in the market matching client requirements. |
| DAPS | Issue dynamic identity attributes based on scoped request. | It fits the second authentication mechanism described in Annex III of the “Architecture Vision Document” where identity attributes are dynamically by the Identity Attributes along with an ephemeral proof. |
| EJBCA |
Public key infrastructure certificate authority software. |
It is needed in all the envisioned authentication mechanisms between Participants as they require the issuance of a x.509 certificate. |
| Keycloak |
Identity and Access Management software. |
This component will manage the authentication and authorisation of the End Users. It can be easily federated with existing Participants’ identity providers and extended to implement several types of authentication mechanisms (2FA, Digital Wallet, etc.). |
| ELK Stack | https://www.elastic.co/elastic-stack/ | As suggested by Tenders Specifications and based on Market Standard. |
| Prometheus | https://prometheus.io/ | As suggested by Tenders Specifications and based on Market Standard. |
| Grafana | https://grafana.com/ | As suggested by Tenders Specifications and based on Market Standard. |
| MTLS | Mutual TLS (mTLS) is a security practice that provides encrypted communications between every workload and application in your infrastructure, regardless of location. | Recognised protocols by several Open-Source products. |
| Crossplane |
Crossplane enables cloud-agnostic infrastructure provisioning and management. |
To abstract away cloud-specific APIs, enabling consistent control of resources across various cloud providers. It empowers DevOps teams to define infrastructure as code (IaC) and easily manage multi-cloud environments, enhancing agility and reducing vendor lock-in. |
| Terraform |
Terraform automates infrastructure as code, simplifying provisioning and scaling. |
For its declarative IaC approach, enabling infrastructure automation through code. Terraform's extensive provider ecosystem ensures broad cloud support and efficient orchestration, facilitating rapid scaling and reducing operational overhead. |
| Ansible |
Ansible orchestrates application deployment and configuration with minimal complexity. |
Agentless automation for simplified application provisioning and configuration management. Ansible's idempotent playbooks, robust modules, and YAML-based syntax simplify complex tasks, ensuring consistency and efficient operations across infrastructure. |
| Kubernetes |
Kubernetes is a container orchestration platform, simplifying application deployment and scaling. |
For containerised workload management and orchestration. Its advanced features, including auto-scaling, rolling updates, and service discovery, simplify application lifecycle management and enhance resource utilisation, making it a top choice for container-based applications. |
| UFW | Uncomplicated Firewall (UFW) simplifies firewall management for Linux systems. | Straightforward firewall rule management on Linux. Its user-friendly interface and uncomplicated syntax make it a powerful tool to secure systems against unwanted network traffic while simplifying the configuration of firewall policies. |
| WireGuard |
WireGuard offers secure, efficient VPN solutions for network privacy and protection. |
To secure network communications with state-of-the-art cryptography. lightweight design, minimal attack surface, and dynamic routing capabilities to provide robust VPN security, ensuring high-speed, low-latency connections for infrastructure. |
| nftables | nftables is a versatile packet filtering framework for fine-grained network control. | For advanced network filtering and routing. Its expressive syntax and performance optimisations help network administrators to efficiently manage packet filtering, firewall rules, and network address translation (NAT). |
| ModSecurity | ModSecurity provides web application firewall (WAF) protection against online threats. | To secure web applications with robust WAF capabilities. Its comprehensive rule sets and real-time threat detection safeguard applications from web-based attacks, ensuring data integrity and user trust. |
| Ceph |
Ceph is a distributed storage system for scalable, reliable data storage. |
For cost-effective, highly available storage solutions. Its distributed architecture, erasure coding, and RADOS (Reliable Autonomic Distributed Object Store) technology deliver scalable, fault-tolerant storage, making it ideal for cloud and data-intensive workloads. |
| OKD (OpenShift) |
OKD, the open-source version of OpenShift, offers container orchestration and management. |
To deploy, manage, and scale containerised applications with Kubernetes simplicity. OKD's developer-friendly features, integrated CI/CD, and extensive ecosystem enhance DevOps workflows and application delivery, without worrying about the infrastructure. |
| OpenStack |
OpenStack is an open-source cloud computing platform for building private and public clouds. |
To create customisable, private cloud environments. The modular architecture provides flexibility and control over cloud resources, enabling tailored cloud solutions, reducing costs, and avoiding vendor lock-in. |
| Kubeless | Kubeless is a serverless framework for Kubernetes, enabling function-as-a-service (FaaS). | Serverless application development over Kubernetes. Simplifies event-driven, microservices-based architectures, providing rapid scaling and efficient resource utilisation, perfect for modern application workloads. Suitable for providers who are already running Kubernetes. |
| OpenHPC | OpenHPC provides a comprehensive high-performance computing (HPC) stack for clusters. | To build and manage high-performance computing clusters. OpenHPC simplifies the integration of HPC software components, ensuring optimised performance for scientific and computational workloads. |
| OpenWhisk |
OpenWhisk is an open-source serverless platform with support for multiple programming languages. |
Serverless capabilities for flexible, event-driven application development. OpenWhisk's language-agnostic approach simplifies serverless computing, facilitating faster development and deployment of cloud-native functions. |
| eDelivery | eDelivery helps public administrations to exchange electronic data and documents with other public administrations, businesses and citizens at the national level and across borders, in an interoperable, secure and reliable way. | Part of Digital Building Blocks from European Commission. |
| eSignature | The DIGITAL eSignature Building Block allows public administrations, businesses, and citizens to electronically sign any document, anywhere in Europe, at any time, in line with the eIDAS Regulation for e-signatures, e-seals and related services offered by Trust Service Providers. | Part of Digital Building Blocks from European Commission. |
| eInvoicing | The eInvoicing Building Block aims to promote the successful uptake of electronic invoicing in Europe, respecting the European standard on electronic invoicing and Directive 2014/55/EU on electronic invoicing in public procurement. | Part of Digital Building Blocks from European Commission. |
| eID | The eID Building Block allows public administrations and private service providers to easily extend the use of their online services to citizens from other Member States, in line with the eIDAS Regulation. In the digital age, public administrations and businesses need to carry out fast, secure electronic transactions and validate the identities of those involved with the same legal validity as traditional paper processes. Electronic identification (eID) makes this possible. | Part of Digital Building Blocks from European Commission. |
| Eclipse EDC |
The EDC connector is a software installed by the participating company or a platform thereby providing technical access to the ecosystem. A connector can consist of monolithic or self-contained software. |
As an open source project hosted by the Eclipse Foundation, the EDC provides a growing list of modules for many widely-deployed cloud environments (AWS, Azure, GCP, OTC, etc.) "out-of-the-box" and can easily be extended for more customised environments, while avoiding any intellectual property rights (IPR) headaches. |
| XFSC Federated Catalogue |
The “Federated Catalogue” service includes a catalogue where Gaia-X resources, asset items, and participants can be found by potential consumers and end users. Resources, asset items and participants are provided at Gaia-X using self-descriptions. |
The reference implementation of organisational Federated Catalogue supporting SD according to the Gaia-X Trustmodel. |
| piveau |
piveau is a data management ecosystem for the public sector. |
It provides components and tools to support the entire data processing chain from harvesting, aggregation, provision, and use. It is highly extensible, focuses on open standards and is designed for use in the cloud and reacts reliably and quickly to unforeseen access peaks. |
| XFSC OCM |
The “Organisation Credential Manager” service establishes trust between the different participants within the decentralised Gaia-X ecosystem. It includes all trust-related functions required to manage and offer Gaia-X self-descriptions in the W3C Verifiable Credential Format. |
The reference implementation of organisational Credential Manager due to Gaia-X Trustmodel. |
| XFSC PCM |
The “Credential Manager” service enables Gaia-X users to manage their credentials themselves. To do this, the user needs secure storage (user wallet) and presentation capabilities in the authentication and authorisation processes. |
The reference implementation of personal Credential Manager due to Gaia-X Trustmodel. |
| Apache Spark |
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. |
Apache Spark is highly adopted by thousands of companies. It also integrates with all important frameworks on Data Science and Machine Learning, SQL Analytics and BL and Storage and Infrastructure. |
| Great Expectations |
A powerful platform to uphold data quality. |
Great Expectations offer broad flexibility and control when creating data quality tests. It also provides auto-updating documentation to ease reports of test suites and results in collaborative environments. |
| Marquez, OpenLineage |
OpenLineage is an open platform for collection and analysis of data lineage. It tracks metadata about datasets, jobs, and runs, giving users the information required to identify the root cause of complex issues and understand the impact of changes. |
OpenLineage contains an open standard for lineage data collection, a metadata repository reference implementation (Marquez), libraries for common languages, and integrations with data pipeline tools. |
| MLflow |
MLflow is an open-source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. |
MLflow offers several key components to access, evaluate, process and deploy Large Language Models (LLM). |
| Apache Jupyter |
JupyterLab is a web-based interactive development environment for notebooks, code, and data. |
Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality. This tool is highly adopted in the data science community |
| Superset |
Apache Superset is an open-source modern data exploration and visualisation platform. |
Superset is fast, lightweight, intuitive, and loaded with options that make it easy for users of all skill sets to explore and visualise their data, from simple line charts to highly detailed geospatial charts. It supports a wide range of data bases. |
| UVdesk | https://www.uvdesk.com/en/ | open-source ITSM tool selected as the tool best matching the tender requirements. |
| TheHive | https://thehive-project.org/ | Same toolset as the one used for cert.eu and other governments institutions. |
| MISP | https://www.misp-project.org/ | Same toolset as the one used for cert.eu and other governments institutions. |
| Spring Cloud Gateway |
software components that act as an API Gateway. |
This component will manage the routing of API Requests to the several services that compose the SMP middleware. It is easily extendible and configurable in order to implement specific cross cutting concerns as security and the control of Access & Usage policies. |
| Spring Cloud Circuit Breaker |
Library implementing the Circuit Breaker pattern and other HA patterns. |
Mitigates high response times and network errors, enhancing system reliability. It implements the Circuit Breaker, Retry and Bulkhead patterns. It is useful for communication inside and outside the SMP Agent perimeter. |
| Webpack Module Federation | Technology enabling the creation of micro-frontends. | A common Application Shell will be implemented, that dynamically loads the several autonomous Front End modules. Each module can be mapped to a specific micro-service and developed independently by the same Team that is in charge of it, increasing the speed of development of distributed and scalable applications. |
| Aruba Consent Management | Consent management service. | It manages consent given by Data Providers to the Consumers. It binds consents to specific versions of a legal text. Data Providers can revoke their consent at any time. Specific events are raised for every notable change in the system, that can be easily reviewed and audited. |
| Spring Cloud Config | https://spring.io/projects/spring-cloud-config | |
| Swagger | https://swagger.io/ | The de facto standard of documentation for REST APIs. |
| Data Mashup Editor (Eng opensource) | The mission of the Data Mashup Editor is to develop a powerful and intuitive graphical tool that simplifies the process of harmonising data from diverse sources, leveraging cutting-edge technologies and intelligent data integration techniques. The Data Mashup Editor is dedicated to ensuring data accessibility, usability, and accuracy, enabling informed decision-making across industries and domains and unlocking the true value of data assets. | The Data Mashup Editor was chosen as one of the tools for data processing building block and data sharing building block due to its ability to seamlessly handle both real-time and batch data streams, while redirecting the output to various entities adopting different technologies and protocols simultaneously. Its internal architecture makes it highly suitable for cloud deployment, ensuring optimal performance and distributed executions. Additionally, it offers an intuitive user experience through its graphical interface, making it easy for users to utilise the tool effectively. |
| Rule Manager (Eng opensource) | The Digital Enabler Rule Manager is a powerful tool designed for managing trigger rules and automated responses based on specific data values within your platform. This tool offers a user-friendly guided wizard for defining and implementing rules for data processing within the platform. | The Rule Manager was chosen as one of the tools for ata processing building block and data sharing building block due to its capability to create rules of varying complexity based on the data within the system and this gives the possibility of adding a monitoring layer in the processing steps. It integrates seamlessly with the Data Mashup Editor, providing a comprehensive solution for data manipulation. Its internal architecture is well-suited for cloud deployment, ensuring excellent performance and distributed executions. Furthermore, its graphical interface provides users with an intuitive experience, simplifying the process of effectively utilising the tool. |
| Airflow |
Apache Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows. |
Airflow was chosen as the data orchestration component, in the supporting data services building block, due to its exceptional flexibility, allowing the installation of plugins as needed. Moreover, it seamlessly integrates with cloud architectures, providing excellent support for distributed execution in a microservices environment. |
The table below presents the list of Open-Source Software currently used by Simpl-Open.
| Capability | Sub-Capability | Tool | Description | URL | Rationale | Additional Considerations |
|---|---|---|---|---|---|---|
| Discovery | Metadata | SD (GX-Trustframework) | Metadata of Participants and service offerings (App, Data, Infra) described as GAIA-X Self-Description using an ontology. The SD uses a linked data format and allows the definition of constraints and quality rules. | https://gaia-x.gitlab.io/policy-rules-committee/trust-framework/ |
|
Ontology highly adopted in Data Space initiatives. Best choice to convince participants to provide self-descriptions in this way. It can be easily enhanced with sectoral specific parameters. |
| Discovery | Metadata | XFSC SD Tooling | Tooling to create self-descriptions describe the service offerings (Data, App, Infrastructure). | https://gitlab.eclipse.org/eclipse/xfsc/self-description-tooling |
|
No other FOSS tool available to create customised SD. Schemas can be created via L inkML Generator Tool Fully customisable SD definitions possible. |
| Discovery | Catalogue at Governance Authority | XFSC Federated Catalogue | Federated Catalogue providing Discovery capability to look up on Self Descriptions of service offerings (Data, App, Infrastructure). | https://gitlab.eclipse.org/eclipse/xfsc/cat |
|
The only implementation of a FOSS federated catalogue supporting SD. i.e. validation of SD when published and searching for SD providing an internal search engine. It also already support semantic validation. In addition the search engine is based on NoSQL which provides the base for knowledge search needed for M2M use cases. CKAN is using PostgresSQL as database. Hence it is not well prepared for ontology searches. There are plugins available to enable limited Ontology search capabilities like SparQL extensions. However, they do not scale and will fail on complex knowledge graph search as needed for ML algorithms. Either a PropertyGraph Database like Neo4J or an RDF-Triple Storage like Apache Fuseki Jena, Virtuoso etc. is needed. |
| Discovery | Credential Manager at Provider | XFSC OCM | The credential manager to store the Self Descriptions on organisational side. It also covers signing of Self Descriptions created by a provider, revoking a credential, verification and retrieval of credentials as microservices. | https://gitlab.eclipse.org/eclipse/xfsc/organisational-credential-manager-w-stack |
|
This is created as part of XFSC matching the needs best for SD. Can be easily replaced with any other wallet solution providing the same protocols in exchanging credentials ( OIDC4VP and OIDC4VC ). |
| Access control & trust | Authentication Provider | Keycloak |
Open-Source Identity and Access Management
Add authentication to applications and secure services with minimum effort.
Keycloak provides user federation, strong authentication, user management, fine-grained authorisation, and more. |
https://www.keycloak.org/ |
|
An on-premise solution that is a de facto standard and offers a wide-range set of features and a native(java) extensible interface. |
| Provisioning | VM/Container/Storage provisioning | Crossplane | Crossplane is an open-source Kubernetes add-on that allows to define and automate the infrastructure using Kubernetes-style configuration files. It extends the Kubernetes API to allow to provision and manage cloud resources and services from various providers, such as AWS, GCP, Azure and more, in a unified manner. | https://www.crossplane.io/ |
|
Multi-cloud environment operation. Crossplane simplifies infrastructure management by bringing the benefits of the Kubernetes declarative model to cloud provisioning. By using Crossplane, teams can leverage the familiar Kubernetes tools and workflows to manage infrastructure alongside their applications, leading to a more consistent, scalable, and efficient infrastructure management process. Crossplane is favored over Terraform ( https://blog.crossplane.io/crossplane-vs-terraform/ ), also because of more permissive license. |
| Provisioning | VM/Container/Storage provisioning | OpenTofu | OpenTofu is an open-source Infrastructure as Code (IaC) tool and community-driven fork of Terraform. It allows the definition, provisioning, and management of infrastructure using declarative configuration files. OpenTofu supports a wide range of cloud providers like AWS, Azure, and GCP, as well as on-premise systems. It enables infrastructure automation, version control, and consistency across deployments, while ensuring long-term openness and flexibility free from proprietary constraints. | https://opentofu.org/ |
|
Multi-cloud environment operation. OpenTofu simplifies infrastructure management by using a declarative configuration model to provision and manage cloud and on-premise resources. As a community-driven fork of Terraform, OpenTofu enables teams to define infrastructure as code using a consistent language and workflow, ensuring predictable, repeatable, and scalable infrastructure provisioning. This open and transparent approach supports automation, collaboration, and better alignment between development and operations teams, without reliance on proprietary tooling. |
| Provisioning | VM/Container/Storage provisioning | ArgoCD | ArgoCD is a declarative, GitOps continuous delivery tool for Kubernetes. It is designed to simplify the process of deploying and managing applications on Kubernetes clusters. ArgoCD uses a GitOps approach, which means it uses Git repositories as the source of truth for application configurations. | https://argo-cd.readthedocs.io/en/stable/ |
|
ArgoCD is selected for its ability to simplify and automate the deployment and management of Kubernetes applications. Its declarative, GitOps approach ensures consistency and reproducibility across environments, while features like automated rollouts and rollbacks enhance application availability and resilience. By leveraging ArgoCD, it's possible to set up continuous delivery pipeline and reduce the complexity associated with manual configuration and deployment processes. |
| Provisioning | Workflow Orchestration | Argo Workflows | Argo Workflows is an open-source container-native workflow engine for orchestrating parallel jobs on Kubernetes. It allows defining complex workflows as YAML files and executing multi-step pipelines directly on Kubernetes, ideal for CI/CD, ML pipelines, and data processing. | https://argo-workflows.readthedocs.io/ |
|
Argo Workflows is chosen for its Kubernetes-native support and flexibility in defining scalable, complex workflows. It allows seamless integration with GitOps, supports step-based and DAG-based workflows, and handles parallel execution effectively. Ideal for task orchestration in cloud-native environments. |
| Provisioning | Event-Driven Automation | Argo Events | Argo Events is an open-source event-driven workflow automation framework for Kubernetes. It allows users to trigger workflows (e.g., Argo Workflows) based on events from various sources such as webhooks, Kafka, S3, schedules, and more. | https://argoproj.github.io/argo-events/ |
|
Argo Events complements Argo Workflows by enabling fine-grained, declarative automation based on real-time system and external events. It helps build responsive, scalable pipelines triggered by real-world conditions or scheduled timers in Kubernetes-native environments. |
| Provisioning | Post Configuration / Application Deployment | Cloud-init | Cloud-init is a popular tool for automating the initialisation and configuration of cloud instances. It is designed to simplify the process of deploying and configuring cloud instances, and is widely used in cloud computing environments. | https://cloud-init.io/ |
|
Cloud-init is selected for its ability to automate the initialisation and configuration of cloud instances, as well as run post-provisioning tasks such as deploying or installing applications, automated network configuration, storage setup, and security hardening. Its modular and customisable approach ensures that instances are properly configured and secured, reducing the risk of errors and improving overall reliability. |
| Provisioning | Storage (Repository) | Gitea | Gitea is a lightweight, open-source, and highly extensible repository management tool that provides a simple and intuitive way to manage code repositories. It offers a web-based interface for creating, managing, and organising code repositories, and provides features such as collaboration, version control, and issue tracking. | https://about.gitea.com/ |
License: MIT License
Community Support: Large and active community with many contributors and users Documentation Available: Extensive documentation available, including user guides, API references, and tutorials Extensibility: Highly extensible with support for custom plugins and integrations Adoption by Business: Widely adopted by businesses and organisations, particularly in the open-source and developer communities |
Gitea is chosen for its ability to provide a lightweight, flexible, and highly extensible repository management solution. Its ease of use, scalability, and customisability make it an ideal tool for managing code repositories. |
| Data Exchange | Data Exchange Service | EDC | The data exchange service implementing the negotiation protocol (Data Space protocol). | https://projects.eclipse.org/projects/technology.edc |
|
Can be replaced with any other IDS connector implementing the IDSA Dataspace Protocol and using ODRL expressions for policy . The EDC connector is chosen because it has a good documentation, provides good interfaces and can be easily customised. Second there are two joined active communities to drive the development: Tractus-X and EDC . In addition, the first IDS connector passing the IDSA certification was the TSI connector based on EDC. Also EDC is the only available IDS connector which has already implemented the dataspace protocol. Other initiatives will follow. |
|
Monitoring, Logging, Reporting, Audit |
Monitoring and Logging | ELK (Elastic, Logstash, Kibana) | Reliably and securely take data from any source, in any format, then search, analyse, and visualise. | https://www.elastic.co/ |
|
The ELK stack is an industry standard for log management and data analysis due to its scalability and powerful features. Elasticsearch handles large volumes of data with real-time search and analytics, Logstash processes and ingests data from various sources, and Kibana provides intuitive visualizations and reporting. Being open-source, it benefits from a large community, continuous improvements, and extensive plugins. Security features like TLS encryption, role-based access control, and audit logging ensure data protection, making ELK a reliable and versatile solution for diverse use cases. |
| Access control & trust | Commons | OpenBao | Used by Keycloak, EJBCA, Spring Cloud Gateway, and when access to stored credentials is needed by a Java Backend. | https://www.vaultproject.io/ |
|
OpenBao is an open source, community-driven fork of HashiCorp Vault managed by the Linux Foundation. It is a secrets management and encryption platform that securely stores, manages, and encrypts sensitive data such as passwords, API keys, and certificates. It provides secure access, auditing, and revocation of secrets across distributed infrastructure, applications, and services, enabling secure development, deployment, and operation of modern systems. |
| Access control & trust | Commons | MinIO |
MinIO is a high-performance, S3 compatible object store. It is built for
large scale AI/ML, data lake and database workloads. It is software-defined and runs on any cloud or on-premises infrastructure. |
https://min.io/ |
|
Min.io is an open-source, Amazon S3-compatible, distributed object storage server for cloud-native and edge computing applications. It provides a highly available, scalable, and performant storage solution with features like erasure coding, bitrot protection, and encryption, making it suitable for a wide range of use cases, from dev to production. |
| Access control & trust | Commons | PostgreSQL | used by Keycloak, EJBCA, Spring Cloud Gateway, and when a DB is needed by a Java Backend | https://www.postgresql.org/ |
|
The World's Most Advanced Open-Source Relational Database |
| Access control & trust |
Common Identity provider |
EJBCA | One of the world's most popular PKIs, EJBCA gives you time-proven flexibility and robustness. Unlike other open-source certificate authority and PKI solutions, EJBCA is platform-independent and can be scaled up and down to match your needs. | https://www.ejbca.org/ |
|
The most mature (23 years), used, and rich in features, java-based PKI solution in the open-source panorama. |
| Access control & trust |
Commons Authorisation |
Spring Cloud Gateway | It is a spring project that provides libraries for building an API Gateway on top of Spring WebFlux or Spring WebMVC. Spring Cloud Gateway aims to provide a simple, yet effective way to route to APIs and provide cross cutting concerns to them such as: security, monitoring/metrics, and resiliency. | https://spring.io/projects/spring-cloud-gateway |
|
Based upon the best java-based backed framework in the world(spring) it also offers a reactive implementation that ensures the maximum level of resiliency and extensibility. |
| Message Broker | Commons | Apache Kafka |
Apache Kafka is an open-source distributed event streaming platform designed for building real-time data pipelines and streaming applications. It serves as a high-throughput, fault-tolerant, and horizontally scalable platform that can handle large volumes of data and stream events in real-time. Kafka uses a publish-subscribe model and durable storage for storing and processing streams of records.
Message Brokerage: In addition to its streaming capabilities, Kafka can effectively serve as a message broker, facilitating communication between different components of a system through the asynchronous exchange of messages. It provides features like message queueing, topic partitioning, and consumer group management, making it suitable for implementing a decoupled, event-driven architecture. |
https://kafka.apache.org/ |
|
Apache Kafka's role as a message broker offers several advantages for handling asynchronous events and message-based communication within distributed systems:
Scalability: Kafka's distributed architecture allows for horizontal scaling, enabling high throughput and low latency message processing even under heavy loads. Durability: Messages are stored durably in Kafka, providing fault tolerance and preventing data loss in case of system failures. Reliability: Kafka ensures reliable delivery of messages to consumers through features like message retention and configurable acknowledgment settings. Decoupling: By decoupling producers and consumers through topics, Kafka enables loosely coupled communication between system components, improving flexibility and resilience. Real-time Processing: Kafka's ability to process and react to events in real-time makes it suitable for use cases requiring low-latency messaging, stream processing, and complex event-driven architectures. |
| Cache | Commons | Redis | Redis Cache is an in-memory data structure store widely used as a caching solution to enhance the performance of applications. By storing frequently accessed data in memory, Redis enables faster data retrieval compared to disk-based databases. It supports a variety of data types such as strings, lists, sets, and hashes, making it versatile for different caching needs. Redis is known for its high throughput, low latency, and scalability, often used for caching web pages, session management, real-time analytics, and message brokering. It also supports persistence, replication, and automatic failover for reliability. | https://redis.io/ |
|
Redis Cache offers several advantages for improving application performance and scalability: Performance : As an in-memory data store, Redis delivers extremely low-latency and high-throughput data retrieval, significantly boosting application speed. Scalability : Redis supports horizontal scaling through clustering and partitioning, allowing it to handle large datasets and heavy traffic efficiently. Flexibility: With support for various data structures such as strings, lists, sets, and hashes, Redis can handle diverse caching and real-time data processing use cases. Persistence and Reliability: Redis offers optional persistence mechanisms like snapshots and append-only files, ensuring durability, while replication and automatic failover provide high availability and fault tolerance. Integration: Redis integrates easily with various programming languages and frameworks, making it a popular choice for developers seeking an efficient, easy-to-deploy caching solution. |
| Data Orchestration | Data Orchestration | Dagster OSS | Dagster is a modern data orchestration platform designed to help teams build, run, and observe data workflows in a structured and reliable way. It provides a framework for defining data transformations as modular, testable units called ops , which can be composed into pipelines or jobs. With strong typing, configuration schemas, and built-in observability, Dagster ensures that data workflows are predictable and maintainable, making it easier to catch errors early and manage complex dependencies. It also integrates with scheduling, monitoring, and external resources, enabling seamless automation and coordination of data tasks across diverse environments. | https://dagster.io/ |
|
Support for modern concepts like built-in data lineage, test-first development, and developer-friendly UX. |
| Discovery | Schema Management | Fuseki | Jena is a Java framework for building Semantic Web applications. It provides an extensive Java libraries for helping developers develop code that handles RDF, RDFS, RDFa, OWL and SPARQL in line with published W3C recommendations . Jena includes a rule-based inference engine to perform reasoning based on OWL and RDFS ontologies, and a variety of storage strategies to store RDF triples in memory or on disk. | https://jena.apache.org/documentation/fuseki2/ |
|
The following table links the OSS components to their architecture documentation and installation guide:
This section presents technical implementation details that are particularly relevant for contributing to Simpl-Open and/or implementing it in a Data Space.
The IAA 2-Tier approach in Simpl-Open is already described in the Data Spaces Concepts section of the Simpl-Open High-Level Overview.
Because of the 2-Tier approach, the components are grouped into Tier 1 and Tier 2.
Tier 1 is meant to be under the control of the governance of the organisation that became a Participant of a Dataspace, its components are local to the participant agent and are dedicated to enabling and controlling the access of the organisation’s end users to the resources/functionalities offered by the Simpl-Open agent and are:
The component responsible for identification and authentication is the Tier 1 Authentication Provider realised using an extended version of Keycloak (OpenID Connect Identity Provider) integrated with the User & Roles component.
The User and Roles component is used to define roles used by the Authorisation Tier 1 , manage roles assignment of Tier 1 Authentication Provider end users and assign identity attributes to roles (described in Identity Attributes and User Roles sections below)
This component manages permissions, determining what actions each end user is authorised to perform on a specific Agent resource. It plays a critical role in maintaining system security by ensuring that only the necessary users have limited access to specific functions, realised through an API Gateway, more specifically Spring Cloud Gateway and relies on Tier1 Authentication Provider to retrieve roles of authenticated end users to enforce RBAC (Role Based Access Control) policies to authorise or deny the access to the requested agent resource.
RBAC policies will be applied to check if the end user has the authorisation to access the requested agent resource/functionality based upon its assigned roles.
The tier 1 credential consists of an OpenID Connect (OAuth 2.0) AccessToken issued by the Tier 1 Authentication Provider , in the form of a JWT ( rfc7519 ) that contains standard claims extended with the following four custom claims:
The client roles is an array containing the list of roles assigned to the end user through the functionalities of the User & Roles component:
client-roles : [ “NOTARY”, “ONBOARDER_M”]
this will also be included in every tier 1 access token with the claim name “ client-roles ” of the JWT ( rfc7519 )
The participant ID is the unique and immutable ID used to identify the participant in the tier 2 IAA process. It is represented by a GUID formatted as shown in the following example:
participant_id : “02309243-2f77-456a-a1db-d8e8bb006f74”
this will also be included in every tier 1 access token with the claim name “ participant_id ” of the JWT ( rfc7519 )
Note that the participant ID will never change in time.
The credential ID is the unique ID used to identify the current credential participant in the tier 2 IAA process. It is represented by the Base58BTC ( https://digitalbazaar.github.io/base58-spec ) of the HASH (sha384) of the Participant x509 Certificate used to communicate in the data space as shown in the following example:
credential_id : “z8A3E8X4NkhgnFrczqy54SZjrnoiz6At3rqLosWN75WCkKQEgxmkA3yqpCPtPqHSnS9”
this will also be included in every tier 1 access token with the claim name “ credential_id ” of the JWT ( rfc7519 )
Note that the credential ID will change in time: e.g. when a credential is compromised a new issuance of credentials must occur.
Participant identity attributes are used to enable the specification of access to a subset of functionalities for a participant. In the context of Tier 2 communication, the presence of Identity Attributes ensures ABAC compliance. Specifically, services provided by dataspace participants to other participants can be protected by one or more Attributes.
A subset of those attributes can be assigned to Tier 1 roles (see Tier 1 User Roles) meaning that every end user belonging to this role owns it and is represented as in the following example;
identity_attributes : [ “DATA_CONSUMER”, “DATA_ACCESS_LEVL1”]
this will also be included in every tier 1 access token with the claim name “ identity_attributes ” of the JWT ( rfc7519 )
Tier 1 roles are the core elements on which the RBAC policies are enforced and are also used by the participant governance to assign a subset of Participant Identity Attributes (see Identity Attributes) to its end users.
Here is the updated list of Roles that are used inside Simpl-Open:
| Human Readable Role Name | Role Value | Description | Predefined | Participant | Assigned Identity Attributes | Id Component |
|---|---|---|---|---|---|---|
| Tier 2 authorisation manager | T2IAA_M | In the Dataspace Governance Authority is the one who is in charge of defining and changing the onboarding procedure itself, like setting up the mandatory documents and the rules that will be followed by the onboarding process. | true | Governance Authority |
IAA-ONB-FE IAA-ONB-BE |
|
| Tier 2 authorisation operator | NOTARY | tier 2 authorisation operator, the one who is in charge of taking care of onboarding requests and follow their process. It will ask for further documents, it will comment on the onboarding requests and reject/approve the requests | true | Governance Authority |
IAA-ONB-FE IAA-ONB-BE |
|
| Tier 2 setup administration role | ONBOARDER_M | tier 2 setup administrator role, the one who is in charge of finalising the tier 2 setup of an agent installation. | true | All Participant |
IAA-U&R-FE IAA-U&R-BE |
|
| Tier 2 identity attributes manager | IATTR_M | This role is present only in the Dataspace Governance Authority and its duties are to cover the whole lifecycle of Identity Attributes, from the creation and management to the assignment to participants | true | Governance Authority |
IAA-SAP-FE IAA-SAP-BE |
|
| Tier 1 user and role manager | T1UAR_M | Tier 1 user and roles manager. In the Dataspace Governance Authority, this role will manage local roles and dataspace identity attributes (defining them and assigning them to participant types + defining their assignability). In any dataspace participant, this role will manage local roles and identity attributes assignment to local roles | true | All Participant |
IAA-U&R-FE IAA-U&R-BE |
|
| Applicant Representative | APPLICANT | end user responsible for onboarding an applicant dataspace participant who sign up the public dataspace onboarding site to manage the onboarding request. Applicant's primary scope is to create an onboarding request and react on the Tier 2 authorisation operator (NOTARY) interaction to get the onboarding request approved | true | Governance Authority |
IAA-ONB-FE IAA-ONB-BE |
|
| Ro-MU-CA | Role defined in XFSC Federated Catalogue: Catalogue Administrator | true | Governance Authority | |||
| Ro-MU-A | Role defined in XFSC Federated Catalogue: Participant Administrator | true | Providers |
DATA_PROVIDER_PUBLISHER APP_PROVIDER_PUBLISHER INFRA_PROVIDER_PUBLISHER |
||
| Ro-SD-A | Role defined in XFSC Federated Catalogue: Self-Description Administrator | true | Governance Authority | |||
| Ro-Pa-A | Role defined in XFSC Federated Catalogue: Participant User Administrator | true | Providers |
DATA_PROVIDER_PUBLISHER APP_PROVIDER_PUBLISHER INFRA_PROVIDER_PUBLISHER |
||
| Researcher | RESEARCHER | Researcher who is able to access research only datasets | false | Consumer | DATA_SEARCHER | |
| SD Publisher | SD_PUBLISHER | Role defined for the user who is responsible for creating and publishing the self-description on the catalogue | true | Providers |
DATA_PROVIDER_PUBLISHER DATA_SEARCHER |
|
| SD Consumer | SD_CONSUMER | Tier-1 Role for Consumer | true | Consumer | CONSUMER | |
| Schema Manager Admin | GA_SCHEMA_ADMIN | Tier-1 Role for Schema Admin | true | Governance Authority | ||
| Schema Manager Viewer | GA_SCHEMA_VIEWER | Tier-1 Role for Schema Viewer | true | Governance Authority | ||
| Kibana Business User | KIBANA_BUSINESS_USER | Role for accessing Kibana as a business user (binded to local Kibana user) | true | All Participant | ||
| Kibana Admin | KIBANA_ADMIN | Role for accessing Kibana as an admin (binded to local Kibana user) | true | All Participant | ||
| Data Orchestration Developer | ORCH_DEVELOPER | Role for developing workfows and services for data orchestration | true |
Consumer Provider |
||
| Data Orchestration Admin | ORCH_ADMIN | Role for administration and management of the orchestration, like setting schedules, retry of Workflows or Monitoring | true |
Consumer Provider |
||
| Infrastructure Provider Admin | INFRA_ADMIN | Role defined for management of all the Infrastructure Provider's cloud resources. | true | Provider | ||
| Infrastructure Provider Deployer | INFRA_DEPLOYER | Role defined for deactivation and triggering of the Infrastructure Provider's cloud resources. | true | Consumer |
Tier 2 is meant to be under the control of the Dataspace Governance Authority and is used by all participant agents to ensure secured and encrypted communications (see Encryption and Guaranteed Authenticity/Integrity sections below), its components are both centralised (in the Authority Agent) and decentralised (local to all agents)
Identity Provider Federation
This component includes functionalities about identity information and Tier 2 credential creation, validation and management.
Starting from the onboarding process, the Identity Provider will be used for:
Create the credential: when an applicant participant is onboarded by approving its onboarding request, a Tier 2 credential is created by the identity provider. The participant installs the credential within its own agent.
Validate the credential: the identity provider verifies the received identity Tier 2 credentials.
Management: during the lifecycle of a credential, it can be either renewed or revoked by the Dataspace Governance Authority.
Security Attribute Provider Federation
To implement ABAC policies, which are used in agent-to-agent communications, a set of valid and known Identity Attributes are needed and will be assigned to each dataspace participant by the Governance Authority.
The Security Attribute Provider component implements several functionalities:
Identity Attributes management (create, delete and modify identity attributes)
Identity Attributes Participant assignment (both during the Onboarding and after)
Temporary attestation of the participant’s identity attributes in the form of a signed ephemeral proof
Tier 2 Authentication Provider
This component is responsible for keeping the Tier 2 Credential received during the onboarding process and implements all Tier 2 Identification and Authentication functionalities such as:
Keep safely store the participant agent Tier 2 Credential and its keypair
Check and Validate any Tier 2 credentials coming from other participant agents during the mTLS Authentication against the Identity Provider Federation .
Check and Validate the ephemeral proof received from other participant agents after the successful mTLS Authentication process.
Check and validate the Tier 1 credential forwarded by other participant agents against the ephemeral proof (that contains also the caller Tier 1 Authentication Provider public key)
Request ephemeral proof to the Security Attribute Provider Federation to be used in secured communications with other participant agents
Authorisation Tier 2
This component is realised through an API Gateway, more specifically Spring Cloud Gateway and relies on the Tier 2 Authentication Provider to check Tier 2 credentials and ephemeral proof received during the mTLS Authentication process to enforce ABAC (Attribute Based Access Control) policies to authorise or deny access to the requested agent resource.
ABAC policies will be enforced in any agent-to-agent communication, by verifying whether the requestor’s attributes are permitted to access the requested resource and if needed the enforcement of ABAC policies can be done also in both Tier 1 and Tier 2 credentials (to check if the identity attribute is also present in the Tier 1 credential used by the end user of the caller participant agent)
The Tier 2 credential has the form of an X509 Certificate and is issued by a Certificate Authority embedded in the Identity Provider Federation .
Identity attributes are the most powerful and versatile tool at the disposal of the Dataspace Governance Authority to “design” the governance and the rules in the interactions between Dataspace participants. Some attributes are built in Simpl-Open ( Built-in = true ) and cannot be modified/removed.
Two important properties can be used in the definition of Identity attributes:
Assignable : if true means that any governance of a Participant that receives this identity attribute can assign it to any Tier 1 roles to then give it to its end users, if false means that this identity attribute is Participant wide and is to be considered as assigned to all the end users of the participant.
IsRight : if true means that the identity attribute should be considered as a special centralised right.
Here is the updated list of Identity Attributes that are used inside Simpl-Open:
| Human Readable Attribute Name | Identity Attribute Value | Description | Built-in | Assignable | IsRight | Id Component | Component & Endpoint | Location of configuration |
|---|---|---|---|---|---|---|---|---|
| Consumer | CONSUMER | Identity attribute used to tag the consumer participant | true | false | false | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Data Provider | DATA_PROVIDER | Identity attribute used to tag the data provider participant | true | false | false | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Application Provider | APP_PROVIDER | Identity attribute used to tag the application provider participant | true | false | false | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Infrastructure Provider | INFRA_PROVIDER | Identity attribute used to tag the infrastructure provider participant | true | false | false | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Data Provider Publisher | DATA_PROVIDER_PUBLISHER | Identity attribute needed for publishing Data Catalogue | true | true | true | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Application Provider Publisher | APP_PROVIDER_PUBLISHER | Identity attribute needed for publishing Application Catalogue | true | true | true | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Infrastructure Provider Publisher | INFRA_PROVIDER_PUBLISHER | Identity attribute needed for publishing Infrastructure | true | true | true | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
|
| Data searcher | DATA_SEARCHER | Identity Attributes used for tagging an end user able to act only as a searcher in the catalogue, but he can't start a contract negotiation or transfer process | true | true | true | Should be put in tier2-gateway configuration within GA agent as ABAC configuration |
tier2-gateway
→ spring-configmap.yaml |
Built-in identity attributes will be available by default in every Simpl-Open dataspace and cannot be modified by the Governance Authority. The Governance Authority can add custom (not built-in) identity attributes based on specific needs. For example , if a Governance Authority needs to define access levels to resources, they could introduce three new identity attributes such as:
| Human Readable Attribute Name | Identity Attribute Value | Description | Built-in | Assignable | IsRight |
|---|---|---|---|---|---|
| Basic Access Level | ACCESS_LEVEL_BASIC | Basic Access Level | false | true | true |
| Medium Access Level | ACCESS_LEVEL_MEDIUM | Medium Access Level | false | true | true |
| Full Access Level | ACCESS_LEVEL_FULL | Full Access Level | false | true | true |
In mTLS (mutual Transport Layer Security) communication, encryption of in-transit data ensures that the information exchanged between a client and a server is protected from interception or tampering. This encryption is achieved through the following process:
TLS Handshake : Both the client and server initiate a TLS handshake, during which they exchange public keys and agree on encryption algorithms.
Mutual Authentication : Unlike regular TLS, in mTLS both the client and server authenticate each other by exchanging digital certificates, confirming the identity of both parties.
Symmetric Encryption : After authentication, a symmetric encryption key is established and used to encrypt all subsequent data transmitted between the client and server.
Through this process, data in transit is securely encrypted , preventing unauthorised access or modification, while ensuring that both the client and server are trusted entities.
Supports the measures in place to ensure end-to-end data integrity, such that Simpl-Open agents can validate the authenticity of the delivered information.
This capability is achieved by implementing mTLS communication between
agents, ensuring that communication can be established only between
trusted and known participants from the Authority.
The Governance Authority during the onboarding processes creates unique
Identity Credentials for each participant of the Dataspace. Then the
participant uses the credential during the mTLS communication.
This section is dedicated to listing all components divided by Frontend FE and Backend BE
| Id Component | Component | Participant | Endpoints published on tier1-gateway | Endpoints published on tier2-gateway | Configuration URL |
|---|---|---|---|---|---|
| IAA-IDPRO-FE | Identity provider FE | Governance Authority | YES | NO | |
| IAA-IDPRO-BE | Identity provider BE | Governance Authority | YES | YES | |
| IAA-SAP-FE | Security Attribute Provider FE | Governance Authority | YES | NO | |
| IAA-SAP-BE | Security Attribute Provider BE | Governance Authority | YES | YES | |
| IAA-ONB-FE | Onboarding FE | Governance Authority | YES | NO | |
| IAA-ONB-BE | Onboarding BE | Governance Authority | YES | NO | |
| IAA-U&R-FE | User & Roles FE | All Participant | YES | NO | |
| IAA-U&R-BE | User & Roles BE | All Participant | YES | NO | |
| IAA-AUTH-FE | Authentication Provider FE | All Participant | YES | NO | |
| IAA-AUTH-BE | Authentication Provider BE | All Participant | YES | YES | |
| xsfc-advsearch-be | Providers, Consumers | YES | NO | ||
| simpl-edc | Providers, Consumers | NO | YES | ||
| sd-creator-backend | Providers | YES | NO | ||
| xsfc-catalogue | Governance Authority | NO | YES | ||
| catalogue-query-mapper | Governance Authority | NO | YES | ||
| Infra. Deployment Script Management FE | Providers | YES | NO | ||
| Infra. Deployment Script Management BE | Providers | YES | NO | https://code.europa.eu/simpl/simpl-open/development/infrastructure/infrastructure-be/-/tree/develop?ref_type=heads#configure-tier1-and-tier2-business-logs | |
| schema-sync-adapter | Providers, Consumers | NO | YES | ||
| asset-orchestrator | Providers | YES | NO | https://code.europa.eu/simpl/simpl-open/development/orchestration-platform/asset-orchestrator/-/blob/feature/add-tags-to-workflow-list/README.md?ref_type=heads#tier1-configuration | |
| dagster | Providers | YES | NO | https://code.europa.eu/simpl/simpl-open/development/orchestration-platform/dagster/-/blob/feature/oauth2-proxy-integration/README.md?ref_type=heads#tier1-gateway-configuration-participant |
The metadata will be described as self-descriptions. These are described in this section.
In the sub-section Self-Description Tooling the tools to create self-descriptions are introduced and the flow of the different steps to be considered are visualised. The SD Schema Creator enables customised schemas for each Data Space. In Schema Definition Properties the proposed attributes any Simpl Data Space should utilise are enlisted. The Validation of Syntax and schema can be looked up in SD Tooling Syntax Validation & Schema Validation.
The structure of Self-Descriptions should be based on the GAIA-X Trustframework . There are already Gaia-X powered Data Spaces providing such an SD. This way the created SD can be easily reused and be enhanced by the special requirements of each sectoral Data Space.
Base Entities and their relationship due to Gaia-X Trustframework
Note
Attributes marked in red color are planned, but not yet implemented.
Data Offering:
| Simpl Attribute | Entity | Attribute | Cardinality | Mandatory / Recommended | Data Type | Constraint | Comment |
|---|---|---|---|---|---|---|---|
| Unique identifier | service-offering | id | 1 | Mandatory | xsd:string | The id of the ServiceOffering. usually refering to a DID. Set automatically. | |
| Name | service-offering | name | 1 | Mandatory | xsd:string | sh:maxLength 255 | A human readable name of the service offering |
| Description | service-offering | description | 1 | Mandatory | xsd:string | sh:maxLength 1000 | a short description of the service offering |
| Location of the dataset (e.g. URL, handle) | service-offering | serviceAccessPoint | 1 | Mandatory | xsd:anyURI | sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)" | a list of Service Access Point which can be an endpoint as a mean to access and interact with the resource |
| Keywords | service-offering | keywords | 0..16 | Recommended | xsd:string | sh:maxLength 50 | list of keywords |
| Language (of the metadata, like the title, description) | service-offering | inLanguage | 1 | Mandatory | xsd:string | sh:languageIn ("bg" "hr" "cs" "da" "nl" "en" "et" "fi" "fr" "de" "el" "hu" "ga" "it" "lv" "lt" "mt" "pl" "pt" "ro" "sk" "sl" "es" "sv") | The language of the content or performance or used in an action. Please use one of the language codes from the IETF BCP 47 standard . See also availableLanguage . |
| Version | xsd:string | The version of the self-description. Technical property, set automatically. | |||||
| Creation date | xsd:dateTimeStamp | The first onboarding date. Technical property, set automatically. | |||||
| Last update date | xsd:dateTimeStamp | The last update date. Technical property, set automatically. | |||||
| SD Schema | xsd:string | Reference to the used Schema ID (and version). Technical property, set automatically. | |||||
| Data Provider | provider-information | providedBy | 1 | Mandatory | xsd:string | sh:maxLength 255 | Reference to Participant SD. To be Set automatically. |
| Contact point (who to contact in case of questions/issues) | provider-information | contact | 1 | Mandatory | xsd:string | sh:pattern "^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$" | email adress of the contact point |
| License | offering-price | license | 1..n | Mandatory | xsd:anyURI |
sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)" sh:maxLength 255 |
A list of SPDX identifiers or URL to document |
| Price Type | priceType | Recommended | xsd:string | sh:in("free" "commercial") | Link to price in the future. | ||
| Price (free, under cost) | offering-price | price | 1 | Mandatory | xsd:decimal | sh:minInclusive 0 | |
| Currency | currency | Mandatory | xsd:string | sh:in("BGN" "EUR" "CZK" "DKK" "HUF" "PLN" "RON" "SEK" ) | |||
| Access policy (to define who can access the dataset) | service-policy | access-policy | 0..n | Recommended | xsd:string | sh:pattern "[:,\{\}\[\]]|(\".*?\")|('.*?')|[-\w.]+" | a list of policy expressed using a DSL (e.g., Rego or ODRL) (access control, throttling, usage, retention, …) |
| Usage policy (to define how a dataset can be used) | service-policy | usage-policy | 0..n | Recommended | xsd:string | sh:pattern "[:,\{\}\[\]]|(\".*?\")|('.*?')|[-\w.]+" | a list of policy expressed using a DSL (e.g., Rego or ODRL) (access control, throttling, usage, retention, …) |
| Compliance: Indicates compliance with relevant data protection regulations and standards. | service-policy | dataProtectionRegime | 0..n | Recommended | xsd:string | sh:pattern "[:,\{\}\[\]]|(\".*?\")|('.*?')|[-\w.]+" | |
| Provenance | dataset-properties | producedBy | 1 | Recommended | xsd:anyURI | sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)" | a resolvable link to the participant self-description legally enabling the data usage |
| Format under which the data is distributed (e.g. csv, xml, …) | dataset-properties | format | 1 | Mandatory | xsd:string | ||
| Schema of the dataset, depends on the type of data for JSON it would be JSON Schema Description that states what fields the data has and the types. | dataset-properties | openAPI | 0..n | Recommended | xsd:anyURI | sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)" | URL of the OpenAPI documentation |
| Additional Information about the dataset | additionalInfo | xsd:anyURI | sh:pattern "[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)" | ||||
| Related datasets | dataset-properties | relatedDatasets | 0..n | Recommended | xsd:string | ||
| Target users | dataset-properties | targetUsers | 0..n | Recommended | xsd:string | ||
| Data Quality (to include metrics such as completeness, accuracy, timeliness and other) | dataset-properties | dataQuality | 0..n | Recommended | xsd:string | ||
| Encryption: Describes the encryption algorithms and keys used to secure the data. | dataset-properties | encryption | 0..1 | Recommended | xsd:string | ||
| Anonymisation/pseudonymisation: Indicates whether sensitive information has been anonymised or pseudonymised to protect privacy. | dataset-properties | anonymization | 0..1 | Recommended | xsd:string | ||
| Contract template | contract-template | contractTemplate | 1..n | xsd:string | sh:in ( "Contract Template 1" "Contract Template 2" "Contract Template 3" ) | Refering to an SD of a Contract Template |
Infrastructure Offering:
| Simpl Attribute | Entity | Attribute | Cardinality | Mandatory / Recommended | Data Type | Constraint | Comment |
|---|---|---|---|---|---|---|---|
| Resource Type | infrastructure-properties | 1 | Mandatory | xsd:string | sh:in("vm" "container" "block_storage" "object_storage" "relational_db" "document_db") | ||
| Region and availability zone | infrastructure-properties | 1..n | Mandatory | xsd:string | sh:in("eu-west-1" "eu-west-2" "eu-west-3" "eu-central-1" "eu-north-1" "eu-south-1" "eu-south-2") | ||
| Size and capacity | infrastructure-properties | 0..1 | Recommended | xsd:string | sh:pattern "\d+(\.\d+)?\s?(B|KB|MB|GB|TB|PB|EB|ZB|YB)" | ||
| Operating system and image | infrastructure-properties | 0..1 | Mandatory | xsd:string | |||
| Network configuration | infrastructure-properties | 0..1 | Recommended | xsd:string | |||
| Security settings (access control, security groups/firewalls, encryption) | infrastructure-properties | 0..1 | Mandatory | xsd:string | |||
| Instance type | infrastructure-properties | 0.1 | Mandatory | xsd:string | |||
| Storage type | infrastructure-properties | 0.1 | Mandatory | xsd:string | |||
| Backup and redundancy | infrastructure-properties | 0..1 | Recommended | xsd:string | sh:in("full-backup" "incremental-backup" "differential-backup") | ||
| Scalability options | infrastructure-properties | 0..1 | Recommended | xsd:string | sh:in("dynamic-scaling" "scheduled-scaling", "sharding") | ||
| Monitoring and logging | infrastructure-properties | 0..1 | Recommended | xsd:string | |||
| Tags and metadata | infrastructure-properties | keywords | 0..16 | Recommended | xsd:string | sh:maxLength 50 | |
| External Url | infrastructure-properties | 1 | Mandatory | xsd:string | sh:maxLength 255 | ||
| Deployment script ID | infrastructure-properties | 0.1 | Mandatory | xsd:string |
termsAndCondition structure (defined by Gaia-X Trustframework)
| Attribute | Cardinality | DataType | Comment |
|---|---|---|---|
| URL | 1 | xsd:string | a resolvable link to document |
| hash | 1 | xsd:string | SHA256 of the above document |
dataAccountExport structure (defined by Gaia-X Trustframework)
The purpose is to enable the participant ordering the service to assess
the feasibility to export its personal and non-personal data out of the
service.
This export shall cover account data e.g., account holder’s billing
information, information on the PII held - but also data provided
previously to the service by the user.
| Attribute | Cardinality | DataType | Comment |
|---|---|---|---|
| requestType | 1 | xsd:string |
the mean to request data retrieval: API, email, webform,
unregisteredLetter, registeredLetter, supportCenter |
| accessType | 1 | xsd:string | type of data support: digital, physical |
| formatType | 1 | xsd:string | type of Media Types (formerly known as MIME types) as defined by the IANA . |
Currently, only mandatory quality rules are supported. A Quality Score can only be calculated for recommended quality rules thus this will also not be supported.
Mandatory quality *rules are always *enforced during the creation of a Self-Description (SD) for an offering, to ensure the data quality of the SD. A resource provider is not able to publish an SD that is not complying with the mandatory quality rules.
Quality rules are defined in the schema of the self-description (which are semantic RDF Graphs) and allow to express data types, constraints and conditions on those RDF Graphs. Thus, SHACL (Shape Constraint Language) Constraints are intended to be used as the formal notation to express quality rules.
The quality rules that can be defined for an SD property can be based on the data type and/or on a SHACL constraint. Example of Constraints:
Minimum or maximum length of a string value
Value Ranges for Numbers
Non-Negative Numbers
Regular Expressions (Patterns)
List of allowed values
Constraints based on other properties
Data Model (initial)
To define the quality rules, there are three basic entities:
Quality Rule: The (mandatory) quality rule. It is uniquely defined by an id and contains a textual description of the rule in clear text;
Rule Template: The template for the formal definition of the rule. It contains besides the ID a field with SHACL template that is parameterised. The number of parameters and their type is defined in a parameter_schema, i.e., JSON Blob with the parameter and data types;
Quality Dimension: The quality dimension used to group the quality rule for instance FAIR as an example.
Each Quality Rule has exactly one Dimension and one Rule Template associated. The template_assoc also contains the concrete parameterisation for the rule template.
Score Calculation
The score is calculated by dimension.
\delta_{r,s} = \begin{cases}
1 & \text{if the quality rule r is fulfilled for the self-description s} \\
0 & \text{else}
\end{cases}
\newline \newline
\text{score}(s, d) = \frac{\sum_{r \ in ~R_d} \delta_{r,s} * w_r}{\sum_{r \ in ~{R}_d} w_r} * 100
\newline \newline
s \text{ is valid} \Leftrightarrow \forall d \in D: \text{score}(s, d) \geq \text{min_score}_d
The kronecker-delta is 1 if the quality rule r is fulfilled for the self-description s, else 0.
The score for a self-description s and quality dimension d is calculated by the sum over all the Quality rules r for the Dimension d (R_d) multiplied by their weight w. This is then normalised by the sum of all weights for the dimension. Because a value between 0-100 is desired instead of between 0-1, it is multiplied by 100.
A self-description is valid exactly if, for all quality dimension the score is greater than the specified threshold (min_score).
Calculation Process
The calculation of the score (and the validation of the rules) is done during the publication of the self-description s in the query mapper. First, all the active quality rules are retrieved from the database (with the associated SHACL template). All the rules are looped over and validated against the self-description s. The results are added to the quality report for s . After all the rules are processed, all the quality dimensions d are iterated over. For each dimension the score is calculated.
Next, it is checked whether all the mandatory rules are fulfilled and if the score for each dimension is above the defined threshold. If this is not the case, the publication is aborted, and the quality report is returned to the provider. Else, the publication continues, and the quality report is returned to the provider.
The self-description tooling consists of four different components that are all in their respective repositories:
SD Schema Creator:
SD Schemas
This component creates the schemas that describe the form and
content of the self-description. It is used by the Governance
Authority to set the standard for the Self-Description. Technically
it is done by a set of configuration files in the form of
YAML-Documents. Those files are verified and transformed into an
ontology and SHACL Constraints that are used by the other components
to create the wizards. The component is written in Python, and at
least the YAML configuration needs to be adjusted for Simpl-Open.
SD Creation Wizard API:
SD Creation Wizard API
The main API project. Transform the SHACL-shapes from the SD
Schema Creator into JSON forms that are used by the frontend to
allow the provider to write new Self-Descriptions.
SD Creation Wizard Frontend:
SD Creation Wizard Frontend
Frontend with the forms for the provider to create
Self-Descriptions. Written in Angular and NodeJS. The result is an
SD in the form of a JSON-LD document that can be uploaded to the
catalogue.
SD Validation API:
SD Validation API
Validation of the Self-Description against SHACL files. Might be
used for the Quality Rule Validation. Written in Java.
Background
Self-Description in the context of DATA/APP are documents that describe the service offering (either Data, Application, or Infrastructure). The Schema of the Self-Description defines the format of the Self-Description, i.e. it is a description about what are the fields for the self-description, their data types and if they are mandatory or not.
Component Self-Description Schema Creator :
The Schema-Framework is a component that is able to generate the self-description schemas from configuration files. The idea is that from a simple configuration the schemas are generated and later used by the provider to write the self-description.
It should include validation of the schema files (syntax and semantic).
The basis of the implementation is the repository from Gaia-X Context sd-schemas
Context View
The main actor in the SD Schema Creator is the Data Governance Authority. They can configure the schema by changing the yaml files that define how the schemas for the different services should look like.
Component View
The input of the system is the SD Schema Configuration, the file uses the LinkML data model and is serialised as a YAML document. After the configuration is changed the process is triggered that first checks the syntax and the semantic. After the validation the configuration files are transformed into two different files that describe the semantic. One is an ontology, i.e. a formal representation of the knowledge which is used as a vocabulary for the SD Tool. The other are constraints in the form of SHACL-Shapes that are used as a template to build the forms in the SD Tool. Both semantic files are serialised as Turtle-files.
Syntax Validation
For YAML files there exist currently no standard for schema validation. To this end, the SD Schema Description is transformed into a JSON Serialisation and a JSON-Schema Description is used for the syntax validation. This JSON-Schema is written by the Data Governance Authority.
Semantic Validation
The Semantic Validation uses a Python script which reads some configuration and guidelines (for instance which fields are mandatory in the schema).
Runtime View
For the current release, the system is simply deployed as a GitLab repository. A GitLab CI Pipeline starts if the configuration is changed by the data governance authority and generates new files. If the Data Provider starts the SD Tool, the SD-Tool pulls from the repository the current SHACL Constraints and Ontology.
Vocabulary, Schema and Self-Descriptions
The vocabulary is a formal description of an ontology, representing knowledge and relationships between the terminologies, containing inference and integrity rules for reasoning.
A schema is describing a data object with constraints on the content, structure and meaning of a graph. These conditions may constrain the number of values that a property may have, the type of values, numeric ranges, string matching patterns or logical combinations of constraints.
The Self-Description is an instance of a schema object, meaning that values are assigned to the properties.
Syntax Validation
Syntax Validation during the process of creating the SD comprises the following:
Formatting: Check if the file is malformed (e.g. missing brackets etc.);
Data Types: Check for the correctly applied values according to data types. Allowed data Types according to dataTypeAbbreviation.yaml .
The syntax validation for data types in the SD Frontend is based on the schema definition, which is the single point of truth.
The Syntax validation on the provider node is based on the schemas that are imposed by Simpl and are intended to guide the user to provide an error free Self-Description.
The Syntax validation on the Governance Authority Node ensures that only valid Self-Descriptions will be published to the catalogue.
Allowed Data Types
xsd:string: ‘ http://www.w3.org/2001/XMLSchema\#string ’
xsd:boolean: ‘ http://www.w3.org/2001/XMLSchema\#boolean ’
xsd:decimal: ‘ http://www.w3.org/2001/XMLSchema\#decimal ’
xsd:float: ‘ http://www.w3.org/2001/XMLSchema\#float ’
xsd:double: ‘ http://www.w3.org/2001/XMLSchema\#double ’
xsd:duration: ‘ http://www.w3.org/2001/XMLSchema\#duration ’
xsd:dateTime: ‘ http://www.w3.org/2001/XMLSchema\#dateTime ’
xsd:time: ‘ http://www.w3.org/2001/XMLSchema\#time ’
xsd:date: ‘ http://www.w3.org/2001/XMLSchema\#date ’
xsd:gYearMonth: ‘ http://www.w3.org/2001/XMLSchema\#gYearMonth ’
xsd:Day: ‘ http://www.w3.org/2001/XMLSchema\#Day ’
xsd:hexBinary: ‘ http://www.w3.org/2001/XMLSchema\#hexBinary ’
xsd:base64Binary: ‘ http://www.w3.org/2001/XMLSchema\#base64Binary ’
xsd:anyURI: ‘ http://www.w3.org/2001/XMLSchema\#anyURI ’
xsd:QName: ‘ http://www.w3.org/2001/XMLSchema\#QName ’
xsd:NOTATION: ‘ http://www.w3.org/2001/XMLSchema\#NOTATION ’
xsd:dateTimeStamp: ‘ http://www.w3.org/2001/XMLSchema\#dateTimeStamp ’
xsd:enum: ‘ http://www.w3.org/2001/XMLSchema\#enum ’
xsd:integer: ‘ http://www.w3.org/2001/XMLSchema\#integer ’
xsd:address: ‘ http://www.w3.org/2001/XMLSchema\#address ’
xsd:nonNegativeNumber: ‘ http://www.w3.org/2001/XMLSchema\#nonNegativeNumber ’
did:example: ‘ https://www.w3.org/TR/did-core/\#example ’
dct:location: ‘ http://dublincore.org/usage/terms/history/\#Location-001 ’
trusted-cloud:meaningfulString: ‘class-placeholder-from-dataTypeAbbreviation.yaml’
Semantic Validation
Semantic Validation during the process of creating the SD comprises:
the verification of property patterns;
data ranges;
other constraints;
the cardinality of the properties;
the ontology/vocabulary compliance.
Examples:
Value Ranges;
Length;
Pattern;
Value Comparison;
Memberships;
Logical.
Constraints can be defined according to Shapes Constraint Language
For this context, both access and usage policies for resources (Data, Application, or Infrastructure) are defined.
The below definition of the Data Space Support Center is followed:
Access Rules/Policy: define whether access to a resource is allowed or not.
Usage Rules/Policy: define how a resource might or may not be used.
Access control policies control the authorisation to access specific data while the data rights owner retains direct control over the data. Usage policies, including consent, regulate the permissible actions and behaviours related to the utilisation of the accessed data, which means keeping control of data even after the items have left the trust boundaries of the data owner. Policies can only be enforced when technically feasible, otherwise only legal enforcement is possible
https://dssc.eu/space/BVE/357075567/Access+%26+Usage+Policies+Enforcement#Data-Space-Registry
Following this definition the access policies are checked before the provider gives (at least partial) control over to the consumer. The usage policies describe the behaviour after the consumer has access to the resource (Data, Application or Infrastructure).
Policy Language
A formal and machine-readable way to express and enforce the policies is needed. Open Digital Rights Language (ODRL) is intended to be used to write both access and usage policies. https://www.w3.org/TR/odrl-model/
The key components of ODRL are:
Here are the key components of an ODRL usage policy:
Asset : The digital content or service to which the policy applies;
Permissions : Actions that are allowed with respect to the asset (e.g., read, download);
Prohibitions : Actions that are explicitly forbidden;
Constraints : Conditions or limitations that must be met for the permissions to apply (e.g., time restrictions);
Duties : Obligations that must be fulfilled by the user in order to exercise a permission (e.g., attribution, payment).
Different ways exist to serialise the ODRL expressions, and JSON-LD is intended to be used for this part.
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/123 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“permission”: [
{
“target”: “ http://example.com/asset/image123 ”,
“action”: “ http://www.w3.org/ns/odrl/2/distribute ”,
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/purpose ”,
“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,
“rightOperand”: “ http://www.example.com/vocab\#nonCommercial ”
},
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/payAmount ”,
“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,
“rightOperand”: “0”
}
],
“duty”: [
{
“action”: “ http://www.w3.org/ns/odrl/2/attribution ”
}
]
}
],
“prohibition”: [
{
“target”: “ http://example.com/asset/image123 ”,
“action”: “ http://www.w3.org/ns/odrl/2/modify ”
}
]
}
Access Policy
Here is an example of an access policy for a dataset provided by a data provider. The policy will specify who can access the data, under which conditions, and for how long.
Scenario:
The dataset contains research data that can be accessed by different roles:
Researchers: Full access to the data for analysis;
Students: Limited access to anonymised data for study purposes;
External partners: Access to aggregated data for collaboration purposes.
The access is granted for a specific period;
The access is granted only for the geographic location of the EU.
Different datasets for the full data, anonymised data and aggregated data are used.
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/123 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “ http://example.com/dataset/research123 ”,
“assigner”: {
“uid”: “ http://example.com/provider/dataProvider001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “SECURITY_ATTRIBUTE”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
},
“action”: [
{ “name”: “ http://www.w3.org/ns/odrl/2/read ” }
],
“target”: “ http://example.com/dataset/research123/aggregated ”,
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,
“operator”: “ http://www.w3.org/ns/odrl/2/leq ”,
“rightOperand”: “2024-12-31T23:59:59”
},
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,
“operator”: “ http://www.w3.org/ns/odrl/2/geq ”,
“rightOperand”: “2024-01-01T00:00:00”
},
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/spatial ”,
“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,
“rightOperand”: “ http://www.geonames.org/external-partner-location ”
}
]
}
]
}
Minimal Access Policy
For the the current release, access policies with limited expressive power are planned to be supported. It is possible to define two different actions
http://www.w3.org/ns/odrl/2/ read : the attribute holder is able to search for the dataset/application/infrastructure;
http://www.w3.org/ns/odrl/2/ use : The attribute holder can consume the dataset/application/infrastructure.
while use implies read .
Date time constraints are planned to be supported for specifying when the policy should be valid.
{RESSOURCE_URI}, {POLICY_URI}, {PROVIDER_URI} are later automatically replaced with the correct URI. {SECURITY_ATTRIBUTE_URI} need to be specified but documentation with the available URI is provided, as well as the action (read for searching and use for consumption, which implies read)
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “{POLICY_URI}”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “{RESSOURCE_URI}”,
“assigner”: {
“uid”: “{PROVIDER_URI}”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “{SECURITY_ATTRIBUTE_URI}”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
},
“action”: [
{ “name”: “ http://www.w3.org/ns/odrl/2/{read/use} ” }
],
“target”: “{RESSOURCE_URI}”,
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,
“operator”: “ http://www.w3.org/ns/odrl/2/leq ”,
“rightOperand”: “2024-12-31T23:59:59”
},
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,
“operator”: “ http://www.w3.org/ns/odrl/2/geq ”,
“rightOperand”: “2024-01-01T00:00:00”
}
]
}
]
}
API to get all available attributes with description about the semantic. When will this be available, and can static get list so the development can start.
Prioritises before the MTLS;
Availability not clear;
Provide a static list.
API to get the attributes of the searching consumer. For the use of filtering the results of the catalogue search:
over the public key;
from the JWT, attributes are in the payload.
How to get Provider ID? While you use self-description? Is it somehow possible to get the ID of the provider from an API to add this information to the Self-description:
unique id of the agent is the public key, from the vault (HashiCorp/OCM) or the public endpoint;
self-description in long run of the participants.
Map Policy to ABAC (who is doing it?):
ABAC only for first layer;
Second Layer with policy evaluation in EDC.
Usage Policy
The IDS Usage Control Language is based on ODRL: https://international-data-spaces-association.github.io/DataspaceConnector/Documentation/v5/UsageControl
The Usage Policy is part of the usage contract, as well as the Self-Description. It contains permissions, prohibitions and obligations.
Usage Policy Examples:
Allow the Usage of the Data
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “ http://example.com/dataset/TestData001 ”,
“action”: “ http://www.w3.org/ns/odrl/2/use ”,
“assigner”: {
“uid”: “ http://example.com/provider/dataProvider001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “ http://example.com/roles/dataConsumer001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
}
}
]
}
Use Data and Delete it After
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “ http://example.com/dataset/TestData001 ”,
“action”: “ http://www.w3.org/ns/odrl/2/use ”,
“assigner”: {
“uid”: “ http://example.com/provider/dataProvider001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “ http://example.com/roles/consumer001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
}
}
],
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/deletion ”,
“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,
“rightOperand”: “after_use”
}
]
}
Restricted Number of Usages
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “ http://example.com/dataset/TestData001 ”,
“action”: “ http://www.w3.org/ns/odrl/2/use ”,
“assigner”: {
“uid”: “ http://example.com/provider/dataProvider001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “ http://example.com/roles/consumer001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
}
}
],
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/count ”,
“operator”: “ http://www.w3.org/ns/odrl/2/lteq ”,
“rightOperand”: “10”
}
]
}
Duration-restricted Data Usage
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/usage/UsagePolicy001 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “ http://example.com/dataset/TestData001 ”,
“action”: “ http://www.w3.org/ns/odrl/2/use ”,
“assigner”: {
“uid”: “ http://example.com/provider/dataProvider001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “ http://example.com/roles/consumer001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
}
}
],
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,
“operator”: “ http://www.w3.org/ns/odrl/2/leq ”,
“rightOperand”: “2024-12-31T23:59:59”
},
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/dateTime ”,
“operator”: “ http://www.w3.org/ns/odrl/2/geq ”,
“rightOperand”: “2024-01-01T00:00:00”
}
]
}
Extended Scenario
Another example of an extended usage policy for a dataset provided by a data provider. The policy will specify how a resource can be used once the access has been granted.
A dataset contains sensitive health research data. The data provider wants to ensure that this data is used responsibly and in compliance with specific guidelines. The usage policy specifies the following:
The data can only be used for academic research purposes;
The data cannot be shared with third parties;
The data must be deleted after the research project is completed;
The data usage is monitored, and any breach of the policy will result in revocation of access.
{
“@context”: “ http://www.w3.org/ns/odrl.jsonld ”,
“@type”: “Policy”,
“uid”: “ http://example.com/policy/usage/001 ”,
“profile”: “ http://www.w3.org/ns/odrl/2/odrl.jsonld ”,
“target”: “ http://example.com/dataset/health\_research123 ”,
“assigner”: {
“uid”: “ http://example.com/provider/dataProvider001 ”,
“role”: “ http://www.w3.org/ns/odrl/2/assigner ”
},
“permission”: [
{
“assignee”: {
“uid”: “ http://example.com/roles/researcher ”,
“role”: “ http://www.w3.org/ns/odrl/2/assignee ”
},
“action”: [
{ “name”: “ http://www.w3.org/ns/odrl/2/use ” }
],
“constraint”: [
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/purpose ”,
“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,
“rightOperand”: “ http://example.com/purpose/academic\_research ”
},
{
“leftOperand”: “ http://www.w3.org/ns/odrl/2/deletion ”,
“operator”: “ http://www.w3.org/ns/odrl/2/eq ”,
“rightOperand”: “after_use”
}
]
}
]
}
Policy Enforcement
This section presents draft content for a capability falling behind the scope of the current release and will be completed at a later time.
Simpl is using the XFSC Federated Catalogue as a Catalogue for Data, Apps and Infrastructure (see architecture document of XFSC Federated Catalogue ).
The Federated Catalogue is not a monolithic application. It consists of multiple components, to reuse existing technology and to allow scaling. Those components can be deployed individually (see section Deployment View).
The components are:
| Name | Responsibility |
|---|---|
| Catalogue | Main component, implementing the core catalogue functionality. |
| Authentication | External component implementing the authentication flow and user management. |
| Graph-DB | Graph database, holding all claims contained in active Self-Descriptions. The Graph database is responsible for executing semantic search queries. |
| File Store | The File store is a blob storage. It holds the Self-Description files and the files for the Schemas. This includes historical versions of the Self-Descriptions and Schemas. |
| Metadata Store | Store for metadata on the Self-Descriptions and Schemas stored in the File Store. |
The architecture of the core component is described in the next sections.
The authentication component is responsible for authenticating users. This is not a central component of the catalogue, as it will be implemented by Lot 1 “Authentication & Authorisation” of the GXFS-DE project. For the catalogue implementation, a mock integration is shown, using common, off the shelf software that implements the OpenID Connect standard [ 14 ] .
The responsibilities of the authentication components are:
Storage of Users;
Storage of user roles for a Participant.
A user belongs to only one Participant, on whose behalf he or she acts (see specification section 2.4 for more details).
For the implementation, Keycloak will be used. It is widely used and also part of the implementation of other lots. Therefore, this integration of different lots is simplified. The user will get a JSON Web Token (JWT [ 15 ] ) with user claims and authorities, which is used to authenticate requests to the catalogue REST API.
An alternative implementation would be Lissi [ 16 ] . It is not further considered, as it is not as mature as Keycloak.
The graph database holds the claims of verified, active Self-Descriptions. Claims of Self-Descriptions that fail the verification are not added to the graph database. Claims of Deprecated, Expired or Revoked Self-Descriptions will be deleted from the Graph database.
The Graph Database can be considered as a kind of search index. The single source of truth is the active Self-Descriptions, stored in the File Store. This means at any point in time the Graph database can be rebuilt from scratch by reimporting the claims of the Self-Descriptions. This allows the following:
Backup : An explicit backup of the Graph database is not needed. Backing up the Self-Description files (located in the File Storage) and the metadata (located in the Metadata Store) is sufficient to allow the rebuild of the Graph Database;
Scalability : Querying the Graph database might be the most critical part regarding performance. Therefore, the Graph Database can be replicated in the future by multiple, independent instances. Since there are no strict consistency requirements, changes in the Graph can be applied independently. In the control flow, all write operations on Self-Descriptions pass the Metadata Store. Therefore, the consistency can be enforced by that database.
Generically returning the Self-Description files containing claims that influence query response is not possible. To get the relevant Self-Description files, the query to the Graph Database can be formulated to return the Gaia-X entity that is the credentialSubject of a Verifiable Credential. Then this can be used as a filter for the Self-Description endpoint, to download the Self-Description file.
Neo4j is used as implementation of the Graph database.
|
Limitation: queries to non-Enterprise Neo4j Graph database returns an empty record when no results are found, rather than an empty list. When there is no data in the Graph database, i.e., no claims extracted from Self-Description, there is still a configuration node for the neosemantics module [ 17 ] , which enables Neo4j to support the RDF data model, which is required here. openCypher queries over all nodes without a WHERE clause or without specifying relationships always return this node, unless regular users are revoked access from the configuration node as follows: DENY MATCH {*} ON GRAPH neo4j NODES _GraphConfig TO PUBLIC However, this revoke operation is only supported in Neo4j enterprise. [ 18 ] It was decided not to implement a workaround that involves query rewriting, as this may have harmful side effects. |
The File Store is responsible to persist all file-based content submitted to the catalogue. These are Self-Descriptions and Schemas.
For the sake of simplicity, a folder in the file system is used as a file store. For future scalability the file store can be simply realised using an Object Storage or Database.
In the Metadata Store persists the metadata for the elements (Self-Descriptions, Schemas and Trust Anchors). It allows to efficiently identify the relevant files in the file storage, to process the incoming requests.
It is realised as relational database (e.g., PostgreSQL or MariaDB). Since all write requests are handled by the database, the transactional functionality guarantees the consistency of the data.
Todo: include the sections in our Wiki
The state machine for a Contract Negotiation is visualised in the figure below:
Transitions marked with C indicate a message sent by the Consumer , transitions marked with P indicate a Provider message. Terminal states are final; the state machine may not transition to another state. A new CN may be initiated if, for instance, the CN entered the TERMINATED state due to a network issue. The associated message types to switch into the mentioned states are denoted in the bottom part of each status box. For further information refer to the specification section contract negotiation protocol .
After successful contract negotiation the Transfer Process can be invoked via the data plane. The state machine for the transfer process is shown in the diagram below:
Any implementation of Eclipse Dataspace Protocol must implement the state machines shown above where respective contract messages respectively Transfer messages induces switching of states.
In the EDC connector the IDS Dataspace protocol is implemented. Via State Transition Functions any specific actions can be triggered like invoking consent or contract managers. This is described in Contract Negotiation Architecture .
Steps to be done for Contract Negotiations:
Pre-requisites a provider has to do to publish a service offered at a connector (using a provider connector):
Create an Asset on the provider side;
Create a Policy on the provider side;
Create a Contract definition on the provider side.
Steps to be done on the consumer side to request a service offer from a connector (using a consumer connector):
How to fetch catalogue on the consumer side;
Negotiate a contract on the consumer side;
Getting the contract agreement id.
These steps are described in Transfer-01-negotation .
After successful negotiation process the transfer process can be started.
In the first step, the Infrastructure Provider (or APP/Data Providers,
upon their need) can use the Deployment Script Management UI (and/or
API) to add their deployment scripts.
These scripts are either Crossplane or Terraform configuration files,
that at the time of execution, will:
A) Provision the infrastructure resources (VM, Container or Storage);
B) Deploy Applications over the infrastructure resources (using Cloud-init, and if needed);
C) Load data sets or images on the infrastructure resource (using Cloud-init, and if needed).
After adding the deployment script (via the available UI or the API), the DeploymentScriptID, which is a unique ID for that deployment script will be returned to the provider.
In the second step, at the time of creating infrastructure offerings (or bundles of app/data + infrastructure, as it would be required for use cases explained in BP 09A and BP 09B), the DeploymentScriptID is being added to the Self-Description (SD).
In the third step, when the offer has been selected and successfully contracted, the Infrastructure Provisioner API (the same API that handles the addition/removal and modifications of the Deployment Scripts in step 1) is being called (currently via the Data Space Connector Extension), and the DeploymentScriptID will be passed to that API for execution.
Therefore, that DeploymentScriptID is being validated, and if the validation is successful, the deployment script will be fetched from the storage and executed. The infrastructure Provisioner Module will (as explained above):
A) Provision the infrastructure resources (VM, Container or Storage).;
B) Deploy Applications over the infrastructure resources (using Cloud-init, at the time of first boot of the instance);
C) Load data sets or images on the infrastructure resource (using Cloud-init, at the time of first boot of the instance).
And will share back the access data with the consumer. Currently and at the time of writing this document, the access information and credentials are being shared in form of an email, but in the future wallet solutions are planned to be used.
The communication between the triggering module and the infrastructure provisioner is done via a message broker, to keep the process asynchronous.
The data sharing between two participant agents is done via two connectors based on Eclipse Dataspace Protocol relying on the IDSA Dataspace Protocol ). Dataspace protocol is divided into two parts: First Contract Negotiation has to be invoked and after successful negotiation the Transfer process can be invoked. Contract Negotiation is done via contract negotiation protocol for the data exchange service based on the trust protocol defined by Gaia-x.
The proposed data transaction model scope is compliant with the EDC Dataspace Protocol.
The Information model of the dataspace model is described
here
.
The figure sketches two implementations of a participant agent:
For further information refer to the specification section model .
EDC connector has implemented the above-mentioned dataspace protocol as well as the depicted data and control planes and an additional Management API . This management API is described in detail here .
The Transfer process is described in the Transfer Process Architecture and is implemented by providing a special extension to back-end systems.
In the current release data orchestration refers to the data plane component responsible for the actual data transfer that takes place after a contract is established between the parties through the connector’s control plane. This orchestration of data flow is a crucial step, as it translates contractual agreements into real actions for exchanging data between a source and a destination.
This component will be implemented as an extension of the EDC connector, it has been depicted as an external entity in various diagrams and system architectures. This approach underscores the goal of creating an external and independent solution that is agnostic of the specific connector used. Such independence is achievable as long as the connector supports the IDSA Dataspace Protocol, which is a key requirement to ensure interoperability within distributed ecosystems like sovereign data spaces or shared data infrastructures.
The primary role of this orchestrator is to serve as a bridge between the actual data source, located outside of the Simpl system, and the designated destination where the data is intended to flow. It ensures seamless connectivity between these two points, handling the complexities of transferring data across systems that may differ in protocols and technology. Additionally, the orchestrator is designed as a specific component tailored to each type of data source. This specialisation allows to externalise the technical management of heterogeneous data sources that will be handled in the Simpl scenario, reducing complexity and promoting flexibility in the integration of various data ecosystems.
Steps to be done for transfer process:
Either pull or push pattern can be used for transfer process:
Consumer pull
Provider push
Description according to SAMPLES from EDC: https://github.com/eclipse-edc/Samples/tree/main/transfer
Following diagram presents the state machine for this case:
Following diagram presents the sequence diagram for this case:
Provider and consumer agree to a contract (not displayed in the diagram);
Consumer initiates the transfer process by sending a DataRequest with destination type HttpProxy;
Provider Data Plane Selector is queried to find a suitable instance;
Provider Control Plane build a DataAddress which type EDR, whose:
endpoint corresponds to the public API of the selected Data Plane;
auth key is Authorisation;
auth code is a signed token generated by the Control Plane with claims;
dad containing the encrypted DataAddress of the actual data source (provider ecosystem);
cid claim containing the contract id.
This DataAddress is sent to the consumer Control Plane through the DSP protocol;
Consumer Control Plane converts the DataAddress into an EndpointDataReference object and dispatches it through the EndpointDataReferenceReceiverRegistry.
Once this process is completed, the consumer backend applications can use the received EndpointDataReference in order to query data from the provider Data Plane, by simply providing the provided token in the request header.
NOTE: For a Data Plane instance to be eligible for the Consumer Pull transfer, it must:
contains HttpProxy in the allowedDestTypes;
contain a property which key publicApiUrl, which contains the actual URL of the Data Plane public API.
Following diagram presents the state machine for this case:
Following diagram presents the sequence diagram for this case:
Provider and consumer agree to a contract (not displayed in the diagram);
Consumer initiates the transfer process, i.e. sends DataRequest with any destination type other than HttpProxy;
Provider Control Plane retrieves the DataAddress of the actual data source and creates a DataFlowRequest based on the received DataRequest and this data address;
Provider Control Plane asks the selector which Data Plane instance can be used for this data transfer;
Selector returns an eligible Data Plane instance (if any);
Provider Control Plane sends the DataFlowRequest to the selected Data Plane instance through its control API (see DataPlaneControlApi);
Provider Data Plane validates the incoming request;
If request is valid, Provider Data Plane returns acknowledgement;
DataPlaneManager of the Provider Data Plane processes the request: it creates a DataSource/DataSink pair based on the source/destination data addresses;
Provider Data Plane fetches data from the actual data source (see DataSource);
Provider Data Plane pushes data to the consumer services (see DataSink).
For visualisation the component Apache Superset is chosen. Superset is a modern data exploration and data visualization platform . It integrates well with a variety of data sources, and it is Open-Source under the Apache License . It comes out of the box with features to create a dashboard or how to explore data . It also provides Security Configurations . A REST API for user & role management can be enabled and even permissions can be customised. Superset’s public REST API follows the OpenAPI specification and is documented here .
The community also provides automatic builds for multi platforms and even prebuild docker builds from a Superset Docker Hub repository .
The following table identifies the different types of logs that can be generated by an IT system together with their definition/description:
| Grouping | Type of logs | Description |
|---|---|---|
| Business logs | Business logs | Record significant events or actions (related to steps within a business process or other functional use cases) that occur within a system, typically used for security, audit and troubleshooting purposes. |
| Technical logs | Application logs | Record events and activities generated by an application during its runtime, typically used for troubleshooting, monitoring performance and auditing activities within the application. |
| Database logs | Record events and activities generated by a database (queries, transactions, schema changes), typically used for troubleshooting, ensuring data integrity (e.g., monitoring transaction rollbacks, deadlocks, or schema violations) and auditing access. | |
| System logs | Record events and activities generated by the operating system (OS) and system-level processes. These logs provide valuable information for monitoring system health, diagnosing issues and ensuring security. System logs can include: low-level system events (kernel event, hardware error), system-level events (service startups/shutdowns/failure), authentication and authorisation events (login attempts, privilege escalation). | |
| Network logs | Record events and activities related to network traffic, devices and communications within a network. These logs are essential for monitoring network health, diagnosing issues and ensuring security. Network logs can include: firewall logs (allowed/denied connections, intrusion detection alerts, security policy violations), router and switch logs (device startups, interface status changes, routing protocol updates), DNS logs (queries/responses, cache activity, DNS server configuration changes and errors), proxy logs (user access, URL requests, content filtering, bandwidth usage) and network traffic logs (packet-level data, including source and destination IP addresses, port numbers, protocols, packet payloads). | |
| Security logs |
Security logs are not a distinct type of log, they are a subset of all the other logs listed above, which allow to detect and respond to security incidents effectively.
Ex: Intrusion detection alerts, security policy violations, anti-virus scans. |
|
| Infrastructure metrics | Infrastructure metrics | A metric is a piece of data that has a name, optional labels and a value. It is not a log per-se, as they need to be retrieved by periodically scrapping an endpoint of the host system (pull instead of push paradigm). Once retrieved, the information is then persisted as a log. |
| Health check | Health check | A health check is a procedure that helps to determine if a component is functioning correctly or not. Just like infrastructure metrics, health checks are not logs per-se, it is an API exposed by each component to return a simple status on the health of the component, which is queried periodically. |
Submission of a contract offer by a provider to a consumer.
For the sake of simplicity, application, database, system, network and security logs are grouped under the more generic term of Technical Logs .
| Use case | Type of logs required | Type of metrics required | Description |
|---|---|---|---|
| Log and monitor business actions, mostly for audit purposes. | Business logs | A business log in this case represents a specific step in a business process that is relevant/meaningful to be tracked. E.g. Submission of an onboarding request. | |
| Log and monitor consumption of a resource (infra/data/app) for various reasons (billing, audit, policy enforcement, regulations compliance ...). | Infrastructure metrics | Depending on the type of data or infrastructure resource that is being consumed, different metrics can be relevant: CPU, RAM, I/O, transfer speed, ... | |
| Technical logs | For application usage and for some data usage cases, application and database logs will give information on what is being done with the data/application. | ||
| Log and monitor the usage of a Simpl-Open agent (of its components) for the purposes of audit and troubleshooting. | Technical logs | All types of technical logs are relevant for troubleshooting purposes and some may also be relevant for audit. | |
| Business logs | A business log is generated for each incoming and outgoing operation at the boundaries of the agent (communication towards Tier 1 or Tier 2 users). | ||
| Infrastructure metrics | Infrastructure metrics generated by the deployed components of the agent (CPU, RAM, Disk, ...). | ||
| Monitor the health of the Simpl-Open agent. | Health check | Health is not logged but only monitored (the monitoring queries each technical component in real time to get its health status). |
Business logs are generated for each type of operation on the Simpl-Open agent:
For synchronous operations:
Request;
Response.
For asynchronous operations:
Request;
ACK of the request;
Callback;
ACK of the callback.
Business logs are generated in 2 places (in 2 different Elastic indexes):
The Tier 1 API Gateway for Human to Machine interactions;
The tier 2 API Gateway for M2M interactions.
Business logs contain the following fields:
Timestamp - Date and time at which the log was created;
Origin - Reference to the end-user (Tier I) or Simpl-Open agent (Tier II) that initiates the HTTP call;
Destination - Reference to the end-user (Tier I) or Simpl-Open agent (Tier II) which is targeted by the HTTP call;
Business Operations - Reference to the operation that is triggered (List to be defined);
Message type - For both sync and async transactions, 4 types: request, request ACK, response, response ACK;
Correlation ID - ID automatically generated by the first request in a transaction and is reused by the response and ACKs to correlate between messages that are part of the same transaction.
Business operations reference list:
| Business Process | Step in BP | Business Operation Description | Business Operation | Agent - tier1/tier2 gateway | Documented | Integrated | Technical API call/response | Backing Service | Service owner |
|---|---|---|---|---|---|---|---|---|---|
| BP 03A | BP03A.01 | Submission of an onboarding request by a provider or consumer. | ONBOARDING_REQUEST | Governance Authority - tier1-gateway | DONE | DONE | POST /onboardingApi/v1/onboardingRequests | Onboarding | Onboarding Team |
| BP03A.02 - Approved | Approval of a onboarding request by the Governance Authority. | APPROVE_ONBOARDING_REQUEST | Governance Authority - tier1-gateway | DONE | DONE | POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/approve | Onboarding | Onboarding Team | |
| BP03A.02 - Rejected | Rejection of an onboarding request by the Governance Authority. | REJECT_ONBOARDING_REQUEST | Governance Authority - tier1-gateway | DONE | DONE | POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/reject | Onboarding | Onboarding Team | |
| BP03A.09 | Confirmation of successful onboarding of a provider or consumer. | APPROVE_ONBOARDING_REQUEST | Governance Authority - tier1-gateway | DONE | DONE | POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/approve | Onboarding | Onboarding Team | |
| BP03A.11 | Confirmation of failed onboarding of a provider or consumer. | REJECT_ONBOARDING_REQUEST | Governance Authority - tier1-gateway | DONE | DONE | POST /onboardingApi/v1/onboardingRequests/{onboardingRequestId}/reject | Onboarding | Onboarding Team | |
| BP 05B | BP05B.07 | Submission of a resource description to the catalogue by a provider. | PUBLISH_CATALOG | Governance Authority - tier2-gateway | DONE | POST /self-descriptions | sd-tooling | Catalogue & Connector Team | |
| BP 06 | BP06.01 | Search in the catalogue. |
ADVANCED_SEARCH
QUICK_SEARCH |
Governance Authority - tier1-gateway | DONE | DONE |
POST /xfsc-advsearch-be/v1/selfDescriptions/advanced GET /xfsc-advsearch-be/v1/selfDescriptions |
xfsc-advsearch-be | Catalogue & Connector Team |
| BP 07 | BP07.01 | Submission of a contract request by a consumer to a provider. | ISSUE_CONTRACT | DONE | POST /contract/v1/credentials/agreements/{contractAgreementId}/definitions/{contractDefinitionId} | Contract Consumption Service | Contract & Billing Team | ||
|
BP07.06
BP07.09 |
Confirm signing of the Contract Agreement |
CONTRACT_TERMINATE
CONTRACT_FINALIZE |
DONE | POST /agreements/{contractAgreementId}/definitions/{contractDefinitionId}/status | Contract Manager Orchestrator | Contract & Billing Team | |||
| BP 08 | BP08.01 | Submission of an infrastructure resource request by a consumer. | POST /transfer/start | Contract Consumption Service | Contract & Billing Team | ||||
| BP08.02 | Completion of an infrastructure resource deployment by a provider. | TRIGGER_REQUEST | POST /scripts/trigger | Contract Consumption Service | Contract & Billing Team | ||||
| BP 09A | BP09A.01 | Submission of a request to transfer a data resource. | REQUEST_DATA_RESOURCE | POST /transfer/start | Contract Consumption Service | Contract & Billing Team | |||
| BP09A.02 | Completion of a data resource transfer by a provider. | TRANSFER_DATA_RESOURCE | POST /transfer/status/{id} | Contract Consumption Service | Contract & Billing Team | ||||
| BP 09B | BP09B.01 | Submission of a request to load data/application on a provider infrastructure by a consumer. | REQUEST_DATA_APPLICATION_RESOURCE | POST /transfer/start | Contract Consumption Service | Contract & Billing Team | |||
| BP09B.04 | Confirmation of a data/application resource deployment. | CONFIRM_DATA_APPLICATION_RESOURCE_DEPLOYMENT | POST /transfer/status/{id} | Contract Consumption Service | Contract & Billing Team |
Next to this predefined list of business operations, Simpl logs all incoming and outgoing requests between agents.
Technical implementation
Routes and ABAC/RBAC rules are loaded in the API Gateways through YAML files.
A separate configuration YAML file that maps routes and specific parameters (e.g. HTTP 200 response code) will be created. Currently, only a static configuration is supported. In a future release, it is aimed to support hot config changes.
Consumption of a resource (infrastructure/data/application) is logged and monitored for 2 main use cases :
Policy enforcement;
Billing.
The following (sub-)processes are considered:
Data Consumption
Direct access to the dataset (BP 09A);
Data is accessible from an infrastructure tenant (BP TBD - possibly extension of 09A);
Data is accessible through a built-in application deployed on the infrastructure tenant (BP 09B);
Infrastructure Consumption (BP 08).
For each of these scenarios, below Data Usage and Infrastructure Usage sections depict the applicable types of usage policy (which also drives billing) and how consumption can be monitored for each of them.
Direct access to the dataset
In this scenario, the data is shared directly between the provider and the consumer (outside of Simpl-Open) and as such no usage policy can be enforced (only “legal enforcement” possible). It corresponds to the “allow usage of data” and “use data and delete afterwards” policies.
This also implies that billing always happens as a one-time payment, upfront of the consumption (possible extension to BP 07).
There is thus nothing that the Simpl-Open agent can log or monitor during consumption.
Data is accessible from an infrastructure tenant
In this scenario, the data provider shared the data on an infrastructure tenant provisioned by an infrastructure provider.
2 types of usage policies are considered, which can be technically enforced and billed:
Based on the number of usages (e.g. access the data 3 times)
Based on the duration (e.g. access the data for 7 days)
In both cases, policy enforcement and billing can be performed based on the logs from the storage.
Architecture assumptions:
It is assumed that VMs and containers always have an attached storage;
It is assumed that Simpl-Open only supports natively S3-compliant storage but is extensible to support other storages (offering an API).
The logs (e.g. storage, bandwidth) are collected over HTTP through the S3 logging API ( Object Storage: Standardising on the S3 API - Architecting IT ).
The exact list of logs that will be collected by Simpl-Open and the mechanism to collect these logs are still to be defined based on what is offered by the S3 logging API.
Data is accessible through a built-in application deployed on the infrastructure tenant
In this scenario, the data provider gives the consumer access to an application that offers restricted viewing (such as read only) or processing capabilities over the data resource. Only Scenario 1 is considered (a stand-alone application will be deployed on a dedicated infrastructure resource per consumer).
1 type of usage policy is considered, which can be technically enforced and billed:
Architecture assumptions:
It is assumed that the application is always deployed and terminated together with the infrastructure resource as part of the deployment script;
It is assumed that Simpl-Open only supports native applications deployed on Kubernetes but is extensible to support other platforms (offering an API).
In this case, monitoring the status of the underlying infrastructure resource is sufficient.
To do so, the following 2 options exist:
Collecting log files from the infra resource;
Collecting logs from the infrastructure provider API.
The first option could be more restrictive as it requires access to the infrastructure resource itself.
Simpl-Open therefore implements option 2 and collects logs through the kube-api exposed by the infrastructure provider.
2 types of usage policies are considered, which can be technically enforced and billed:
Based on duration (e.g. access to a VM for 7 days);
Based on resource utilisation (e.g. CPU, RAM, storage, bandwidth).
In the first case, monitoring the status of the infrastructure resource is sufficient and in the second case, it requires access to infrastructure metrics of the resource.
Architecture assumptions:
It is assumed that Simpl-Open only supports natively:
S3-compliant storage
Kubernetes containers platform
VMWare virtual machines
but is extensible to support other platforms (offering an API).
Both the status of the resource and infrastructure metrics can be collected through the infrastructure provider APIs:
S3 API for storage;
kube-api for containers;
VMWare API for VMs.
The exact list of logs/metrics that will be collected by Simpl-Open and the mechanism to collect these logs are still to be defined based on what is offered by the APIs.
This section is only a placeholder for capabilities falling behind the scope of the current release and will be completed at a later time.
The existing LogWrapper ( Log4J wrapper - SIMPL - Confluence ) has been designed at the early stage of project development. As project progresses there is increasing demand to add additional field to a wrapper with the goal to build better dashboards for the end user.
Nested document for HTTP
| Log Type | Old Schema | New Schema | Comments |
|---|---|---|---|
| Infrastructure |
{ "timestamp": "2024-08-20T06:20:12.201Z", "level": "INFO", "message": "Application started", "thread": "main", "logger": "eu.simpl.simpl_billing.SimplBillingApplication" } |
{ "timestamp": "2024-08-20T06:20:12.201Z", "level": "INFO", "message": "Application started", "http": { "method": "GET", "action": "full URL" }, "thread": "main", "customFields": { "<key1>": "value1", "...": "..." } } |
(optional) httpMethod - GET/POST/PUT/DELETE/OPTION (optional) httpAction - full URL of the request customFields - as a place holder map, where could place any field that is application specific. |
| Business |
{ "timestamp": "2024-08-12T12:43:18.437+0200", "level": "BUSINESS", "message": { "msg": "Network", "messageType": "RESPONSE", "businessOperation": "[operation1, operation2]", "origin": "origin_name", "httpStatus": "200", "destination": "destination_name", "correlationId": "correlation_id", "user": "user_name" }, "thread": "main", "httpRequestSize": "null", "httpExecutionTime": "null" } |
{ "timestamp": "2024-08-12T12:43:18.437+0200", "level": "BUSINESS", "message": { "msg": "Network", "messageType": "RESPONSE", "businessOperation": "operation name", "origin": "origin_name", "destination": "destination_name", "correlationId": "correlation_id", "user": "user_name", "userIp": "userIp", "customFields": { "<key1>": "value1", "...": "..." } }, "thread": "main", "http": { "method": "GET", "action": "full URL", "status": "200", "requestSize": "null", "executionTime": "null", "responseSize": "null" } } |
httpMethod - GET/POST/PUT/DELETE/OPTION
httpAction - full URL of the request
customFields - as a place holder map, where could place any field that is application specific. |
Backward compatibility
A new version of a Java package will be published which should still be able to produce V1 logs. All teams can adopt new version of the library at their own pace. Backward compatibility to be handled at logstash level.
A python equivalent will be produced for Data2 team applications that are Python Based
Links
Requirements: SIMPL-2949 - Getting issue details… STATUS
epic: SIMPL-4114 - Getting issue details… STATUS
Onboarding request for custom fields: SIMPL-18664 - Getting issue details… STATUS
code: https://code.europa.eu/simpl/simpl-open/development/contract-billing/common_logging
This document specifies the high-level, service-oriented architecture for the core governance services within the SIMPL framework, with a primary focus on Schema Lifecycle Management . The design is guided by principles of Separation of Concerns , Decoupling , and Interoperability .
The architecture is defined by the Schema Management Service (SMS) , which is the definitive source of truth and lifecycle manager for all schemas and vocabularies. The SMS provides the tools for a Governance Authority to manage, version, and control the status of schemas.
Downstream services, such as a Catalogue Service , act as consumers of these schemas. The interaction between the SMS and its consumers is event-driven . This ensures that consuming services are decoupled, resilient, and performant, as they do not need to query the SMS in real-time to perform their functions (e.g., validating resource descriptions).
At its core, the architecture defines two key services:
Schema Management Service (SMS) : The exclusive, authoritative system for managing the entire lifecycle of schemas, from creation and versioning to publication and revocation.
Catalogue Service (Example Consumer) : A consuming service that validates and stores Resource Descriptions. It subscribes to events from the SMS to maintain a local, synchronised registry of published schemas.
The system operates on a clear separation between the management of schemas and their consumption. Management is a direct interaction with the SMS, while consumption is driven by events that the SMS produces.
A. Core Flow: Schema Lifecycle Management (Governance Perspective)
Schema Version Creation : A Governance Administrator creates a new version of a schema by submitting a SHACL file and its associated metadata (e.g., version number, changelog) to the SMS Management API . The SMS validates and stores the new version.
Schema Publication : To make an entire schema family available for use, the Administrator uses the SMS Management API to change the status of the Schema Concept to PUBLISHED.
Event Notification : Upon successfully changing the status, the SMS :
Updates its internal database to reflect the new status.
Publishes a SchemaPublished event. This event contains the schema’s metadata, its new status, and the content of its versions.
Schema Revocation : If a schema family is no longer approved for use, the Administrator changes its status to REVOKED via the API. This triggers a SchemaRevoked event, preventing new data from being validated against any version of this schema.
B. Example Use Case: Event-Driven Validation (Consumer’s Perspective)
This flow describes how a consuming service, like a Catalogue, leverages the event-driven model.
Subscription & Caching : The Catalogue Service subscribes to events from the SMS. This is typically implemented via a secure webhook where the SMS calls a private endpoint on the Catalogue (e.g., POST /internal/events/schema-published).
Local Registry Update : When the Catalogue receives a SchemaPublished or SchemaRevoked event, it processes the payload and updates its own local, optimized registry of published schemas . The Catalogue is now self-sufficient for validation.
Submission for Publication : A Provider submits a Resource Description (RD) to the Catalogue Service . The RD references a specific schema version.
Local and Fast Validation : The Catalogue Service performs all validation against its local registry :
It checks that the schema family is present in its registry and has an active (PUBLISHED) status.
It uses its local copy of the schema version to perform SHACL validation on the RD.
Crucially, there are no real-time API calls from the Catalogue to the SMS during the validation process.
Processing : If validation is successful, the Catalogue persists the RD. If it fails (either because the RD is invalid or the schema is not in its local published registry), it returns an error.
Schema Management Service (SMS)
Objective : To be the single, authoritative source of truth and lifecycle manager for all governance-related schemas and vocabularies.
Key Responsibilities :
Providing a secure API for creating and versioning schemas and vocabularies.
Providing an exclusive interface for Governance Administrators to manage the lifecycle status (PUBLISHED/REVOKED) of schema concepts (families).
Publishing events (SchemaPublished, SchemaRevoked) to notify subscribed services of lifecycle changes.
Ensuring the integrity and validity of the schemas it manages.
Making schema content publicly discoverable and retrievable via stable, referenceable URIs for ad-hoc discovery or bootstrapping new subscribers.
Interfaces :
Management API : A private, authenticated RESTful interface
for all management tasks. It is the sole entry point for creating versions and changing the lifecycle status of schema concepts.
Resolver Interface : A public, read-only interface that
serves the raw RDF content of schemas and vocabularies.
Event Publisher : An internal component that pushes
notifications to the registered webhooks of subscribing services.
Catalogue Service (Example Consumer)
Objective : To act as a repository for validated Resource Descriptions, ensuring data quality by enforcing conformance to published schemas.
Key Responsibilities in this Context :
Exposing an endpoint for receiving RD submissions.
Maintaining a local, cached registry of published schemas, synchronized via events from the SMS.
Exposing a private webhook endpoint for the SMS to push SchemaPublished and SchemaRevoked events.
Performing all internal validation of submitted RDs against its local schema registry.
Persisting RDs that successfully pass validation.
Technology : The reference implementation uses Apache Jena Fuseki with a TDB2 backend. This provides a performant, standards-compliant RDF triple store with support for SPARQL queries and updates.
Guiding Principles :
Data Segregation : The data is partitioned into logical datasets to enforce access control and simplify data management tasks like backup and indexing.
Immutability : Published assets (schemas, vocabularies, RDs) are treated as immutable. Changes are handled by creating new versions, not by updating existing ones.
Rich Metadata : Each asset is described with a comprehensive set of metadata properties to support discovery, administration, and provenance tracking.
The data is partitioned across five distinct datasets.
ds_schemas
Purpose : Contains only the raw SHACL content of schemas.
Structure : Each schema version is stored in its own named graph. The URI of the named graph is identical to the schema version’s public, dereferenceable URI.
Management : Managed exclusively by the Schema Management Service (SMS) .
ds_schema_metadata
Purpose : Contains only the administrative metadata about schemas (e.g., titles, versions, status, changelogs).
Structure : All schema metadata triples are stored in a single, default graph or a dedicated named graph.
Management : Managed exclusively by the SMS .
ds_vocabularies & 2.4. ds_vocabulary_metadata
Note : These dataset descriptions would follow the same status model as schemas, where the status is on the concept, not the version.
ds_vocabularies Purpose : Contains only the raw RDF content of vocabularies (e.g., SKOS thesauri).
ds_vocabulary_metadata Purpose : Contains only the administrative metadata about vocabularies.
ds_resource_descriptions
OUT OF SCOPE : This is out of scope for schema management but offers an insight into how schemas could be used by a downstream service like a catalogue. The Catalogue Service would manage this dataset to store validated Resource Descriptions.
Rationale for Granular Segregation
This five-dataset approach prioritises security and clarity of ownership over query performance.
Benefit : It provides the highest level of data isolation. Access control policies can be applied at the dataset level, ensuring, for example, that the Catalogue Service has absolutely no ability to modify schema content or metadata.
Trade-off : It introduces complexity for queries that need to join data across these datasets (e.g., the SMS’s internal validation checking a schema’s use of a vocabulary). Such operations require the service logic to perform multiple queries to different datasets and join the data at the application level, or rely on SPARQL’s SERVICE clause for federated queries, which can have performance implications.
The following specifies the properties used to describe schemas and vocabularies. Prefixes dct, owl, and simpl refer to Dublin Core, OWL, and a custom SIMPL namespace, respectively.
Schema Metadata
Schema Concept (the “family”) :
a simpl:Schema: Declares the resource as a schema concept.
dct:title: A concise, human-readable title.
dct:description: A detailed description of what this schema is used to describe.
simpl:resourceType: A literal classifying the schema’s target. Values: “data”, “infrastructure”, “application”.
simpl:status: A literal with a value of “PUBLISHED” or “REVOKED”. This controls whether any version of the schema can be used for new resource validation.
simpl:latestVersion: An object property pointing to the URI of the most recent version of this schema concept.
Schema Version :
a simpl:SchemaVersion: Declares the resource as a specific version.
dct:isPartOf: Points back to the parent simpl:Schema concept URI.
dct:creator: The URI or identifier for the user who submitted this version.
dct:created: The xsd:dateTime of the submission.
owl:versionInfo: The semantic version string (e.g., “1.0”, “1.1.2”).
simpl:changelog: A literal containing a description of changes in this version.
Metadata Field Constraints
To ensure consistency and validity, the following constraints apply to the metadata fields submitted via the API. The schemaName corresponds to the name field submitted during creation, which forms the unique identifier in the API path.
| Schema Name | name | String | Required . Must be PascalCase. Alphanumeric only. Min 3, Max 64 chars. No spaces or special characters. | ApplicationAsset | |
|---|---|---|---|---|---|
| Title | title | String | Required . Plain text. Min 10, Max 255 chars. | "Application Asset Schema" | |
| Description | description | String | Required . Plain text. Min 20, Max 2048 chars. | "A schema for describing a software application..." | |
| Resource Type | resourceType | String | Required . Alphanumeric only. Min 3, Max 64 chars. No spaces or special characters. | application | |
| Status | status | String | Required on PATCH . Must be one of: PUBLISHED, REVOKED. | PUBLISHED | Not required on first schema creation - it's status is always PUBLISHED - will be handled in story where we PUBLISH a schema |
| Version | version | String | Required . Must follow Semantic Versioning (SemVer) format (X.Y.Z). | "1.2.0" | |
| Changelog | changelog | String | Required . Plain text. Max 1024 chars. | "Added operationalStatus property." | Not required on first schema creation - this is added on version creation |
System-Populated Metadata
The following metadata fields are automatically generated by the system and cannot be provided by the user.
| Initial Version | owl:versionInfo | String | On Schema Creation | The system automatically sets the initial version to "1.0.0" . |
|---|---|---|---|---|
| Initial Status | simpl:status | String | On Schema Creation | The system sets the initial status of a new schema concept to "PUBLISHED" by default. AN "PUBLISH" event needs to be triggered |
| Created Timestamp | dct:created | xsd:dateTime | On Version Creation | A timestamp matching 'xsd:dateTime' requirements |
| Creator | dct:creator | string | On Version Creation | Some sort of ID of the authenticated user that created the version. |
| Parent Link | dct:isPartOf | xsd:anyURI | On Version Creation | An automatic link back to the parent simpl:Schema concept URI. |
| Latest Version Ptr | simpl:latestVersion | xsd:anyURI | On Version Creation | The parent schema concept is updated to point to the URI of the new version. |
This section specifies the RESTful API for the Schema Management Service (SMS).
Endpoint Base URL : https://api.simpl.space/api/v1
Authorization : API endpoints enforce Role-Based Access Control (RBAC).
governance-admin : Required for write operations (POST, PATCH).
governance-viewer or provider : Required for read operations (GET).
Errors are returned using a structured JSON body compliant with RFC 7807 (Problem Details for HTTP APIs).
Example Error Response:
{
“type”: “[ https://api.simpl.space/errors/conflict\ ]( https://api.simpl.space/errors/conflict )”,
“title”: “Conflict”,
“status”: 409,
“detail”: “Version ‘1.0.0’ for schema ‘DataAsset’ already exists.”,
“instance”: “/api/v1/schemas/DataAsset/versions”
}
List & Query Vocabularies
Endpoint : GET /vocabularies
Description : Retrieves a paginated list of vocabulary concepts.
Query Parameters :
Success Response (200 OK) : A paginated list of vocabulary concepts.
Get Vocabulary Concept
Endpoint : GET /vocabularies/{vocabName}
Description : Retrieves the metadata for a single vocabulary concept, including its status and a list of all its available versions.
Success Response (200 OK) : A JSON object with the concept’s metadata.
Revoke or Activate Vocabulary Concept
Endpoint : PATCH /vocabularies/{vocabName}
Description : Changes the status of a vocabulary concept to ACTIVE or REVOKED.
Authorization : governance-admin
Request Body : {“status”: “REVOKED”} or {“status”: “ACTIVE”}
Responses :
200 OK: Returns the full, updated resource representation of the vocabulary concept.
404 Not Found.
Create Vocabulary Version
Endpoint : POST /vocabularies/{vocabName}/versions
Description : Submits a new version of an existing vocabulary concept.
Authorization : governance-admin
Request Content-Type : multipart/form-data
Parts :
metadata: JSON object. {“version”: “1.1”, “changelog”: “Added new status.”}
file: The vocabulary content as a .ttl file.
Responses :
201 Created: The Location header is set, and the response body contains the new version’s resource representation.
400 Bad Request, 409 Conflict.
Get Vocabulary Version Metadata
Endpoint : GET /vocabularies/{vocabName}/versions/{version}
Description : Retrieves the metadata for a single vocabulary version.
Responses :
200 OK: Returns the full, updated resource metadata.
404 Not Found.
Create Schema
Endpoint : POST /schemas
Description : Creates a new schema concept and its initial version (v1.0.0). The system automatically creates the first version.
Authorization : governance-admin
Request Content-Type : multipart/form-data
Parts :
metadata: JSON object containing the concept’s properties. {“name”: “DataAsset”, “title”: “Data Asset”, “description”: “Schema for describing a dataset.”, “resourceType”: “data”}
file: The schema content as a .ttl file.
Responses :
201 Created: The Location header is set to the new schema concept’s URI. The response body contains the new concept’s resource representation, including the auto-created first version.
400 Bad Request: The file part contains invalid SHACL or the metadata is malformed.
409 Conflict: A schema with the provided name already exists.
List & Query Schemas
Endpoint : GET /schemas
Description : Retrieves a paginated list of schema concepts. This is the primary discovery endpoint for any schema consumer (e.g., Providers, tools, administrators).
Query Parameters :
resourceType (string): Filter by type (data, infrastructure, application).
status (string, default: PUBLISHED): Filter concepts by status (PUBLISHED, REVOKED).
Success Response (200 OK) : A paginated list of schema concepts.
Get Schema Concept
Endpoint : GET /schemas/{schemaName}
Description : Retrieves the metadata for a single schema concept, including its status and a list of all its available versions.
Success Response (200 OK) : A JSON object with the concept’s metadata.
Revoke or Publish Schema Concept
Endpoint : PATCH /schemas/{schemaName}
Description : Changes the status of a schema concept to PUBLISHED or REVOKED. This controls whether the schema family is active for use.
Authorization : governance-admin
Request Body : {“status”: “REVOKED”} or {“status”: “PUBLISHED”}
Responses :
200 OK: Returns the full, updated resource representation of the schema concept.
404 Not Found.
4.5. Create Schema Version
Endpoint : POST /schemas/{schemaName}/versions
Description : Submits a new version of an existing schema concept. The server performs all internal validation. A new version does not have its own status.
Authorization : governance-admin
Request Content-Type : multipart/form-data
Parts :
metadata: JSON object. {“version”: “1.0.0”, “changelog”: “Initial version.”}. Note that resourceType is part of the concept and is not submitted here.
file: The schema content as a .ttl file.
Responses :
201 Created: The Location header is set, and the response body contains the new version’s resource representation.
400 Bad Request (e.g., for validation failure).
Get All Versions for a Schema
Endpoint : GET /schemas/{schemaName}/versions
Description : Retrieves a list of all available versions for a single schema concept.
Authorization : governance-viewer or higher.
Success Response (200 OK) :
Error Response :
Get Schema Version Metadata
Endpoint : GET /schemas/{schemaName}/versions/{version}
Description : Retrieves the metadata for a single schema version.
Responses :
200 OK: Returns the full, updated resource metadata.
404 Not Found.
To support a decoupled, event-driven architecture, the SMS provides a mechanism for consuming services to subscribe to lifecycle events via webhooks. When a registered event occurs (e.g., a schema is published), the SMS will send a POST request to the subscriber’s registered URL with a detailed event payload.
Webhook Management
Endpoint : POST /webhooks
Description : Creates a new subscription to receive event notifications.
Request Body :
{
“targetUrl”: “ https://catalogue.example.com/internal/events/simpl-sms ”,
“events”: [“SchemaPublished”, “SchemaRevoked”],
}
Responses :
201 Created: The webhook subscription was created successfully.
|
{
"webhookId": "", "targetUrl": " https://catalogue.example.com/internal/events/simpl-sms ", "events": [] } |
Endpoint : GET /webhooks
Endpoint : DELETE /webhooks/{webhookId}
Event Payload Structure
When an event is triggered, the SMS will send a POST request to the targetUrl.
Example: SchemaPublished Event Payload
This payload is self-contained, providing the consumer with all the information needed to add the schema to its local registry without further API calls.
Schema Published
|
{
"eventId": "evt_123456789", "eventType": "SchemaPublished", "timestamp": "2023-10-27T10:00:00Z", "data": { "schema": { "uri": " https://api.simpl.space/schemas/ApplicationAsset ", "title": "Application Asset", "description": "Schema for describing a software application.", "resourceType": "application", "status": "PUBLISHED" }, "version": { "uri": " https://api.simpl.space/schemas/ApplicationAsset/1.0.0 ", "version": "1.0.0", "changelog": "Initial version.", "created": "2023-01-15T09:30:00Z" } } } |
Schema Revoked
|
{
"eventId": "evt_123456789", "eventType": "SchemaRevoked", "timestamp": "2023-10-27T10:00:00Z", "data": { "schema": { "uri": " https://api.simpl.space/schemas/ApplicationAsset ", "title": "Application Asset", "description": "Schema for describing a software application.", "resourceType": "application", "status": "PUBLISHED" }, "version": { "uri": " https://api.simpl.space/schemas/ApplicationAsset/1.0.0 ", "version": "1.0.5", "changelog": "updated field x", "created": "2023-01-15T09:30:00Z" } } } |
This section provides concrete examples of a schema and a corresponding resource description. It illustrates how a schema uses terms from the core simpl vocabulary and walks through the end-to-end validation use case based on the defined event-driven architecture.
Core simpl Vocabulary Excerpt
This excerpt defines the operationalStatus property and the Active, Inactive, and Decommissioned concepts. These terms would be part of the foundational simpl vocabulary managed by the Governance Authority.
@prefix rdf: < http://www.w3.org/1999/02/22-rdf-syntax-ns\# > .
@prefix rdfs: < http://www.w3.org/2000/01/rdf-schema\# > .
@prefix owl: < http://www.w3.org/2002/07/owl\# > .
@prefix simpl: < https://api.simpl.space/meta\# > .
# Definition of the property itself
simpl:operationalStatus a rdf:Property, owl:ObjectProperty ;
rdfs:label “Operational Status”@en ;
rdfs:comment “Describes the current operational state of an asset.”@en .
# Definition of allowed values (as individual concepts)
simpl:Active a owl:NamedIndividual ;
rdfs:label “Active”@en .
simpl:Inactive a owl:NamedIndividual ;
rdfs:label “Inactive”@en .
simpl:Decommissioned a owl:NamedIndividual ;
rdfs:label “Decommissioned”@en .
Example Schema: ApplicationAsset
This schema describes a software application. It uses standard properties (from dct:) and properties defined within the simpl: vocabulary. The simpl:operationalStatus property is constrained by a sh:in list, which references concepts from the core vocabulary.
File Submitted by User : application-asset-v1.2.0.ttl
Submission Metadata (JSON part) : {“version”: “1.2.0”, “changelog”: “Added operational status.”}
Content of application-asset-v1.2.0.ttl :
@prefix sh: < http://www.w3.org/ns/shacl\# > .
@prefix simpl: < https://api.simpl.space/meta\# > .
@prefix dct: < http://purl.org/dc/terms/ > .
@prefix xsd: < http://www.w3.org/2001/XMLSchema\# > .
< https://api.simpl.space/schemas/ApplicationAsset/1.2.0 >
a sh:NodeShape ;
sh:targetClass simpl:ApplicationResource ;
sh:property [
sh:path dct:title ;
sh:datatype xsd:string ;
sh:minCount 1 ;
sh:maxCount 1 ;
] ;
sh:property [
sh:path simpl:owner ;
sh:datatype xsd:string ;
sh:minCount 1 ;
] ;
sh:property [
sh:path simpl:operationalStatus ;
sh:minCount 1 ;
sh:maxCount 1 ;
# Constrains the value to be one of the concepts from the core simpl vocabulary
sh:in ( simpl:Active simpl:Inactive simpl:Decommissioned ) ;
] .
End-to-End Use Case: Validating a New Resource Description
This walkthrough illustrates the data flow for publishing a new ApplicationResource, reflecting the event-driven architecture where the Catalogue Service is a subscriber to the Schema Management Service (SMS) .
Phase 1: Schema Publication and Notification (Pre-requisite)
Administrator Publishes Schema : A Governance Administrator uses the SMS Management API to change the status of the ApplicationAsset schema family to PUBLISHED.
SMS Publishes Event : The SMS successfully updates its database and sends a SchemaPublished event notification to all its subscribers, including the Catalogue Service. The event payload contains all the metadata and SHACL content for all versions of the ApplicationAsset schema.
Catalogue Service Updates Local Registry : The Catalogue Service receives the event, validates its signature, and populates its own local, optimized registry of published schemas . It now has a local copy of the ApplicationAsset schema and knows it is active for validation.
Phase 2: Resource Description Submission and Validation
Provider Creates Resource Description : A Provider authors a Resource Description, ensuring it conforms to a specific version of a published schema.
@prefix simpl: < https://api.simpl.space/meta\# > .
@prefix dct: < http://purl.org/dc/terms/ > .
a simpl:ApplicationResource ;
dct:conformsTo < https://api.simpl.space/schemas/ApplicationAsset/1.2.0 > ;
dct:title “Customer Relationship Manager” ;
simpl:owner “Sales Department” ;
simpl:operationalStatus simpl:Active .
Provider Submits to Catalogue Service : The Provider sends the Resource Description content in a POST request to the Catalogue Service .
Catalogue Service Performs Local Validation :
The Catalogue Service parses the submitted RDF and extracts the schema URI: https://api.simpl.space/schemas/ApplicationAsset/1.2.0 .
It consults its local registry . It confirms that the ApplicationAsset schema family is present and its status is PUBLISHED.
It retrieves the content for version 1.2.0 from its local cache.
No API call is made to the SMS.
The Catalogue Service loads the provider’s RD and the local schema content into its internal SHACL engine for validation.
Publication and Storage :
Since validation succeeds, the Catalogue Service persists the Resource Description in its own dataset, making it published and discoverable.
If validation had failed (e.g., the RD was invalid, or the schema was not found in the local registry), the service would have returned an error to the Provider without storing the RD.
The Resolver Interface is a public, read-only set of HTTP endpoints designed to provide stable, referenceable URIs for accessing the content of schemas and vocabularies managed by the Schema Management Service (SMS). It is the primary means for ad-hoc discovery and retrieval of schema and vocabulary resources by any client, including developers, tools, or other services bootstrapping their local caches.
The design is guided by the following principles:
Public Accessibility : Unlike the private, authenticated Management API, the Resolver Interface is open to the public for read-only operations.
Dereferenceable URIs : The URIs for schema and vocabulary concepts and versions are stable and can be resolved over HTTP to retrieve their content or metadata.
Content Negotiation : The interface supports content negotiation, allowing clients to request the resource representation that best suits their needs, such as RDF in various serializations or a JSON representation of the metadata.
Statelessness : Each request to the resolver contains all the information needed to process it, adhering to REST principles.
Content negotiation allows a client to request a specific representation of a resource. The Resolver Interface uses the standard HTTP Accept header for this purpose. Clients should specify their desired media type in this header.
Supported media types for schema and vocabulary versions include:
text/turtle: The raw SHACL or RDF content in Turtle format.
application/ld+json: The content in JSON-LD format. (PENDING)
application/rdf+xml: The content in RDF/XML format. (PENDING)
If a client does not provide an Accept header, the interface will respond with a default: text/turtle.
These endpoints provide access to content of schema concepts and their versions.
Resolve Schema Concept
Resolves a schema concept. By default, this endpoint returns the raw SHACL content of the latest published version of the schema, providing a stable URI for clients that always need the most up-to-date version.
Endpoint : GET /schemas/{schemaName}
Description : This endpoint provides a single, stable URI for a schema family. Its behavior depends on the Accept header.
Authorization : None required. This is a public endpoint.
Responses :
200 OK : Returns the appropriate resource based on the Accept header.
404 Not Found : If the {schemaName} does not exist.
406 Not Acceptable : If the server cannot provide a representation in the requested format.
Example 1: Resolving the latest schema content (default)
Request:
GET /schemas/ApplicationAsset HTTP/1.1
Host: api.simpl.space
Accept: text/turtle
Response (Content-Type: text/turtle):
@prefix sh: < http://www.w3.org/ns/shacl\# > .
@prefix simpl: < https://api.simpl.space/meta\# > .
@prefix dct: < http://purl.org/dc/terms/ > .
@prefix xsd: < http://www.w3.org/2001/XMLSchema\# > .
< https://api.simpl.space/schemas/ApplicationAsset/1.2.0 >
a sh:NodeShape ;
sh:targetClass simpl:ApplicationResource ;
sh:property [
sh:path dct:title ;
sh:datatype xsd:string ;
sh:minCount 1 ;
sh:maxCount 1 ;
] ;
sh:property [
sh:path simpl:owner ;
sh:datatype xsd:string ;
sh:minCount 1 ;
] ;
sh:property [
sh:path simpl:operationalStatus ;
sh:minCount 1 ;
sh:maxCount 1 ;
sh:in ( simpl:Active simpl:Inactive simpl:Decommissioned ) ;
] .
Resolve Schema Version
Retrieves the raw SHACL content for a specific version of a schema.
Endpoint : GET /schemas/{schemaName}/{version}
Description : This endpoint unambiguously resolves to the SHACL file for a specific, immutable version of a schema.
Authorization : None required. This is a public endpoint.
Content Negotiation : Supported. Clients can request different RDF serializations via the Accept header.
Responses :
200 OK : Returns the schema content in the requested or default format.
404 Not Found : If the {schemaName} or {version} does not exist.
406 Not Acceptable : If the server cannot provide a representation in the requested format.
Request:
GET /schemas/ApplicationAsset/1.2.0 HTTP/1.1
Host: api.simpl.space
Accept: text/turtle
Response should be identical to above
This section outlines the architecture for the notification service, which uses an asynchronous API with Kafka for message queuing.
The following AsyncAPI specification defines the contract for the notification service. It details the channels, messages, and operations for sending notifications.
asyncapi: ‘3.0.0’
info:
title: Notification Service API
version: ‘1.0.0’
description: API documentation for a notification service using Kafka.
defaultContentType: application/json
servers:
production:
host: ‘kafka://localhost:9094’
protocol: kafka-secure
description: Kafka server
channels:
notifications:
address: “notifications”
messages:
EmailNotification:
$ref: ”#/components/messages/EmailNotification”
operations:
SendNotification:
action: send
summary: Sending notification message to Kafka topic ‘notifications’
channel:
$ref: ’#/channels/notifications’
components:
messages:
EmailNotification:
name: EmailNotification
title: Sending email notification
payload:
type: object
properties:
channel:
type: string
enum:
description: Type of notification channel.
message:
type: string
description: Body of the message.
to:
type: string
description: Email address of the recipient.
cc:
type: array
items:
type: string
description: List of email addresses in CC.
subject:
type: string
description: Subject of the message.
“$ref: ’#/components/messages/EmailNotification‘“
For any service to send notifications, the service will act as a Kafka producer and publish messages to the notifications topic. The notification service will then consume these messages and send out the actual notifications (e.g., emails).
Construct the Message : the service creates a message that conforms to the EmailNotification schema defined in the AsyncAPI specification.
Serialize the Payload : The message payload is serialized into a JSON string.
Publish to Kafka : The serialized payload is sent to the notifications Kafka topic.
The calling service will need to be configured with the Kafka broker details to connect and publish messages. On dev this is deployed in the common namespace along with other shared services.
We also need to setup a configuration for the ‘to:’ email address used in the notifications.
Here is an example of a notification message that would be sent to providers when a new schema is published.
{
“channel”: “notifications”,
“to”: “ providers@example.com ”,
“subject”: “New <ResourceType> Schema Published: <New Schema Name>”,
“message”: “A new schema has been published with the following details:\n\n- **Name**: New Schema Name\n- **Title**: Title of the New Schema\n- **Description**: A brief description of what this new schema is about.\n- **Resource Type**: Schema”
}
The scope of Simpl-Open’s security is limited to its role as an agent that facilitates communication between participants (nodes).
An example of a typical end-to-end (E2E) flow is outlined below:
A Consumer decides to access a dataset managed by a Data Provider.
The Data Provider ensures the security of its dataset and compliance with the applicable regulations, at the time of its creation.
Only a portion of the dataset may be made available for sharing.
Simpl-Open does not oversee the control of datasets, which remain entirely under the ownership and management of the Data Provider.
A Consumer reserves an infrastructure tenant from an Infrastructure Provider.
This tenant is a Platform-as-a-Service (PaaS) environment derived from Infrastructure-as-a-Service (IaaS) and PaaS services provided by the Infrastructure Provider.
Both the Data Provider and the Consumer can access this dedicated tenant.
Data is transferred from a Data Provider to a Provider’s infrastructure (dedicated tenant) using the Simpl-Open Agent, which manages:
Contract Establishment between consumer and provider’s
organisations.
Secure Communication: Ensuring safe data transfer from the
provider (source) organisation to the consumer (target) organisation.
Access Control: Granting tenant access to authorised personnel
only.
This typical end-to-end flow is presented on the following figure.
Simpl-Open functions as middleware, managing Agent-to-Agent communication flows without storing any datasets. As the primary decision-makers in data processing, legal and security responsibilities regarding the data rest solely with the participants (Data Controllers). This includes their obligations to comply with legal and regulatory requirements for both data usage and provision.
As Simpl-Open is a distributed System, the classical end-to-end responsibilities (such as security, operations etc.) are segmented as follows:
Network responsibility: facilitates the pure exchange of information through a network of Simpl-Open agents (i.e. deployment of Simpl-Open).
Local Node responsibility: each participant (node) is accountable for managing its datasets, applications, infrastructure, and workstation in compliance with its local regulations.
This segmentation of responsibilities is depicted on the following figure.
The overall security of a Data Space is the result of contributions from multiple actors, with their respective responsibilities, structured as follows:
Governance Authority : orchestrates the security framework across all participants.
Every Participant : Each participant is required to have local IT security plans and implement measures for their personnel, IT systems, and local deployment of the Simpl-Open agent.
Deployment of Simpl-Open network : provides security capabilities to ensure a robust protection for the node-to-node communication.
Simpl-Open agent : Each agent includes features to comply with the Simpl-Open IT Security Plan, ensuring alignment with the product’s security requirements.
Simpl-Open development : The development process adheres to stringent security measures, ensuring the product is resilient against potential threats.
This section focuses exclusively on the architecture of Simpl-Open as a product.
Separate architecture documents will be created for each deployment of Simpl-Open, including IT security plans tailored to specific Data Spaces and detailing the responsibilities assigned to each participant.
Several aspects of Security have been implemented specifically into the “ DevSecOps Approach ” section of the Architecture document, on those areas:
| Domain | Confluence Reference |
|---|---|
| User Management |
Audit process (WIP)
|
| OVH audit trails | OVH Log Data Platform service is used for K8s audit logs management. |
| Security testing (SAST, SCA, DAST) | SAST, DAST and SCA are implemented as part of the DevSecOps pipelines, as described in the DevSecOps Approach section. |
| Backup and restore | Cluster backups are made using Velero . |
These part covers the aspect related to the “Production of Simpl-Open” as SW product.
The following tables present the features that have already been introduced as part of the security architecture of Simpl-Open.
These features were identified consequently to other business features described in SC1 Annex 1 or were implemented based on standard best practices in application architecture.
In future version of the architecture document, each relevant section could be updated to highlight how the security controls are implemented in Simpl-Open. This could be in the shape of a dedicated security related paragraph in the respective sections, describing the specific security control implementation.
The following table presents the features that have been analysed and designed to address the security aspects listed below.
| ID | Domain | Node | Feature | Section of document |
|---|---|---|---|---|
| 1 | Tier 1 Access Control | All | RBAC (Role Based Access Control) | Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 2 | Tier 2 Access Control | Governance Authority(Management) | ABAC(Attribute Based Access Control) | Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 3 | Local Directory System | All |
Tier 1 Authentication Provider (OpenID Connect) User & Roles |
Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 4 | Tier 1 Authorisation | All | Authorisation Tier 1 | Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 5 | Tier 2 Authorisation | Governance Authority | Authorisation Tier 2 | Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 6 | Tier 1 Authentication | All | Tier 1 Authentication Provider (OpenID Connect) | Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 7 | Tier 2 Authentication | Governance Authority |
Identity Provider Federation Security Attribute Provider |
Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 8 | Tier 2 Authentication | All | Tier 2 Authentication Provider | Simpl-Open Technology Architecture > Detailed Technical Specifications > Identification, Authentication & Authorisation |
| 9 | Communication | All |
Encryption Integrity Authentication |
Simpl-Open High Level Overview > Data Space Concepts (see " Data Space Participant: Tier I and Tier II ") |
| 10 | Logging | All |
Logging Monitoring Reporting |
Simpl-Open Technology Architecture > Detailed Technical Specifications > Logging, Monitoring & Reporting |
Technical security includes deployment aspects of the open-source technology components, which are outlined below. These have been implemented based on general secure development guidelines and standard security architecture patterns.
| ID | Domain | Node | Components | Section of document |
|---|---|---|---|---|
| 1 | Local Directory System | All |
Keycloak federated with any Local IDP User & Roles microservice |
Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA |
| 2 | Tier 1 Authorisation | All | Spring Cloud Gateway | Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA |
| 3 | Tier 2 Authorisation | Governance Authority | Spring Cloud Gateway | Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA |
| 4 | Tier 1 Authentication | All | Keycloak | Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA |
| 5 |
Tier 2 Authentication |
Governance Authority |
EJBCA Security Attribute Provider microservice |
Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA |
| 6 | Tier 2 Authentication | All | Tier 2 Authentication Provider microservice | Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 1 - Onboarding & IAA |
| 7 | Logging | All | Monitoring Service (ELK) | Simpl-Open Technology Architecture > Technology Components Views > TCV - Domain 3 - Management/Operation of Data Space |
Next to the above features implemented as part of the Simpl-Open product, every participant should consider the following typical infrastructure deployment practices (outside of Simpl-Open scope), including technical security / hardening features, such as:
DMZ protected access and Network Security of the DMZ (DDoS or any other attack), IDS, FW, VPN
2-3 tiers deployment view (Via Container Security or VMs/Network Security)
API Gateway / Front End Layer
Backend Services
Secure Interface towards each Node Applications/Data Sources
These recommended measures are depicted on the following figure:
This section describes the handshake process to establish a secured mTLS connection with another agent.
The initial version of the handshake process is designed to assume that, the called endpoint belongs to another Simpl-Open agent (is secured by its Tier 2 Authorisation Gateway) and for this reason is triggered in any HTTP call performed by the mTLS HTTP Client (the only option for the current release) to try to establish a mTLS connection. However, if the target endpoint does not belong to a Simpl-Open agent, the communication fallback to standard HTTPS(TLS)
Handshake process steps:
The caller mTLS client uses the Tier 2 Authentication Provider to retrieve a valid Ephemeral Proof
if it already exists, from the agent cache.
if not present in the cache, a new proof is then requested from the Authority and stored in the cache for subsequent calls until it expires.
The caller agent performs an mTLS authentication
the caller agent performs always a credential check (OCSP request) to the Authority Identity Provider.
the called agent Tier 2 Authentication Gateway performs a credential check (OCSP request) to the Authority Identity Provider only if no validated proof associated with the credential’s public key is found in his cache.
The caller agent always sends the ephemeral proof to be validated
the called agent checks the cache to see if the proof was already validated (if true step 3.b is not performed).
the proof is validated and stored in the cache with a TTL (Time To Live) calculated according to its expiration time.
Call the endpoint
The enhanced version of the handshake process is designed to discover if the called endpoint belongs to another Simpl-Open agent (is secured by its Tier 2 Authorisation Gateway) and only in this case an attempt to establish a mTLS connection is done. Anyway, if the target endpoint does not belong to a Simpl-Open Agent the communication fallback to standard HTTPS (TLS), the main optimisations done are:
No mTLS authentication is done for non-Simpl-Open endpoints
Only one credential check per called agent per proof is performed
No multiple exchanges of the proof that are already exchanged
Extended Ephemeral Proof Validation(issuance and expiration time are checked to be valid by all agents)
Handshake process steps:
The caller agent mTLS client (client/transparent proxy/any other implementation) uses the Tier 2 Authentication Provider to retrieve a valid Ephemeral Proof and its HASH
if it already exists, from the agent cache
if not present in the cache a new proof is then requested from the Authority and stored in the cache for subsequent calls until it expires.
The caller agent performs a preflight call (HTTP OPTION) to the Called T2 preflight endpoint using the proof HASH as query string param, to notify which proof will be used:
The caller agent performs an mTLS authentication
the caller agent tries to retrieve the credential check response associated with the received credential’s public key from the cache
if no credential check response is found in the cache, the caller agent performs a credential check request (OCSP) to the Authority Identity Provider and stores it in the cache with a specified TTL
the called agent Tier 2 Authentication Gateway tries to retrieve the credential check response associated with the received credential’s public key from the cache
if no credential check response is found in the cache, the called agent Tier 2 Authentication Gateway performs a credential check request (OCSP) to the Authority Identity Provider and stores it in the cache with a specified TTL
only if a 204 - No Content status code was received in step 2 then the caller agent sends the ephemeral proof to be validated:
Call the endpoint
The administration of the Simpl-Open agent requires technical accounts that are allowed to:
Perform the first configuration of the agent
Create the accounts for managing the operations of the agent components. A standard structure for these accounts will be proposed but this structure should be tailored to the business and technical organisation of each participant, where rights are assigned to roles, and roles are assigned to people.
This section gives an overview of the architecture for the DevSecOps tools and environments for Simpl-Open. The following diagram is taken over from Specific Contract 1 - Terms of reference, which provides an overall view of the required DevSecOps approach completed with the relevant choices of tools/technologies in our implementation.
This architecture diagram below shows the main components of the DevSecOps toolchain used to comply with the above-mentioned approach for the development of Simpl-Open.
The central CI/CD pipeline to build and test the applications and components, as well as the code repositories, are on a GitLab instance on code.europe.eu.
OVH is used for the different Kubernetes clusters:
Dedicated cluster with various namespaces for development;
Dedicated cluster with various namespaces for integration;
Dedicated cluster with various namespaces for end-to-end testing;
Dedicated clusters for Keycloak identity management, GitLab Runners, DevSecOps tools and DevSecOps tool testing (staging env for DevSecOps tools).
Tickets, test cases and test reports are found in Jira, which is set up along with Xray for test management.
The diagram reflects the current status of the toolchain, with planned elements shown shaded.
The DevSecOps team provides the infrastructure resources to the development teams. It is responsible for the setup of the Kubernetes clusters on OVH and the management of those.
The cluster “Dev-components” is set up as the development environment. Each team gets isolated name space(s) to run their services.
To avoid vendor-lock-in, it is proposed to avoid using managed services on OVH like managed databases. This guarantees that the designed solution will also work on other cloud platforms without modifications needed.
The DevSecOps team manages the clusters via Rancher and provides access to the projects and namespaces only to the team members of the product streams.
The expected workload for each of the environments is estimated based on the input from the different development teams and used for initial cluster sizing.
The table below shows the different stages and what they are used for.
| Stage | Purpose | Data | Operations Level Agreement |
Target User Group (responsible for deployment) |
Release Management | Deployment Strategy (when releases are applied) |
|---|---|---|---|---|---|---|
| Dev |
|
|
n/a |
|
|
|
| Int |
|
|
n/a |
|
|
|
| Pre-Prod |
|
|
n/a |
|
|
|
If a Dev-Team needs a new environment for any stage, they need to create an issue in the following GitLab repo: Simpl/Operations/Environment-onboarding.
The DevSecOps team will create the environment with the default tool stack and grant access to it afterwards.
Process Description:
Create project in Rancher in the desired cluster (dev, int, …);
Deploy default toolstack (ingress etc.);
Create project in Argo CD;
Grant access to Rancher project and Argo CD project.
The following best practices are used to secure the environments.
Access is granted / revoked based on the process described below.
Definition of Basic roles (as tracked in PMO master list)
| User role name | Description |
|---|---|
| ADMIN | Role for the operation of the DevSecOps toolchain |
| PSO/ EC | Members of PSO accessing the DevSecOps toolchain for quality assurance purposes |
| DEVELOPER | Developers who will use the DevOps pipeline for development activities |
| LEAD DEVELOPER | Developer with code ownership and elevated security privileges |
| DEVELOPER OPS | Developers with elevated infrastructure privileges |
| LEAD DEVELOPER OPS | Developers with code ownership and elevated security + infrastructure privileges |
| TESTER | Testers who will take part in the testing of developed code |
This table shows the mapping between the basic roles and the internal roles within each tool.
|
Tool (with internal Roles) Basic Role |
code.europe.eu |
Argo CD (ADMIN, DEV, READ-ONLY) |
Rancher (ADMIN, Project Member, Read-Only) |
Vault (ADMIN, DEV, DENY-ALL) |
Fortify (Security Lead (Admin), Developer, Lead Developer, Tester) |
SonarQube (ADMIN, DEV) |
Prometheus | Grafana | Loki | Aerokube Moon |
|---|---|---|---|---|---|---|---|---|---|---|
| everyone on PMO master list (non-need for any specific DevSecOps role) |
DEVELOPER on Simpl group level (subject to self-registration) |
n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| ADMIN | MAINTANER is set manually for DevSecOps team | ADMIN for Argo CD | ADMIN | ADMIN | Security Lead (Admin) | ADMIN | ADMIN | ADMIN | ADMIN | ADMIN |
| PSO |
DEVELOPER on Simpl group level (subject to self-registration) |
READ-ONLY | READ-ONLY | DENY-ALL | n/a | Dev | n/a | VIEWER | n/a | n/a |
| DEVELOPER |
DEVELOPER on Simpl group level (subject to self-registration) |
Project Member (per project) |
n/a | n/a |
Developer (per application) |
Dev | n/a | VIEWER | n/a | n/a |
| LEAD DEVELOPER |
DEVELOPER on Simpl group level CODEOWNER in repository (subject to self-registration) |
Project Member (per project) |
n/a | n/a |
Lead Developer (including Developer rights) (per application) |
Dev | n/a | VIEWER | n/a | n/a |
| DEVELOPER OPS |
DEVELOPER on Simpl group level (subject to self-registration) |
Project Member (per project) |
Project Member (per project) |
DEV (per project) |
Developer (per application) |
Dev | n/a | VIEWER | n/a | n/a |
| TESTER |
DEVELOPER on Simpl group level (subject to self-registration) |
READ-ONLY | n/a | n/a | Tester | Dev | n/a | VIEWER | n/a | TESTER |
Keycloak is used as a central instance for user management and providing the login mechanisms for the different tools:
Application security scans will be done by Fortify on Demand (FoD). The scope is the following:
Static Application Security Test (SAST);
Static Component Analysis (SCA);
Container Scanning;
Dynamic Application Security Test (DAST).
SAST (Static Application Security Testing) is a method of testing the source code of an application for security vulnerabilities without executing the application. It’s a type of white-box testing that analyses the application’s internal structures and logic on the code level for flaws that might lead to security risks and vulnerabilities. Main advantages of using SAST:
Enables detection and remediation of vulnerabilities early in the Software Development Lifecycle (SDLC), reducing costs and risks;
Can analyse the entire codebase, providing a comprehensive security assessment;
Helps the project to comply with security standards like OWASP, PCI DSS and others
WHEN: SAST is performed on every commit/merge
WHAT: the source code is scanned
WHERE: during the development lifecycle
WHO: the pipeline is starting the analysis automatically
Fortify is integrated with the central component development pipeline which triggers a Static Application Security Test (SAST) when new code is merged in the repository by the development or the integration teams. For the feature branch a scan can be requested manually. The result of the scan is shown on the dashboard of Fortify. Developers can review the results of their components and handle identified vulnerabilities in the next version of the code. A quality gate set in Fortify must be met for the pipeline to merge the code to the main branch.
SCA (Software Composition Analysis) is a process used to identify and manage risks associated with the use of third-party and Open-Source software components in an application. It is a critical aspect of modern software development, as applications increasingly rely on external libraries and frameworks. Main advantages of using SCA:
Protects the Simpl agent by identifying and addressing vulnerabilities in external components;
Mitigates legal risks from improper use of Open-Source licenses;
Automates tracking and reporting of third-party components.
WHEN: SCA is performed on every commit/merge
WHAT: third-party libraries are scanned
WHERE: during the development lifecycle
WHO: the pipeline is starting the analysis automatically
SCA is integrated with the same approach as SAST using the debricked service of the Fortify online platform.
Container scanning refers to the process of analysing and inspecting container images for security vulnerabilities, compliance issues, malware and other potential risks before they are deployed in a production environment. Here’s a brief overview:
Purpose: The primary goal is to ensure that containers are free from known security vulnerabilities and adhere to organisational security policies. This helps in maintaining the integrity, confidentiality and availability of applications running inside these containers.
Components Scanned:
Base Images: Checking if the base images from which containers are built have any known vulnerabilities.
Dependencies: Examining all the software libraries and dependencies included within the image for vulnerabilities or outdated versions.
Configuration: Assessing the container’s configuration files for potential security misconfigurations.
WHEN: container scanning is performed in the development lifecycle at every branch and i every commit/merge
WHAT: the containers and images used are scanned
WHERE: during the development lifecycle
WHO: the pipeline is starting the analysis automatically
DAST (Dynamic Application Security Testing) is a method used to identify security vulnerabilities in an application by analysing it during runtime. It simulates attacks on a running application, typically from an external perspective, to uncover vulnerabilities that can be exploited in real-world scenarios. Main advantages of using DAST:
Identifies vulnerabilities of the Simpl software as an attacker would exploit them
Finds issues related to application logic, runtime behaviour and server configuration.
WHEN: at the end of the development lifecycle, after e2e testing.
WHAT: DAST is performed on the Agents.
WHERE: DAST is performed in the pre-prod environment
WHO: the end2end team is responsible for configuring and running the scans
DAST is implemented using the WebInspect service of the Fortify online platform.
While SAST, SCA and Container Scanning are integrated into the component pipeline and used on the component level, DAST will be triggered manually after deployment of the integrated Simpl agent in the pre-prod environment and the completion of preparational steps. This testing type may require a runtime of up to 2-3 days by Fortify. Once the testing is complete the Fortify dashboard will provide an overview of the results. Similarly to SAST and SCA developers can review the identified issues and act on them as necessary.
In order to set up a flexible and scalable environment for managing our containerised applications, Kubernetes has been identified as the best fit technology. The primary reasons for the choice are:
vendor neutral platform;
support of microservices infrastructure;
autoscaling capabilities to handle growing and fluctuating workloads;
support for DevSecOps;
support of multi-tenant environments;
high availability.
Main features of Kubernetes:
Network Policies : Defined network policies network policies to control traffic within the cluster; OVH provides a default set of policies. Inside the Kubernetes cluster ingress and egress isolation for pod level according to the specific needs of the cluster are used.
Secret Management : Usage of Kubernetes Secrets and/or Vault to store sensitive data securely. Secrets used in the pipeline are stored in an external tool like Vault;
Role-Based Access Control (RBAC) : Implemented RBAC to manage access to cluster resources based on user roles; Keycloak groups are mapped to Rancher projects to ensure proper isolation of namespaces. User roles are mapped to Keycloak groups/roles.
Service Accounts : Usage of Service Accounts to authenticate and authorise pods;
Image Scanning : The integrity and security of container images are verified before deployment in the pipeline; This is done by using Trivy/Fortify triggered by the pipeline.
Regular Updates and Patching : To keep the Kubernetes distribution and components up-to-date with the latest security patches regular updates are done. Since Kubernetes is a managed service, updates are made available by OVH. Admins regularly check update options and decide to stay with the current version or update.
This section outlines the implementation of continuous deployment (CD)
and GitOps in Simpl-Open using GitLab CI/CD, Helm Charts, Argo CD and
multiple environments.
The goal is to automate the release management process, ensuring
consistent and reliable deployments across various environments.
The architecture consists of:
GitLab : The source code is managed on the GitLab instance at code.europa.eu . There also the CI/CD pipeline is used;
Helm Charts : Package managers for Kubernetes applications;
Argo CD : A continuous deployment tool for automating the application release process for the development, integration and pre-prod environments
Fleet Management : A K8 concept and tool to centrally manage DevSecOps tools, agents, components on every cluster in the landscape.
Multiple Environments : The deployment is done in multiple environments.
The project uses GitFlow, as a branching strategy for Git repositories designed to streamline collaboration and manage releases in software projects. GitFlow has become widely adopted in software development workflows, especially for projects with regular release cycles.
The advantages of the GitFlow approach:
Clearly defines branches for development, features, releases and hotfixes;
Makes it easier for teams to work on different features or issues simultaneously;
Facilitates managing multiple releases and hotfixes.
Artefacts should be versioned according to the Semantic Versioning Concept.
The GitFlow approach in Simpl is depicted in the following diagram:
Explanation for the branches:
main : The main branch, which represents the production-ready code;
develop : The development branch, where new features are developed and tested;
feature/ *: Feature branches for specific tasks or fixes;
release : Release branch for release candidates;
hotfix : Forked from tags of the main branch used for urgent fixes.
In GitLab (code.europa.eu) the main and develop branches are set up as protected. Merge to main can be initiated from develop, release and hotfix. Developers remove release and hotfix following the merge of the updated code to main. Develop is only allowed to merge code from feature/* (for instance feature/SIMPL-1234). Developers remove the feature branch after merging to develop.
The pipeline is implemented using GitLab CI/CD. The are multiple steps included to ensure proper testing and security before the deployment:
Pipeline features:
Building the Code : Compile the source code into executable binaries or artifacts;
Perform Unit Testing : Execute automated tests to verify individual components of the codebase for correctness;
Create package for distribution : Bundle the application into distributable formats like jar and publish to the GitLab artifacts;
Perform Quality Testing with Sonar : Perform static code analysis to identify code quality issues and technical debt;
Build and Push Docker Image : Create a Docker image from the application and push it to the GitLab Container registry;
Perform Image Scan and produce SBOM : Scan the Docker image for vulnerabilities and compliance issues using Trivy, list all dependencies created with Trivy and Fortify;
Perform SAST (Static Application Security Testing)/SCA (Software Composition Analysis) : Analyse the source code for security vulnerabilities without executing the code using Fortify;
SCA (Software Composition Analysis) : Identify and analyse Open-Source components (dependencies scanning) for known vulnerabilities using Trivy for image scanning and Fortify for dependencies;
Release reports, Java package and Helm Chart : Update the version, package and release a Helm chart for Kubernetes deployments in the GitLab Package Registry;
Pipeline runs can be tracked on the UI of GitLab. Issues are indicated by the progress diagrams on the UI, details are provided by GitLab based on the logs of the failing jobs.
The release management process will be carried out on two distinct levels (App of Apps concept):
Unitary Component development
Agent Components (integration and pre-prod)
The following diagram shows the overall process:
As shown in the diagram there are multiple stages with different environments:
Development Environment : The development environment is where new features and enhancements are developed and tested on component level;
Integration Environment : The integration environment is dedicated to integration activities and integration testing before they are promoted to pre-prod;
Pre-Prod Environment : The pre-prod environment serves as an environment where features are integrated and tested together as a cohesive unit, end to end. This is where load testing is taking place.
A Production environment is not planned for Simpl-Open, just for Simpl-Labs and Simple-Live.
As an overall concept, the release management process is automated using GitLab CI/CD and Argo CD:
Prepare a Release : A new release version is created following the GitFlow approach on GitLab;
Build and Test : The extended pipeline stages run to validate the release quality, including E2E testing and extensive security scanning;
Deploy : Argo CD deploys the release to the target environment;
Verify : Verify the deployment by running tests and monitoring application logs.
Helm Charts are used to manage components and Kubernetes applications. Similarly, Helm Charts define the application on Agent level.
Chart Management : Helm Charts are managed using GitLab CI/CD, allowing for automated updates and versioning;
Deployment : Helm Charts are deployed to the target environment using Argo CD.
Component development teams release their component by a Helm Chart. By the application of the App of Apps concept, on the Agent level (App) the individual components (Apps) are defined and managed. This is ensured by the configuration of Helm Charts in a hierarchical manner (Agent level configuration overwrites component level configuration).
Benefits of the App of Apps Concept
Scalability : Simplifies managing a large number of components in complex agents;
Centralised Management : Enables a single point of control for all components;
Flexibility : Supports managing components across multiple environments or clusters;
Modularity : Each component remains independently manageable, facilitating updates and troubleshooting;
GitOps Alignment : Integrates seamlessly with GitOps workflows for declarative management.
Argo CD is used to automate the deployment process. It is deployed in each environment to support the release process.
Application Definition : Define applications in Argo CD’s configuration file (application.yaml);
Source Code Management : Argo CD manage automatic deployments. Deployments happen based on the following triggers: by repo source code change, package registry change and manually. Dev team have the freedom to configure this for their components.
Deployment Strategies : Choose deployment strategies for each application, such as rolling updates or blue-green deployments.
The testing process is separated into different phases which are shown in the diagram below. The full test process is described in the testing document.
Monitoring and logging tools are used to track application performance and detect issues:
Prometheus : A monitoring tool that collects metrics from the environments/ tools and stores them in a timeseries database; Deployed as an agent on all clusters.
Grafana : A visualisation tool that displays dashboards based on Prometheus data. Centrally deployed on the DevSecOps-tools cluster as a central instance.
Promtail : will be deployed on the clusters as the agent to discover and gather logs.
Loki : will be deployed to centrally aggregate and manage collected logs.
Prometheus agents will be deployed to all Kubernetes clusters and connected to the central Prometheus instance (on the DevSecOps tool server) to consolidate metrics. Grafana, deployed on the DevSecOps cluster will use the data available for the Grafana central instance to visualise data.
Metrics data is retained for 1 month.
Currently, email based alerting mechanism is set up in Grafana to notify the operators for events configured in the tool.
Further extension of the infrastructure will be done with the deployment of Promtail and Loki.
On the Kubernetes clusters Velero is used for backing up and restoring persistent volumes. The tool is deployed on the following clusters:
| Cluster | Backup policy |
|---|---|
| dev-components | WEEKY, DAILY |
| devint-agents | WEEKY, DAILY |
| devsecops-keycloak | NONE |
| devsecops-runners | WEEKLY |
| devsecops-tools | WEEKY, DAILY |
| devsecops-toolstest | NONE |
| preprod-agents | WEEKY, DAILY |
Backup is done for the specific Namespaces configured for the process.
Data restoration is carried out with the same tool.
Deprovisioning refers to the process of removing or deleting resources that were initially provisioned. In this context, it involves removing the application from the Kubernetes clusters across different environments and deleting the infrastructure managed by Terraform.
For Simpl there are two steps for the deprovisioning: One for the application and the second for the overall infrastructure to shut down Simpl completely.
The application deployed via ArgoCD can be removed by deleting the relevant application resource. This can be achieved using the ArgoCD CLI or the ArgoCD API.
Please note that this process has to be repeated for every instance of the application running on different Kubernetes clusters within each environment.
Terraform maintains an up-to-date state file that reflects the current state of the infrastructure. To deprovision the infrastructure, it needs to be destroyed using Terraform.
Terraform offers the `destroy` command to delete the infrastructure which was deployed by Terraform. The command compares the state file to the current infrastructure and removes everything that exists.
The command has to be executed for each of the environments separately. This can be done exclusively by the admins of the DevSecOps team for each environment.
If the infrastructure including the DevSecOps-Tools-Cluster is destroyed, also the managed applications like Keycloak, Vault etc. are deleted. Any necessary data must be backed up before this process is started.
While L2 requirements are mapped to functional requirements through the use of components in Jira, the table below provides an extract from this mapping.
| Requirement ID | Summary | Component/s |
|---|---|---|
| SIMPL-402 | Create usage policy | Resource Offering Editor |
| SIMPL-409 | Assign usage policy | Resource Offering Editor |
| SIMPL-415 | Enforce usage policies | Contract Management, Data Space Connector |
| SIMPL-469 | Quick Search | Federated Catalogue, Search |
| SIMPL-500 | Semantic Validation | Federated Catalogue, Vocabulary Management |
| SIMPL-503 | Access policy publication | Resource Offering Editor |
| SIMPL-514 | Assign Contract Template | Contract Management, Resource Offering Editor |
| SIMPL-1610 | Defining preconfigured attributes | IAA |
| SIMPL-1612 | Tier 2 identity attributes configuration | IAA |
| SIMPL-1613 | Tier 2 attributes management - services | Onboarding, IAA |
| SIMPL-1616 | Authentication between participant agents | IAA |
| SIMPL-1619 | Handling different versions of application | Federated Catalogue, Resource Offering Editor |
| SIMPL-1629 | Unified Orchestration Mechanism | Infrastructure Management |
| SIMPL-1630 | Cross-Platform Service Management | Infrastructure Management |
| SIMPL-1655 | Participant offboard operations | Onboarding, IAA |
| SIMPL-1658 | Implement monitoring actions | Observability |
| SIMPL-1672 | View the onboarding process documentation and initiate the onboarding | Onboarding |
| SIMPL-1673 | Register onboarding application | Onboarding |
| SIMPL-1674 | Onboarding request - tracking by applicant | Onboarding, IAA |
| SIMPL-1675 | Onboarding requests - automated tracking and monitoring | Onboarding |
| SIMPL-1676 | Onboarding requests - verification support | Onboarding, IAA |
| SIMPL-1677 | Onboarding requests - manual approval support | Onboarding, IAA |
| SIMPL-1679 | Onboarding requests - rejection support | Onboarding, IAA |
| SIMPL-1681 | Attribute selection | Onboarding, IAA |
| SIMPL-1682 | Create credential request | Onboarding, IAA |
| SIMPL-1683 | Credential creation | Onboarding, IAA |
| SIMPL-1684 | Credential request - tracking by participant | Onboarding, IAA |
| SIMPL-1685 | Credential request - notification of completion | Onboarding |
| SIMPL-1686 | Credentials installation and review - services | Onboarding, IAA |
| SIMPL-1687 | Credentials installation and review - status and information | Onboarding, IAA |
| SIMPL-1688 | Credentials installation and review - identity attributes check | Onboarding, IAA |
| SIMPL-1689 | Users and roles configuration | Onboarding, IAA |
| SIMPL-1696 | Mandatory quality rules | Federated Catalogue |
| SIMPL-1698 | Validation of a resource description - feedback to the provider | Federated Catalogue, Schema Management |
| SIMPL-1699 | Syntax Validation | Federated Catalogue, Schema Management |
| SIMPL-1704 | Creating a resource description | Resource Offering Editor |
| SIMPL-1705 | Uploading a resource description | Federated Catalogue, Resource Offering Editor |
| SIMPL-1715 | Access policy definition | Resource Offering Editor |
| SIMPL-1719 | Advanced Search | Federated Catalogue, Search |
| SIMPL-1728 | Attributes of a self-description for a dataset | Federated Catalogue |
| SIMPL-1729 | Attributes of a self-description for an application | Federated Catalogue |
| SIMPL-1730 | Support for sharing across the Federated Dataspace | Federated Catalogue |
| SIMPL-1731 | Adding a vocabulary | Vocabulary Management |
| SIMPL-1734 | Advance search - Search parameters compliant with constraints and vocabularies | Schema Management, Vocabulary Management |
| SIMPL-1739 | Triggering Mechanism | Data Space Connector, Infrastructure Management |
| SIMPL-1740 | data space IAA Tier 2 customization | IAA |
| SIMPL-1741 | End user authentication process - services | IAA |
| SIMPL-1743 | Identity provider federation initialisation | IAA |
| SIMPL-1744 | Ensure RBAC compliance | IAA |
| SIMPL-1745 | Roles management operations | IAA |
| SIMPL-1746 | Identity provider federation configuration | IAA |
| SIMPL-1747 | Identity provider federation APIs | IAA |
| SIMPL-1748 | End user authentication process - api | IAA |
| SIMPL-1749 | Adding attributes of a self-description for a dataset/application/infrastructure | Federated Catalogue, Vocabulary Management |
| SIMPL-1751 | Update vocabulary | Vocabulary Management |
| SIMPL-1752 | Remove vocabulary | Vocabulary Management |
| SIMPL-1753 | Updating attributes of a self-description for a dataset/application/infrastructure | Schema Management |
| SIMPL-1754 | Selecting shared entries | Federated Catalogue |
| SIMPL-1755 | Selecting dataspaces for catalogue sharing | Federated Catalogue |
| SIMPL-1756 | Publishing shared entries to selected dataspace | Federated Catalogue |
| SIMPL-1757 | Quality dimension and Quality Rules | Federated Catalogue, Resource Offering Editor |
| SIMPL-1758 | Calculation of Quality Score | Federated Catalogue |
| SIMPL-1772 | Storing results | Search |
| SIMPL-1784 | Data sharing | Data Space Connector, Data Transfer |
| SIMPL-1787 | Duplication of source before applying data processing | Data Transfer |
| SIMPL-1788 | Template and Policy Engine for VM | Infrastructure Management |
| SIMPL-1789 | Integration with Cloud APIs through Crossplane | Infrastructure Management |
| SIMPL-2882 | Log infrastructure consumption metrics in the provider agent | Observability |
| SIMPL-2884 | Metrics to log during Infrastructure resource consumption | Observability |
| SIMPL-2889 | Monitoring infrastructure consumption | Observability |
| SIMPL-2894 | Simpl shall log metrics when data is transferred through the Simpl-Open agent | Observability |
| SIMPL-2902 | Monitoring data consumption | Observability |
| SIMPL-2904 | Log all the metrics in a central repository per agent | Observability |
| SIMPL-2906 | Logging amount and type of data transferred through Simpl-Open agent | Observability |
| SIMPL-2907 | Logging the reason for transferring data | Observability |
| SIMPL-2914 | Logs and traces compliant with EU regulations and with the rules set for the audit process | Observability |
| SIMPL-2916 | Pre-configured monitoring dashboard | Observability |
| SIMPL-2917 | Participant to configure custom dashboards | Observability |
| SIMPL-2919 | Monitoring Simpl-Open agent software components technical logs | Observability |
| SIMPL-2921 | Monitoring Simpl-Open agent infrastructure metrics | Observability |
| SIMPL-2924 | Healthcheck endpoint for all of application components | Observability |
| SIMPL-2926 | Application healthchecks in the monitoring dashboard | Observability |
| SIMPL-2929 | Send alert when a component is unhealthy | Observability |
| SIMPL-2930 | Store the alerts | Observability |
| SIMPL-2932 | Make all logged information retrievable in real time from a reporting module | Observability |
| SIMPL-2941 | Simpl shall store technical logs of agent (software) components in a log repository | Observability |
| SIMPL-2945 | Store technical logs of the infrastructure on which Simpl-Open is deployed in a log repository | Observability |
| SIMPL-2946 | Log Simpl agent infrastructure metrics | Observability |
| SIMPL-2949 | Simpl shall log all business actions in the central logs repository | Observability |
| SIMPL-2966 | Simpl shall log the usage of data/application resource on a provider's infrastructure by a consumer | Observability |
| SIMPL-2969 | Simpl shall log all Tier I accesses to the agent | Observability |
| SIMPL-2970 | Simpl shall log all security events generated by its components | Observability |
| SIMPL-3180 | Alert thresholds definition | Observability |
| SIMPL-3182 | Alert triggering | Observability |
| SIMPL-3382 | The Usage Contract Agreement stored in human readable format | Contract Management |
| SIMPL-3835 | Monitoring Simpl-Open agent Tier II transactions | Observability |
| SIMPL-3886 | Monitoring Simpl business logs | Observability |
| SIMPL-3995 | Define the onboarding process documentation | Onboarding |
| SIMPL-4417 | Automated deployment of Simpl-Open pre-configured monitoring dashboard | Observability |
| SIMPL-4421 | Simpl shall log all Tier II transactions | Observability |
| SIMPL-4422 | Monitoring Simpl-Open agent infrastructure technical logs | Observability |
| SIMPL-4423 | Monitoring Simpl-Open agent Tier I accesses | Observability |
| SIMPL-4424 | Monitoring Simpl-Open agent security events | Observability |
| SIMPL-4428 | Monitor Simpl agent infrastructure components health | Observability |
| SIMPL-4494 | Sorting search results | Search |
| SIMPL-4495 | Filter search result based on access policy | Federated Catalogue, Search |
| SIMPL-4889 | Publishing a resource description | Federated Catalogue, Resource Offering Editor |
| SIMPL-5396 | Request a data resource | Data Space Connector, Data Transfer |
| SIMPL-6100 | Requesting an infrastructure resource | Infrastructure Management |
| SIMPL-6109 | Access policy enforcement | Data Space Connector |
| SIMPL-6122 | Data Visualization | Data Transfer, Infrastructure Management |
| SIMPL-10173 | Configure a ruleset for the automatic validation of onboarding request documents | Onboarding |
| SIMPL-10174 | Define identity attributes for an Onboarding Procedure Template | Onboarding, IAA |
| SIMPL-10489 | Onboarding request automated document validation | Onboarding |
| SIMPL-10572 | Governance Authority - Credentials actions | IAA |
| SIMPL-10594 | Participant - Credential Renewal and Deployment | IAA |
| SIMPL-11315 | Governance Authority‚ retrieving schemas and schema versions | Schema Management |
| SIMPL-11316 | Governance Authority‚ creating a new schema for a new resource type | Schema Management |
| SIMPL-11318 | Governance Authority‚ creating a new version of an existing schema for a resource type | Schema Management |
| SIMPL-11320 | Governance Authority‚ revoking a schema | Schema Management |
| SIMPL-11321 | Governance Authority‚ retaining a revoked schema for existing resource descriptions | Resource Offering Editor, Schema Management |
| SIMPL-11322 | Governance Authority‚ ensuring that a revoked schema is not available for publishing a new resource description | Resource Offering Editor, Schema Management |
| SIMPL-11323 | Governance Authority‚ validating a schema‚ syntax, semantics, and default properties | Schema Management |
| SIMPL-11328 | Governance Authority - publishing a validated schema | Schema Management |
| SIMPL-11333 | Governance Authority‚ notifying Providers about schema changes | Schema Management |
| SIMPL-12197 | Governance Authority - Identity attributes assignment to participants | IAA |
| SIMPL-12898 | A Provider consults an overview of its Resource descriptions | Federated Catalogue, Search |
| SIMPL-12903 | A Provider consults the details of one of its own resource descriptions | Federated Catalogue, Search |
| SIMPL-12904 | A Provider consults the version history of one of its own resource descriptions | Resource Offering Editor |