Microservices are organized into 3 groups:
Core microservices are mandatory. They handle:
The “Discovery” microservice handle the IPFS networks, the cluster identity, and discovers peers. It interacts directly with Kubernetes through CRD (Custom Resource Definitions).
In front of the Discovery microservice, another service, the “Discovery API”, offer an API to interact with the Discovery microservice.
The split between “Discovery API” and “Discovery” have been made in order to facilitate the changes of the Discovery microservice. The Discovery API is only an interface for editing the Kubernetes objects related to the Discovery microservice.
The microservice behind this ugly name is one of the main microservices of the application. Its purpose is to handle users (or admins) frontends, for basic (non-specialized) actions.
The user frontend is the main entrypoint to the solution. It allows to a user to:
The admin frontend allow to:
This microservice handle resource consumption and resource prices.
Under development
Resource sharing microservices handle how resources are shared in the network.
The catalog microservice is an aggregator of all the available providers.
Providers are split into 3 kinds:
The catalog handle the registration of local providers. The generic backend microservice send requests to the catalog microservices on all the connected clusters and made an aggregation of them.
The catalog also manage the visibility of registered resources to others:
public
: Available to anyoneprivate
: Available only for the current cluster usersrestricted
: Available to explicit defined projectsProvider manage access to their resources.
Data providers purpose are to make data available from different sources (database, file system, API…).
It can give auth information for service to connect to data, create auth information to access a subset of a database (SQL Views), or directly give the data to be processed (files).
Access will should in principle be read only.
For example, an existing implementation of a data provider is the “Sentinel data-provider”. It allows user to get access to satellite images.
The storage provider give access to write data (database, file system…).
It can take in the form of auth credentials to connect to a database, or repository to be used by services to connect, or could directly take in files and do the push itself.
Storage providers can be used to store process results.
The compute provider define resources usable in the local cluster (CPU + RAM).
It also handles the interconnection between clusters via admiralty and manage reservations and to be sure not to do too much over-provisioning of computing resources.
The service provider allow adding or removal of services (process).
A service is a made by a docker image and a CWL definition and generally describe a process to be made on data (Ia processing, image resizing, video encoding etc).
The pipeline orchestrator handle the execution of the workflow.
It takes all parameters and workflow submitted by the user and make sure that everything (remote cluster and providers) is ready. It converts the workflow sent and the CWL describing the steps into argo pipelines and execute the workflow. Admiralty annotations are added on the workflow to manage distributions of tasks onto different clusters.