PUNCH4NFDI results documentation

AAI requirements PUNCH4NFDI

Current version: 2023-04-03

High Level Requirements to PUNCH AAI infrastructure

Technical AAI requirements

Indigo (WLCG, Belle, INFN maintained) vs. Unity IAM (Helmholtz, EUDAT/B2ACCESS)

PUNCH AAI

Indigo IAM

Unity IAM

the following features should be implemented in Unity IAM

moreover the issue with the limitation on group lengths should be solved. It has turned out, that the entitlements contained in the user information become too long for mapping the to local systems (Linux, NextCloud DB, ...)
Example: urn:geant:dfn.de:nfdi.de:punch:group:PUNCH4NFDI#login.helmholtz.de
current length limit: 32 byte
note: this is not an OIDC limitation. This means that the service might have to solve how the mapping can be accomplished. Moreover the group length is also related to EOSC compatibility issues. When the group names are shortened too much they would become too generic and users may be mapped to the same groups at service provider level.

general issues:

user stories

Scientist in PUNCH within multiple experiments runs calibration workflow

Mr. scientist X is member of multiple collaborations (a small local experiment A and a large HEP experiment B) and working within PUNCH. As such, he is part of several groups: The PUNCH intra group, the experiment A group, the experiment B group, and in addition, he has "calibration" privileges in experiment A, i.e. is allowed to create new calibration files.

Today, he logs in to the PUNCH portal with his web browser, and at this point selects that he wants to work in the context of experiment A and create new calibrations. The portal receives an authorization token with these permissions / group and role memberships. Scientist X then selects he wants to use the JupyterHub service on Compute4PUNCH and read and write data from experiment A. The authorizations are forwarded to the MyToken service and can be used within the compute job launched by the JupyterHub. Inside the Hub, he kicks off the calibration workflow, which starts off further 10 000 jobs on the Compute4PUNCH infrastructure taking input from Storage4PUNCH. Some input datasets are placed on several storage endpoints maintained by Experiment A, but they accept authorization tokens from the common PUNCH AAI system if the tokens contain the necessary authorization information. All 10 000 jobs which start simultaneously have the necessary authorization information and can access the data.

The workflow, behind the scenes, limits the privileges of the tokens such that each single job can only access the input files it needs and write the outputs it should produce. No job can access data it should not, nor write out or even delete files which it should not access. Especially, the tokens do not grant permissions of Experiment B or the PUNCH intra groups.

Scientist from a HEP experiment wants to run an analysis job on non-public data

Mrs. scientist Y from a large HEP experiment using their own authentication and authorization system (token-based) wants to analyse her data on the PUNCH infrastructure. She logs in to the PUNCH portal, and additionally triggers a login from there to the HEP experiment's infrastructure. The token is included in the session and forwarded to services requiring access, such as Compute4PUNCH and Storage4PUNCH, so access works even though the data can only be accessed by members of the HEP collaboration.

Scientist wants to copy data into Storage4PUNCH

A scientist wants to make data available on Storage4PUNCH, his/her experiment can not provide its own storage endpoints, but the data is of a size fitting Storage4PUNCH. The experiment is still ongoing and producing data. He/she wants to set up an automated (but secure) way to upload newly produced files via the commandline, using a time-limited token. The providers of the Storage4PUNCH infrastructure offer their storages for these use cases, but must ensure that the write permissions of the used authorization tokens are limited to the logical path in the global namespace which belongs to the experiment. All storages trust the central AAI instance which provides policies when creating such authorization tokens taking the paths into account.