AAI requirements PUNCH4NFDI
Current version: 2023-04-03
High Level Requirements to PUNCH AAI infrastructure
- Identity Provider AAI using federated AAI infrastructure (DFN-AAI, Helmholtz AAI, eduGAIN )
- single sign on infrastructure to PUNCH services
- via PUNCH portal
- need (PUNCH) internal infrastructure and policies w. (local) resource providers
- we want to define access rights on different levels for PUNCH services (the following points are to some extent requirements to service providers - they are kept in the document for clarification)
- when access rights are provided while bypassing AAI roles and groups this is not in the responsibility of the AAI but rather in the responsibility of individual users and services
- service providers should be able to block access immediately in case a user left the project or a VO. The AAI needs to contain such information, so that this can be queried from services and resource providers
- read/write access to Intranet, gitlab, DESY Cloud
- different users and groups may have different rights (admins, management, TAs)
- access tokens may have a limitation in time
- job submission to various resources only to those VOs accepted by the resource providers
- data access on storage elements only by accepted VOs
- file permissions on storage resources on granular levels
- services need to be PUNCH AAI and PUNCH AAI group aware
- we want to organise PUNCH users in groups with different access rights
- punch, intra punch, project groups, management, admins, TAs, ...
- the system should provide a web interface or API for PUNCH AAI administrators to be able to get information about group membership of individual users (it should be stated that getting such information may under certain circumstances be against the DSGVO, also that other user related information is to be found in the IdPs)
- user/group information need to be updated easily if required
- it is required that AAI admins get information in case a user left the project
- we want to give access rights to external users (e.g. from NFDI and beyond) to restricted PUNCH services
- how to include information about external membership, as e.g. in HEP VOs ?
- we need to be able to get information about group memberships of external users
- authorizations should be possible directly with tokens, not only based on user identities
- this includes also the necessity for token delegation with reduced rights for security reasons
- this includes the requirement for an UI via which the user can specify the required rights to be used
- service registration needs to be possible
- tutorials and documentation
- how to register a (test) service
- what OAuth2/OIDC service URLs for Helmholtz AAI are to be used for PUNCH AAI as documented in e.g. https://hifis.net/doc/helmholtz-aai/howto-services/ ?
- The PUNCH URLs have all a "punch" included, as for example https://login-dev.helmholtz.de/punch-oauth2/.well-known/openid-configuration
- what http requests to what url are required to retrieve user information as user name, group memberships ?
- dev environment with same user and group inormation as prod environment is required for proper testing
- There are services, which are designed to support multiple user groups. These groups can be managed by separate AAI solutions. Here a corresponding authorisation policy is required and an agreement between domains how mutually to use resources. It should be transparent for the user via which AAI to log in.
Technical AAI requirements
Indigo (WLCG, Belle, INFN maintained) vs. Unity IAM (Helmholtz, EUDAT/B2ACCESS)
PUNCH AAI
- is currently based on Helmholtz AAI, which is Unity based
- missing features, mainly due to differences in Unity and Indigo IAM are discussed below
- PUNCH AAI contains only groups and services decide about access rights based on AAI groups
- the entitlements of a user who logs in via PUNCH AAI also contains group information of Helmholtz AAI
- PUNCH AAI entitlements have the same structure as the Helmholtz AAI entitlements, but they distinguish in their values
- all group memberships can be collected via keycloak, but as entitlement and not really as group
Indigo IAM
- access rights are included in AAI and thus included in the issued tokens
- token delegation (with potentially weaker access rights) included
- granular /path based authorisation included
Unity IAM
- no authorisation claims
- authorisation claims would enable a decentralised scaling usage model
no user information queries by resource-providers to the central instance would be needed
group membership and granular authorisation for storage access would be possible
- authorisation claims would enable a decentralised scaling usage model
- no tokens with reduced authorizations can be created and / or delegated
- users in PUNCH may have multiple roles in one group. The role needs to be selected and assumed. But parts of the PUNCH infrastructure may require reduced permissions only. Therefore also the delegation of less powerful tokens for safety needs to be enabled.
- remark: values of entitlements can be reduced already now
- users in PUNCH may have multiple roles in one group. The role needs to be selected and assumed. But parts of the PUNCH infrastructure may require reduced permissions only. Therefore also the delegation of less powerful tokens for safety needs to be enabled.
- scopes should be added for group and capability selection:
wlcg.groups[:]
storage.read:/punch/somewhere
following groups, roles, entitlements claims as in RFC9068 section 2.2.3.1 - it is to be noted that certain software requires tokens smaller 1 kB. If more information is added the tokens become larger
- currently discussion is going on how this issue shall be addressed
- it is also to be noted that it is not yet clear how path based authorisation and authorisation claims can be implemented in Unity IAM
the following features should be implemented in Unity IAM
- authorisation claims, as in RFC9068, section 2.2.3.1
- granular / path based authorisation (also potentially with time limitation included)
- token delegation (with potentially reduced rights)
- authorisation information controllable via requested scopes
- 2 factor authorisation in Unity IAM is required for Storage4PUNCH and Compute4PUNCH
- this is already included in the Unity implementation as optional for the users but has not yet been activated.
- should this be done always via AAI or is it sufficient if this is done at the resource provider ?
moreover the issue with the limitation on group lengths should be solved. It has turned out, that the entitlements contained in the user information become too long for mapping the to local systems (Linux, NextCloud DB, ...)
Example: urn:geant:dfn.de:nfdi.de:punch:group:PUNCH4NFDI#login.helmholtz.de
current length limit: 32 byte
note: this is not an OIDC limitation. This means that the service might have to solve how the mapping can be accomplished. Moreover the group length is also related to EOSC compatibility issues. When the group names are shortened too much they would become too generic and users may be mapped to the same groups at service provider level.
general issues:
- remark: it is currently being discussed if and how tokens can be utilised across AAI boundaries. Which means how tokens issued by Indigo IAM can be utilised by Unitiy based AAIs and vice versa.
- Some services may require a way to access authorization policies independent of the login via the PUNCH AAI. It would be beneficial, if a unique identifier like the “ePPN” received through another login method could be mapped to the PUNCH authorization policies. As a technical solution, a signed OPA Bundle with the PUNCH authorization policies may be generated for consumption in selected services. PUNCH AAI could also serve customers via redirection. PUNCH4NFDI could also talk and contribute to the edu ID project.
- It should be clarified if we can expect from the NFDI some cross domain single sign on infrastructure
user stories
Scientist in PUNCH within multiple experiments runs calibration workflow
Mr. scientist X is member of multiple collaborations (a small local experiment A and a large HEP experiment B) and working within PUNCH. As such, he is part of several groups: The PUNCH intra group, the experiment A group, the experiment B group, and in addition, he has "calibration" privileges in experiment A, i.e. is allowed to create new calibration files.
Today, he logs in to the PUNCH portal with his web browser, and at this point selects that he wants to work in the context of experiment A and create new calibrations. The portal receives an authorization token with these permissions / group and role memberships. Scientist X then selects he wants to use the JupyterHub service on Compute4PUNCH and read and write data from experiment A. The authorizations are forwarded to the MyToken service and can be used within the compute job launched by the JupyterHub. Inside the Hub, he kicks off the calibration workflow, which starts off further 10 000 jobs on the Compute4PUNCH infrastructure taking input from Storage4PUNCH. Some input datasets are placed on several storage endpoints maintained by Experiment A, but they accept authorization tokens from the common PUNCH AAI system if the tokens contain the necessary authorization information. All 10 000 jobs which start simultaneously have the necessary authorization information and can access the data.
The workflow, behind the scenes, limits the privileges of the tokens such that each single job can only access the input files it needs and write the outputs it should produce. No job can access data it should not, nor write out or even delete files which it should not access. Especially, the tokens do not grant permissions of Experiment B or the PUNCH intra groups.
Scientist from a HEP experiment wants to run an analysis job on non-public data
Mrs. scientist Y from a large HEP experiment using their own authentication and authorization system (token-based) wants to analyse her data on the PUNCH infrastructure. She logs in to the PUNCH portal, and additionally triggers a login from there to the HEP experiment's infrastructure. The token is included in the session and forwarded to services requiring access, such as Compute4PUNCH and Storage4PUNCH, so access works even though the data can only be accessed by members of the HEP collaboration.
Scientist wants to copy data into Storage4PUNCH
A scientist wants to make data available on Storage4PUNCH, his/her experiment can not provide its own storage endpoints, but the data is of a size fitting Storage4PUNCH. The experiment is still ongoing and producing data. He/she wants to set up an automated (but secure) way to upload newly produced files via the commandline, using a time-limited token. The providers of the Storage4PUNCH infrastructure offer their storages for these use cases, but must ensure that the write permissions of the used authorization tokens are limited to the logical path in the global namespace which belongs to the experiment. All storages trust the central AAI instance which provides policies when creating such authorization tokens taking the paths into account.