.. This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/legalcode ================================================================ Fix and improve access rules and interactions with share drivers ================================================================ https://blueprints.launchpad.net/manila/+spec/fix-and-improve-access-rules One of the major benefits that shared file systems offer is access control. Manila's implementation of access control offers an easy-to-use interface to allow and revoke client access to shared file systems. Access is controlled on the basis of the respective type of access that makes sense to individual storage protocols. For example, access to NFS Shares is controlled with IP based rules, and to CEPH FS shares is controlled with CEPHX access rules, etc. For many releases now, we have had to fix many issues with access control APIs biggest of which were incorrect API behavior `[1]`_ and race conditions `[2]`_. Problem description =================== Two major design changes in manila affected the access rules implementation: Introduction of Share Instances `[3]`_ and bulk access rule updates on share back ends `[4]`_. Prior to these changes the behavior was as follows: * Tenant could request (or deny) access for a given share to a client * Tenant could send x number of such requests in succession * Tenant could list access rules on a given share and each rule had a distinct ``state`` that would inform the tenant if the rule was applied or denied successfully. This API behavior was easy to write scripts around and even perform bulk access rule operations on a given share. Since we had a per share per rule ``state`` and every access rule was individually applied or denied by a share driver, there was better error reporting and lesser scope for a race condition to occur. The current behavior is as follows: * Shares have instances. There is no ``state`` for an access rule for a given share instance, instead, the share instance has an ``access_rules_status`` attribute that is a combined state across all the access rules of a given share instance. * Tenants are not aware of share instances; access control is still performed at the share level. They can add or remove rules for a given share via the access control API. We do not support modifying multiple access rules at once, so API calls to ``allow_access`` or ``deny_access`` only ever impact one access rule at a time. * Tenants can send x number of such requests in succession, however, if new requests come in when rules are still being processed for the given share instance, the ``access_rules_status`` on the share instance will transition from 'active' to 'updating' and from 'updating' to 'updating_multiple'. * If there was a request that the back end or the share driver could not honor, the exception is raised and the ``access_rules_status`` would be set to 'error'. No further rule additions would be allowed, until all "bad" rules are removed. * Identifying "bad rules" would require a careful amount of supervision on the tenant's part by noting each state transition; otherwise, the server would pretend that all rules are in 'error' state - owing to the fact that there is no tracking of the per share instance, per access rule state in the manila database. * When a bunch of access requests are made on the same share, after a point, the share manager ignores the requests (i.e, share instance's ``access_rules_status`` transitions from 'active' to 'updating' on the first new rule, and from 'updating' to 'updating_multiple' when the next rule comes in and the status is still 'updating'. If any further requests come in when the status is 'updating_multiple', they are ignored.) `[5]`_ * Further, in an attempt to avoid race conditions in the share manager, the whole interaction to the share back end is synchronized with a lock. If the share driver or the back end takes a long time, the lock would prevent any further operations. * If the share manager service crashes or is bounced when a share instance's ``access_rules_status`` is 'updating' or 'updating_multiple', users are stuck. No new rules can be added and because of poor error reporting. To recover from this state, users would have to perform another access rules API request (deny access to an existing client) to trigger an update of the access rules. * When share instances were introduced as the underlying database representation for share replicas and migration instances, the design was conscious of the fact that "instances" would be visible to administrators and not end users (replicas are visible to tenants, however, they needn't know that replicas are share instances). However, error messages `[6]`_ seem to break the abstraction. * When a share has multiple instances, the API sends an RPC request to the share manager host corresponding to each instance. Currently, the logic is as follows: * user requests access to a share for a client * manila-api performs basic validation on the state of the share * It then commits the rule to the database * It loops over the instances of the share, validating each one and shooting off the RPC request. * manila-api responds with a ``202 Accepted`` once all the RPC requests have been sent out. Here, if the validation fails for the second or subsequent share instance, the RPC call has already been made for the first share instance, resulting in the user receiving a ``400 Bad Request`` and the rule transitioning to 'error', but the share might be accessible via export paths of the first instance. Use Cases ========= For a user of shared file systems, access control is arguably the most basic requirement. The proposed change would allow preserving the good portions and enhancing the user experience for access control in manila: * Users can add any number of access rules to a given share in quick succession as far as the API is concerned. * Users can identify which rule/s failed to apply. * Users may continue to apply or deny rules while some rules are in 'error' state. * Users may be able to deny a rule that has not yet been applied. Elimination of the currently existing race conditions will benefit both users and cloud administrators. Proposed change =============== - Per share instance rule, per access rule ``state`` will be re-introduced. - Four new access rules states will be introduced: 'queued_to_apply' (is currently 'new'), 'queued_to_deny', 'applying' and 'denying', the state transition for this: * Adding a new rule: **When the request is received by the API** share_instance.access_rules_status: - 'active' ----> 'out_of_sync' - 'error' ----> 'error' (state will be preserved) - 'out_of_sync' ----> 'out_of_sync' (state will be preserved) share_instance_access_map.state - 'queued_to_apply' ----> 'queued_to_apply' (rule starts out in this state) **In the share manager, before the request is sent to the share driver** share_instance.access_rules_status: - 'error' / 'out_of_sync' ----> (state will be preserved) share_instance_access_map.state - 'queued_to_apply' ----> 'applying' **When the share driver returns with the response** share_instance.access_rules_status: - 'error' ----> 'error' (state will be preserved) - 'out_of_sync' ----> 'active' (if no other 'queued_to_apply' or 'queued_to_deny' rules) share_instance_access_map.state - 'applying' ----> 'active' or 'error' * Deleting an existing rule: **When the request is received by the API** share_instance.access_rules_status: - 'active' ----> 'out_of_sync' - 'error' ----> 'error' (state will be preserved) - 'out_of_sync' ----> 'out_of_sync' (state will be preserved) share_instance_access_map.state - 'active' ----> 'queued_to_deny' - 'applying' ----> 'queued_to_deny' - 'error' ----> 'queued_to_deny' - 'queued_to_apply' ----> 'queued_to_deny' **In the share manager, before the request is sent to the share driver** - 'queued_to_deny' ----> 'denying' **When the share driver returns with the response** share_instance.access_rules_status: - 'error' ----> 'error' (state will be preserved) - 'out_of_sync' ----> 'active' (if no other 'queued_to_apply' or 'queued_to_deny' rules) share_instance_access_map.state - 'denying' ----> 'deleted' or 'error' .. note:: * When a share has multiple share instances, all instances of the share are expected to have the same access rules. In case of share migration, while existing access rules from the source share instance are eventually applied on the destination share instance, it begins out with no access rules. Also, when migrating a share, certain back ends may not be able to allow write operations on the source share during the migration, for a variety of reasons. Host based migration cannot handle new data being written into the share when existing data is being copied over. To ensure that all write operations are fenced off, manila casts existing rules on the source share to read-only prior to these kinds of migrations. In case of share replication, the expectation that clients have access to all replicas still holds with a clarification of semantics: For 'dr' replicas, we track all the access rules for the replica in the database, however, there are no export locations for the replica, hence, though the database contains these rules as being 'active' for the given replica, the replica is not accessible. In case of 'readable' replicas, any read/write rules are cast by the drivers to read-only for the secondary replicas; i.e, a rule's database representation does not change from 'r/w', the 'r/o' semantics on the replica are inherently ensured by the share drivers themselves `[7]`_ `[8]`_. This expectation off the drivers is an exception that exists only within this feature, i.e, the logic to ensure read-only semantics currently needs to live in each driver, though it is something that the share manager could handle uniformly, as we have in case of share migration. See further discussion within `Read-only access semantics`_. .. important:: * If all share instances are expected to have the same access rules, why would we still maintain per access rule per share instance states? While this may be clear with the purpose and design of share instances, let's clarify this once more for posterity: Each share instance is associated with a manila host. They can each be created at different times and acted upon independently, i.e, consider creating a share, allowing access and at a later time, creating a replica for it, or migrating it. Therefore, the ``state`` of access control should be tracked per rule, per share instance. This is the benefit of having an ``access_rules_status`` as an attribute of each share instance as well. The ``access_rules_status`` of an active replica can be 'active', while for a secondary replica where rules have not yet been synced, the attribute can be "out_of_sync". When the user lists the access rules however, the per share instance access rule states will be aggregated. See `Aggregate status fields`_ for more details. - Rule status updates will be coordinated between the API service and the share manager service. The only state transition for an existing rule in the API service, as noted above, is performed while denying rules. The API service will acquire the same locks as the share manager service in order to make this transition. - Manila-api and manila-share both have access to the database; the access rules payload is not necessary since these services can individually read the rules from the database and perform necessary state transitions. Thus, the RPCs: ``manila.share.rpcapi.ShareAPI.allow_access`` and ``manila.share.rpcapi.ShareAPI.deny_access`` will be collapsed into ``manila.share.rpcapi.ShareAPI.update_access``. - When the share manager receives the RPC request to update_access, it will react as follows: - Acquire a lock, look into the database for any rules in 'applying' or 'denying' states for the given share instance. If there are any rules in these states, the driver is currently processing rule changes, any 'queued_to_apply' or 'queued_to_deny' rules are `batched` to be applied or denied when the driver is done with its current task. Steps below are not executed right away. If there are no rules in 'applying' or 'denying' states, set the state of any 'queued_to_apply' rules to 'applying', and 'queued_to_deny' rules to 'denying'. Release the lock. .. note:: When the driver is processing an access rules update, any 'queued_to_apply' or 'denying' rules are left alone until the driver returns from its current task. The ``update_access`` interface is designed for a bulk update. There is no order for processing rules. If access rules have to be processed in any particular order, it would be up to the share driver to do so. - Call the driver interface ``update_access`` passing existing rules and the "changed" rules ('applying' or 'denying' rules). - Accept ``state`` attribute to be set by driver, allowing for rules in transitional states to be updated by the driver. If the driver does not return a state for each rule in transitional state, transition 'applying' rules to 'active' and soft delete rules in 'denying' state. Perform these actions by acquiring a lock to read the current state, and releasing it at the end of the transaction. - Acquire a lock and read any 'queued_to_apply' rules that may have shown up, if any, repeat the last three steps, else continue to the next step. - Acquire a lock and transition the share instance's ``access_rules_status`` from ``out_of_sync`` to ``active``. ``error`` state will be preserved if any access rules are in ``error`` state for the given share instance. - The database API for retrieving a specific rule or all rules for a given share, or a given share instance will be refactored. - The coarse lock around the ``update_access`` driver interface (or the fall back interfaces) will be removed. A reader-writer lock around database calls for these access rules will be introduced as pointed out above. This is because both manila-share and manila-api services read from and write to the database. For correct behavior, deployers should prefer a distributed lock `[9]`_ or a file lock living on a shared file system. - On restarting the share-manager service, the 'recovery' step for access rules will be updated. All rules in 'applying' state will be reverted to 'queued_to_apply' before requesting the driver to sync access rules for a given share instance. - _`Read-only access semantics`: For share migration and share replication, access rules of a given instance, source instance for share migration or secondary replicas of a share, may require to be set to ``r/o`` (read-only). Currently, there is code to ensure the read-only semantics on the fly for the source share instance of a migration. This spec proposes adding a field to the database representation of share instances. The field will be called ``cast_rules_to_readonly``. It will be set and unset as necessary in migration and replication workflows where applicable. When we have the field, performing a check in the ``update_access`` workflow, before invoking the driver would be as simple as:: if share_instance['cast_rules_to_readonly']: # The share manager will know to cast all rules to 'r/o' # before calling the share driver as long as this condition holds. The ``cast_rules_to_readonly`` field will be False by default for any share instance. The field will be set to True when creating a new replica on a share supporting ``readable`` style of replication. It will be unset on the replica when it gets promoted to being the primary (active replica) and set to True on the replica that gets demoted to being a secondary. The ``cast_rules_to_readonly`` field will be set to True at the beginning of each migration where necessary and unset always if any migration is canceled. Note that migration is not supported for replicated shares, replicas have to be deleted before migrating the share. The ``cast_rules_to_readonly`` field will not be exposed to tenants. It will be exposed to administrators in all APIs that return the share instance in "detail". It will not be present in the share model, not even as a @property because it solely belongs to a particular share instance. Setting and unsetting of the ``cast_rules_to_readonly`` will be synchronized by a lock. Supporting read-only rules is a minimum requirement within manila. If drivers do not support them, this and future evolution of manila features will expose differences and break the abstraction. There have been plans to act on such discrepancies, in terms of policy changes. Any action on these drivers is out of scope of this spec. _`Aggregate status fields` -------------------------- To preserve the abstraction at the API as far as users are concerned, we need to aggregate status fields for both the share instance ``access_rules_status`` as well an per instance access rule ``state`` attributes: share_access_map.state non persistent, but reflects aggregate state from share_instance_access_map.state Aggregation: Read all share_instance_access_map.state, prefer in the following order: 'error', 'queued_to_apply', 'queued_to_deny', 'applying', 'denying', 'active' share.access_rules_status: non persistent, but reflects aggregate state from share_instance.access_rules_status Aggregation: Read all share_instance.access_rules_status, prefer in the following order: 'error', 'out_of_sync', 'active' Alternatives ------------ - Not having per share instance per access rule state: Live outside the reality that processes fail and things go wrong once in a while and even genuine user errors can upset well written driver or storage system code (There are drivers that error out entire requests on not understanding a rule type). If we don't separate the status field now, we would have to support annoyed users and document how to get out of messy situations. - Not having transitional states: Allow coarse transitions as occurring today and live without the added benefit of concurrency control and poor user experience, i.e, allow all rules to go to 'error'. - Keep the coarse lock in the share manager: Holding locks for long durations increases the chances of things going wrong when processes fail while holding locks and has a higher chance of running into deadlock situations `[10]`_. - Don't refactor the API, allow the error messages to remain as they are: We need user documentation and awareness around what share instances are and how they work in manila. This breaks the abstraction and design, but maybe is not the worst thing in the complicated OpenStack ecosystem. - Don't combine RPCs - reasonable, but no real benefit of separating them. - Don't refactor the database API code - reasonable, it is a lot of code churn, but again, the number of database calls increases and we would be compromising on performance because we don't want to address this problem. - Don't require a shared/distributed lock: The alternative would be to make state transitions only occur in one service (either manila-api or manila-share). However, the proposed design is superior in terms of correctness. We would need to compromise on the behavior changes: i.e, disallow denying a rule until it is 'active'; or the manila-data service must make requests via the manila-share or the manila-api to create new rules. The resulting code may be simpler, but the loss of functionality would be a backwards-incompatible API change. (This is perfect reasoning for a manila-conductor service, a la nova-conductor.) On the flip side, if a deployer does not deploy a DLM lock as suggested, and still distributes manila-api and manila-share services, the chances of running into race conditions is higher when the proposed implementation merges. However, this may not occur in some clouds where high volume of access requests on the same share are rarely performed, if ever. Note that any scripting of the access API may uncover these race conditions. Disclaimer: The developer driving this effort does not condone wrong deployment practices. - Instead of the ``cast_rules_to_readonly`` field being added to the share instances, we can evolve a set of conditions within the code to ensure the read-only semantics that we desire for share instances. - For a host assisted migration, we can refer to the configuration parameter that specifies whether the host can support read-only access to a given share, and in the ``update_access`` workflow, we can change all access rules to read-only rules before invoking the driver. - Similarly, we can ascertain any ``readable`` replicas, within the ``update_access`` workflow and cast access rules to read-only before invoking the driver. - For ``writable`` and ``nondisruptive`` migrations, we can carefully ensure we never invoke ``update-access`` while migrating. While this seems straight-forward, it seems much complicated to maintain in the light of future feature functionality. It also overloads the use of multiple status fields. This creates a bottleneck, and a potential race condition when these status fields are updated by any process within manila. Also, we need to ensure a consistent behavior on restarting manila services. With this motivation, a field in the database seems necessary. - Require that driver update the ``state`` attribute of rules. This is a reasonable ask. Allowing the drivers to update which rules were not acted upon successfully is more beneficial than the share manager setting all the transitional rule states to 'error' during a bulk update operation. We need to have code in the share manager that performs rule updates conditional to whether or not the driver has decided to return updates for each access rule. This is a similar pattern as seen with other existing driver entry-points. Therefore, while we will encourage drivers to revisit the ``update_access`` interface and return updates to the state field. It is not feasible to make these changes en-masse as part of this effort, not even for the first party drivers. Data model impact ----------------- The ``manila.db.sqlalchemy.models.ShareInstanceAccessMapping`` model will have a new field called ``state``. The database upgrade step will add this column by populating it with the value of existing ``manila.db.sqlalchemy.models.ShareInstance.access_rules_status`` column. Values of the ``manila.db.sqlalchemy.models.ShareInstance.access_rules_status`` column will be re-mapped to remove 'updating' and 'updating_multiple' as valid access_rules_status values. Values of the ``manila.db.sqlalchemy.models.ShareInstance.access_rules_status`` column will be re-mapped to remove 'updating' and 'updating_multiple' as valid access_rules_status values. The ``manila.db.sqlalchemy.models.ShareInstance`` model will have a new field called ``cast_rules_to_readonly``. The database upgrade step will add this column with a default value of ``True`` for all share instances that have their ``replication_type`` attribute set to ``readable`` and whose ``replica_state`` is not ``active`` (secondary replicas in a readable type of replication). For all other share instances, the value will be set to False. In the ORM layer, ``manila.db.sqlalchemy.models.ShareInstanceAccessMapping`` model will now contain a back-ref to the access rules model (``manila.db.sqlalchemy.models.ShareAccessMapping``). That way, the database APIs associated with reading a given share instance access mapping row will have access to the related access rule data (such as ``access_type`` and ``access_level``). The database downgrade step will drop the ``state`` column from ``manila.db.sqlalchemy.models.ShareInstanceAccessMapping``. It will not alter the ``manila.db.sqlalchemy.models.ShareInstance.access_rules_status`` column. The database downgrade step will also drop the ``cast_rules_to_readonly`` column from ``manila.db.sqlalchemy.models.ShareInstance``. As always, this downgrade is not recommended in a production cloud. REST API impact --------------- No new APIs will be added. While we will bump the micro-version to expose the 'queued_to_deny' state, transitional 'applying' and 'denying' states, some other changes will be made without bumping the micro-version, since the behavior is currently broken in the API and it is hard to warrant requiring backwards compatibility given the unpredictable/undesirable behavior: **Adding an access rule** POST /v2/{tenant-id}/shares/{share-id}/action BODY:: { 'allow_access': { "access_level": "rw", "access_type": "ip", "access_to": "0.0.0.0/0" } } - Policy check will be completed first before any other validation - If any share instance is invalid, the API will return without creating the rule in the database. - Adding access rules while the share is being migrated will be possible, as long as all share instances have a valid host. This behavior will be exposed only from a new micro-version of the allow-access API. **Denying an access rule** POST /v2/{tenant-id}/shares/{share-id}/action BODY:: { 'deny_access': { "access_id": "a25b2df3-90bd-4add-afa6-5f0dbbd50452" } } - Policy check will be completed first before any other validation - If any share instance is invalid, the API will return without denying the rule for any other valid instance. - Access rules cannot be denied when the share instance status is not 'available'. **Listing access rules** POST /v2/{tenant-id}/shares/{share-id}/action BODY:: { 'access_list': null } - Policy check will be completed first before any other validation - Transitional statuses introduced will be mapped to pre-existing states for API requests with a micro-version lesser than the micro-version where they were introduced. Mapping 'applying' and 'queued_to_apply' is easy, they will be set to 'new'; mapping 'queued_to_deny' and 'denying' is tough because the state could have been anything prior to transitioning to 'denying'. So, we will read the share's aggregate ``access_rules_status``, if it is 'error', we will map the state of this access rule to 'error', else if the ``access_rules_status`` is 'out_of_sync', the rule's state will be mapped as 'new', and if the ``access_rules_status`` is 'active', the rule's state will be mapped to 'active'. Note that this was the behavior that we are trying to change by re-introducing the per share instance per access rule state. **Share Instance APIs** - The ``cast_rules_to_readonly`` field will be exposed in the "detail" views of share instances (admin-only) as a micro-versioned change. Security impact --------------- Bulk access rule updates will no longer be ignored by manila. We will also support denying access in case access was accidentally granted. Since tracking of state changes would be improved, we expect a positive security impact. Notifications impact -------------------- None until `[11]`_ merges and is fully supported in manila. The work to add user messages will be proposed with a new blueprint and not part of this work. Other end user impact --------------------- End users will be prompted to prefer the new micro-version in case of writing applications against a newer manila release, so as to gain full benefit of the more predictable state transitions. When manila is upgraded in their clouds, they will already benefit from faster API failures as opposed to the previous versions, no matter what API version they use. Performance impact ------------------ Changes to the ORM and the database API will reduce redundant calls to the database, and is expected to have a positive impact on the performance. Instead of reacting to every RPC request and triggering update_access calls on the driver, the proposed implementation defers un-applied rule requests if a driver is processing rule requests. This could lead to batching of requests to take effect at once, thereby reducing the number of calls to the storage back end, hopefully improving performance. Other deployer impact --------------------- Until tooz `[9]`_ support is added in manila, we expect deployers to support manila with file locks created with oslo_concurrency. In multi-node deployments, these file locks must be accessible to all nodes where all manila services are run. Typically, this is achieved with storing these locks on a shared file system. Once tooz support is added, synchronization introduced in this patch will automatically benefit from the locking abstraction introduced by tooz; and deployers may choose to configure a distributed lock management system underneath manila/tooz. Since no state is saved in the services, this proposed change does not introduce any regression to affect `active-active` deployment choices. Developer impact ---------------- None Driver impact ------------- As always, when denying a rule, if the rule does not exist on the back end, the drivers must not raise an exception, they may log that the access rule never existed, and return None allowing for the rule to be deleted from the database. Raising exceptions will set all rules in transitional states, 'applying' or 'denying' to 'error'. So, drivers must carefully consider exception handling. As part of this work, we will accept 'error' rule states from the driver in the response, so, drivers can tell the share manager which exact rules failed to be applied. This needs to be added to each driver considering how each back end can handle error reporting. Implementation ============== Assignee(s) ----------- Primary assignee: | gouthamr Other contributors: | yogesh | rooneym Work Items ---------- * Code - For ease of implementation and review, this work will be done in multiple related change sets, each will partially implement the blueprint until all of the proposed items are implemented. * Functional tests in server and client Dependencies ============ None Testing ======= - Negative/API tests will be added for the changes in the API including scheduling invalid shares or invalid replicas and testing the access rules interaction at the API. - Existing access rules tests will be modified to validate the access_rules_status but wait on the access rule's state attribute in both manila and python-manilaclient. - The scenario tests spec `[12]`_ proposes to add tests around the read-only semantics. This simplification should pass the tests proposed. Documentation Impact ==================== The following OpenStack documentation will be updated to reflect this change: * OpenStack User Guide: will document the changes in state transitions. * OpenStack API Reference: All API changes will be documented * Manila Developer Reference: A state transition diagram will be added for access rules' ``state`` and share instances' ``access_rules_status``. References ========== _`[1]`: https://bugs.launchpad.net/manila/+bug/1550258 _`[2]`: https://bugs.launchpad.net/manila/+bug/1626249 _`[3]`: https://blueprints.launchpad.net/manila/+spec/share-instances _`[4]`: https://blueprints.launchpad.net/manila/+spec/new-share-access-driver-interface _`[5]`: https://bugs.launchpad.net/manila/+bug/1626249 _`[6]`: https://github.com/openstack/manila/blob/4c6ce2f/manila/share/api.py#L1417 _`[7]`: http://docs.openstack.org/developer/manila/devref/share_replication.html#access-rules _`[8]`: http://docs.openstack.org/admin-guide/shared-file-systems-share-replication.html#access-rules _`[9]`: https://review.openstack.org/#/c/318336/ _`[10]`: https://en.wikipedia.org/wiki/Dining_philosophers_problem _`[11]`: https://blueprints.launchpad.net/manila/+spec/user-messages _`[12]`: https://review.openstack.org/#/c/374731/