OCI Vault backup key rotation fails with KMS error during scheduled jobs

We’re experiencing failures with our scheduled backup jobs that rely on OCI Vault key rotation. The backups run nightly across multiple regions, and ever since we implemented automatic key rotation for our master encryption keys, we’re getting KMS-related errors.

The error appears when the backup service tries to access the rotated key:


Error: KMS.KeyNotAccessible
Message: Key version ocid1.key.oc1..aaa...xyz not found in vault
HTTP Status: 404

Our key rotation is configured for 90-day intervals, and we’ve verified the new key versions exist in the vault. The IAM policies seem correct for single-region access, but we’re wondering if there’s something specific about multi-region key referencing that we’re missing. The backup jobs were working perfectly before we enabled rotation.

Has anyone dealt with OCI Vault key rotation in a multi-region backup scenario? We need to understand the proper way to configure KMS policies so backup services can access rotated keys across regions.

Thanks for the quick response. I checked our backup configuration and you’re right - we were using versioned key OCIDs. I’ve updated the configuration to use the base key OCID, but we’re still getting errors on cross-region backups. The same-region backups now work fine after your suggestion.

The cross-region error is slightly different now - it’s an authorization issue rather than a not-found error. Could this be related to how IAM policies handle vault access across regions?

I’ve seen this exact issue. The problem is usually that your backup service policy is referencing a specific key version OCID instead of the key OCID itself. When rotation happens, a new version is created and the old OCID becomes invalid. Check your backup configuration - it should reference the key without the version suffix.

Building on Sarah’s point - I recommend creating a separate dynamic group for your backup instances and then granting cross-region vault permissions to that group. This is cleaner than instance-specific policies. Also verify that your vault replication is properly configured if you’re expecting keys to be available in multiple regions automatically.

Cross-region vault access requires explicit IAM policy statements for each region. The default vault policies are region-specific. You need to add policy statements that allow your backup service to access vaults in other regions where you’re running backups. Make sure the policy includes both ‘use key-delegate’ and ‘read vaults’ permissions for the target regions.

Let me provide a comprehensive solution for OCI Vault key rotation with multi-region backups, addressing all the issues you’ve encountered.

1. OCI Vault Key Rotation Configuration

First, ensure you’re using the base key OCID (not versioned) in all backup configurations. The key OCID format should be ocid1.key.oc1..<region>.<unique_id> without any version suffix. When rotation occurs, OCI automatically uses the latest version.

2. KMS IAM Policy Configuration

You need comprehensive IAM policies that cover multi-region scenarios. Create a dynamic group for your backup instances:


ALL {instance.compartment.id = 'ocid1.compartment.oc1..xxx'}

Then create policies for cross-region vault access:


Allow dynamic-group backup-instances to use key-delegate in tenancy
Allow dynamic-group backup-instances to read vaults in tenancy
Allow dynamic-group backup-instances to use keys in tenancy where target.key.id = 'ocid1.key.oc1..xxx'

The use key-delegate permission is critical for cross-region scenarios as it allows the backup service to delegate key operations across regions.

3. Multi-Region Key Referencing

For cross-region backups, you have two options:

a) Vault Replication: Enable vault replication to create replica vaults in target regions. This ensures keys are locally available and reduces latency. Configure replication in the OCI Console under Vault settings.

b) Cross-Region References: If not using replication, your backup configuration must explicitly handle cross-region key access. The backup service needs to authenticate against the home region’s identity service but reference keys in the target region.

4. Handling Propagation Delays

Implement these safeguards:

  • Add a 30-minute buffer between key rotation and backup jobs
  • Implement retry logic with exponential backoff (initial retry after 2 minutes, then 5, 10, 20)
  • Monitor IAM policy propagation using OCI Audit logs before triggering backups
  • Use OCI Events to trigger backups only after key rotation completion events

5. Best Practices

  • Schedule key rotation during maintenance windows, not immediately before backup jobs
  • Test key rotation in non-production environments first to verify IAM policy effectiveness
  • Enable OCI Monitoring alerts for KMS errors (KMS.KeyNotAccessible, KMS.Unauthorized)
  • Document your key OCIDs and rotation schedules in a central repository
  • Consider using shorter rotation periods (30-60 days) to catch issues earlier

6. Troubleshooting Steps

If you continue experiencing issues:

  1. Verify the key status using `oci kms management key get --key-id
  2. Check IAM policy evaluation using OCI Policy Simulator
  3. Review OCI Audit logs for detailed authorization failures
  4. Confirm vault replication status if using replicated vaults
  5. Ensure your backup service version supports automatic key version resolution

The intermittent failures you’re experiencing are almost certainly due to IAM policy propagation delays combined with the timing of your backup jobs relative to key rotation events. Implementing the retry mechanism and scheduling buffers should resolve this completely.

After making these changes, monitor your backup jobs for at least one full rotation cycle to confirm everything works consistently.

Yes, there’s definitely a propagation delay. IAM policy changes can take 10-15 minutes to propagate across regions, and key rotation events need time to replicate. For production backup jobs, I recommend implementing a retry mechanism with exponential backoff. Also consider scheduling backups at least 30 minutes after any key rotation events.