I’m looking to start a discussion about database archiving strategies for the subscription management module in ICS 2021. Our subscription database has grown to over 50GB with 5 years of transaction history, and we’re starting to see performance impacts on reporting and month-end processing.
We need to balance several concerns: maintaining audit compliance (7-year retention requirement), keeping active subscription data performant, and managing storage costs. I’m particularly interested in hearing about partitioning strategies, purging approaches for truly obsolete data, and backup considerations when dealing with archived subscription records. What approaches have worked well for others managing large subscription datasets?
From a compliance perspective, make sure your archiving strategy maintains audit trails. We archive subscription transactions older than 3 years to a separate archive database, but keep full transaction logs and audit records accessible for the full 7-year retention period. The archive database is read-only and backed up to tape for long-term storage. This approach balances performance with regulatory requirements.
For purging, be very careful about what you actually delete versus archive. We only purge truly temporary data like session logs or staging tables. Everything else gets archived even if it seems obsolete. You’d be surprised how often someone needs to reference a 6-year-old subscription record for a legal dispute or customer inquiry. Our purging policy is limited to data with zero business value after 90 days.
We implemented table partitioning on our subscription tables by year, which made archival much easier. Active subscriptions stay in the current partition, and we can archive older partitions to separate tablespaces or even move them to cheaper storage. The key is setting up the partitioning scheme early before your data gets too large.
Good point about being conservative with purging. I’m leaning toward an archive-first approach where we rarely delete anything permanently. The challenge is making archived data accessible enough for occasional queries without impacting production performance.
That’s interesting. Did you partition by subscription start date or by transaction date? I’m wondering which approach gives better query performance for typical subscription reporting that often spans multiple years of data.