Happy to share the technical approach and code examples. Here’s our complete migration framework:
Phase 1: Data Extraction and Profiling
We started with comprehensive data profiling to understand quality issues:
import pandas as pd
from datetime import datetime
import re
class LeaseDataProfiler:
def __init__(self, legacy_data):
self.df = pd.read_csv(legacy_data)
def profile_data_quality(self):
report = {
'total_contracts': len(self.df),
'missing_values': self.df.isnull().sum().to_dict(),
'duplicate_contract_ids': self.df['contract_id'].duplicated().sum(),
'invalid_dates': self.check_date_validity(),
'orphaned_records': self.check_referential_integrity()
}
return report
This profiling revealed that 12% of contracts had invalid date ranges, 8% had missing tenant information, and 5% were duplicates.
Phase 2: Data Transformation Pipeline
We built a multi-stage transformation pipeline with validation at each step:
class LeaseTransformer:
def __init__(self):
self.validation_errors = []
self.warnings = []
def transform_contract(self, legacy_contract):
# Stage 1: Basic field mapping
odoo_contract = {
'name': self.clean_contract_name(legacy_contract['contract_num']),
'partner_id': self.map_tenant_id(legacy_contract['tenant_id']),
'date_start': self.parse_date(legacy_contract['start_date']),
'date_end': self.parse_date(legacy_contract['end_date']),
'recurring_amount': self.extract_rent_amount(legacy_contract['rent_notes'])
}
# Stage 2: Validation
self.validate_contract(odoo_contract, legacy_contract['contract_id'])
# Stage 3: Payment schedule extraction
if legacy_contract['payment_schedule_notes']:
odoo_contract['invoice_lines'] = self.extract_payment_schedule(
legacy_contract['payment_schedule_notes']
)
return odoo_contract
def extract_payment_schedule(self, notes_text):
# Pattern matching for common formats
patterns = [
r'\$([\d,]+\.\d{2})\s*per\s*month',
r'monthly rent\s*:\s*\$([\d,]+\.\d{2})',
r'([\d,]+\.\d{2})\s*monthly'
]
for pattern in patterns:
match = re.search(pattern, notes_text, re.IGNORECASE)
if match:
amount = float(match.group(1).replace(',', ''))
return self.generate_invoice_lines(amount)
# If no pattern matches, flag for manual review
self.warnings.append({
'type': 'payment_schedule_extraction',
'message': f'Could not extract payment schedule from: {notes_text[:100]}'
})
return None
Phase 3: Multi-Level Validation
We implemented three validation levels to catch issues without being overly restrictive:
class ContractValidator:
def validate(self, contract, contract_id):
errors = []
warnings = []
# BLOCKING ERRORS - Must be fixed before migration
if not contract.get('name'):
errors.append(f"Contract {contract_id}: Missing contract name")
if not contract.get('partner_id'):
errors.append(f"Contract {contract_id}: Missing tenant reference")
if contract.get('date_start') and contract.get('date_end'):
if contract['date_end'] < contract['date_start']:
errors.append(f"Contract {contract_id}: End date before start date")
# WARNINGS - Flag for review but don't block
if contract.get('recurring_amount', 0) < 100:
warnings.append(f"Contract {contract_id}: Unusually low rent amount")
if contract.get('date_start'):
lease_duration = (contract['date_end'] - contract['date_start']).days
if lease_duration < 30 or lease_duration > 3650:
warnings.append(f"Contract {contract_id}: Unusual lease duration")
return {'errors': errors, 'warnings': warnings}
Phase 4: Reconciliation and Quality Assurance
After migration, we ran comprehensive reconciliation comparing legacy vs Odoo data:
class MigrationReconciliation:
def reconcile(self, legacy_df, odoo_df):
reconciliation_report = []
# Check record counts
if len(legacy_df) != len(odoo_df):
reconciliation_report.append({
'check': 'record_count',
'status': 'FAIL',
'legacy_count': len(legacy_df),
'odoo_count': len(odoo_df)
})
# Check financial totals
legacy_total = legacy_df['monthly_rent'].sum()
odoo_total = odoo_df['recurring_amount'].sum()
variance = abs(legacy_total - odoo_total)
reconciliation_report.append({
'check': 'total_monthly_rent',
'status': 'PASS' if variance < 100 else 'FAIL',
'legacy_total': legacy_total,
'odoo_total': odoo_total,
'variance': variance
})
return pd.DataFrame(reconciliation_report)
Migration Strategy
We migrated in phases by property to minimize risk:
- Week 1-2: Pilot migration (200 contracts, 1 property) - Full testing and validation
- Week 3-6: Phase 1 (2,000 contracts, 5 properties) - Monitor issues and refine
- Week 7-10: Phase 2 (6,300 contracts, remaining properties) - Full migration
Between phases, we ran parallel systems and reconciled daily to catch any discrepancies.
Key Success Factors:
- Comprehensive data profiling upfront revealed 85% of issues before writing code
- Three-level validation (errors/warnings/info) prevented blocking valid edge cases
- Automated reconciliation caught discrepancies immediately
- Phased approach allowed learning and refinement between batches
- Clear rollback procedures for each phase gave confidence to proceed
Final Results:
- 8,500 contracts migrated with 99.7% accuracy
- 127 contracts required manual intervention (1.5%)
- Average reconciliation variance: $0.23 per contract
- Zero production incidents post-migration
- 6 months later: All contracts validated, no legacy system dependencies
The code framework is reusable for other entity migrations. We’ve since used modified versions for migrating equipment leases and service contracts with similar success rates.