CloudWatch dashboard missing custom metrics after Lambda runtime upgrade for analytics ETL jobs

After upgrading our Lambda functions from Python 3.7 to 3.9 for ETL analytics workloads, CloudWatch dashboards are no longer showing custom metrics. The functions execute successfully, but the metrics we publish using boto3’s put_metric_data aren’t appearing. I’ve verified the Lambda runtime upgrade was successful, and logs show no errors during metric publishing.

Code snippet that used to work:

cloudwatch.put_metric_data(
    Namespace='Analytics/ETL',
    MetricData=[{'MetricName': 'RecordsProcessed', 'Value': count}]
)

The IAM permissions for metric publishing haven’t changed. Is there something different about how Lambda runtimes handle CloudWatch custom metrics in Python 3.9? Our monitoring has gaps now and we can’t track ETL job performance.

Check your Lambda execution role - does it have cloudwatch:PutMetricData permission? Sometimes role policies get reset during runtime migrations. Also, verify the boto3 version in your Lambda layer or deployment package. Python 3.9 might be using a different boto3 version that requires explicit region configuration.

I’ve resolved this issue multiple times. You need to address all three focus areas:

Lambda Runtime Upgrade Impact: The Python 3.9 runtime has stricter timeout handling and different default networking behavior. Your metric publishing code needs explicit error handling and retry logic:

try:
    response = cloudwatch.put_metric_data(
        Namespace='Analytics/ETL',
        MetricData=[{'MetricName': 'RecordsProcessed', 'Value': count, 'Unit': 'Count'}]
    )
    print(f"Metric published: {response['ResponseMetadata']['HTTPStatusCode']}")
except Exception as e:
    print(f"Metric publish failed: {str(e)}")

CloudWatch Custom Metrics: Ensure you’re using the correct metric format. Python 3.9 with newer boto3 versions requires explicit timestamp and unit specifications. Add these to your MetricData:

from datetime import datetime
MetricData=[{
    'MetricName': 'RecordsProcessed',
    'Value': count,
    'Unit': 'Count',
    'Timestamp': datetime.utcnow()
}]

IAM Permissions for Metric Publishing: Your execution role needs comprehensive CloudWatch permissions. Update the policy to include:

{
  "Effect": "Allow",
  "Action": [
    "cloudwatch:PutMetricData",
    "logs:CreateLogGroup",
    "logs:CreateLogStream",
    "logs:PutLogEvents"
  ],
  "Resource": "*"
}

Key troubleshooting steps:

  1. Enable X-Ray tracing on your Lambda to see if CloudWatch API calls are being made
  2. Check CloudWatch Logs for boto3 client initialization errors - Python 3.9 is stricter about credential handling
  3. Verify your Lambda has internet connectivity (check VPC configuration if applicable)
  4. Test metric publishing with a standalone script using the same IAM role to isolate Lambda-specific issues
  5. Review CloudTrail for PutMetricData API calls - if they’re not appearing, the calls aren’t reaching CloudWatch

In my experience, the most common culprit is VPC networking. If your Lambda was moved to a VPC during the upgrade without proper endpoints or NAT, CloudWatch API calls will timeout silently. Check your Lambda’s VPC configuration and either add a CloudWatch VPC endpoint or ensure you have a NAT gateway configured.

Also, update your boto3 to at least 1.28.x for better Python 3.9 compatibility. The version you’re using (1.26.137) predates some Python 3.9 optimizations.

Another thing to check: the Lambda function’s network configuration. If you moved to VPC-attached Lambdas during the upgrade, you need VPC endpoints for CloudWatch or a NAT gateway for internet access. Custom metrics require outbound connectivity to the CloudWatch API, and VPC Lambdas without proper networking can’t reach it.