We’re experiencing intermittent routing issues in our IBM Cloud VPC that’s breaking our ERP integration. Traffic between our application subnet (10.240.10.0/24) and database subnet (10.240.20.0/24) randomly fails, causing ERP transaction timeouts.
I’ve checked the VPC routing table configuration and noticed multiple custom routes, but I’m not sure if there’s a CIDR overlap issue. Here’s our current setup:
Route 1: 10.240.0.0/16 -> local
Route 2: 10.240.20.0/24 -> 10.240.10.5 (custom)
Route 3: 10.240.0.0/20 -> gateway-id
I’ve enabled Flow Logs but struggling to pinpoint the exact routing decision causing failures. When the issue occurs, ERP users can’t complete purchase orders or update supplier records. Has anyone dealt with VPC routing table conflicts like this? How do I troubleshoot overlapping routes and identify which rule is taking precedence?
If Route 3 doesn’t serve a specific purpose (like routing that /20 block to a transit gateway or VPN), then yes, remove it. Having overlapping routes creates confusion and can lead to exactly the issues you’re experiencing. Keep your routing table as simple as possible - only add custom routes when you have explicit requirements like connecting to on-premises networks, routing through firewalls, or directing traffic to network appliances. For standard VPC subnet communication, the implicit local routes are all you need and they’re much more reliable.
Update: I tested connectivity using the VPC console tool and confirmed Route 2 was indeed intercepting database traffic. The load balancer at 10.240.10.5 isn’t configured to forward that traffic - it’s only meant for external ERP API calls. I’m going to remove Route 2 today. Should I also clean up Route 3, or is there a valid reason to keep a more specific route alongside the /16 catchall?
Thanks for the quick response. I checked the gateway in Route 3 and it’s active. The IP 10.240.10.5 in Route 2 is supposed to be our application load balancer. But now I’m wondering - should I even have Route 2? The database subnet should be reachable via the default local route, right? I’m going to review our Flow Logs more carefully to see which route is actually being used when failures happen.
Let me provide a comprehensive solution based on the discussion:
Root Cause Analysis:
Your VPC routing table has overlapping CIDR blocks causing traffic misrouting. Route 2 (10.240.20.0/24 → 10.240.10.5) was forcing database traffic through a load balancer that doesn’t forward it, and Route 3 (10.240.0.0/20) overlaps with your VPC-wide Route 1.
Step 1: VPC Routing Table Configuration
Remove the problematic custom routes:
ibmcloud is vpc-routing-table-route-delete VPC_ID ROUTING_TABLE_ID ROUTE_2_ID
ibmcloud is vpc-routing-table-route-delete VPC_ID ROUTING_TABLE_ID ROUTE_3_ID
Your routing table should only contain:
- Default local route: 10.240.0.0/16 → local (implicit, handles all intra-VPC traffic)
- Only add custom routes if you need to route to transit gateways, VPN connections, or network appliances
Step 2: Subnet CIDR Overlap Detection
Validate no overlapping custom routes exist:
ibmcloud is vpc-routing-tables VPC_ID --output json | jq '.routes[] | select(.destination | contains("10.240"))'
Ensure no two routes have overlapping CIDR blocks unless intentionally using longest-prefix-match for traffic engineering.
Step 3: Flow Logs Troubleshooting
Verify traffic flows correctly after cleanup:
- Query Flow Logs for traffic between subnets during your next ERP transaction window:
Filter: src_ip=10.240.10.* AND dst_ip=10.240.20.*
Look for: action=accept (should be present)
action=reject/drop (should be absent)
- Use VPC’s built-in connectivity test: VPC Console → Routing Tables → Test Connectivity
- Source: App subnet interface IP
- Destination: Database subnet IP
- Verify result shows ‘local’ route used
Validation:
After removing Routes 2 and 3, test your ERP integration:
- Purchase order creation should complete without timeouts
- Supplier record updates should succeed consistently
- Monitor Flow Logs for 24-48 hours to confirm no rejected connections
Best Practice:
For subnet-to-subnet communication within a single VPC, always rely on implicit local routing. Only create custom routes when you have explicit requirements like routing through virtual firewalls, connecting to transit gateways, or directing traffic to on-premises networks via VPN. Keep your routing table minimal and document the purpose of every custom route to avoid future confusion.
This should resolve your ERP integration failures permanently.
I see the problem - you have overlapping CIDR blocks in your routing table. Route 1 (10.240.0.0/16) and Route 3 (10.240.0.0/20) overlap, and Route 2 is trying to override traffic to your database subnet. VPC routing uses longest prefix match, so Route 3 with /20 takes precedence over Route 1 for that range, but Route 2 with /24 should win for database traffic. Check if your gateway in Route 3 is actually reachable. Also, verify Route 2’s next hop (10.240.10.5) - is that IP actually forwarding traffic correctly?
To add to the troubleshooting advice, use the VPC Flow Logs to trace the actual path. Filter logs by source IP from your app subnet and destination IP in your database subnet during a failure window. Look for ‘reject’ or ‘drop’ entries. You can also use the ‘Test connectivity’ feature in the VPC console to simulate traffic between subnets and see which route gets applied. This will show you definitively whether Route 2 or Route 3 is intercepting your traffic when it should be using the local route.