I’m encountering a critical issue with our OCI Compute instance after deploying a custom Oracle Linux 8 image. The instance was running fine on the standard platform image, but after creating and deploying our custom image with pre-configured batch job scripts, the instance fails to boot.
The custom image creation process seemed successful - I used the OCI Console to create the image from our working instance, verified the image was in ‘Available’ state, and launched a new VM.2 instance from it. However, the instance gets stuck in ‘Provisioning’ state and eventually shows as ‘Running’ but is completely unreachable via SSH.
I checked the boot diagnostics through the console, but the serial console output shows the boot process stopping after kernel initialization. I suspect there might be a kernel compatibility issue or something with the boot configuration got corrupted during image creation.
Kernel panic - not syncing: VFS: Unable to mount root fs
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0
Our batch jobs are scheduled to run tonight and this is blocking our entire data processing pipeline. Has anyone experienced similar boot failures with custom OCI images?
Secondary mount points in fstab can definitely cause boot failures if they’re configured with the ‘nofail’ option missing. If the system tries to mount a volume that doesn’t exist or has a different UUID in the new instance, it can hang or panic during boot. You need to either use ‘nofail’ option for non-critical mounts or reference volumes by label rather than UUID for custom images.
I’d recommend checking the cloud-init logs if you can access them through the serial console. Sometimes custom images fail because cloud-init configuration conflicts with OCI’s initialization process. Also, verify that your custom image didn’t include any instance-specific network configurations or SSH keys that would prevent proper initialization in the new instance context. The kernel version compatibility between your custom configuration and OCI’s hypervisor is another area to investigate - some custom kernel modules or parameters might not be supported.
Had this exact scenario last month. Our batch processing setup had similar issues. What worked for us was recreating the custom image with a clean preparation process: stop the instance gracefully, remove any instance-specific configurations from /etc/fstab, clear cloud-init state with ‘cloud-init clean’, verify no hardcoded network settings in /etc/sysconfig/network-scripts, then create the image. Make sure your batch job scripts reference mount points by labels, not UUIDs.