OCI Compute instance fails to restart after deploying custom image with pre-installed ERP agent

williampro · December 27, 2024, 2:24pm

I’m encountering a critical issue with our OCI Compute instance after deploying a custom Oracle Linux 8 image. The instance was running fine on the standard platform image, but after creating and deploying our custom image with pre-configured batch job scripts, the instance fails to boot.

The custom image creation process seemed successful - I used the OCI Console to create the image from our working instance, verified the image was in ‘Available’ state, and launched a new VM.2 instance from it. However, the instance gets stuck in ‘Provisioning’ state and eventually shows as ‘Running’ but is completely unreachable via SSH.

I checked the boot diagnostics through the console, but the serial console output shows the boot process stopping after kernel initialization. I suspect there might be a kernel compatibility issue or something with the boot configuration got corrupted during image creation.


Kernel panic - not syncing: VFS: Unable to mount root fs
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0

Our batch jobs are scheduled to run tonight and this is blocking our entire data processing pipeline. Has anyone experienced similar boot failures with custom OCI images?

jenniferops · January 11, 2025, 4:31am

Secondary mount points in fstab can definitely cause boot failures if they’re configured with the ‘nofail’ option missing. If the system tries to mount a volume that doesn’t exist or has a different UUID in the new instance, it can hang or panic during boot. You need to either use ‘nofail’ option for non-critical mounts or reference volumes by label rather than UUID for custom images.

jessicasql · January 12, 2025, 5:59pm

I’d recommend checking the cloud-init logs if you can access them through the serial console. Sometimes custom images fail because cloud-init configuration conflicts with OCI’s initialization process. Also, verify that your custom image didn’t include any instance-specific network configurations or SSH keys that would prevent proper initialization in the new instance context. The kernel version compatibility between your custom configuration and OCI’s hypervisor is another area to investigate - some custom kernel modules or parameters might not be supported.

thomas_expert · January 22, 2025, 7:33pm

Had this exact scenario last month. Our batch processing setup had similar issues. What worked for us was recreating the custom image with a clean preparation process: stop the instance gracefully, remove any instance-specific configurations from /etc/fstab, clear cloud-init state with ‘cloud-init clean’, verify no hardcoded network settings in /etc/sysconfig/network-scripts, then create the image. Make sure your batch job scripts reference mount points by labels, not UUIDs.

Topic		Views
Compute instance fails to boot from custom image after migration, showing 'No bootable device' error Oracle Cloud question , compute , devops-auto , oci-2021 , cloud-init , custom-image , boot-volume , region-migration , instance-launch	5	December 8, 2024
Compute instance boot failure after importing custom image with UEFI firmware Oracle Cloud question , compute , oci-2019 , custom-images , linux , vm-deployment , uefi-boot , gpt-partition , boot-failure	3	March 29, 2025
Startup script fails during VM creation from custom image, causing deployment rollback in automated pipeline Google Cloud Platform (GCP) question , compute , ci-cd , devops-auto , terraform , gcp-2019 , custom-image , startup-script , vm-creation	4	September 27, 2025
Compute API instance launch fails with 'Invalid shape' error for custom images in compute module Oracle Cloud question , compute , automation , rest-api , oci-2020 , apis , custom-image , instance , shape	3	July 3, 2025
Compute Engine instance template misconfiguration breaks ERP autoscaling group deployments Google Cloud Platform (GCP) question , compute , gcp-2019 , deployment-failure , autoscaling , compute-engine , instance-template , managed-instance-group , custom-image	4	June 3, 2025
Fluentd sidecar in container pod enters CrashLoopBackOff when forwarding to OCI Logging Oracle Cloud question , observability , oci-2019 , yaml , fluentd , container-servi , oci-logging , crashloopbackoff , sidecar-pattern	6	April 3, 2025
Object Storage upload fails for large files from OCI Compute instance Oracle Cloud question , compute , storage , oci-2019 , multipart-upload , upload-fail , oci-object-storage , oci-cli , network-bandwidth	6	June 26, 2025
Compute Engine instance restore fails with custom image permissions error Google Cloud Platform (GCP) question , compute , disaster-recovery , automation , gcp-2019 , iam-permissions , compute-engine , backup-disaster , cross-project	5	November 15, 2025
Pod autoscaling in container service fails to provision new nodes Oracle Cloud question , compute , iam , oci-2019 , autoscaling , container-servi , resource-manager , insufficient-permissions , stack-execution	3	December 31, 2024

OCI Compute instance fails to restart after deploying custom image with pre-installed ERP agent

Related topics