Serverless batch processing with OCI Functions and Object Storage for ML inference on compute-intensive datasets

carol_ops · September 25, 2025, 5:24pm

We recently migrated our nightly batch processing workload from traditional compute instances to a fully serverless architecture using OCI Functions and Object Storage. The use case involved processing large CSV files (500MB-2GB) that vendors upload daily to our landing bucket.

The implementation leverages Object Storage event triggers to automatically invoke functions when new files arrive. We packaged our Python-based processing logic into container images deployed as OCI Functions, which parse the CSVs, validate data, and write results to a database.

The serverless scaling has been impressive - functions spin up based on file arrival, process in parallel, and shut down automatically. We’ve eliminated manual scheduling and reduced operational overhead significantly. Processing time dropped from 45 minutes to 12 minutes on average due to parallel execution.

Happy to share our architecture patterns and lessons learned for anyone considering similar serverless batch implementations.

karen_builder · October 22, 2025, 9:33pm

Also curious about monitoring. With traditional VMs we had standard OS metrics and logs. How do you monitor function performance and troubleshoot failures in this serverless setup? Are you using OCI Logging and Monitoring services, or something else?

sandra_data · October 11, 2025, 10:51am

The chunking approach is smart. How are you managing the Object Storage event triggers? We’ve had issues with duplicate events in the past when using cloud events. Do you have any deduplication logic in your functions?

susanops · October 16, 2025, 5:55am

Good catch - we did implement idempotency handling. Each function checks a processing status table at the start using the object name and upload timestamp as a composite key. If a record exists with status ‘processing’ or ‘completed’, the function exits gracefully. We also use Object Storage tags to mark processed files.

For the event triggers themselves, we configured Events Service rules that filter on ObjectCreated events with a prefix match for our landing bucket path. The rule invokes the function through an action. We haven’t seen duplicate processing since adding the idempotency checks, though we do occasionally see duplicate event deliveries logged.

jenniferops · September 25, 2025, 7:17pm

This sounds like exactly what we need! We’re currently running cron jobs on compute instances for similar file processing. A few questions: How did you handle the container packaging for OCI Functions? Did you use the Fn Project CLI or Docker directly? Also, what’s the maximum execution time you’ve seen for a single function invocation?

Topic		Replies	Views
Automated CDN log export to OCI Object Storage for security monitoring Oracle Cloud use-case , automation , observability , oci-2019 , python , security-monitoring , siem-integration , content-deliver , oci-events-functions	5	0	October 29, 2025
Containerized ETL pipeline for analytics: reducing data processing time by 40% with OCI Container Instances Oracle Cloud use-case , analytics , parallel-processing , etl-pipeline , oci-2021 , performance-optimization , autoscaling , containers-ctn , container-instances	6	1	November 10, 2025
Automated ERP batch processing using OCI Compute autoscaling reduced overnight job duration by 40% Oracle Cloud use-case , compute , batch-processing , cost-optimization , oci-2019 , performance-optimization , sla-compliance , autoscaling , workload-management	4	0	November 28, 2025
Automated data pipeline from Cloud Storage to BigQuery with ML forecasting for sales analytics Google Cloud Platform (GCP) use-case , analytics , devops-auto , gcp-2021 , machine-learning , cloud-storage , pipeline-automation , cloud-functions , bigquery	3	1	March 12, 2025
Automated Cloud Object Storage ingest for analytics using Event Streams and Cloud Functions IBM Cloud use-case , storage , analytics , automation , ic-2019 , real-time-analytics , cloud-functions , event-streams , cos	7	0	December 8, 2024
Streaming real-time data warehouse metrics from Autonomous Database to OCI Monitoring for proactive alerting Oracle Cloud use-case , observability , oci-2019 , python , cloud-functions , autonomous-database , oci-monitoring , metrics-stream , real-time-alerting	5	1	August 12, 2025
Function Compute times out when processing large payroll batch exports Alibaba Cloud question , serverless , payroll , compute , timeout , event-driven , batch-processing , function-compute , ac-2021	3	0	January 5, 2025
Automated ERP invoice matching using OSS, Function Compute, and OCR reduced manual workload by 70% Alibaba Cloud use-case , compute , automation , function-compute , ac-2019 , manual-process , ocr , oss , workload-reduction	4	3	March 24, 2025
Automated invoice processing using Cloud Storage API for ERP document ingestion and workflow acceleration Google Cloud Platform (GCP) use-case , storage , erp-integration , event-driven , gcp-2019 , workflow-automation , cloud-functions , apis , cloud-storage-api	5	0	May 27, 2025

Serverless batch processing with OCI Functions and Object Storage for ML inference on compute-intensive datasets

Related topics