Valid Databricks-Certified-Professional-Data-Engineer Dumps shared by EduDump.com for Helping Passing Databricks-Certified-Professional-Data-Engineer Exam! EduDump.com now offer the newest Databricks-Certified-Professional-Data-Engineer exam dumps, the EduDump.com Databricks-Certified-Professional-Data-Engineer exam questions have been updated and answers have been corrected get the newest EduDump.com Databricks-Certified-Professional-Data-Engineer dumps with Test Engine here:
A Structured Streaming job deployed to production has been resulting in higher than expected cloud storage costs. At present, during normal execution, each micro-batch of data is processed in less than 3 seconds; at least 12 times per minute, a micro-batch is processed that contains 0 records. The streaming write was configured using the default trigger settings. The production job is currently scheduled alongside many other Databricks jobs in a workspace with instance pools provisioned to reduce start-up time for jobs with batch execution. Holding all other variables constant and assuming records need to be processed in less than 10 minutes, which adjustment will meet the requirement?
Correct Answer: D
Comprehensive and Detailed Explanation From Exact Extract: Exact extract: "If no trigger is specified, the default processing-time trigger runs micro-batches as fast as possible." Exact extract: "Trigger once processes all available data once and then stops." Exact extract: "Job clusters are created for a job run and terminate when the job completes." The default "as fast as possible" trigger creates many empty micro-batches which repeatedly list/query cloud storage, inflating storage/metadata API costs. Switching to trigger(once=True) and scheduling the job to run every 10 minutes processes all available data in one batch, then stops. This both meets the <10-minute freshness requirement and minimizes compute (cluster can shut down between runs) and storage API calls (one batch per run instead of continual empty batches). Reference: