Valid Databricks-Certified-Data-Engineer-Associate Dumps shared by ExamDiscuss.com for Helping Passing Databricks-Certified-Data-Engineer-Associate Exam! ExamDiscuss.com now offer the newest Databricks-Certified-Data-Engineer-Associate exam dumps, the ExamDiscuss.com Databricks-Certified-Data-Engineer-Associate exam questions have been updated and answers have been corrected get the newest ExamDiscuss.com Databricks-Certified-Data-Engineer-Associate dumps with Test Engine here:
In order for Structured Streaming to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing, which of the following two approaches is used by Spark to record the offset range of the data being processed in each trigger?
Correct Answer: A
Structured Streaming uses checkpointing and write-ahead logs to record the offset range of the data being processed in each trigger. This ensures that the engine can reliably track the exact progress of the processing and handle any kind of failure by restarting and/or reprocessing. Checkpointing is the mechanism of saving the state of a streaming query to fault-tolerant storage (such as HDFS) so that it can be recovered after a failure. Write-ahead logs are files that record the offset range of the data being processed in each trigger and are written to the checkpoint location before the processing starts. These logs are used to recover the query state and resume processing from the last processed offset range in case of a failure. References: Structured Streaming Programming Guide, Fault Tolerance Semantics