Valid Databricks-Machine-Learning-Professional Dumps shared by EduDump.com for Helping Passing Databricks-Machine-Learning-Professional Exam! EduDump.com now offer the newest Databricks-Machine-Learning-Professional exam dumps, the EduDump.com Databricks-Machine-Learning-Professional exam questions have been updated and answers have been corrected get the newest EduDump.com Databricks-Machine-Learning-Professional dumps with Test Engine here:
A Machine Learning Engineer has a computer vision model in Databricks Model Serving that obscures sensitive data from images. Internal teams use the model throughout the work week when they request access to new files. Recently model users complained that the model takes much longer in the morning. This coincides with when people arrive at work and request files for the day. When the engineer reviews the endpoint health metrics, they see P50 model latency peaks around 9AM at 20 seconds. Request rate also peaks at 9AM at 15 requests/second. The GPU utilization is over 60%, GPU memory usage over 50%, and provisioned concurrency at 4 throughout the day. What can the engineer do to reduce user wait time when request rate peaks at 9AM each morning?
Correct Answer: B
The symptoms indicate a concurrency bottleneck during the 9AM traffic spike: request rate increases sharply, latency jumps, and the endpoint is already running at a fixed provisioned concurrency of 4. Increasing workload_size scales the endpoint horizontally to handle more concurrent requests, reducing queueing and bringing down user-perceived wait time during peak demand.