Databricks-Machine-Learning-Professional Exam Dumps | A Machine Learning Engineer has a computer vision model in Databricks Model Serving that obscures sensitive

<< Prev Question Next Question >>

Question 72/96

A Machine Learning Engineer has a computer vision model in Databricks Model Serving that obscures sensitive data from images. Internal teams use the model throughout the work week when they request access to new files. Recently model users complained that the model takes much longer in the morning. This coincides with when people arrive at work and request files for the day. When the engineer reviews the endpoint health metrics, they see P50 model latency peaks around 9AM at 20 seconds. Request rate also peaks at 9AM at 15 requests/second. The GPU utilization is over 60%, GPU memory usage over 50%, and provisioned concurrency at 4 throughout the day. What can the engineer do to reduce user wait time when request rate peaks at 9AM each morning?

A. The endpoint is constrained by its ability to handle simultaneous requests. Enable "scale_to_zero" on the endpoint to spin up additional endpoints quickly to handle the requests.

B. The endpoint is constrained by its ability to handle simultaneous requests. Scale the endpoint horizontally by editing the endpoint workload_size from "Small" to "Medium" to increase the model concurrency.

C. The endpoint is constrained by its ability to handle simultaneous requests. Define a rate_1imit on the endpoint to spread out the requests over the course of the day.

D. The endpoint is constrained by the GPU. Scale the endpoint vertically by changing the endpoint workload_type from "GPU_SMALL" to "GPU_MEDIUM".

Question 72/96

LEAVE A REPLY

Download PDF File