Valid Professional-Machine-Learning-Engineer Dumps shared by ExamDiscuss.com for Helping Passing Professional-Machine-Learning-Engineer Exam! ExamDiscuss.com now offer the newest Professional-Machine-Learning-Engineer exam dumps, the ExamDiscuss.com Professional-Machine-Learning-Engineer exam questions have been updated and answers have been corrected get the newest ExamDiscuss.com Professional-Machine-Learning-Engineer dumps with Test Engine here:
You are developing a custom image classification model in Python. You plan to run your training application on Vertex Al Your input dataset contains several hundred thousand small images You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?
Correct Answer: B
Cloud Storage is a scalable and cost-effective storage service for any type of data. By storing image files in Cloud Storage, you can access them from anywhere and avoid the overhead of managing your own storage infrastructure. However, accessing image files directly from Cloud Storage can be slow and inefficient, especially for large-scale training. A better option is to use serialized records, such as TFRecord or Apache Avro, which are binary formats that store multiple images and their labels in a single file. Serialized records can improve the data throughput and reduce the network latency, as well as enable data compression and sharding. You can use TensorFlow or Apache Beam APIs to create and read serialized records from Cloud Storage. This solution requires minimal code changes and can speed up your training time significantly. References: * Cloud Storage | Google Cloud * TFRecord and tf.Example | TensorFlow Core * Apache Avro 1.10.2 Specification * Using Apache Beam with Cloud Storage | Cloud Storage