DSA-C03 Exam Dumps | A data science team at a retail company is using Snowflake to store customer transaction data'. They

<< Prev Question Next Question >>

Question 92/143

A data science team at a retail company is using Snowflake to store customer transaction data'. They want to segment customers based on their purchasing behavior using K-means clustering. Which of the following approaches is MOST efficient for performing K-means clustering on a very large customer dataset in Snowflake, minimizing data movement and leveraging Snowflake's compute capabilities, and adhering to best practices for data security and governance?

A. Exporting the entire customer transaction dataset from Snowflake to an external Python environment, performing K-means clustering using scikit-learn, and then importing the cluster assignments back into Snowflake as a new table. This approach involves significant data egress and potential security risks.

B. Using a Snowflake User-Defined Function (UDF) written in Python that leverages the scikit-learn library within the UDF to perform K-means clustering directly on the data within Snowflake. Ensure the UDF is called with appropriate resource allocation (WAREHOUSE SIZE) and security context.

C. Implementing K-means clustering using SQL queries with iterative JOINs and aggregations to calculate centroids and assign data points to clusters. This approach is computationally expensive and not recommended for large datasets. Moreover, security considerations are minimal.

D. Using Snowflake's Snowpark DataFrame API with a Python UDF to preprocess the data and execute the K-means algorithm within the Snowflake environment. This approach allows for scalable processing within Snowflake's compute resources with data kept securely within the governance boundaries.

E. Employing only Snowflake's SQL capabilities to perform approximate nearest neighbor searches without implementing the full K-means algorithm. This compromises the accuracy and effectiveness of the clustering results.

Question 92/143

LEAVE A REPLY

Download PDF File