DSA-C03 Exam Dumps | You are tasked with creating a new feature in a machine learning model for predicting customer lifetime

<< Prev Question Next Question >>

Question 68/143

You are tasked with creating a new feature in a machine learning model for predicting customer lifetime value. You have access to a table called 'CUSTOMER ORDERS which contains order history for each customer. This table contains the following columns: 'CUSTOMER ID', 'ORDER DATE, and 'ORDER AMOUNT. To improve model performance and reduce the impact of outliers, you plan to bin the 'ORDER AMOUNT' column using quantiles. You decide to create 5 bins, effectively creating quintiles. You also want to create a derived feature indicating if the customer's latest order amount falls in the top quintile. Which of the following approaches, or combination of approaches, is most appropriate and efficient for achieving this in Snowflake? (Choose all that apply)

A. Use the window function to create quintiles for 'ORDER AMOUNT and then, in a separate query, check if the latest 'ORDER AMOUNT for each customer falls within the NTILE that represents the top quintile.

B. Calculate the 20th, 40th, 60th, and 80th percentiles of the 'ORDER AMOUNT' using 'APPROX PERCENTILE or 'PERCENTILE CONT and then use a 'CASE statement to assign each order to a quantile bin. Calculate and see if on that particular date is in top quintile.

C. Use a Snowflake UDF (User-Defined Function) written in Python or Java to calculate the quantiles and assign each 'ORDER AMOUNT to a bin. Later you can use other statement to check the top quintile amount from result set.

D. Create a temporary table storing quintile information, then join this table to original table to find the top quintile order amount.

E. Use 'WIDTH_BUCKET function, after finding the boundaries of quantile using 'APPROX_PERCENTILE' or 'PERCENTILE_CONT. Using MAX(ORDER to determine recent amount is in top quantile.

Question 68/143

LEAVE A REPLY

Download PDF File