DSA-C03 Exam Dumps | You are preparing a dataset in Snowflake for a K-means clustering algorithm. The dataset includes features

<< Prev Question Next Question >>

Question 119/143

You are preparing a dataset in Snowflake for a K-means clustering algorithm. The dataset includes features like 'age', 'income' (in USD), and 'number of_transactions'. 'Income' has significantly larger values than 'age' and 'number of_transactions'. To ensure that all features contribute equally to the distance calculations in K-means, which of the following scaling approaches should you consider, and why? Select all that apply:

A. Apply StandardScaler to all three features ('age', 'income', 'number_of_transactions') to center the data around zero and scale it to unit variance.

B. Apply MinMaxScaler to all three features to scale them to a range between O and 1 .

C. Do not scale the data, as K-means is robust to differences in feature scales.

D. Apply RobustScaler to handle outliers and then StandardScaler or MinMaxScaler to further scale the features.

E. Apply PowerTransformer to transform income and StandardScaler to other features to handle skewness.

Question 119/143

LEAVE A REPLY

Download PDF File