Valid Databricks-Certified-Professional-Data-Scientist Dumps shared by ExamDiscuss.com for Helping Passing Databricks-Certified-Professional-Data-Scientist Exam! ExamDiscuss.com now offer the newest Databricks-Certified-Professional-Data-Scientist exam dumps, the ExamDiscuss.com Databricks-Certified-Professional-Data-Scientist exam questions have been updated and answers have been corrected get the newest ExamDiscuss.com Databricks-Certified-Professional-Data-Scientist dumps with Test Engine here:
You are working with the Clustering solution of the customer datasets. There are almost 40 variables are available for each customer and almost 1.00,0000 customer's data is available. You want to reduce the number of variables for clustering, what would you do?
Correct Answer: C,E
Explanation When you are applying clustering technique and you find that there are quite a huge number of variables are available. Then it is better the find the co-relation among the variables and consider only one or two variables from the highly co-related variables. Because highly co-related variable will have the same effect, while creating the cluster. We can use scatter plot matrix among the variables to find the co-relation. You can also combine several variables into a single variable. For example if you have two values in the dataset like Asset and Debt than by combining these two values like Debt to Asset ratio and use it while creating the cluster.