Valid 70-475 Dumps shared by ExamDiscuss.com for Helping Passing 70-475 Exam! ExamDiscuss.com now offer the newest 70-475 exam dumps, the ExamDiscuss.com 70-475 exam questions have been updated and answers have been corrected get the newest ExamDiscuss.com 70-475 dumps with Test Engine here:
You are designing a solution that will use Apache HBase on Microsoft Azure HDInsight. You need to design the row keys for the database to ensure that client traffic is directed over all of the nodes in the cluster. What are two possible techniques that you can use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.
Correct Answer: C,D
Explanation/Reference: Explanation: There are two strategies that you can use to avoid hotspotting: * Hashing keys To spread write and insert activity across the cluster, you can randomize sequentially generated keys by hashing the keys, inverting the byte order. Note that these strategies come with trade-offs. Hashing keys, for example, makes table scans for key subranges inefficient, since the subrange is spread across the cluster. * Salting keys Instead of hashing the key, you can salt the key by prepending a few bytes of the hash of the key to the actual key. Note. Salted Apache HBase tables with pre-split is a proven effective HBase solution to provide uniform workload distribution across RegionServers and prevent hot spots during bulk writes. In this design, a row key is made with a logical key plus salt at the beginning. One way of generating salt is by calculating n (number of regions) modulo on the hash code of the logical row key (date, etc). Reference: https://blog.cloudera.com/blog/2015/06/how-to-scan-salted-apache-hbase-tables-with-region-specific-key- ranges-in-mapreduce/ http://maprdocs.mapr.com/51/MapR-DB/designing_row_keys_for_mapr_db_binary_tables.html