Valid Professional-Data-Engineer Dumps shared by ExamDiscuss.com for Helping Passing Professional-Data-Engineer Exam! ExamDiscuss.com now offer the newest Professional-Data-Engineer exam dumps, the ExamDiscuss.com Professional-Data-Engineer exam questions have been updated and answers have been corrected get the newest ExamDiscuss.com Professional-Data-Engineer dumps with Test Engine here:
You are creating a data model in BigQuery that will hold retail transaction dat a. Your two largest tables, sales_transation_header and sales_transation_line. have a tightly coupled immutable relationship. These tables are rarely modified after load and are frequently joined when queried. You need to model the sales_transation_header and sales_transation_line tables to improve the performance of data analytics queries. What should you do?
Correct Answer: B
BigQuery supports nested and repeated fields, which are complex data types that can represent hierarchical and one-to-many relationships within a single table. By using nested and repeated fields, you can denormalize your data model and reduce the number of joins required for your queries. This can improve the performance and efficiency of your data analytics queries, as joins can be expensive and require shuffling data across nodes. Nested and repeated fields also preserve the data integrity and avoid data duplication. In this scenario, the sales_transaction_header and sales_transaction_line tables have a tightly coupled immutable relationship, meaning that each header row corresponds to one or more line rows, and the data is rarely modified after load. Therefore, it makes sense to create a single sales_transaction table that holds the sales_transaction_header information as rows and the sales_transaction_line rows as nested and repeated fields. This way, you can query the sales transaction data without joining two tables, and use dot notation or array functions to access the nested and repeated fields. For example, the sales_transaction table could have the following schema: Table Field name Type Mode id INTEGER NULLABLE order_time TIMESTAMP NULLABLE customer_id INTEGER NULLABLE line_items RECORD REPEATED line_items.sku STRING NULLABLE line_items.quantity INTEGER NULLABLE line_items.price FLOAT NULLABLE To query the total amount of each order, you could use the following SQL statement: SQL SELECT id, SUM(line_items.quantity * line_items.price) AS total_amount FROM sales_transaction GROUP BY id; AI-generated code. Review and use carefully. More info on FAQ. Reference: Use nested and repeated fields BigQuery explained: Working with joins, nested & repeated data Arrays in BigQuery - How to improve query performance and optimise storage