Associate-Data-Practitioner Exam Dumps | You need to create a data pipeline that streams event information from applications in multiple Google

<< Prev Question Next Question >>

Question 21/47

You need to create a data pipeline that streams event information from applications in multiple Google Cloud regions into BigQuery for near real-time analysis. The data requires transformation before loading. You want to create the pipeline using a visual interface. What should you do?

A. Push event information to a Pub/Sub topic. Create a Dataflow job using the Dataflow job builder.

B. Push event information to a Pub/Sub topic. Create a Cloud Run function to subscribe to the Pub/Sub topic, apply transformations, and insert the data into BigQuery.

C. Push event information to a Pub/Sub topic. Create a BigQuery subscription in Pub/Sub.

D. Push event information to Cloud Storage, and create an external table in BigQuery. Create a BigQuery scheduled job that executes once each day to apply transformations.

Correct Answer: A

Pushing event information to aPub/Sub topicand then creating aDataflow job using the Dataflow job builderis the most suitable solution. The Dataflow job builder provides a visual interface to design pipelines, allowing you to define transformations and load data into BigQuery. This approach is ideal for streaming data pipelines that require near real-time transformations and analysis. It ensures scalability across multiple regions and integrates seamlessly with Pub/Sub for event ingestion and BigQuery for analysis.
The best solution for creating a data pipeline with a visual interface for streaming event information from multiple Google Cloud regions into BigQuery for near real-time analysis with transformations isA. Push event information to a Pub/Sub topic. Create a Dataflow job using the Dataflow job builder.
Here's why:
* Pub/Sub and Dataflow:
* Pub/Sub is ideal for real-time message ingestion, especially from multiple regions.
* Dataflow, particularly with the Dataflow job builder, provides a visual interface for creating data pipelines that can perform real-time stream processing and transformations.
* The Dataflow job builder allows creating pipelines with visual tools, fulfilling the requirement of a visual interface.
* Dataflow is built for real time streaming and applying transformations.
Let's break down why the other options are less suitable:
* B. Push event information to Cloud Storage, and create an external table in BigQuery. Create a BigQuery scheduled job that executes once each day to apply transformations:
* This is a batch processing approach, not real-time.
* Cloud Storage and scheduled jobs are not designed for near real-time analysis.
* This does not meet the real time requirement of the question.
* C. Push event information to a Pub/Sub topic. Create a Cloud Run function to subscribe to the Pub/Sub topic, apply transformations, and insert the data into BigQuery:
* While Cloud Run can handle transformations, it requires more coding and is less scalable and manageable than Dataflow for complex streaming pipelines.
* Cloud run does not provide a visual interface.
* D. Push event information to a Pub/Sub topic. Create a BigQuery subscription in Pub/Sub:
* BigQuery subscriptions in Pub/Sub are for direct loading of Pub/Sub messages into BigQuery, without the ability to perform transformations.
* This option does not provide any transformation functionality.
Therefore, Pub/Sub for ingestion and Dataflow with its job builder for visual pipeline creation and transformations is the most appropriate solution.

Question 21/47

LEAVE A REPLY

Download PDF File