A data engineer is tasked with loading JSON files containing customer reviews from an external stage into a Snowflake table. The JSON files have varying schemas and nested structures.
Which of the following methods is the MOST efficient and scalable way to ingest and query this data, minimizing the need for upfront schema definition?
Correct Answer: B
Using a VARIANT column is the most efficient way to load JSON data with varying schemas because it allows you to load the data as is without defining a rigid schema upfront. Dot notation and the FLATTEN function allow you to query the data flexibly. Creating a relational table (Option A) requires defining a schema upfront which is not ideal for varying schemas. External tables (Option C) still require a schema definition. Stored procedures (Option D) can be complex and less scalable. Using Spark (Option E) adds unnecessary complexity and cost for this scenario.