A data engineer is tasked with implementing a data governance strategy in Snowflake. They need to automatically apply a tag 'PII CLASSIFICATION' to all columns containing Personally Identifiable Information (PII). Given the following requirements: 1. The tag must be applied as close to data ingestion as possible. 2. The tagging process should be automated and scalable. 3. The tag value should be dynamically set based on a regular expression match against column names and data types. Which of the following approaches would be MOST effective and efficient in achieving these goals?
Correct Answer: C
Option C is the most effective because it leverages Snowflake's native event capture mechanisms (Event Tables, Streams and Tasks) to react to DDL changes in near real-time. This approach is automated, scalable, and avoids the overhead of periodic polling. Options A and B involve periodic scanning which is less efficient. Option D is manual and doesn't scale. Option E introduces unnecessary complexity with external functions and ML models for a relatively simple task, increasing operational overhead.