A company stores unstructured text data (PDFs, DOCX) in an external stage (AWS S3). They want to use Snowflake Cortex's PARSE DOCUMENT function to extract specific information, but are encountering performance issues and high costs. Which of the following strategies could optimize performance and reduce costs when using PARSE DOCUMENT in this scenario?
Correct Answer: A,B,E
Option B is correct because pre-processing reduces the amount of data that PARSE_DOCUMENT needs to process. Partitioning in the external stage enables Snowflake to more efficiently retrieve the relevant data. Option C is correct because caching prevents redundant processing and reduce MAX FILE_SIZE to lower value. Option E is correct because error handling ensures processing continues and monitoring provides insights into resource usage. Option A increasing warehouse size and MAX FILE SIZE without other optimizations is often a brute-force approach that doesn't address the root cause of performance problems and leads to unnecessary costs. Option D, limiting batch size, can help with memory issues but doesn't fundamentally improve the efficiency of document parsing.