A developer wants to extract hidden text from a pdf file. Which output method(s) should be used?
Correct Answer: D
To extract hidden text from a pdf file, the output method that should be used is FullText only. The FullText output method is one of the options available in the Read PDF Text activity, which reads all the characters from a specified pdf file and stores them in a string variable3. The FullText output method extracts the text from the pdf file as it is, without keeping the formatting or the position of the text. The FullText output method can also extract the hidden text from the pdf file, which is the text that is not visible on the screen, but can be copied and pasted into another application4. For example, the hidden text can be the metadata, comments, or annotations of the pdf file. The FullText output method is suitable for extracting hidden text from a pdf file, as it does not depend on the visibility or the layout of the text. The other output methods, such as Native or OCR, are not suitable for extracting hidden text from a pdf file, as they rely on the appearance or the position of the text on the screen. The Native output method preserves the formatting and the position of the text, but it cannot extract the text that is not visible or selectable5. The OCR output method converts the text from the pdf file into an image and then extracts the text from the image, but it cannot extract the text that is not displayed or recognized by the OCR engine6.
References: Read PDF Text, Extracting Hidden Text from PDF, Native, and OCR from UiPath documentation and forum.