What are the two main data extraction methodologies used in document understanding processes?
Correct Answer: B
According to the UiPath documentation, there are two common types of data extraction methodologies used in document understanding processes: rule-based data extraction and model-based data extraction12. Rule- based data extraction targets structured documents, such as forms, invoices, or receipts, that have a fixed layout and a predefined set of fields. Rule-based data extraction uses predefined rules, such as regular expressions, keywords, or coordinates, to locate and extract the relevant data from the documents1. Model- based data extraction is used to process semi-structured and unstructured documents, such as contracts, emails, or reports, that have a variable layout and a diverse set of fields. Model-based data extraction uses machine learning models, such as neural networks, to learn from examples and extract the relevant data from the documents1. Both methodologies have their advantages and limitations, and depending on the use case, they can be used separately or in combination, in a hybrid approach2.
References: 1: Data Extraction Overview 2: Document Processing with Improved Data Extraction