Typical document based workflows require hundreds of man hours for data entry. Wissen’s inhouse Document data extractor can automate manual data entry work flows by accurately locating and extracting data from fixed and variable locations in a pdf. The solution is based on heuristics based approach that can learn from a few examples for different formats. Spatial features and typological information of PDF text are used to model a layout parser.
Training takes a couple of minutes. Minimum training data required
Extensible to documents that come in a standardise fixed format
Extract complex information from both -Header (like order number) – Line items (like price, quantity And Multi line values spanning to other columns in the table
Language agnostic
Flexibility -Lets you choose which data points or line items you want to extract
Unique layout parser for each document type
Can be seamlessly integrated with ERP or CRM system
Designed to scale – Batch process or parse them real time . Once trained, our systems can convert thousands of PDF forms to Excel or CSV within a couple of minutes.