Ping’s concept of Data Readiness combines data from field-level confidences, careful weighting of the most important data, and document structural analysis, to create an overall Readiness analysis that can be used to determine how much work is required before a submission is ready for underwriting.
To every data point, a confidence is assigned. This includes a postal code, limit value, square footage, construction data; anything extracted from a source document has an attachment. The confidence is usually simplistic: does the data meet expectations for data of that type? was extracting it risky or lossy in some way? is there any reason to suspect that it’s incorrect on its own? do any independent sources corroborate the information?
A reliability rating of High, Medium, Low, or No Data is assigned to every major category of data for each building. It combines field-level confidence scores and expectations and importance of data in the category. Key categories of data may include:
Automated processing of a submission results in an initial readiness determination of Pass-through Ready, Good, Consider Review, and Needs Review, along with notes explaining detected concerns.
Determining the success of extracting everything safely from a submission is a complex problem. A number of independent approaches feed the final determination. For example, the above confidences and reliabilities are taken into account, weighted by the importance of the data to the client and the submission. Independently, ML models analyzing the structure of the input documents integrate their confidence in a number of areas: was the overall structure relatively typical? did I find all the data? were some rows or headers possibly titles or totals? did the document employ layout techniques that create ambiguity or are beyond current capabilities?