The Challenge of Collecting Quality Data for AI in Underwriting

Machine learning (ML) and artificial intelligence (AI) are revolutionizing the underwriting process in the insurance industry. However, AI systems can only be as effective as the data they are trained on. High-quality, comprehensive data is essential for accurate risk assessment, yet gathering this data is becoming increasingly challenging.
The Data Dilemma
Obtaining detailed and accurate data about clients is fraught with difficulties. Clients are increasingly reluctant to share personal information due to privacy concerns, stricter legislation is being enacted, and data providers are tightening their controls. This creates a significant barrier for insurers looking to leverage AI for underwriting.
Insights from Kaggle: A Case Study
We recently came across an interesting project on Kaggle about responsible AI in predictive underwriting. The dataset included basic parameters such as smoking status, weight, and age.

While these are undoubtedly important, they raise several questions:
  • Data Accuracy: If the data is self-reported through a questionnaire, there's a risk that people might provide false information to influence their policy costs.
  • Limited Scope: Smoking and weight are critical factors, but what about alcohol consumption, lifestyle, and diet? Including these could significantly improve risk assessments, particularly for conditions like diabetes or chronic diseases.
The Role of Expert Underwriters and Actuaries
Currently, the gap in data quality is often bridged by the high skill levels of underwriters and actuaries. However, for a truly accurate risk assessment, a deep understanding of a person's health status is necessary. Internal expertise is crucial to help configure models, evaluating the impact of various symptoms on potential risks.

For instance:
  • Symptom Evaluation: A set of moderate symptoms indicating parasitic infection might not significantly increase the risk of visiting a doctor or developing a chronic condition in the short term. Conversely, symptoms pointing to an ulcer should heavily influence the risk assessment.
  • Holistic Analysis: Consider the nature of symptoms, their number (relative to potential illnesses), and their frequency. This approach allows us to digitize a person's health condition and integrate it into the final risk assessment model.
UNLOCK PERSONALIZED INTERACTIONS TO SKYROCKET YOUR SALES
Enriching Existing Data Models
The aim is not to create an entirely new model but to enhance existing datasets. This includes questionnaires, chronic disease records, third-party data, and medical histories. By understanding which diseases can be predicted, we can overlay data about doctor visit frequencies, further enriching the model with necessary coefficients.
Collecting Accurate Data
How can we collect such comprehensive data directly and ensure its accuracy? The solution lies in offering something valuable in return. For instance, providing personalized vitamin recommendations can be highly effective. Over 80% of Americans take vitamins, with half of them choosing their vitamins independently. Offering your clients personalized recommendations not only adds value for them but also allows you to gather truthful, necessary data for your models.
A Practical Example: Welly Quizzes
Consider Welly, which uses expertly crafted quizzes to evaluate 12 major body systems and provide a comprehensive assessment, predicting potential diseases. By integrating this approach, insurers can collect accurate, first-hand data from clients, enriching their underwriting models and improving risk assessment accuracy. Detailed information on what data can be collected with Welly can be found here.
Conclusion
Collecting quality data for AI in underwriting is a growing challenge, but it's not insurmountable. By leveraging innovative approaches like personalized health recommendations, insurers can gather the necessary data directly from clients. This not only enhances the accuracy of AI models but also provides added value to customers, creating a win-win situation. As the industry continues to evolve, combining expert insights with advanced data collection techniques will be key to staying ahead in the competitive landscape.
Made on
Tilda