Imagine that you're a key consultant hired by Allstate, one of the nation's largest insurance providers, to help them understand and improve their customer purchase behavior. The company recognizes that the insurance landscape is becoming increasingly competitive, with customers having more choices than ever before. Allstate wants to leverage data science to enhance customer relationships, tailor product offerings, and ultimately increase sales. However, they realize this is difficult given the complexity involved.
You are tasked with analyzing customer data – demographics, past interactions, policy quotes, and purchase history – to build a model that predicts which coverage options customers are most likely to select. By identifying the factors that drive customer choices, you will be able to provide Allstate with actionable insights for personalizing marketing messages, streamlining the quoting process, and developing new product bundles that better meet customer needs. Your success will translate into increased customer satisfaction, improved retention rates, and a significant boost to Allstate's overall sales performance.
Goal: The goal of this project is to build a model that can predict the purchased coverage options using a customer's shopping history.
The dataset contains transaction history for customers who ended up purchasing a policy. It includes customer information, quoted policy information, and costs. The training set contains the entire quote history for each customer, with the last row containing the purchased coverage options. The test set contains a partial history of the quotes, and the task is to predict the purchased coverage options.
customer_ID: A unique identifier for the customershopping_pt: Unique identifier for the shopping point of a given customerrecord_type: 0=shopping point, 1=purchase pointday: Day of the week (0-6, 0=Monday)time: Time of day (HH:MM)state: State where shopping point occurredlocation: Location ID where shopping point occurredgroup_size: How many people will be covered under the policy (1, 2, 3 or 4)homeowner: Whether the customer owns a home or not (0=no, 1=yes)car_age: Age of the customer's carcar_value: How valuable was the customer's car when newrisk_factor: An ordinal assessment of how risky the customer is (1, 2, 3, 4)age_oldest: Age of the oldest person in customer's groupage_youngest: Age of the youngest person in customer's groupmarried_couple: Does the customer group contain a married couple (0=no, 1=yes)C_previous: What the customer formerly had or currently has for product option C (0=nothing, 1, 2, 3,4)duration_previous: How long (in years) the customer was covered by their previous issuerA,B,C,D,E,F,G: the coverage optionscost: Cost of the quoted coverage optionsData Source: Kaggle Allstate Purchase Prediction Challenge
Download DataYour task is to predict the seven coverage options (A, B, C, D, E, F, G) that each customer will end up purchasing.
Python Libraries: Pandas, NumPy, scikit-learn, Matplotlib/Seaborn.