As a marketing analyst at a successful advertising agency, you've observed inconsistencies in media campaign costs across different projects. Some campaigns unexpectedly exceed their budgets, while others underperform in terms of reach and engagement. The current process for estimating campaign costs is largely based on industry averages and doesn't account for the unique characteristics of each client or target audience. The agency needs a more precise and data-driven approach to budget allocation.
You have access to a comprehensive dataset that captures various aspects of past media campaigns, including reach, demographics, channels, and expenditure. Your task is to construct a robust regression model that accurately predicts campaign costs based on these factors. By identifying key cost drivers and building a reliable forecasting tool, you will enable the agency to allocate budgets more efficiently, negotiate better rates with media outlets, and maximize the return on investment for their clients' marketing campaigns.
Goal: The goal of this project is to build a regression model that can accurately predict the cost of media campaigns.
The dataset contains information about various media campaigns, including sales, units, children, and other factors.
id: Unique identifierstore_sales: Store sales (in millions)unit_sales: Unit sales (in millions)total_children: Total childrennum_children_at_home: Number of children at homeavg_cars_at home: Average cars at homegross_weight: Gross weightrecyclable_package: Recyclable packagelow_fat: Low fatunits_per_case: Units per casestore_sqft: Store square footagecoffee_bar: Coffee barvideo_store: Video storesalad_bar: Salad barprepared_food: Prepared foodflorist: Floristcost: Cost (target variable)Data Source: Kaggle Playground Series S3E11
Download DataYour task is to build a regression model to predict media campaign costs.
Python Libraries: Pandas, NumPy, scikit-learn, Matplotlib/Seaborn.