Action Prediction — Reference
The Action Prediction model estimates the probability that each customer will perform a specific action — for example making a purchase, renewing a contract, or subscribing to a newsletter. It is a predictive model and outputs a probability score (0–1) per user.
Input data: see AI Model Data Requirements → Action Prediction for the User/Event table requirements and minimum data volumes.
Key concepts
- Targets — defined using filters for positive, negative, and unknown outcomes. Filters can be based on User tables (e.g.
deal_stage = Won) or Event tables (e.g.Checkout Completed). - Features — must come from User tables. To use event-derived signals (behavioural, transactional), pre-aggregate them into user-level fields first — the Fields Builder can do this for you.
- Output — predictions are written to a dedicated table linked to
bpp_user_id, refreshed on the daily schedule.
JSON configuration reference
| Field | Type | Description | Example / best practice |
|---|---|---|---|
train | STRING | Training mode. Usually "auto". | "auto" |
client | STRING | Client/project identifier. | "ExampleClient" |
data_src.region | STRING | Cloud region of the dataset (auto-populated). | "europe-west8" |
data_src.features | OBJECT | Tables and columns used as features. User tables only. | "users_bpp": ["plan_quantity", "browser", "pageviews_last_30d"] |
dataset_id | STRING | BigQuery dataset ID (auto-populated). | "bpp_tables" |
project_id | STRING | GCP project ID (auto-populated). | "example-bi-data" |
filter_positive | OBJECT | Filters defining positive cases (action performed). | deal_stage IN (Won, Renew) |
filter_negative | OBJECT | Filters defining negative cases (action not performed). | deal_stage IN (Lost, Churn) |
filter_unknown | OBJECT | Filters to exclude ambiguous cases from training. | Users still in "Negotiation" |
action_name | STRING | Name of the modelled action. | "purchase", "renewal" |
bucket_name | STRING | Cloud Storage bucket for intermediate artifacts (auto-populated). | "bpp_models" |
train_preset | STRING | Training quality preset. | "medium_quality", "high_quality" |
train_test_split | FLOAT | Ratio of data used for the test set. | 0.2 |
level_score_top_k | INT | Number of score levels for probability buckets. | 10 |
feature_max_pvalue | FLOAT | Feature-selection threshold. Lower = stricter. | 0.1 |
fail_on_model_not_found | BOOL | Stop execution if no suitable model is found. | true |
Each filter object has the shape { "filters": [ { "field", "table", "value", "data_type", "comparison_operator" } ], "operator": "AND" | "OR" }.
Example — target from a User table
{
"train": "auto",
"client": "ExampleClient",
"data_src": {
"region": "europe-west8",
"features": {
"users_bpp": [
"plan_quantity",
"number_of_employees",
"device_class",
"browser",
"distinct_countries_last_90d",
"avg_time_to_checkout",
"pageviews_last_30d",
"add_to_cart_count",
"completed_orders_count",
"avg_order_value",
"support_tickets_opened",
"contract_age_months"
]
},
"dataset_id": "bpp_tables",
"project_id": "example-bi-data",
"filter_unknown": {
"filters": [
{ "field": "deal_stage", "table": "users_bpp", "value": ["In Progress", "Negotiation"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_negative": {
"filters": [
{ "field": "deal_stage", "table": "users_bpp", "value": ["Lost", "Churn"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_positive": {
"filters": [
{ "field": "deal_stage", "table": "users_bpp", "value": ["Won", "Renew"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
}
},
"action_name": "purchase",
"bucket_name": "bpp_models",
"train_preset": "medium_quality",
"train_test_split": 0.2,
"level_score_top_k": 10,
"feature_max_pvalue": 0.1,
"fail_on_model_not_found": true
}
Example — target from an Event table
When the target action is tracked as an event (e.g. "Checkout Completed"), the filters apply to the event table:
{
"train": "auto",
"client": "ExampleClient",
"data_src": {
"region": "europe-west8",
"features": {
"contacts_bpp": [
"industry",
"company_size",
"source_campaign",
"clicks_on_emails",
"opened_emails",
"total_pageviews",
"avg_pages_per_session",
"distinct_devices",
"distinct_browsers"
]
},
"dataset_id": "bpp_tables",
"project_id": "example-project",
"filter_unknown": {
"filters": [
{ "field": "event_status", "table": "events_bpp", "value": ["In Progress", "Offer Preparation"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_negative": {
"filters": [
{ "field": "event_status", "table": "events_bpp", "value": ["Lost", "Expired"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_positive": {
"filters": [
{ "field": "event_status", "table": "events_bpp", "value": ["Won", "Checkout Completed"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "OR"
}
},
"action_name": "checkout",
"bucket_name": "bpp_models",
"train_preset": "medium_quality",
"train_test_split": 0.2,
"level_score_top_k": 10,
"feature_max_pvalue": 0.1,
"fail_on_model_not_found": true
}
Feature engineering suggestions
Combine behavioural, transactional, and contextual signals. Examples by use case:
Lead qualification (B2B) — source campaign/medium, meetings booked in first 14 days, distinct countries of access, device & browser, engagement (opened emails, clicks, webinar attendance).
Newsletter subscription — pageviews last 30 days, content categories browsed, avg session duration, bounce rate, distinct devices, speed from landing to subscription.
E-commerce purchase — days since last visit, add-to-cart rate, checkout speed (cart → payment), distinct countries/IPs, failed transactions, number of categories purchased.
Contract renewal (SaaS) — contract age in months, logins per week, support-ticket open/resolve ratio, payment history, distinct device classes, geodiversity of access.
Upsell / cross-sell — current plan tier vs available tiers, interaction with upsell banners/emails, time on premium features, product lines purchased, active vs inactive modules.
Interpreting training results
After each successful run, the model instance detail page shows a performance summary (see Model Instances → Training results). How to read it:
- F1-Score — balances precision and recall, from 0 (poor) to 100 (excellent). It's the main quality indicator; for business-critical use cases aim for F1 > 70.
- Confusion matrix — the split of True/False Positives and Negatives. Use it to see whether errors skew toward false positives (riskier for high-cost campaigns) or false negatives.
- Feature importance — which inputs drive the prediction most. Monitor these features for data quality; a low F1 often means weak or missing features.
- Influential levels — for categorical features (e.g.
job_role,lead_source), which specific values push the prediction up or down. Validate these against domain knowledge; spurious correlations warrant a data-quality check.
Best practices
- Define non-overlapping positive / negative / unknown filters.
- Pre-aggregate event features into user-level fields for modelling.
- Include device and geodiversity signals (browser, device class, distinct locations).
- Track speed features (e.g. time from add-to-cart to checkout, signup to first purchase).
- Validate event timezones (UTC) to avoid mislabeled targets.
- Retrain regularly to capture new behavioural trends, and target campaigns at high-probability users while monitoring error rates.