Skip to main content

Action Prediction — Reference

The Action Prediction model estimates the probability that each customer will perform a specific action — for example making a purchase, renewing a contract, or subscribing to a newsletter. It is a predictive model and outputs a probability score (0–1) per user.

Input data: see AI Model Data Requirements → Action Prediction for the User/Event table requirements and minimum data volumes.


Key concepts

  • Targets — defined using filters for positive, negative, and unknown outcomes. Filters can be based on User tables (e.g. deal_stage = Won) or Event tables (e.g. Checkout Completed).
  • Features — must come from User tables. To use event-derived signals (behavioural, transactional), pre-aggregate them into user-level fields first — the Fields Builder can do this for you.
  • Output — predictions are written to a dedicated table linked to bpp_user_id, refreshed on the daily schedule.

JSON configuration reference

FieldTypeDescriptionExample / best practice
trainSTRINGTraining mode. Usually "auto"."auto"
clientSTRINGClient/project identifier."ExampleClient"
data_src.regionSTRINGCloud region of the dataset (auto-populated)."europe-west8"
data_src.featuresOBJECTTables and columns used as features. User tables only."users_bpp": ["plan_quantity", "browser", "pageviews_last_30d"]
dataset_idSTRINGBigQuery dataset ID (auto-populated)."bpp_tables"
project_idSTRINGGCP project ID (auto-populated)."example-bi-data"
filter_positiveOBJECTFilters defining positive cases (action performed).deal_stage IN (Won, Renew)
filter_negativeOBJECTFilters defining negative cases (action not performed).deal_stage IN (Lost, Churn)
filter_unknownOBJECTFilters to exclude ambiguous cases from training.Users still in "Negotiation"
action_nameSTRINGName of the modelled action."purchase", "renewal"
bucket_nameSTRINGCloud Storage bucket for intermediate artifacts (auto-populated)."bpp_models"
train_presetSTRINGTraining quality preset."medium_quality", "high_quality"
train_test_splitFLOATRatio of data used for the test set.0.2
level_score_top_kINTNumber of score levels for probability buckets.10
feature_max_pvalueFLOATFeature-selection threshold. Lower = stricter.0.1
fail_on_model_not_foundBOOLStop execution if no suitable model is found.true

Each filter object has the shape { "filters": [ { "field", "table", "value", "data_type", "comparison_operator" } ], "operator": "AND" | "OR" }.


Example — target from a User table

{
"train": "auto",
"client": "ExampleClient",
"data_src": {
"region": "europe-west8",
"features": {
"users_bpp": [
"plan_quantity",
"number_of_employees",
"device_class",
"browser",
"distinct_countries_last_90d",
"avg_time_to_checkout",
"pageviews_last_30d",
"add_to_cart_count",
"completed_orders_count",
"avg_order_value",
"support_tickets_opened",
"contract_age_months"
]
},
"dataset_id": "bpp_tables",
"project_id": "example-bi-data",
"filter_unknown": {
"filters": [
{ "field": "deal_stage", "table": "users_bpp", "value": ["In Progress", "Negotiation"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_negative": {
"filters": [
{ "field": "deal_stage", "table": "users_bpp", "value": ["Lost", "Churn"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_positive": {
"filters": [
{ "field": "deal_stage", "table": "users_bpp", "value": ["Won", "Renew"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
}
},
"action_name": "purchase",
"bucket_name": "bpp_models",
"train_preset": "medium_quality",
"train_test_split": 0.2,
"level_score_top_k": 10,
"feature_max_pvalue": 0.1,
"fail_on_model_not_found": true
}

Example — target from an Event table

When the target action is tracked as an event (e.g. "Checkout Completed"), the filters apply to the event table:

{
"train": "auto",
"client": "ExampleClient",
"data_src": {
"region": "europe-west8",
"features": {
"contacts_bpp": [
"industry",
"company_size",
"source_campaign",
"clicks_on_emails",
"opened_emails",
"total_pageviews",
"avg_pages_per_session",
"distinct_devices",
"distinct_browsers"
]
},
"dataset_id": "bpp_tables",
"project_id": "example-project",
"filter_unknown": {
"filters": [
{ "field": "event_status", "table": "events_bpp", "value": ["In Progress", "Offer Preparation"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_negative": {
"filters": [
{ "field": "event_status", "table": "events_bpp", "value": ["Lost", "Expired"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "AND"
},
"filter_positive": {
"filters": [
{ "field": "event_status", "table": "events_bpp", "value": ["Won", "Checkout Completed"], "data_type": "STRING", "comparison_operator": "IN" }
],
"operator": "OR"
}
},
"action_name": "checkout",
"bucket_name": "bpp_models",
"train_preset": "medium_quality",
"train_test_split": 0.2,
"level_score_top_k": 10,
"feature_max_pvalue": 0.1,
"fail_on_model_not_found": true
}

Feature engineering suggestions

Combine behavioural, transactional, and contextual signals. Examples by use case:

Lead qualification (B2B) — source campaign/medium, meetings booked in first 14 days, distinct countries of access, device & browser, engagement (opened emails, clicks, webinar attendance).

Newsletter subscription — pageviews last 30 days, content categories browsed, avg session duration, bounce rate, distinct devices, speed from landing to subscription.

E-commerce purchase — days since last visit, add-to-cart rate, checkout speed (cart → payment), distinct countries/IPs, failed transactions, number of categories purchased.

Contract renewal (SaaS) — contract age in months, logins per week, support-ticket open/resolve ratio, payment history, distinct device classes, geodiversity of access.

Upsell / cross-sell — current plan tier vs available tiers, interaction with upsell banners/emails, time on premium features, product lines purchased, active vs inactive modules.


Interpreting training results

After each successful run, the model instance detail page shows a performance summary (see Model Instances → Training results). How to read it:

  • F1-Score — balances precision and recall, from 0 (poor) to 100 (excellent). It's the main quality indicator; for business-critical use cases aim for F1 > 70.
  • Confusion matrix — the split of True/False Positives and Negatives. Use it to see whether errors skew toward false positives (riskier for high-cost campaigns) or false negatives.
  • Feature importance — which inputs drive the prediction most. Monitor these features for data quality; a low F1 often means weak or missing features.
  • Influential levels — for categorical features (e.g. job_role, lead_source), which specific values push the prediction up or down. Validate these against domain knowledge; spurious correlations warrant a data-quality check.

Best practices

  • Define non-overlapping positive / negative / unknown filters.
  • Pre-aggregate event features into user-level fields for modelling.
  • Include device and geodiversity signals (browser, device class, distinct locations).
  • Track speed features (e.g. time from add-to-cart to checkout, signup to first purchase).
  • Validate event timezones (UTC) to avoid mislabeled targets.
  • Retrain regularly to capture new behavioural trends, and target campaigns at high-probability users while monitoring error rates.