Action Prediction — Reference

The Action Prediction model estimates the probability that each customer will perform a specific action — for example making a purchase, renewing a contract, or subscribing to a newsletter. It is a predictive model and outputs a probability score (0–1) per user.

Input data: see AI Model Data Requirements → Action Prediction for the User/Event table requirements and minimum data volumes.

Key concepts

Targets — defined using filters for positive, negative, and unknown outcomes. Filters can be based on User tables (e.g. deal_stage = Won) or Event tables (e.g. Checkout Completed).
Features — must come from User tables. To use event-derived signals (behavioural, transactional), pre-aggregate them into user-level fields first — the Fields Builder can do this for you.
Output — predictions are written to a dedicated table linked to bpp_user_id, refreshed on the daily schedule.

JSON configuration reference

Field	Type	Description	Example / best practice
`train`	STRING	Training mode. Usually `"auto"`.	`"auto"`
`client`	STRING	Client/project identifier.	`"ExampleClient"`
`data_src.region`	STRING	Cloud region of the dataset (auto-populated).	`"europe-west8"`
`data_src.features`	OBJECT	Tables and columns used as features. User tables only.	`"users_bpp": ["plan_quantity", "browser", "pageviews_last_30d"]`
`dataset_id`	STRING	BigQuery dataset ID (auto-populated).	`"bpp_tables"`
`project_id`	STRING	GCP project ID (auto-populated).	`"example-bi-data"`
`filter_positive`	OBJECT	Filters defining positive cases (action performed).	`deal_stage IN (Won, Renew)`
`filter_negative`	OBJECT	Filters defining negative cases (action not performed).	`deal_stage IN (Lost, Churn)`
`filter_unknown`	OBJECT	Filters to exclude ambiguous cases from training.	Users still in "Negotiation"
`action_name`	STRING	Name of the modelled action.	`"purchase"`, `"renewal"`
`bucket_name`	STRING	Cloud Storage bucket for intermediate artifacts (auto-populated).	`"bpp_models"`
`train_preset`	STRING	Training quality preset.	`"medium_quality"`, `"high_quality"`
`train_test_split`	FLOAT	Ratio of data used for the test set.	`0.2`
`level_score_top_k`	INT	Number of score levels for probability buckets.	`10`
`feature_max_pvalue`	FLOAT	Feature-selection threshold. Lower = stricter.	`0.1`
`fail_on_model_not_found`	BOOL	Stop execution if no suitable model is found.	`true`

Each filter object has the shape { "filters": [ { "field", "table", "value", "data_type", "comparison_operator" } ], "operator": "AND" | "OR" }.

Example — target from a User table

{
  "train": "auto",
  "client": "ExampleClient",
  "data_src": {
    "region": "europe-west8",
    "features": {
      "users_bpp": [
        "plan_quantity",
        "number_of_employees",
        "device_class",
        "browser",
        "distinct_countries_last_90d",
        "avg_time_to_checkout",
        "pageviews_last_30d",
        "add_to_cart_count",
        "completed_orders_count",
        "avg_order_value",
        "support_tickets_opened",
        "contract_age_months"
      ]
    },
    "dataset_id": "bpp_tables",
    "project_id": "example-bi-data",
    "filter_unknown": {
      "filters": [
        { "field": "deal_stage", "table": "users_bpp", "value": ["In Progress", "Negotiation"], "data_type": "STRING", "comparison_operator": "IN" }
      ],
      "operator": "AND"
    },
    "filter_negative": {
      "filters": [
        { "field": "deal_stage", "table": "users_bpp", "value": ["Lost", "Churn"], "data_type": "STRING", "comparison_operator": "IN" }
      ],
      "operator": "AND"
    },
    "filter_positive": {
      "filters": [
        { "field": "deal_stage", "table": "users_bpp", "value": ["Won", "Renew"], "data_type": "STRING", "comparison_operator": "IN" }
      ],
      "operator": "AND"
    }
  },
  "action_name": "purchase",
  "bucket_name": "bpp_models",
  "train_preset": "medium_quality",
  "train_test_split": 0.2,
  "level_score_top_k": 10,
  "feature_max_pvalue": 0.1,
  "fail_on_model_not_found": true
}

Example — target from an Event table

When the target action is tracked as an event (e.g. "Checkout Completed"), the filters apply to the event table:

{
  "train": "auto",
  "client": "ExampleClient",
  "data_src": {
    "region": "europe-west8",
    "features": {
      "contacts_bpp": [
        "industry",
        "company_size",
        "source_campaign",
        "clicks_on_emails",
        "opened_emails",
        "total_pageviews",
        "avg_pages_per_session",
        "distinct_devices",
        "distinct_browsers"
      ]
    },
    "dataset_id": "bpp_tables",
    "project_id": "example-project",
    "filter_unknown": {
      "filters": [
        { "field": "event_status", "table": "events_bpp", "value": ["In Progress", "Offer Preparation"], "data_type": "STRING", "comparison_operator": "IN" }
      ],
      "operator": "AND"
    },
    "filter_negative": {
      "filters": [
        { "field": "event_status", "table": "events_bpp", "value": ["Lost", "Expired"], "data_type": "STRING", "comparison_operator": "IN" }
      ],
      "operator": "AND"
    },
    "filter_positive": {
      "filters": [
        { "field": "event_status", "table": "events_bpp", "value": ["Won", "Checkout Completed"], "data_type": "STRING", "comparison_operator": "IN" }
      ],
      "operator": "OR"
    }
  },
  "action_name": "checkout",
  "bucket_name": "bpp_models",
  "train_preset": "medium_quality",
  "train_test_split": 0.2,
  "level_score_top_k": 10,
  "feature_max_pvalue": 0.1,
  "fail_on_model_not_found": true
}

Feature engineering suggestions

Combine behavioural, transactional, and contextual signals. Examples by use case:

Lead qualification (B2B) — source campaign/medium, meetings booked in first 14 days, distinct countries of access, device & browser, engagement (opened emails, clicks, webinar attendance).

Newsletter subscription — pageviews last 30 days, content categories browsed, avg session duration, bounce rate, distinct devices, speed from landing to subscription.

E-commerce purchase — days since last visit, add-to-cart rate, checkout speed (cart → payment), distinct countries/IPs, failed transactions, number of categories purchased.

Contract renewal (SaaS) — contract age in months, logins per week, support-ticket open/resolve ratio, payment history, distinct device classes, geodiversity of access.

Upsell / cross-sell — current plan tier vs available tiers, interaction with upsell banners/emails, time on premium features, product lines purchased, active vs inactive modules.

Interpreting training results

After each successful run, the model instance detail page shows a performance summary (see Model Instances → Training results). How to read it:

F1-Score — balances precision and recall, from 0 (poor) to 100 (excellent). It's the main quality indicator; for business-critical use cases aim for F1 > 70.
Confusion matrix — the split of True/False Positives and Negatives. Use it to see whether errors skew toward false positives (riskier for high-cost campaigns) or false negatives.
Feature importance — which inputs drive the prediction most. Monitor these features for data quality; a low F1 often means weak or missing features.
Influential levels — for categorical features (e.g. job_role, lead_source), which specific values push the prediction up or down. Validate these against domain knowledge; spurious correlations warrant a data-quality check.

Best practices

Define non-overlapping positive / negative / unknown filters.
Pre-aggregate event features into user-level fields for modelling.
Include device and geodiversity signals (browser, device class, distinct locations).
Track speed features (e.g. time from add-to-cart to checkout, signup to first purchase).
Validate event timezones (UTC) to avoid mislabeled targets.
Retrain regularly to capture new behavioural trends, and target campaigns at high-probability users while monitoring error rates.

Key concepts​

JSON configuration reference​

Example — target from a User table​

Example — target from an Event table​

Feature engineering suggestions​

Interpreting training results​

Best practices​