Skip to main content

Interest Analysis — Event Table Requirements

This page describes the requirements for event tables used in the Interest Analysis model. BPP supports three types of interest prediction:

  • Product Interest — based on an explicit product classification already in the event row.
  • Custom Interest — inferred from content/page analysis using a custom taxonomy.
  • IAB Interest — inferred using the industry-standard IAB taxonomy.

General requirements for event tables

To support any type of Interest Analysis, each event must include:

Column nameTypeDescription
event_timestampDATETIMEWhen the event occurred (mandatory, UTC)
hashed_emailSTRINGA user identifier (mandatory — any consistent identifier works)
event_labelSTRINGInterest classification label (optional, see below)
page_urlSTRINGFull URL of the visited page (optional, used for content classification)

All timestamps must be in UTC. The user identifier must be consistent across tables.


Product Interest

Scenario: the interest label is already available in the event row (e.g. from product/category tracking).

Required columns: event_timestamp, hashed_email, event_label.

Label format: hierarchical, using the double-pipe separator (||):

Electronics||Smartphones
Home & Garden||Furniture||Chairs

Example row

event_timestamp | hashed_email | event_label
---------------------|-----------------|---------------------------
2024-05-01 12:01:00 | a1b2c3@hash.com | Fashion||Shoes||Sneakers

IAB & Custom Interest

Scenario: the event lacks explicit categorization. BPP infers it through content classification.

Required columns: event_timestamp, hashed_email, page_url (must include the protocol, e.g. https://).

How it works

  1. A topic classification phase analyses the content of page_url.
  2. An interest label is generated and written into event_label.
  3. Interest modelling then uses event_label exactly as in Product Interest.
  • IAB Interest uses the predefined IAB taxonomy.
  • Custom Interest uses a client-specific taxonomy defined during onboarding.

Before classification

event_timestamp | hashed_email | page_url
---------------------|-----------------|------------------------------------------
2024-06-10 09:45:00 | xyz789@hash.com | https://www.example.com/blog/laptops

After classification

event_timestamp | hashed_email | page_url | event_label
---------------------|-----------------|---------------------------------------|------------------------
2024-06-10 09:45:00 | xyz789@hash.com | https://www.example.com/blog/laptops | Technology||Computers

Notes

  • event_label is optional at ingestion, but required before interest models can run.
  • For Product Interest, event_label must be present at load time.
  • For IAB/Custom Interest, event_label is generated automatically.

Best practices

  • Use consistent, meaningful hierarchies in event_label (max 3 levels).
  • Avoid duplicate or conflicting labels for the same event.
  • Ensure all page_url values are reachable and crawlable for classification.

Summary

Interest typeRequired columnsLabel sourceNotes
Productevent_timestamp, hashed_email, event_labelFrom event trackingLabel must be present at load time
IABevent_timestamp, hashed_email, page_urlGenerated via scrapingApplied only to pageview events
Customevent_timestamp, hashed_email, page_urlGenerated via scrapingTaxonomy defined per client