Written By
Steven Chen
When Quantcast customers are designing their ad campaigns, they can choose from trillions of different combinations of audience traits, inventory attributes, and campaign goals. Ara, the AI and machine learning engine that powers the Quantcast Platform, can reach specific audiences across hundreds of metro areas, thousands of domains, and millions of user interests, just to name a few. With so many possible dimensions, it can be challenging for our customers to immediately understand the scope of the audience they are trying to reach, and how their campaign’s performance may change with their selected campaign configurations. To make this easier, pre-campaign forecasting provides our customers with an invaluable estimate of key performance indicators, like how many impressions their ad sets will likely deliver and what their audience reach will be with a given configuration. At Quantcast, we leverage our unique technical expertise and modeling techniques to construct a robust pre-campaign forecasting pipeline that can return a forecast based on trillions of data points in less than a few seconds.
What exactly are we trying to forecast?
While building an ad set, our customers are primarily interested in:
When forecasting how an ad set with a customer’s given campaign constraints will deliver, we want to align as closely as possible with how Ara would optimize that ad set’s performance in real time. These campaign constraints can be separated into binary and probabilistic constraints, with respect to how Ara meets them:
Forecasting potential reach
When forecasting potential reach, we only consider binary constraints, as Ara predicts probabilities for probabilistic constraints at bid time. These binary constraints can be represented using Boolean logic conditions, such as (“from Utah” OR “from Ohio”) AND (“interested in tacos” OR “interested in burritos”) and (“on cnn.com” OR “on bbc.com”). Previous approaches have attempted to forecast potential reach under binary constraints by imposing conditional independence assumptions between categories of constraints or by approximating the distribution of potential reach with tree-based models. Fortunately, at Quantcast we can take a direct approach with very few assumptions. Using our Kamke database, we can compute intersections and unions across billions of bid opportunities in the past week to project the number of bid opportunities / users that satisfy (“from Utah” OR “from Ohio”) AND (“interested in tacos” OR “interested in burritos”) and (“on cnn.com” OR “on bbc.com”). Kamke can return this estimate within a second.
Forecasting impressions and reach
The dynamics and models that comprise Ara’s multi-goal optimization controller for bidding are in continuous development. To avoid training against a moving target, we decouple and simplify our pre-campaign forecasting pipeline by treating Ara’s controller as a black box that will deliver a number of impressions given a daily budget and set of probabilistic constraints. Concretely, we fit a model to the following relationship: F(budget, constraint_0, constraint_1, …, constrain_n) → impressions.
The campaign goal also has a significant impact on this relationship (e.g., video-view optimized campaigns usually produce higher costs per impression than conversion-optimized campaigns), so we partition our training data by campaign goal and train a separate model for each goal.
To account for the probabilistic constraints, we project the constraints into a richer feature space by estimating the resulting reduction in impressions over a continuous range for that constraint (e.g., a viewability rate of 0%-100%). For demographics, each demographic category (age, gender, education level, etc.) is projected into its own space and its possible values are translated into a continuous range, based on the empirical distribution of that category’s groups (per country). Training sets for these models are taken from campaigns optimized to maximize viewability rates and demographic compositions. We then combine these to compute our final impression forecast as: F(budget, constraint_0, constraint_1, …, constrain_n, p1(viewability), p2(age)....) → impressions.
Finally, to forecast reach, we predict the ad set’s frequency (impressions per unique browser) and then scale our impressions forecast by that frequency. For ad sets without frequency goals, we derive the frequency from the ad set’s explicitly set frequency cap or a global frequency computed over all campaigns without frequency caps. For ad sets with frequency goals, using historical data compiled from campaigns optimized towards frequency goals, we similarly train a model that estimates the change in reach induced by the frequency goal: G(impressions, frequency_goal) → frequency; where impressions / frequency → reach.
Share article