Post modified on
Ad testing in Google ads can get tricky. This post will help you orientate in the metrics and help you to answer the question: “Which of my ads is the better?”. I described this topic more in-depth at the biggest PPC event in CEE region – you can check my presentation. This is also the last post from the PPC Ad series:
- Introduction to PPC Ads
- Automated Ad Creation with PPC Bee
- PPC Ads from Excel
- Ad Testing and Analysis
I hope this series helped you to better understand the universe behind Google Ads. As a bonus, you can download the Excel file which automatically segments the ads the way you need. Just within a few clicks thanks to Power Query.
Filter and Segment!
The core is to compare comparable. How easy and common it is to use the data the wrong way is perfectly described in the Debunking Ad Testing Series from Martin Roettgerding.
The Dark Side of Search Partners
Here comes the Simpson’s paradox. When you check aggregated data – you would take the Ad A from the picture below, because of its high CTR. However, when you split the data by placement it shows you that the second ad is the winner. It had better CTR on Google.com and due to better quality, it won 4 times more ad auctions and got way more impressions on Search partners. A situation like this happens even with “rotate evenly” set-up in Google Ads. The system rotates the ads for ad auctions. Not for winning ad auctions – if the worse ad loses the auction – it looses automatically the impression. So, in the end, the worse ad has 4 times fewer impressions than the better ad.
This example is from Jan Zdarsa (Google, United Arab Emirates). As he says: “if it works for Google.com it works for Search partners as well”. Therefore, I automatically filter out the Search partners data (also in the Excel template that you can download below)
Devices Differ
Check the follow graph. The CTR for paid ads are getting way more different between mobile and desktop – it is partially influenced by the more common use of Google Shopping in SERP. In any case, you might get again easily into Simpson’s trap again.
Therefore, I separate the results between mobile and desktop devices (even in the Excel template)
What about average position?
Currently, the use of featured snippets and Google shopping really changes the usage of the average position. That is why Google killed this metric. The inaccuracy in the SERP location is described by Bloomarty in his Ad series and I borrowed the following pictures from him.
What should we use instead of the average position?
We have Segment Top vs. Other:
Top: Google Ads text ads that appear directly above Google search results.
Other: Any Google Ads text ads that don’t appear directly above Google search results are categorized as “Google search: Other.” (Google support)
And we have new metrics:
Search impression top rate [Impr. (Top) %] how often was your ad in the Top section (above the organic results) divided by all impressions.
Search top impression rate = Impressions on top/all Impressions
- Search partners are here included as well
- In all impressions are the impressions in the Top + in Others
You can create your own [Impr. (Top)%] just for Google.com without Search partners
Impression absolute top rate [Impr. (Abs. Top) %] how often was your ad on the first position (the highest in Top section or absolute top) divided by all impressions.
Search absolute top impression rate = Impressions on the absolute top/Impressions
Segment | impressions | Impr. (Top) % | Impr. (Abs. Top) % |
TOTAL | 1500 | 67% | 10% |
Abs. top | 150 | ||
Top | 1000 | ||
Other | 500 |
Thanks to these two metrics you can calculate how many impressions we on the first position in the top section. I call this my new metric:
Absolute top impressions [Abs. top Impr.] (my metric)
And this number is the first trick that can help you to solve the ad analysis problem. But Google doesn’t give you this metric so easily.
You don’t even get the Impr. (Abs. Top)% and Impr. (Top)% metrics with Top vs. Other segments. So you need to download all the data with Network (with search partners) and device and later with Network (with search partners) and device and Top vs. Other. And calculate it by yourself.
Ok, so now you know how many impressions you had in the top segment and how many on the very first position by device and by the network for each ad in each ad group.
Now is time for the second trick… another new metric
You can segment the ad groups by the percentage of how often the impression from top position on the absolute top was.
Absolute top impressions/ Top impressions
That would be called something like Absolute top impressions from top impressions – lets use [Abs. Top from Top %]
Segment | impressions | Impr. (Abs. Top) % | Abs. Top from Top % |
TOTAL | 1500 | 10% | 15% |
Abs. top | 150 | ||
Top | 1000 | ||
Other | 500 |
But these metrics help you only if you use it to segment ad groups to some clusters Like ad groups that had 100-71% of top impressions in the absolute top, then 70-40%, and 40-0%.
So you basically transform the metric to a new dimension
And you can group all ad groups by this dimension and can calculate any impression-based metric – like CTR.
So now I can evaluate the ads like in the good old times before featured snippets.
Power Query in the Excel template bellow splits the Abs. Top z Top % just to 3 parts:
- 100-71 – most of the Top impressions were on the very first.
- 70-41 – something in between, the Top impressions were not on the top but not on the bottom
- 40-0 – almost none of the impressions were on the very top – mostly the bottom position in the Top section
If you have more data – you can definitely divide the ad groups to more segments. However, this split is applicable to most of the cases. Now you can really compare the apples to apples.
I believe you can better visualize and decide which ad to choose from. And that the reason why I did it because at the iPrice group we wanted to test the best performing ads (in terms of CTR) in the SEO world as meta descriptions and titles.
Learning from the experiments:
Do not forget to use the current meta description or title (because it really happens that it works the best)
Test even “crazy” ideas. I was lucky because the volume we generated through Google Ads was enough to test many ad copies at once and those we thought would win lost and some we hesitated to test won in the end.
Check the segmented data + compare the volume of impressions because of the higher impressions the higher volume of successful ad auctions
Pro tip – Valuetrack Parameter for ad position {adposition} can be appended to Final URL as a parameter (so you can still track the position even after the end of average position)
What is wrong with impression based metrics?
The CTR is based on impressions. However, if the ad is performing better, it wins more ad auctions (even the less relevant ones) and therefore worse CTR and PPI. You can check this post from Bloomarty about why any impression based metric could be misleading.
But is there any solution?
You can create a campaign which has only exact match keywords. So theoretically there will be no less relevant search terms that your ad with high ad rank would be featured. However, now when exact match is not that exact anymore it might cause you troubles to keep exact campaigns really in exacts. And if you have bigger accounts it is almost impossible to use so many excluding keywords for each search term that is not exactly the keyword.
Waiting for the data
“How long should I wait for the data?”
“Till you get 95% statistical significance”
No, that’s not enough. You can get to this significance in a few clicks. As you can see on the table below from a webinar with Brad Geddes
However, many SEM specialists just “haunt” the statistical significance instead of using logic. Bloomarty wrote that another article with the experiment of having A/A test and how he found the winner with even higher statistical significance than 95%.
I am not saying that using statistical significance is bad. It just doesn’t mean it is better when you have a small dataset.
Ok, I shouldn’t be hasty. But I don’t want some bad ads ruining my performance. So what’s the right time to wait /right volume of data?
Minimum confidence levels
Since I want to segment the data so I can make a decision. It needs a lot of data to convince me about the result. I like the style recommended by Markéra Kabátová from uLab. When you increase a click by one, check how it has effected the CTR. If it is a minor change then you are ready to analyze.
This is an example for better understanding:
impressions | clicks | CTR | CTR (clicks +1) | CTR change | Should I wait? |
100 | 1 | 0.01 | 0.020 | 100% | wait |
1000 | 10 | 0.01 | 0.011 | 10% | so-so |
1000 | 20 | 0.02 | 0.021 | 5% | ready |
Power Query template
There is a video of how to use the Excel sheet. It’s done in just 3 minutes! I selected sampled data (conversion < 1) so no client gets effected by sharing this to you 🙂
Input data
Filter the campaigns and ad groups that you want to analyze and select the following columns:
Ad, Campaign, Ad group, Status
and Label
– for filtering
Impr., Clicks, Cost, Conversions, Conv. value
and Bounce rate
– for metrics
Impr. (Abs. Top)%
and Impr. (Top) %
– for segmentation
Now download all the ads with the following segments: device and Network once you have it download it again but now even with the Top v. Other segment.
You might wonder why don’t I use the Google Sheet connector – well, Google Sheet is possible for smaller accounts. But since the segmentations are increasing the number of rows you can easily hit the limit of Google sheets cells.
Later you need to add the path the .csv from Google Ads. It is in Query group Input and following queries – (Ad report - device and network
and Ad teport topvsother
) – check the following screenshots
Change of ad variants
If you use PPC Bee then PowerQuery automatically divides the ad template names from the URL (as you defined it in PPC Bee)
- Check the step XXX – Ad variant and XXX – Ad variant 2 here I select the ad label. If the ad doesn’t have any label, Power Query check if there is any string after the URL (?ppcbee-adtext-variant= or this string &ppcbee-adtext-variant)
- Then it labels the ads based on those aspects in a step called XXX – Ad variant Final
Outcome and visualization
The final outcome is in Query Main metrics – there are a lot of metrics to use – this table will be loaded to your Excel once you hit refresh the data. Later you need to use the Pivot table for visualization. Power Query is a great tool, but it’s not really handy for visualization. That is why your final step is to select the desired columns in the Pivot table.
Next steps
You choose the winner and send it over to the SEO team to change the meta titles or pause the bad performing templates and create new once. Do not forget to write down the results from the test, as I stress in the first part of the PPC Ad series.
If you are using PPC Bee now comes handy the structure. I have for each category 2 ads in PPC Bee. If I would add another ad in the PPC Bee, the structure would be lost… The new ads would be at the bottom and the whole PPC Bee UI would be a huge mess. How to avoid it? Just rewrite the bad performing ad with the new text and test it again. Later on, you can use the same Excel sheet, with a predefined Pivot table and data sources. So you will literally just click refresh next time.
Summary
- Think about ad testing as a contest to find the best combination of selling points that really resonates to your target group (and performs the best)
- This contest never ends.
- Maintain a database of your former test – what was tested and the result
- Think about how to test it – A/B test, A/B/C test or more multivariant ad test?
- Don’t be hasty but don’t wait too long
- Segment and do not trust search partners
- Don’t stop testing, And if you really don’t have time to evaluate the test, at least leave the rotation to Google, but ad regularly new ads.
I would appreciate your feedback on this PPC Ad series. Happy testing!