Get in touch with our team
Feature image for 27.04.2018


5 min read

Brighton SEO: George Karapalidis – Using machine learning and statistical models to predict revenue potential for search

This article was updated on: 07.02.2022

What if you could predict when a user was going to convert, or if you knew a user was unlikely to convert, but if they were exposed to a remarketing ad they would return to site and convert. This is the digital marketers dream, and in 2018 it is in our grasp with Machine Learning.

George Karapalidis’ talk at Brighton SEO 2018 discusses how machine learning can be used to forecast ecommerce transactions, and aid our digital marketing efforts.

What is machine learning? Machine learning is basically technology that develops algorithms based on previous data trends, patterns and models that it has been exposed to.

George’s goal is to find opportunities to increase organic revenue with the help of Machine Learning. SEO is changing  we need to embrace the mind of a data scientist to find the best opportunities.

The fact is that great SEO opportunities are often well hidden. There is always low hanging fruit – but deciding which one is worth the most money is the crucial next step if we don’t want to be wasting our time!

Thankfully, George has come up with a way of using Google Analytics data with Google Search Console data to identify where the opportunities lie for our websites. From Analytics, we get data about the performance of our landing pages, but what’s missing is information on what queries led uers to landing on those pages.

This gap is filled by Search Console, which gives us information about the queries that lead to people landing on our sites, but doesn’t tell us how those landing pages are peforming beyond the CTR and definitely doesn’t give us information about how much revenue we’re getting from them. The best plan is to match both platforms together to get a more complete look at the data.

We need to combine the metrics into one table. This table can show us page performance and, importantly, allow us to statistically calculate what percentage of a page’s revenue a particular search query is contributing. We need to add this to the table as another data column. At the import stage it’s also important to clean out non-commercial pages and anything else that we know we don’t need to pay attention to.

With this data, we can begin to model where we might see the biggest gains. A search query that contributes a high percentage of revenue to a high-value page is likely to give us a much better ROI than a search query that contributes a very low amount of revenue overall.

However, this is just the first part of the puzzle. We also need to factor in search intent. George plots search intent in a matrix that has 2 sets of variables:

  • Brand aware vs brand unaware
  • Product aware vs product unaware

For example, ‘tesco whole chicken’ would be both brand and product aware, but ‘what meat to use for roast dinner’ would be brand and product unaware’ and most search queries, if not all, can be categorised in this way. Transactional intent – the intent to make a purchase from your site – is highest when the search query is both brand and product aware, and brand awareness always leads to higher intent than brand unawareness, as this means that users still need to be convinced.

By training a DeepLearning model (a type of machine learning based on neural networks, which are modeled after human brains), we can automatically identify the transactional intent behind a query thanks to this matrix, allowing us to sort a lot of queries very quickly by intent without having to look at them individually. The more data we use and the more tests we run, the better that our neural networks will become at correctly assigning transactional intent to large volumes of search queries. Ultimately, each search term is given a probability based on brand awareness for how likely it is to convert users.

Another important factor to bear in mind is that branded CTR is very high in position 1, but drops significantly for positions 2 and 3, whereas generic CTR does decline, but is more evenly spread.

We can then use this data in our search queries table mentioned above to inform a revenue prediction model that includes page performance, click through rate and intent.

Although this data can tell us which queries might be good to focus on, we also need to use common sense and more statistical analysis to factor in the time of year. If a page converts very well in summer and very poorly in winter, there’s no point spending resources to optimise it in November. Optimise the pages that will give you the best return over the next few months. This is a great way to distinguish between multiple opportunities that all seem important.

George tried some optimisation based on this data and saw transactions improved by 15% and revenue by 7%, which was lower than what he expected, but he then used this data to inform the margin for error included in his ML model. The optimisation techniques he used were primarly siloing more content under the categories – including FAQs and informational content in the category folders of the website.

Final tips for SEOs:

  • Become a data scientist.
  • Be great at data collection.
  • Automate the analysis – If it takes a month to realise a great opportunity you need to find opportunities in less than an hour.

Closing thoughts from Impression

This talk was marketed for SEOs, but Helen Freeman, one of our PPC executives was also in the talk and thought that there were also some great takeaways for PPC. The potential for ML to help with search intent is useful across the board in search engine marketing – it’s not only important for organic searches.