Skip to content

Information Gain in SEO

Sara Taher
5 min read
Google Patent on Information Gain
Google Patent on Information Gain

Every once in a while, there's a new buzzword in SEO. This time it's "information gain". The term is not new and has been in SEO for a while. It dates back to a patent that was filed by Google in 2018 and published in 2020.

I believe this is an important concept that can definitely up your SEO content game. Gone are the days when all you needed to do was skyscraper the top 10 results and call it a day. It's time to approach content creation differently and update our content optimization playbooks because:

But how to tackle those waves of change coming our way? One way to do that - definitely - is leveraging the "information gain" concept.

💡
If you don't want to read through the theory, click here to jump directly to tips on how to improve the information gain score for your content.

What is information gain in SEO?

Simply put, an information gain score is like a rating that tells you how much new and useful information a webpage contains. If a page has a high score, it means it has a lot of new stuff you haven't seen before. If it has a low score, it might not have much new information for you.

An example of an information gain score: A user searching the internet for information on "how to choose running shoes". When the user clicks on a webpage, they gain new information on "running shoes". The "information gain score" is the measure of how much new stuff they're learning about "running shoes" from that webpage compared to what they already learned from other pages they've seen before on that topic.

There's some confusion about "information gain" in SEO and in ML. While the term is the same, it's very different. I usually don't include a lot of theory in my blogs, but I believe this is important to remove the confusion I'm seeing circulating about "what is information gain" in SEO and how it is different from the concept in Machine Learning.

Information gain in SEO vs. in Machine Learning

In Machine Learning "information gain is a measure used to select the most informative feature (attribute) for splitting data when building decision trees. For instance, consider an email spam filter, where we have several features to help us decide whether an email is spam or not. Some of those could be the sender's domain, subject line, presence of certain words like "free" or "offer”, number of misspellings, time of day it was sent, and more. We calculate the IG for each feature select the one with the higher score as a root node to split, and iterate the process until we select all the features for splitting. IG is closely related to the concept of entropy, which measures the amount of impurity in a dataset. Therefore, we can relate IG to a reduction in the entropy." Says Nirmal Budhathoki - Senior Data Scientist at Microsoft.

In SEO, information gain refers to the Google patent titled "Contextual estimation of link information gain". The patent discusses the information gain score for a given document. According to this patent, the information gain score is indicative of additional information that is included in the document beyond information contained in documents that were previously viewed by the user. In some implementations, the information gain score may be determined for one or more documents by applying data from the documents across a machine learning model to generate an information gain score. Based on the information gain scores of a set of documents, the documents can be provided to the user in a manner that reflects the likely information gain that can be attained by the user if the user were to view the documents. If you'd like to dig deeper into how the information gain score is calculated, check this post by the late Bill Slawaski.

How to improve the information gain score for your content

Marketing is a team sport, and so is SEO. You'll need to work closely with the content team/person on this one. Here are few things you can do to improve the information gain score for your content:

  • I love this tip is from Steve Toth. Every copywriter (and SEO writing tool) is using the top 10 results to build the skeleton for their content. We don't want that! We want unique content. How to find unique content ideas to include in your blog? search for [keyword] filetype:pdf or [keyword] filetype:ppt to find new content and information that's different from what is already in SERPs. Here's an example:
How to use filetype google search operator
How to use filetype google search operator
  • Another tip by Eric Lancheres is simply adapting and repositioning existing content as new information. For example: if the common knowledge on Google is "1 in 10 people love salmon", this information can be repositioned as "10% of people like salmon".
  • Use perplexity.ai. I love this search engine. It does incorporate google ranking signals but it also has its own ranking system. The biggest advantage of this conversational search engine is that it lists the sources for its answers. Sometimes it's not your traditional top 10 results on google. Definitely worth checking out and familiarizing yourself and the content team with.
Example of a search on Perplexity.ai for "what is perplexity.ai"
Example of a search on Perplexity.ai for "what is perplexity.ai"
  • Another tip by Manny Diaz that Steve Toth mentioned in a recent newsletter was using consensus.app to find new information. This tool answers your questions by summarizing academic papers! Cool right?
Consensus AI search engine for Research
Consensus AI search engine for Research
  • Very unpopular opinion: give your copywriters more freedom to structure the content. The more you define it for them using the traditional skyscraper method the more likely their output will be just another version of what's in SERPs. Encourage them to add their personal twist, thoughts, and experiences.
  • Create your own surveys and therefore have your own data that you can use on your blogs. This can be a simple poll on LinkedIn or Twitter, a google form, customer surveys or simply analyzing trends in sales calls. Whatever way, the point is, to have your own unique data.
  • Another way, if you have the budget, you can use statista.com to buy the data you need. This is not only useful for improving your information gain scores, but can also attract quality backlinks!
Statista data marketplace
-
  • Finally, you can always reach out to subject matter experts for a quote on their perspective or opinion. This will enrich your content, and improve its EEAT, and experts will share your content with their network as well. A recipe for success!

Do you have any more tips? feel free to share them with me and I'll include them here! Thanks for reading :)

SEO

Related Posts

Members Public

Calculating SERPs Compression Ratio With Python

I love how this community builds on top of the work of each other! Well this article is no different from that. It all started with Roger Montti publishing a post about "How Compression Can Be Used To Detect Low Quality Pages" on Search Engine Journal. Next thing

The Output of a Python Script that pulls the top 10 results for a keyword and calculates the compression ratio for them
Members Public

A Tale of Two Courses: How I Became an SEO Course Creator!

I ran a webinar recently where I discussed my journey into becoming an SEO course creator. So I decided to convert this to a blog as well. Here's my journey, how it started and why, and behind the scene details of what goes into creating a lesson every

SEO Strategy Creation Process
Members Public

Beyond Keywords: Optimizing Content Through UX Analysis

💡🚀 Ready to boost your SEO with Python? Join my hands-on training designed for SEO professionals! Learn to automate tasks and analyze data easily. Don't miss out—start your journey today! Learn more here. Also I finally soft-launched the SEO Strategy Course at 50% Off for a limited time,

Screenshot of user clicks from Microsoft Clarity dashboard