Discovering, Crawling, Extracting and Indexing at Bing with Fabrice Canel

Fabrice was on the podcast last year talking about Javascript and the new indexing API. That was very interesting and he shared quite a few insights…

This Episode Takes the Conversation a BIG step Further

This conversation is on a whole different planet. Fabrice is head of the entire discovery-crawling-extracting-indexing process. Think about how much that involves. And how important he and his team are to the process of getting your content to the top of the results.

You cannot hope to get your content into search results if it isn’t found, crawled, extracted and indexed… and since he manages every single one of those steps, he is a person we really need to listen to.

Bingbot and Googlebot Function in Much the Same Way

Obviously they don’t function exactly the same way down to the tiniest detail. But close enough …

  1. the process is exactly the same
    (discover, crawl, extract, index)
  2. the content they are indexing is exactly the same
  3. the problems they face are exactly the same
  4. the underlying technology they use is the same

So the details of exactly how they achieve each step will differ. But they are faced with the same environment and aim to do the same thing – index the web effectively. So, we can safely assume Google deals with the discovery-crawling-extracting-indexing process in a manner very, very close to Bing.

Just think about whatever industry you are in – details differ, but every competitor uses the same foundation. Easy to forget, but this is just another industry. So same here.

Google functions much the same way as Bing. And vice versa. Close enough for us not to need to worry too much about the differences.

Stunning Insights. I Learned sooooo Much.

The conversation with Frédéric Dubut that kicked off this series (this episode recorded at UnGagged LA) suddenly looks tame and unrevealing. A simple ‘mise en bouche’, as we say in French.

Listen and Learn

  • Google collaborate with Bing on Chromium
  • They discover 70 billion new webpages every day
  • Bingbot pre-filters to stores only the ‘best’ content
  • New technology is coming out for rendering (Machine Learning + Javascript)
  • Standardised HTML is powerful
  • Bing (and we can safely assume Google) is getting exponentially better at extracting information
  • The process of storing the content is MUCH more important than you probably imagine
  • Every candidate set team at Bing relies on Bingbot
  • Nofollow has always been just a hint
  • Sitemaps and RSS are incredibly important
  • Indexing includes annotation, and annotations are fundamentally important to all the other teams and their algos
  • Indexing includes classification, and classification is fundamentally important to all the other teams and their algos

In short, as SEOs, we all depend on Fabrice and his team to an extent most of of us have probably will only start to grasp after watching the episode. This is the foundation of ranking in search. Everything else depends on this.

Fabrice is a truly lovely guy who wants to help you as a website manager… if only you’d help him help you. Here he tells you what he (and, presumably, his equivalent at Google) wants from you so that he can help you get your content to rank.

Help them overcome their problems, and you WILL be rewarded. Groovy !

Catch the rest of the Bing Series:

  1. How Ranking Works at Bing – Frédéric Dubut, Senior Program Manager Lead, Bing
  2. Discovering, Crawling, Extracting and Indexing at Bing – Fabrice Canel Principal Program Manager, Bing
  3. How the Q&A / Featured Snippet Algorithm Works – (this episode) Ali Alvi,  Principal Lead Program Manager AI Products, Bing
  4. How the Image and Video Algorithm Works – Meenaz Merchant, Principal Program Manager Lead, AI and Research, Bing
  5. How the Whole Page Algorithm Works – Nathan Chalmers, Program Manager, Search Relevance Team, Bing

By Jason BARNARD

Jason Barnard has over 2 decades of experience in digital marketing.

He currently teaches Brand SERP optimisation to students at Kalicube.pro and writes regularly for leading marketing publications such as Search Engine Journal, SEMrush, OnCrawl, Searchmetrics as well as appearing regularly on digital marketing webinars and speaking at major conferences around the world such as BrightonSEO, PubCon, SMX London, YoastCon.