Help! Google Search isn't indexing my pages
2024-08-20 ยท en-j3PyPqV-e1s manual
SPEAKER: Eventually, Googlebot might get around to crawling it. That's the moment when it fetches the page from your server and processes it further to potentially index it. Once it gets to crawling, the URL would move on to the "Crawled-- currently not indexed," or the page gets indexed. [MUSIC PLAYING] Today we will dive into Google Search Console's "Discovered-- currently not indexed" status in the Page Indexing report. When using Google Search Console-- and you should use it-- you probably went into the Page Indexing report and perhaps saw these kinds of reasons for pages not being indexed. One of the most frequent questions we are getting about this is the "Discovered-- currently not indexed" status. Let's see what it means and what you could do about it. First and foremost, Google will almost never index all content from a site. This isn't an error and not even necessarily a problem that needs looking into. It's a node on the status of these pages mentioned there. To understand what this means, we need to look at how a page proceeds through the systems and processes that make up Google Search. At the very beginning, Googlebot finds a URL somewhere. That can be a sitemap or a link, for example. Googlebot has now discovered that this URL exists. Googlebot basically puts it into a to-do list of URLs to visit and possibly index later on. In an ideal world, Googlebot would immediately get to work on this URL. But as you probably know from your own to-do list, that isn't always possible. And that's the first reason why you might see this in Google Search Console. Googlebot simply didn't get around to crawl the URL yet, as it was busy with other URLs. So sometimes it's just a matter of a bit more patience on your end to get this resolved. Eventually, Googlebot might get around to crawling it. That's the moment when it fetches the page from your server and processes it further to potentially index it. Once it gets to crawling, the URL would move on to the "Crawled-- currently not indexed" or the page gets indexed. But what if it doesn't get crawled and stays in "Discovered-- but not indexed." Well, that usually either has to do with your server or with your website's quality. Hmm. Let's look at potential technical reasons first. Say you have a web shop and just added a thousand new products. Googlebot discovers all these products at the same time and would like to crawl them. In previous crawls, however, it has noticed that your server gets really slow or even overwhelmed when it tries to crawl more than 10 products at the same time. It wants to avoid overwhelming your server, so if it decides to crawl, it might do so over a longer period of time-- say, 10 products at a time over a few hours rather than all the thousand products within the same hour. That means that not all 1,000 products get crawled at the same time. Googlebot will take longer to get around these products then. It makes sense to look at the Crawl Stats report and the Reply section in there to see if your server responds slowly or with HTTP 500 errors when Googlebot tries to crawl. Note that this usually only matters for sites with very large amounts of pages-- say, millions or more-- but server issues can happen to smaller sites too. It makes sense to check with your hosting company what to do to fix these performance issues if they arise. The other far more common reason for pages staying in "Discovered-- currently not indexed" is quality, though. When Google Search notices a pattern of low-quality or thin content on pages, they might be removed from the index and might stay in "Discovered." Googlebot knows about these pages but is choosing not to proceed with them. If Google Search detects a pattern in URLs with low-quality content on your site, it might skip these URLs altogether, leaving them in "Discovered" as well. If you care about these pages, you might want to rework the content to be of higher quality and make sure your internal linking relates this content to other parts of your existing content. See our episode on internal linking for more information on this. So, in summary, most sites will have some pages that won't get indexed, and that's usually fine. If you think a page should be indexed, then you should consider checking the quality of the content on these pages that stay in "Discovered-- currently not indexed." Make sure as well that your server isn't giving Googlebot signals that it is overwhelmed when it's crawling. Please leave us a comment if you want more technical content on Google Search Central and what other topics we should cover. Thanks for watching, and see you soon. And I now proclaim myself Google Search. I'm not Googlebot. Googlebot is gone. I'm the new Googlebot. Googlebot is on vacation. I'm the substitute Googlebot. Ask me anything. [LAUGHS] [MUSIC PLAYING]