Help! Google Search isn't indexing my pages

2024-08-20 · en-j3PyPqV-e1s manual
SPEAKER: Eventually, Googlebot might get around to crawling it.
That's the moment when it fetches the page
from your server and processes it further to potentially index
it.
Once it gets to crawling, the URL would move
on to the "Crawled-- currently not indexed,"
or the page gets indexed.
[MUSIC PLAYING]
Today we will dive into Google Search Console's "Discovered--
currently not indexed" status in the Page Indexing report.
When using Google Search Console--
and you should use it-- you probably went
into the Page Indexing report and perhaps saw
these kinds of reasons for pages not being indexed.
One of the most frequent questions we are getting about
this is the "Discovered-- currently not indexed" status.
Let's see what it means and what you could do about it.
First and foremost, Google will almost never index all
content from a site.
This isn't an error and not even necessarily
a problem that needs looking into.
It's a node on the status of these pages mentioned there.
To understand what this means, we
need to look at how a page proceeds
through the systems and processes that
make up Google Search.
At the very beginning, Googlebot finds a URL somewhere.
That can be a sitemap or a link, for example.
Googlebot has now discovered that this URL exists.
Googlebot basically puts it into a to-do list
of URLs to visit and possibly index later on.
In an ideal world, Googlebot would immediately
get to work on this URL.
But as you probably know from your own to-do list,
that isn't always possible.
And that's the first reason why you might see
this in Google Search Console.
Googlebot simply didn't get around to crawl the URL yet,
as it was busy with other URLs.
So sometimes it's just a matter of a bit more patience
on your end to get this resolved.
Eventually, Googlebot might get around to crawling it.
That's the moment when it fetches the page
from your server and processes it further to potentially index
it.
Once it gets to crawling, the URL would move
on to the "Crawled-- currently not indexed"
or the page gets indexed.
But what if it doesn't get crawled and stays
in "Discovered-- but not indexed."
Well, that usually either has to do with your server
or with your website's quality.
Hmm.
Let's look at potential technical reasons first.
Say you have a web shop and just added a thousand new products.
Googlebot discovers all these products at the same time
and would like to crawl them.
In previous crawls, however, it has
noticed that your server gets really slow
or even overwhelmed when it tries
to crawl more than 10 products at the same time.
It wants to avoid overwhelming your server,
so if it decides to crawl, it might
do so over a longer period of time-- say, 10 products at
a time over a few hours rather than all the thousand
products within the same hour.
That means that not all 1,000 products get crawled at the same
time.
Googlebot will take longer to get around these products then.
It makes sense to look at the Crawl Stats report and the Reply
section in there to see if your server responds
slowly or with HTTP 500 errors when Googlebot tries to crawl.
Note that this usually only matters for sites
with very large amounts of pages--
say, millions or more--
but server issues can happen to smaller sites too.
It makes sense to check with your hosting company what
to do to fix these performance issues if they arise.
The other far more common reason for pages staying
in "Discovered-- currently not indexed" is quality, though.
When Google Search notices a pattern of low-quality or thin
content on pages, they might be removed from the index and might
stay in "Discovered."
Googlebot knows about these pages
but is choosing not to proceed with them.
If Google Search detects a pattern in URLs with low-quality
content on your site, it might skip these URLs altogether,
leaving them in "Discovered" as well.
If you care about these pages, you
might want to rework the content to be of higher quality
and make sure your internal linking relates
this content to other parts of your existing content.
See our episode on internal linking
for more information on this.
So, in summary, most sites will have
some pages that won't get indexed,
and that's usually fine.
If you think a page should be indexed,
then you should consider checking the quality
of the content on these pages that stay in "Discovered--
currently not indexed."
Make sure as well that your server
isn't giving Googlebot signals that it is
overwhelmed when it's crawling.
Please leave us a comment if you want more technical content
on Google Search Central and what other topics
we should cover.
Thanks for watching, and see you soon.
And I now proclaim myself Google Search.
I'm not Googlebot.
Googlebot is gone.
I'm the new Googlebot.
Googlebot is on vacation.
I'm the substitute Googlebot.
Ask me anything.
[LAUGHS]
[MUSIC PLAYING]