Transcript Collector

Optimizing login-page content for Google Search

2025-09-04 ยท en automatic

Open YouTube
[Music]
Hello and welcome to a new episode of
Search of the Record, a podcast coming
to you from the Google Search team where
we will talk all about search and maybe
have some fun along the way. My name is
Martin and I am a developer advocate or
search advocate at the search relations
team here at Google. And with me is
John. Hi, John.
>> Hi, Martin.
>> John, I have a question.
>> Uh
>> oh.
>> I No. Um I I read something and I
thought about it and I'm not sure what
to think about it. So, someone said
online somewhere that they don't need to
do SEO or they don't need to worry about
SEO because what they do like the the
stuff they have on their website is
behind a login. And I'm not sure if that
means that really they don't have to do
SEO because I think they might still
have to do some amount of SEO. What do
you think?
>> I think the the real answer is it
depends.
>> All right. Okay.
>> I'm so sorry.
>> I'm going to step in here as Barry
Schwarz. What does it depend on? I think
if you really don't care about what is
indexed, then like do whatever you want
kind of thing. Like maybe something will
be indexed, maybe it won't, but I have
zero care about what is visible in
search and nobody can access my content
anyways, so probably doesn't matter. If
you care a little bit about what is
visible in search, then maybe you should
think about how you set things up.
>> Okay. How would I know how to set it up
if I want I mean I guess I want my
website to show up in search somehow,
but I guess I don't want to just show
like the login page, right?
>> Yeah. I mean there there are variety of
different directions that you could go
there. The variations I usually see are
uh things like paywalled content where
basically you do want Google to index
things but the content itself might be
behind a payw wall or a login page or
something like that. So when a user
comes they would see the kind of the
interstitial to to log in. We have a bit
of documentation on how to set up
paywalled content.
So
perhaps that's not really what uh the
the person that was asking you was
asking about because it it also sounds
like they don't want to show the content
to Google either, which is fine.
>> H yeah. Um, so with with paywalled
content, what usually happens is you try
to recognize when Google is crawling
>> and you serve Google the the content
that you want to make available and you
add the paywalled structured data to the
page to make it clear to Google that hey
actually this content is not available
to everyone. uh there's some limitations
and that could be maybe you require a
login, maybe you require payment, maybe
after a certain number of iterations
you're like, "Oh, this is enough free
content now you have to pay for it."
Like there there lots of variations with
regards to paywalled content.
>> It it also doesn't have to be something
that's behind like a clear payment
thing. It can just be something like a
login or some other mechanism that
basically limits the visibility of the
content.
>> Ah, so for instance, if I have to like
watch a video or like click on an ad or
something to get to the rest of the
content, that's kind of also fine.
>> I don't know about fine, but that kind
of falls into the the category of well
there's there's something that needs to
be done before this content is actually
visible. And then you would use a payw
wall structure data.
>> Okay. Also, if you have something like
uh different thresholds where you say
it's like some some people get to view
five pages for free and others have the
whole content available for free because
you're doing AB testing maybe about the
prices or things like that,
>> then you'd want to use a paywalt
structure data just to make sure that
when Google is looking at it, they
realize that sometimes this content is
not available.
>> Okay. Got it. Got it. And the payroll
structure data helps us to understand
that users might see something different
and that's that's totally fine.
>> Okay.
>> I think there's one thing maybe to watch
out for with paywalled content is that
when a user looks at your page, you
don't load the content into the HTML,
but rather you make sure that it's
really not loaded into the the pages
DOM. Uh so that if a if a browser has
something like a um what is it the this
the text reader speech reader
>> uh screen reader yeah
>> screen reader uh that the screen reader
doesn't go off and read all of this text
that you're trying to hide
>> uh those kind of things. So that would
be kind of the the thing that I would
watch out for that if it's if it's
really paywalled content or limited
content, make sure you don't load it
into the browser and use JavaScript to
turn it on, but rather that it's really
only served to the user when you want to
make it available.
>> Okay, but that's like one specific kind
of content that you are hiding away or
like making not immediately accessible.
But what if I have I don't know a
website where I share
apartment ads for instance or like
apartments to rent and I want people to
log in to see the apartment. How or to
interact with the apartment or to apply
for the apartment. How would I go about
that? Is that just like immediately show
a login page or is are there better ways
of doing that? I I guess the question
would then be do you want this content
to be visible in Google or not? If it's
visible in Google then that would be
kind of the model of paywalled content.
if you don't want it visible in Google.
Um, maybe like this is is a private
forum or a private community where
you're sharing things. Or maybe you have
something like I don't know a a private
service where
>> people who have a subscription they have
access to this content but it's not
shown in Google or specific tools or
something like I don't know you have a
spreadsheet that runs in a browser kind
of thing where everyone has their own
private content and they all have URLs.
>> Okay fine. All right. How are how are
like bigger services doing it? Do they
just show the
do you know like if they just show the
login page or do they how do they do
this? Do I don't think they use all
payroll uh structured data. Now, I I
think if you if you're looking at a
service like, I don't know, Search
Console or or Google Drive where you
have kind of this this private content
uh that is hosted online with a specific
URL, then fundamentally in order for
someone to see that content, they have
to log in. Mhm.
>> And uh usually I I guess it depends on
on how they set it up, but often times
when you try to access a page like that,
it'll redirect you to a login page. And
I I think how they deal with the login
page determines a little bit how uh
things could potentially end up being
indexed.
Uh so for example with uh search console
uh one of the things that they do is
that they have a set of marketing pages
that are freely available and uh if you
try to access a search console URL
directly without being logged in it'll
redirect you to a marketing page which
has a link uh that says you can sign in
here to to actually get the the full
information.
>> That makes sense. And I I think from an
SEO point of view, that's that's
fantastic because you search for Search
Console and you can find these marketing
pages and if anyone accidentally links
to their private search console URL
like, oh, I want to look at the
performance report for my site and they
share that with people. Um, then that
URL will redirect to the marketing page.
So that like on the one hand users who
find that link randomly like they end up
on the marketing page they know what
it's about and uh for search engines
they will find this marketing page and
they'll be like oh okay this this has
indexable content we will just index
this. Okay. So, we are seeing it depends
a little bit on what kind of content
you're hiding and like there are in
between bits and pieces like you don't
have to just completely direct to a
login page I guess. Okay. Interesting. I
think like one of the things we we
noticed over the years specifically
around login pages is that if you have a
very generic login page, we will see all
of these URLs that show that login page
that redirect to that login page as
being duplicates. Like if whenever you
access a private uh URL it just says
username and password then we will think
all of these individual private URLs are
actually the same. So we'll fold them
together as duplicates and we'll focus
on indexing the login page because
that's kind of what you give us to
index. And uh that means in the search
results that login page is going to be
very popular because all of these random
links they keep redirecting to it or
they keep showing the same login page.
So if someone is searching for your
service and it's like want to know more
about your service and the only thing or
the primary thing they find in search is
like here is how to log in. That might
be a kind of a weird experience for
them. Okay, that is uh yeah I mean yeah
that's not great. So should they then
for instance check if it's a legitimate
Google bot and then just give like the
actual content of the URL or at least
like some sample content or how would
you fix that specific problem where
everything gets ddubbed?
>> I I think if this is private content you
don't want to share that with Google
bot. Well, okay. If it's private
content, sure. If it's p if it's private
content, you don't. But then how do you
keep Google bot from putting it in the
index? You just put a no index on it or
like robot the URL away so that we are
not even crawling it. Like what's the
idea?
>> Okay. So I I think for these situations
where you want to show a login page,
it's good to have some context on the
login page. Um, so the the search
console model is basically will show a
marketing page instead of the login
page, but if you have a generic login
page, put some information about what
your service is on that login page,
which could be enough to just have a
sample of text like, oh, you're
accessing
u Martin's furniture lookup site or I
don't know uh some internet thing uh
where maybe some private content is. And
then if you have some information on
that login page, then we can index that
that information. And if you have
different types of services that use the
same login page, then those different
services will have slightly adjusted
login pages. So if you're searching for,
I don't know, maybe we'll just stick
with Google Drive. Like if you're
searching for Google Docs, you'll find a
login page maybe that for Google Docs.
If you're searching for Google Sheets,
you'll find a login page maybe for
Google Sheets. Uh so having a little bit
of information on there is important. Uh
the the other thing you mentioned is
whether all of this should just be
blocked by robots.ext.
>> Uh which is another common strategy for
dealing with things that you don't want
to have indexed.
>> The problem I think with with doing that
is the URLs could become indexable. Uh
so we wouldn't see the contents of the
login page but rather we would just see
like oh like people are linking to this
specific Google doc and we can't access
it but maybe we should show it in the
search results if someone is searching
for something similar and also this
could be visible if someone does
something like a site query for your
site and it's like oh like tell me all
of the URLs that are indexed for this
hidden section of a website and then
Google and other search engines might be
like, oh, it's like I know about all of
these URLs. I like I don't have any
information on what's on there, but it's
like feel free to try them out
essentially, which is probably a bad
idea. And if you have random hashes in
the URL, so a collection of random
characters, it's not a bad thing or not
a terrible thing. But if you have things
like usernames or email addresses in the
URL,
>> then of course all of those could become
indexable. So if it's private content,
um, serve it with a no index or redirect
it to a login page uh somewhere. Don't
use robot set text
>> and optimally don't leak private details
in URLs.
>> Sure. Yeah, of course. Yeah, that's I I
think I think that's always a good
practice. But uh sometimes you you have
things like your form submission
parameters in a URL somewhere and it's
like get stuck there.
>> Mhm. Okay. any other common problems
you're seeing with login pages
specifically or with content that is
behind some sort of login?
>> Yeah, I I think the the other question
that I sometimes run across is whether
or not the login page should be
indexable by itself. Mhm.
>> And I I think that depends a bit on the
nature of the the content that you have
behind the login page. Uh, for example,
if you have uh a kind of an internet
that is available publicly where your
employees can only access it, then you
probably don't need that login page
index in search because like your
employees should be able to find the
URLs for your private content on their
own hopefully. Uh so in a case like that
you probably can just serve u I don't
know the login page with an error code
or use server side authentication
um or put a no index on the login page
so that if it does get found then at
least it won't be indexable like that.
Uh so that's I I think one aspect as
well which I've I've seen in the past
every now and then that people's
internets end up getting indexed.
There's a login form but like you
probably don't want people to
accidentally run across your internet uh
URLs. Now I think those are kind of the
the primary aspects and showing a login
page is is generally fine. Um whether or
not you redirect to a login page or show
the login page directly ultimately I
think is more a technical decision on
your side. Sometimes there are security
implications around that.
>> Oh security implications.
>> Well I I think like like cookies for
example right?
>> Okay fair.
>> So maybe you have something like a
login.youdommain.com
and like everything gets routed through
there then you want to redirect to login
page there. That that makes sense. Yeah.
Okay. Yeah. Sure. Okay. I see what you
mean with security applications. Okay.
>> And I I think this is a problem pretty
much for any site that has kind of
private sections uh on on the site where
which are accessible through individual
URLs. Um, but definitely a problem for
for sites like Google Drive or all of
the the various Google services where
you end up kind of like having a lot of
content that is private to to yourself
to the user and where you have a lot of
different login pages. And specifically,
if you have multiple services that go
through the same login page, then it's
worthwhile to kind of think about how
you actually want your service to be
foundable in in the search results.
>> Yeah.
>> And for the most part, you do want
things findable. And if people link to
something private, you do want something
smart to happen there. Uh so it's kind
of good to to think about like how you
should combine things and we regularly
see Google services getting this wrong
or getting I mean not necessarily
getting it wrong and that you can access
the private content uh but wrong in the
sense that uh we we index things that
probably we shouldn't be indexing like
that
>> and then all you get is a login page.
Yeah. That's not not great.
>> Yeah. Yeah. Um I I think search console
used to have that problem before they
moved to kind of having the marketing
pages uh as a redirect target where you
would search for search console and you
would find someone's search console URL
in the search results and it's indexed
as like sign in here kind of thing which
is like it's a login page. Of course,
you can reach search console that way,
but it's not really the best way to uh
show search console in the search
results. And because Google has so many
different services and so many different
teams working on these services, it's
like you invariably run across
situations like that.
>> I mean, for some of the services is also
tricky. If I have a Google doc that I
make public kind of like a non-e website
so to speak um and then it gets indexed
and then it is visible and people
actually can use the content and then I
delete the file or if I make it private
again then it is indexed it will take a
while until it falls out of the index.
So there will be
>> yeah surprises let's put it that way. I
I mean surprises in the sense that if
you're not prepared, sure. But uh I I
think like it it just makes it hard for
search engines to go and actually index
or find content on Google Docs where
it's like, um maybe there's something
here. Maybe all of this is private.
Probably it's private, but maybe I
should check anyway.
>> Yeah. Uh but you know I I think the the
other aspect that's kind of interesting
uh is that internally we don't give SEO
advice on these kind of things.
>> Mh.
>> Uh so every now and then someone with a
public service will ping us internally
at Google and be like oh how do I make
sure my service is indexed properly and
essentially we have to point them at our
public documentation. Maybe we'll point
them at this podcast in the future. Uh
yeah, but uh it's it's something that
just comes up every now and then. And I
think larger websites, especially those
that have private content, they probably
have similar things. Even e-commerce
sites where you have something like you
can look up your account or the the
orders that you had in the past. They
will have a specific URL and maybe
someone will link to that and uh search
engine will try to index it. And how you
handle that kind of depends on what is
actually shown in the search results.
>> Yep. And whatever makes sense for a user
who might land on that or want to land
on that. Yeah, that makes sense.
>> All right.
>> Yeah.
>> Okay. Would you say there's like
something that people should do to make
sure they are doing this right for them?
Is there like the top tip that you want
everyone to take away from who has to
deal with login?
>> I think the most important part is that
you understand how how things are
currently working for your site. So the
the way I would do that is I'd open an
incognito window in a browser uh where
you're basically not logged in to any of
the services that you usually use and
then you search for something associated
with your site. That could be um like if
if the primary content is behind a login
page uh then you could search for your
name or like the the name of the
service. Uh, so you could search for, I
don't know, search console or Google
Docs or something like that and then you
click on maybe the top couple of results
to see what actually comes up there. And
if the top result is something like a
login page and there's no information on
this page at all otherwise, then
probably that's something that you can
improve. Whereas if the top results are
kind of reasonable marketingy content
for people who are not logged in yet,
then that seems okay. And I think with
regards to more specific sections of a
site, that gets a little bit harder
because you have to search for those
parts specifically. You almost have to
know that there's something that could
be found. Uh, for example, on an
e-commerce site, if you have a page that
shows your orders, you could search for
that URL pattern or specific words that
might be on a page like that and see
what comes up and just kind of from
there while you're not logged in in an
incognito window, see it's like is there
actually reasonable content that comes
up? Does it do a reasonable kind of a
redirect to login uh page or I don't
know login experience if you want to add
more content to those pages or is this
kind of jarring for the user that is
like what am I doing here and why why
did I end up on this page that's asking
for a password now I'm not trying to
hack this website kind of thing. Um, so
that's kind of the direction I would go
there. And it's like if you see that
things are okay in the search results,
then probably you're already doing
things properly. If you see that things
are not going okay, then I would
recommend digging into those specific
URLs, trying to figure out where are
they coming from. Uh, what happens when
you use search consoles um, uh, URL
testing tool to look at those pages?
Does it show like what you see or does
it show something different? And based
on that, you would try to make a plan
for improving things.
>> Okay, that sounds pretty good. And I I
think that's pretty actionable advice,
especially with like checking how your
service currently presents in search to
someone who's not logged in is probably
a very very good first stage to make
sure you have a good customer experience
in the end. So that makes sense. All
right. I I think that's that's pretty
much pretty much that sorted. Huh? Do
you have anything else you want to say
about login pages?
>> I I could tell you more, but you have to
log in first, Martin.
>> Uh,
how does that show up in the index?
>> Give me some sample content first. I I
want to know if I want to want to Is
that a payw wall? Do you want me to pay
for this information? No, but you should
subscribe to this podcast and then we'll
we'll tell you more.
>> And that's actually free. So, definitely
do subscribe. Leave us a comment. Um,
have you seen any services that are
screwing this up in the search results
or do you have more questions regarding
login pages or or pay walls? Let us know
in the comments below. And um we'll
probably be talking about these specific
issues more in depth. You can also
submit to the office hours as well if
you have a specific question. But if
it's a broader thing, then we might
discuss it here in the podcast. Awesome.
Well, John, it has been a pleasure.
Thank you so so much for being here. And
uh I think I've never thought this much
about login pages. I I don't know. For
me, they're always just like username,
password, or email and password and then
like a button and that's it. But yeah,
there's more to it. Thanks a lot for
joining me. Uh thanks to all the
listeners out there. That's it for this
episode. I do hope people enjoyed that a
lot. John, if they want to talk to you,
where do you hang out online these days,
behind or not behind a login? Where can
people reach out to you?
>> I don't know. Sometimes it's hard. Uh,
I'm mostly
active on Blue Sky nowadays, so people
can drop me a note there or send me a
private message if they log in. Um, so
that would potentially be a good place.
Okay, so everyone follow John on Blue
Sky and uh thanks a lot for listening.
Please do like and subscribe if you
enjoyed this episode and goodbye. Bye.
[Music]
We've been having fun with these podcast
episodes and we hope that you, the
listener, have found them both
entertaining and insightful, too. Feel
free to drop us a note on LinkedIn or
chat with us at one of the next events
that we go to if you have any thoughts.
And of course, don't forget to like and
subscribe. Thank you and goodbye.
[Music]