pk_synths
02-02-2006, 04:15 PM
I have a huge site. Close to 2 million pages so I've never had an issue with Google indexing my pages fairly quickly. Recently I uploaded 2k new pages and for some reason Googlebot has decided to index those pages with the session id attached.
The strange thing is that this is only happening on BigDaddy!! Instead of having 2k new pages indexed BigDaddy is showing 13k because most are indexed numerous times with just the session id being different.
I checked Google "classic" and it has 700 pages indexed of the 2k and NONE have session ids which looks and sounds normal. But this BigDaddy thing is very strange. My system is setup to not serve Googlebot any session ids by looking at the request string for "Googlebot" any other request gets a session id. SO I'm guessing whatever Google is using to spider my site now isn't using the Googlebot protocol and getting the session id served to it or it's dumping the "Googlebot" protocol after the first request.
If anyone out there has a site specifically setup to not serve Googlebot session ids can you please check BigDaddy and confirm your pages aren't getting indexed with the session id either.
I'd hate to get smacked with a dup penalty because Google decided to change their spider protocol.
Thanks,
The strange thing is that this is only happening on BigDaddy!! Instead of having 2k new pages indexed BigDaddy is showing 13k because most are indexed numerous times with just the session id being different.
I checked Google "classic" and it has 700 pages indexed of the 2k and NONE have session ids which looks and sounds normal. But this BigDaddy thing is very strange. My system is setup to not serve Googlebot any session ids by looking at the request string for "Googlebot" any other request gets a session id. SO I'm guessing whatever Google is using to spider my site now isn't using the Googlebot protocol and getting the session id served to it or it's dumping the "Googlebot" protocol after the first request.
If anyone out there has a site specifically setup to not serve Googlebot session ids can you please check BigDaddy and confirm your pages aren't getting indexed with the session id either.
I'd hate to get smacked with a dup penalty because Google decided to change their spider protocol.
Thanks,