Lesigner Girl
Minerva
Head of Runboard staff
Registered: 11-2005
Posts: 9606
Karma: 132 (+147/-15)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
I'm glad you both like it. It's almost too bad that page hasn't been triggered yet. Bingbot made 5 more requests for nonexistent pages, though.
|
6/18/2013, 7:36 pm
|
Link to this post
PM Lesigner Girl
Read Blog
|
Lesigner Girl
Minerva
Head of Runboard staff
Registered: 11-2005
Posts: 9606
Karma: 132 (+147/-15)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
Best viewed in Chrome.
Firefox will do that same spinning animation, but the video won't work when it does.
|
6/19/2013, 12:33 am
|
Link to this post
PM Lesigner Girl
Read Blog
|
Lesigner Girl
Minerva
Head of Runboard staff
Registered: 11-2005
Posts: 9606
Karma: 132 (+147/-15)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
Wow. 15 attempts for the same non-existent page (/administrator/) in 13 seconds. I guess the bot got confused without some kind of error code.
IP: 198.50.174.237
OS: Windows 7
Browser: Chrome 22.0.1229.79
What do you think? Should I serve them a 503 or some other error code?
Then again, I could create a fake admin page with a form that sends their entries to my email and see what they fill in. Edit: No, that would probably end up sending me too much mail. Maybe a log instead.
Last revised by Lesigner Girl, 6/19/2013, 4:49 am
|
6/19/2013, 4:48 am
|
Link to this post
PM Lesigner Girl
Read Blog
|
Lesigner Girl
Minerva
Head of Runboard staff
Registered: 11-2005
Posts: 9606
Karma: 132 (+147/-15)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
You're looking at the 100 status. Here's the 503:
503 Service Unavailable
The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.
14.37 Retry-After
The Retry-After response-header field can be used with a 503 (Service
Unavailable) response to indicate how long the service is expected to
be unavailable to the requesting client. This field MAY also be used
with any 3xx (Redirection) response to indicate the minimum time the
user-agent is asked wait before issuing the redirected request. The
value of this field can be either an HTTP-date or an integer number
of seconds (in decimal) after the time of the response.
Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds )
Two examples of its use are
Retry-After: Fri, 31 Dec 1999 23:59:59 GMT
Retry-After: 120
In the latter example, the delay is 2 minutes.
I think I'll stick this in for now and tell bots to try back in a week. Then, I'll take a look at this blackhole trap for bad bots.
|
6/19/2013, 4:23 pm
|
Link to this post
PM Lesigner Girl
Read Blog
|
Lesigner Girl
Minerva
Head of Runboard staff
Registered: 11-2005
Posts: 9606
Karma: 132 (+147/-15)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
Dang. A 503 prevents my custom page from showing, so I removed it.
|
6/19/2013, 4:27 pm
|
Link to this post
PM Lesigner Girl
Read Blog
|
Queenyforever
Ignore me.
Registered: 01-2007
Province: Just north of the clouds...
Posts: 1467
Karma: 48 (+48/-0)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
Oh well....it was a good idea!
This however: virtual Blackhole trap, is brilliant!
---
“Freedom and democracy are dreams you never give up.”
|
6/19/2013, 5:55 pm
|
Link to this post
PM Queenyforever
Read Blog
|
Lesigner Girl
Minerva
Head of Runboard staff
Registered: 11-2005
Posts: 9606
Karma: 132 (+147/-15)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
After reading it, I see that all it does is lure bad bots via a hidden link and serve up a 403 (forbidden) error. Since the bots I'm dealing with are making up their own URLs and not following links, it would be worthless, and since the one accessed that page 58 times in less than 9 minutes when I had the 403 in place, it would probably be counter-productive.
When the description said it would ban bots, I thought they meant it would prevent them from receiving any data back from the server at all.
Edit to add: I suppose it could still be good for bad search engine bots that don't follow the instructions in the robots.txt file.
Last revised by Lesigner Girl, 6/19/2013, 6:55 pm
|
6/19/2013, 6:54 pm
|
Link to this post
PM Lesigner Girl
Read Blog
|
Queenyforever
Ignore me.
Registered: 01-2007
Province: Just north of the clouds...
Posts: 1467
Karma: 48 (+48/-0)
|
ReplyQuote
|
|
Re: Bingbot crawling errors, causing 404s
Well, worth keeping for future reference. But not terribly practical for the bots you are dealing with...
---
“Freedom and democracy are dreams you never give up.”
|
6/20/2013, 8:55 pm
|
Link to this post
PM Queenyforever
Read Blog
|