Google Bot now crawls arbitrary Javascript sites
Just spotted something while looking through Apache logs:
66.249.67.106 ... "GET /ajax/xr/ready?x=clcgvsgizgxhfzvf HTTP/1.1" ...
This is an ajax request issued from document.ready() callback of one website?s pages. This means that the bot now executes the Javascript on the pages it crawls. The IP of 66.249.67.106 is crawl-66-249-67-106.googlebot.com and the A record is a match, so this is in fact a Google Bot.
Moreover, few lines below there is this:
66.249.67.106 ... "GET /content/halloc/index.html?&x=clcgvsgizgxhfzvf ...
This is an URL that is fetched via Ajax by a Javascript function in response to the menu item click. Also, note the x argument - it is dynamically added and only by that specific function. This means that the bot now emulates a user clicking around the site and then seeing which actionable items lead to which additional pages.
This is pretty incredible in and of itself.
It also means that Google Bot has smarten up to a point where it can crawl an arbitrary Ajax?ified website and not ?some dynamic comments? as before.
Good stuff, never liked that escaped_fragment contraption anyway :)
jon jones chuck colson death meteor showers 2012 ufc 145 jones vs evans bobby valentine bobby valentine
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.