Global IP Alliance

Sharing the experience in the Communications Industry

Something like

Absolute constraints to indexing via web-crawlers due to communications overhead as content explodes

The idea being that content production is racing ahead of our ability to index it and there are fundamental processing and bandwidth limits.

E.g. if you have a site with 1 million pages, the googlebot has to hit your site about 12 times per second - at some point the bandwidth restrictions and response time will simply not allow for a complete index. The threshold is probably far lower, ergo the Internet is becoming opaque.

Any comments on that?

Share

Replies are closed for this discussion.

Replies to This Discussion

If you think about it, crawling the web with its polling method is just about the most inefficient way possible to find out about something. Much better would be if sites were to publish their complete index and then keep them updated. Of course this runs into a host of other problems immediately because we have eliminated the discovery element and we could immediately run into index spamming.
So would you expect to have search caching based on tag lines?
Not really expecting anything in particular, just thinking that as the public internet grows and grows there will be an absolute limit to indexing all the content and beyond that it gets opaque. Kind of like dark matter and that this may force a move to IPv6 so that sites can expose thier content more directly to web crawlers and similar technology.

Just speculating on the limits of the current situtation.. Right now the possibility of a constrained search space really works in Google's favor because if indexing does have an upper bound, getting your lURI's into Google will be more important than ever.

Mainly though I was wondering if anyone had any thoughts about this.

RSS

Photos

Loading…

Latest Activity

Constantine Gavrilidis is now a member of Global IP Alliance
on Friday
Carl Ford and Fredrik Henning are now friends
November 20
WikiCarl added a blog post
FCC Chairman Julius Genachowski has expanded from four to six the principles of freedom associated with Net Neutrality. Now however these principles are now going to be codified into regulatory rules. So the question has to be asked can the concept…
October 1
August 26
WikiCarl is now a member of Global IP Alliance
August 26
Sheryl Breuker updated their profile
August 7
Sheryl Breuker updated their profile photo
August 7
Simon Andrews added a blog post
There is much to hear these days that the latest developments in the LTE technology will leave behind WiMAX. In a broader perspective, the situation is different. Although the WiMAX vendor community has been pushing the notion that 2009 is the year…
July 13

RSS

A gathering point look at the future of the Internet on all its support networks.

© 2009   Created by Carl Ford on Ning.   Create a Ning Network!

Badges  |  Report an Issue  |  Privacy  |  Terms of Service