Posted on 2009-03-17 17:19
Robert Su 閱讀(231)
評(píng)論(0) 編輯 收藏 所屬分類(lèi):
工程相關(guān)
http://www.freepint.com/gary/direct.htm#top
大多數(shù)搜索引擎存在著非常大的問(wèn)題,很多人已經(jīng)意識(shí)到這個(gè)問(wèn)題了。
現(xiàn)在的問(wèn)題是,海量的網(wǎng)絡(luò)有一些通用搜索引擎——谷歌、百度抓取不到的“看不見(jiàn)的網(wǎng)頁(yè)”
這部分網(wǎng)頁(yè)比例是比較高的;
特別由于AJAX 以及RIA的大量應(yīng)用,crawler面臨挑戰(zhàn)不小……
待續(xù)
There's a big problem with most search engines, and it's one many
people aren't even aware of. The problem is that vast expanses of the
Web are completely invisible to general purpose search engines like
AltaVista, HotBot and Google. Even worse, this "Invisible Web" is in
all likelihood growing significantly faster than the visible Web
you're familiar with.
So what is this Invisible Web and why aren't search engines indexing
it? To answer this question, it's important to first define the
"visible" Web, and describe how search engines compile their indexes.