ColinMcEnroe

"the only people for me are the mad ones, the ones who are mad to live, mad to talk, mad to be saved, desirous of everything at the same time..." -- Kerouac

Thursday, September 08, 2005

Wall Street Journal this week

New Search Engines
Help Users Find Blogs

Users Say Google and Yahoo
Fail to Locate Latest Postings;
A Guide to the Top Sites
By VAUHINI VARA
Staff Reporter of THE WALL STREET JOURNAL
September 7, 2005; Page D1

The race is on to become the Google of blogs.

Web logs, online diaries written and published by everyone from college students to big media companies, are being created and updated at an astonishing rate -- and established search companies such as Google Inc. and Yahoo Inc. don't always catch them fast enough. Now, a handful of closely held upstarts such as Technorati Inc., Feedster Inc. and IceRocket.com LLC see an opportunity: Build a search engine that can track the information zipping through blogs, nearly in real time.


Search engine Technorati tracks about 16.5 million blogs.


The new sites are gaining traction with users looking to sample what people are talking about online, from the fallout from Hurricane Katrina to silly celebrity gossip. As free tools make it easier for even the most technophobic to publish online, there's a growing demand for services to sift through the clutter.

The new services, some of which are less than a year old, aren't without their glitches. The technology is still evolving and companies are still looking for the best way to track and sort blogs. Some services miss large numbers of blogs, while others pull up irrelevant sites.

Still, the tech-savvy are flocking to them. Julie Meloni, a 31-year-old Web designer in San Jose, Calif., often uses Google to find how-to guides for design tricks. But to learn what other Web designers are saying about a new development in the industry, she turns to Technorati to search blogs. "You can hear what the unofficial word is," she says. "You can watch the buzz happen."
Just a few years ago, the term "blog" didn't exist. Now, many people follow their postings as they would a favorite television show. Others turn to them for news. No one knows exactly how many blogs exist. But the number of them tracked by Technorati has doubled every five months or so to, most recently, about 16.5 million. The rapid proliferation has made it increasingly frustrating for Web users to find what they're looking for.

For those who want just a small taste of what prominent bloggers are saying, DayPop is a good place to go. It culls its search results from fewer than 60,000 blogs chosen by editors. That means it's likely to offer up relatively few links to well-known bloggers like Andrew Sullivan and Dan Gillmor. Sites like Technorati, Feedster, IceRocket and BlogPulse scour far more blogs -- between 15 million and 20 million each -- so searches on those sites deliver far more results, often from obscure sources. While Technorati and BlogPulse focus exclusively on blogs, other sites -- Feedster and IceRocket included -- offer the option to bring in mainstream news sources.

Search results often vary widely between sites. A blog search yesterday afternoon for "William Rehnquist" and "John Roberts" returned more than 2,000 results on Feedster, with 85 posted that day. BlogPulse, meanwhile, offered 704 results, with just five from that day -- three from the same Web site. Technorati and IceRocket fell somewhere in between with 845 and 1,295 results, respectively, about 50 each posted that day. Of all the recently posted blog entries -- from sources ranging from confused college students to well-regarded political pundits -- few showed up on the first page of results for more than one search site.


The big general search engines such as Google, Yahoo and Microsoft Corp.'s MSN do include blog pages in Web and news searches, but so far don't allow users to conduct blog-only searches. For news searches, the sites update their listings of articles several times an hour. But for Web searches, they build their indexes by sending automated Web crawlers scurrying through the Internet, taking snapshots of all the Web pages they visit. They then sort their results based heavily on relevance, using complex (and guarded) formulas that display results based, in part, on how popular a particular site is. The process means that the big sites don't always deliver the freshest Web search results. The day after MTV's annual Video Music Awards show, the first result in a Google search for "video music awards" was the official MTV Web site. The second result was a blog entry about gadgets toted by celebrities at last year's show. The third: A review of the awards show -- written three years ago. The blog search sites, meanwhile, offered links to chatty -- and timely -- gossip about stars' outfits and rambling acceptance speeches this year.

While Google, Yahoo and Microsoft search billions of Web pages, blog search sites typically focus on between 10 million and 20 million blogs. But, in many ways, the upstarts are as different from each other as they are from the giants. Technorati, for instance, relies mostly on a mechanism called "pinging" to monitor blogs. Most bloggers maintain their journals through blog publishing services like Blogger or LiveJournal, which have features that can automatically send out a "ping" to notify search services when a user's blog has been updated. David Sifry, chief executive of Technorati, says his company gets an edge from exclusive deals in which some blog-hosting companies ping Technorati before anyone else. After receiving a heads-up, Technorati visits the blog and updates its database.

Feedster monitors pings, too, but also sleuths for new entries on its own by automatically combing through news feeds, which are summaries of blog entries that can include just a few paragraphs. But the use of news feeds means that Feedster might miss a blog entry that mentions, say, Hurricane Katrina in the last paragraph. IceRocket relies a little less on pings, and more on automated Web crawlers, which surf from site to site looking for new entries. The crawlers can distinguish blogs from other Web pages because most blogs look the same, with chronologically arranged entries, separate headings for each one, and so on.

The new blog-search sites draw only a sliver of the visitors that Google, Yahoo and Microsoft's MSN do. Most of them didn't have enough traffic in July to register on the radar of Internet-tracking firm Nielsen/NetRatings. Technorati did, with 642,000 unique visitors. But its traffic still made up less than 1% of Google's visitors that month.

0 Comments:

Post a Comment

<< Home