What next for Internet: Real Time Web

twitter-cartoonOn 21st October, Google and Microsoft announced their tie-up with Twitter for its feed. Twitter has signed deals to put messages sent via the micro blogging service into the Microsoft and Google search indexes. The deals will see messages, or tweets, show up in Bing and Google search results almost as soon as they show up on Twitter. The deals underscore the growing importance of real-time search but why is there so much fuss about Real Time Search? The reason is that we are moving towards “Real Time Web”, beyond Web 2.0 and search is likely to be the cornerstone of Real Time Web. Real Time Web would enable you to get the information as it is being created whether you are looking for it or not. Real time web is now more potent as there are many more sources of information like blogs, micro blogs, social media sites and many more access devices like the desktops and mobile phones.

Real Time Web is about

  1. Search or discovery of real time content
  2. Transport of data in real time between sites – Semantic Web that would enable people to share content beyond the boundaries of applications and websites
  3. Live activity without requiring refresh (e.g. feed in Friendfeed)

The need for Real Time Web is due to enormous amount of data and information that has got created. There is a need to keep it structured and make it more users friendly. The video below is a great attempt at scaling the problem that we are talking about:

Video: Did You Know 3.0 – From Meeting in Rome this Year

Google is the undisputed leader of web search with 67% of total search conducted on the internet. Google provides better search results by way of its Page Rank technology which means that as a site attracts more visitors and gets link from other sites, it rises up in the page rank and ultimately it reaches the top of the charts. This is a slow process but gives very good result which is reflected in the popularity of Google. The search engines crawl the web and by the time they index the results, the content is already stale and you get information as of yesterday, last week or last month. The new content may be improvement on the old content but it would always be old content that would have the benefit in the SEO driven search engines. If this is the case then what is the point of creating new improved content?

Real Time Web is about getting information in real time and hence search becomes so important. People want to know catch on the latest and are unwilling to wait for a particular story to rise in the page rank. The popularity of Twitter search is a live example of the real-time search. Before I move further, I want to make it clear that I am not suggesting that Google is going to lose its leadership position or the numbers of searches on its network are going to come down. I only intend to make a case of a new emerging stream of real-time search. There are different needs from Internet at different point of time and hence today’s web is not going to disappear and is going to co-exist with Real Time Web.

Did you try searching for Michael Jackson the day he died on Google? Chances are that either your query would have got blocked (Google thought that some spam has hit it) or you would have got results related to his life and albums. A new generation of search engines like Tweetmeme, OneRiot, Topsy, Scoopler, and Collecta are trying to redefine what makes a piece of information important. The pictures of Michael Jackson or the Hudson River plane crash appeared on the social media faster than the traditional media in real time as it happened. This is real time web. The final shape and form of Real Time Web is still hazy but the initial green shoots of it are visible. I am also not sure about what it would look like but one thing is sure that it would be a potent combination of “Social Media + Real Time Search + Collaboration”. Paul Buchheit (Gmail and Adsense fame) is working on FriendFeed and believes he has just brought to market the next big form of communication online: flowing, multi-person, real-time conversations. Is this another manifestation of Real Time Web?

Real-time search is the concept of searching for and finding information online as it is produced. It is a result of advancements in web search technology coupled with growing use of social media enables online activities to be queried as they occur. We have all seen how we got the latest news about Michael Jackson, Iran protests or Mumbai terror attacks and if that is a pointer, then the real time search is here to stay.

People provide explicit and implicit signals on what they like and what they do not. This may not always reflect in the traditional search engines. When one tweets a link or submits a link on Stumbleupon, we know who is doing it and what the credibility of the person is. If the search is indexed on the basis of the authority of the people who liked a particular story, then the results would be more human than the machine generated results optimized on page rank and sites linking in. There are other signals like the actual usage data of sites that can be used as another input for the index. Thus the results are likely to be more relevant. It is important that we recognize that the Real Time Search is not only about getting the newest but also about getting most relevant content faster.

Many people vent out their frustration over poor service delivery by tweeting about it or by updating their status on Facebook with it. The companies can get great consumer insights if they use the real time search and can attempt to solve the consumer problems proactively. The positive momentum created by this act would help retain the consumer forever.

The key real-time search engines are:

  1. Twitter Search
  2. Facebook
  3. Scoopler
  4. Collecta
  5. Crowdeye
  6. OneRiot
  7. Topsy
  8. Tweetmeme

Google itself has declared that “Real-Time Search” as its biggest challenge at Google’s Searchology event, held in May, 2009, Marissa Mayer listed the following as the hardest unsolved problems in search:

  1. Finding the most recent information
  2. Expressing that you want just one type of result
  3. Assessing which results are best
  4. Knowing what you’re looking for
  5. Expressing your searches in keywords

Why is Google bothered?

Imagine a situation where Collecta which is a real time search engine throws up results that are shared on social media sites like Stumbleupon, Digg, etc. It is possible that the piece of information rises up the chart on social media sites and generates a lot of traffic. Google juice (traffic from Google) has been the largest source of revenue for Google Adsense and that could be under threat if alternate traffic generation sites emerge.

The other reason why Google needs real time search is to remain comprehensive. Google wants to organize all of the world’s information, including the real-time conversations.

Google has struggled in Web 2.0 and is worried that it may not be able to maintain its dominant position in Real Time Web as it has in the traditional web. Microsoft was a power in IT but it could not replicate its success in the web similarly, Google may not be able to retain its top slot in Real Time Web. Google Wave is largely an attempt by Google to keep pace with the Real Time Web.

Issues with Real-Time Web

The real-time communications within the social networks and microblogging sites like Twitter and Friendfeed of Facebook have introduced a new immediacy to online interaction. Twitter search is the best place to look for trending real time information. Yesterday, Twitter reported its 5th billion tweet – imagine the amount of data that has got generated. Similarly, Facebook with its 300 million users too has a rich data through status updates and Friendfeed. Twitter and Facebook are the key source for data for new types of search engines. Other social media sites also contribute by feeding information on likes and dislikes of surfers to the real-time search engines.

However, a lot of garbage is there in what the people are writing on Twitter or on Facebook status update. The search engines would need to slice and dice the data in a way that meaningful results can be thrown up. Real time search is infected with spam. Finding relevant, fresh information quickly is a challenge that all the search engines are struggling with. The tools to effectively utilize this information are today missing.

The Real Time Search is currently focused on the social media. It needs to expand to the rest of the web to be able to provide relevant results. Moreover, as a user, I am not sure which search engine should I use. I tried to get results for “Telecom” on Collecta and Topsy and I got completely different results. It’s confusing!!!

CollectaTopsy

The other problem with Real Time search is that the algorithms are pretty rudimentary. Even Twitter throws the result indexed by freshness rather on an elaborate algorithm that takes into account the source of information or authority of information. It is yet early days for real time search and new start-up companies on this kind of search are fast emerging.

Commercialization of real time web is suspect as only a fraction of the real time conversations are of commercial value. There could be some commercial value but finding the right business model to be able to monetize it may not be easy. We have already seen that many seemingly successful internet ventures are struggling to get cash ringing (Youtube and Twitter also fall in this category).

In summary, Real-time is an emerging phenomenon, which means much of the value we may draw from it in the future is unknown. Real time web’s value emerges from the fact that a number of people are simultaneously talking about something that could be of interest to others. They share and talk about a wide range of topics which is current, valuable and human. In Real Time Web, we should expect faster information, faster technology, and more filters to help us control it. The missing links in Real Time Web are coming together very fast but till then let’s sit back and enjoy the rivalry between Google and Facebook

Video: Google and the Real-Time Web

Google Seattle engineering director Peeyush Ranjan outlined the challenges Google faces as it tries to incorporate new and updated pages into its index to make sure people get the latest information — trying to figure out what’s truly important to surface at any given moment, while still giving appropriate weight to well-established pages. Recorded Aug. 27, 2009 at Google’s Seattle engineering “tech talk” event.

Video: Kevin Kelly – “Web 3.0”

Related Posts: