Logo: University of Southern California

Digg This: ISI Computer Scientist Forecasts Social Network Behavior

Next, Kristina Lerman hopes to learn how a story go viral.

August 11, 2010 —

Computer Scientist Kristina Lerman of the Information Sciences Institute recently took a look at Digg, and found that watching the behavior of a relatively few superusers foretold the fate of newly posted stories.

Digg is the news aggregation web site that posts 25,000 new stories every day. Lerman and Tad Hogg of the Institute for Molecular Manufacturing analyzed postings in the site's "upcoming" list -- stories that are held in a queue waiting to be "promoted" to the main pages of the site -- trying to predict which stories would become popular.

Kristina Lerman

The pair presented their resulting paper, "The Social Dynamics of Digg" at the 4th International Conference on Weblogs and Social Media recently in Washington D.C.

Lerman is a Project Leader at the Information Sciences Institute and holds a joint appointment as a Research Assistant Professor in the USC Viterbi School of Engineering's Computer Science Department. She and Hogg used mathematical operators similar to the ones used by biologists to describe the collective behavior of social insects to study the behavior of web users.

They found that not all users on Digg are equally influential in promoting a story. "The top 30 users -- the so called "superusers" -- were responsible for the vast majority of the stories posted to the front page of Digg," said Lerman.

Such superusers are linked to hundreds or even thousands of other users, so when they make a recommendation, their linked users, in turn, can then promote the story by voting for it, where it ends up on one of the main pages on Digg.

Lerman and Hogg hypothesized that observing the reactions to a story shortly after it was posted on the site, they could predict how fast a particular news item would be promoted to the main Digg front page.

"We can then use this "crowd sourcing" to predict whether the posted news item will go viral," says Lerman.

A key point of her work is that she determined that the popularity of a particular item posted on a site like Digg is not related to the content of the posting as much as it is to the links of the person who is doing the posting.

Lerman decoded the 'friendship' links on sites like Digg to figure out who the superusers are. Then by following the postings and the Digg network's reaction to the initial appearance of an item, a prediction model can anticipate how popular the item will eventually become on the site.

What use would it be to people to see at the start of a web posting, how their message is being received and then passed on to other users?

Lerman says: "Marketing could know ahead of time: is my campaign working or not? The political people could be spreading messages and asking: Are my messages working or not?"

"The new social media sites offer a glimpse into the future of the web," she says. "Rather than passively consuming information, users will actively participate in creating, evaluating, and disseminating information."

One of the key points of the work is to leverage what's learned from sites like Digg and its use of "crowd sourcing" to perceive other social networking patterns.

"Social media sites, such as Digg, show that is possible to exploit the activities of others to solve hard information processing problems. We expect progress in this field to continue to bring novel solutions to problems and information processing, personalization, search and discovery," says Lerman.

While superusers can help move storiesto Digg's front page, they have relatively little influence in getting a news item to go viral -- become ubiquitous on numerous networks, not just Digg.

"When we looked at the viral stories," says Lerman, "all we can say is that the content (of these viral stories) is unpredictable."

Lerman is now trying to change this situation.

"We are using these mathematical models to understand how a group of web users reacts to a story, and then use this information to predict whether the story will go viral."