Advertisment

Digital library project poses tech challenges

author-image
CIOL Bureau
New Update

BANGALORE: The sheer scale and complexity of the ambitious Universal Digital Library project that hopes to digitize a million books worldwide, is posing some interesting problems for IT researchers and academicians.

Advertisment

“The logistics of digitizing so many publications is a considerable problem since this is essentially a university undertaking,” said Dr Raj Reddy, professor of computer science and robotics, Carnegie Mellon University, who leads the project.

He said that the project has thrown up interesting issues such as creating a good system that can provide access to the publications to millions of people, system design problems and designing an easy to use interface since they are multilingual users.

The initiative is a collaborative effort involving research organizations in the US, India and China. So far, around 400,000 publications have been scanned and indexed in China and around 200,000 in India. This project hopes to build a free, searchable digital library on the Internet. The publications are available in various languages such as Sanskrit, Tamil, Telugu, Kannada, Hindi, Chinese and Arabic to name a few.

Advertisment

As part of the Digital Library project, various universities such as the Carnegie Mellon University are undertaking research on challenges like text mining, machine translation, summarization, image processing, user interface design, classification and anticipatory analysis.

The key challenge, according to Dr Reddy, is to make available relevant information at the right time. He cited the example of the tsunami disaster that struck South Asia. “The challenge is about how to detect something new in streaming data and being able to extract, categorize and summarize the data in a relevant way,”

Highlighting the specific problems that are peculiar to India, he said, “Language technologies are very important in India especially since we have so many languages. This is the only country to have 15 official languages. Though most of the languages have the same sounds, there are different scripts. It is unfortunate that we have not been able to solve these problems,” he said.

CyberMedia News

tech-news