Advertisment

Internet has a non-English future too!

author-image
CIOL Bureau
Updated On
New Update

MUMBAI, INDIA: For the Internet users globally, the Internet would have been useless if they were not familiar with English language. For those who do not know English, it's a huge disadvantage to search the data and information as most of the enormous resources available on the Internet platform are mainly in English.

Advertisment

Interestingly, this is not the problem with people only; the Machine also has the same handicap. It is quite difficult for the computer to process a query in another language and provide the result. Yes, as of now, the power of Internet is directly connected with the power of English.

Naturally, as Internet has become an integral part of life, it is imperative to break this language barrier – both for the man and the machine.

Addressing such critical issues of language-based search and results, cross-lingual queries, machine translations and related technologies, the Indian Institute of Technology Bombay - IITB has organized fifth International Conference of the Global WordNet Association (GWC) in Mumbai.

Advertisment

Speaking to CyberMedia News, Professor Pushpak Bhattacharya from the Department of Computer Science and Engineering at IIT Bombay said that the Global WordNet Association, which was formed a decade back, works in the area of resources and words across languages and linking it with computers or machines to provide query results across languages.

Professor Bhattarcharya is also the national coordinator of India based Indo WordNet Group explained that the Indo WordNet Group comprises scholars, experts, professors and researchers from IIT Bombay and several educational institutions across India.

“The Indo WordNet Group, formed five years back, mainly focuses on the machine  translations between Indian languages and English language. The group is the brain child of IIT Bombay and presently leads with 16 members from the institute,” he added.   

Advertisment

Even Professor Bhattacharya agreed that the Internet has more content in English compared to content in local languages. Indian languages-based content is quite low compared to English content and hence it is disadvantageous for non-English speaking people.

“There's a need for cross lingual search which is asking queries in one language and receiving results in other languages. In simple terms, for instance if a Punjabi speaking person types or search information in Punjabi than the machines need to translate the queries and also has to provide the results in local Punjabi and other languages,” he explained.

Further Professor Bhattacharya informed the Indo WordNet Group has already developed Hindi WordNet – the cross- and multi-lingual search tool which has license and free versions; and is commercially used by many organizations and Internet companies in India.

Advertisment

Hindi WordNet has been accepted under the Linguistic Data Consortium (LDC) by University of Pennsylvania and was developed at the Center for Indian Language Technology, Computer Science and Engineering Department, IIT Bombay.

Moreover, he added that the group is presently developing WordNet in Marthi, Sanskrit, Assamese and other south Indian languages at various education and technology institutes across India.

“Language-based word search will be the future of the Internet in the coming years and also the World Wide Web (W3) Consortium has approved the standardization on the web based on the knowledge representation  in languages based on the Web Ontology Language (OWL), which is linked with Word Net, word translations and results,” Professor Bhattacharya concluded.

tech-news