MeitY along with CFILT, IIT Bombay presents the English-Marathi Parallel Corpus Creation for Machine Translation (MPaCT) challenge. As a part of National Language Translation Mission, the challenge aims to help and encourage the advancement of Machine Translation technology in Indian Languages.
About the Challenge
The Challenge states that Machine Translation (MT) is one of the most-used language technologies today. Thanks to the popularity of the internet and globalised economy, we need more language translations. Over the last two decades, MT technology has taken significant strides forward due to the adoption and advancement of data-driven approaches for natural language processing. Data is now the key driver of progress in MT. Large volumes of training data, also called parallel corpora, are needed for training the Machine Learning models used in MT.
Unfortunately, large parallel corpora are not available in many Indian languages. Thus, this has been the major barrier for progress in MT technology for those languages.
Addressing the gap in Marathi and MT
To address the data gap in Marathi language, CFILT, IIT Bombay is setting up the English-Marathi parallel corpus creation challenge. It is open to participants from industry and academic institutions.
Further, as part of this challenge, a two-pronged approach for creating high-quality English-Marathi parallel corpus will be taken.
1. Translation: IIT Bombay will provide text documents in English and the participants have to produce high-quality Marathi translations.
2. Community contribution: Participants are further encouraged to contribute parallel data from any domain with the goal of collectively building a large multi-domain English-Marathi parallel corpus.
Timeline of the event
The applications began on 5 AUG, 2020. The last date to apply is 12 AUG, 2020 and the results will come out on September 15. For more T&C, you can read here also.
Incentives on participation
CFILT and IIT Bombay will commission a subset of top-ranking participants. They will together build a large English-Marathi parallel corpus for its MT system. They will also compensate such startups for their services as per the norms of Government of India.
How to participate?
You can participate by registering here. Alternatively, you can also log in here and participate.