HomeLatest NewsGovTechWe are building indigenous datasets and AI models, says IT minister Ashwini Vaishnaw

We are building indigenous datasets and AI models, says IT minister Ashwini Vaishnaw

Union minister Ashwini Vaishnaw tells Rajya Sabha India’s AI mission will develop local datasets, multilingual models and research hubs to address domestic challenges and expand economic, employment opportunities.

Preferred Source of Google

Union Electronics and IT Minister Ashwini Vaishnaw on Friday outlined the government’s strategy for artificial intelligence (AI), saying it is aimed at making AI resources accessible to all, addressing India-specific challenges and creating economic opportunities.

Replying to questions in the Rajya Sabha, Vaishnaw said the approach builds on the existing technology ecosystem, which the government estimates will generate over $280 billion in annual revenue this year and employs more than six million people.

He said there are more than 1,800 in the country, of which over 500 work on AI and that 89% of founded last year were AI-powered. Citing ‘s AI rankings, he said India ranks among the top countries in AI skills, capabilities and policy frameworks and is the second-largest contributor to AI projects on GitHub.

Advertisement
Saksham Bharat 2026
Saksham Bharat 2026
A multi-stakeholder dialogue on skilling gap in Cybersecurity, Data Resilience and AI — and the roadmap to a Saksham Bharat.
Register Now →
VeeamON 2026 Tour India - Mumbai
VeeamON 2026 Tour India - Mumbai
A VeeamON 2026 India Leadership Series Mumbai for senior public sector and government technology leaders.
Register Now →
Cyber Surakshit Uttar Pradesh
Cyber Surakshit Uttar Pradesh
Find out strategies, frameworks and solutions for building a resilient and secure digital ecosystem across Uttar Pradesh.
Register Now →
VeeamON 2026 Tour India - Bengaluru
VeeamON 2026 Tour India - Bengaluru
A VeeamON 2026 India Leadership Series Bengaluru for senior public sector and government technology leaders.
Register Now →
VeeamON 2026 Tour India - Delhi
VeeamON 2026 Tour India - Delhi
A VeeamON 2026 India Leadership Series Delhi for senior public sector and government technology leaders.
Register Now →
Infosec Reimagined
Infosec Reimagined
Infosec Reimagined 2026 is the premier information security summit where top leaders—CISOs, CROs, CIOs, CTOs and risk executives—converge to redefine cyber resilience.
Register Now →
Digital Senate
Digital Senate
Digital Senate is a premier conference uniting government leaders, technologists and innovators to share ideas, success stories and strategies on digital governance, public sector transformation, cybersecurity and emerging technologies in India.
Register Now →
CIO Prism
CIO Prism
CIO Prism unites forward-thinking technology leaders to exchange transformative insights, shape digital strategies, and foster innovation, empowering enterprises to excel in an era of rapid technological change.
Register Now →

According to the minister, the IndiaAI Mission, launched in 2024, is intended to create an “inclusive AI ecosystem” aligned with national development goals. A key element, he said, is the development of high-quality datasets relevant to local conditions.

Several platforms and initiatives are being used to build these datasets. Vaishnaw said AIKosh — a data platform that integrates government and non-government datasets — currently offers more than 1,200 India-specific datasets and 217 AI models in sectors such as health, agriculture and education.

Other resources, he said, include farmer query data from Kisan Call Centres, geological datasets from states, and medical imaging data for diagnosing brain lesions.

Advertisement

The minister said AIKosh is supported by the Bharat Data Exchange, an extension of the Open Government Data platform, which serves as a repository for shareable government data in machine-readable formats.

Language technology is another area of focus. Under the National Language Translation Mission’s Bhashini platform, citizens contribute voice, text and translations in 22 Indian languages via the BhashaDaan portal.

Vaishnaw said the data reflects India’s linguistic diversity, including dialectal and regional variations, and is used to build AI models for speech recognition, machine translation and other tools. Over 70 research institutions are involved in curating these resources, he said.

Advertisement

The minister also detailed work under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS), which has established 25 Technology Innovation Hubs in academic institutions focusing on AI, machine learning, IoT, robotics, cybersecurity and quantum technologies.

He said the IIIT Hyderabad hub has developed more than 105 India-specific datasets, digitised over 2,000 pathology images, and created an India Driving Dataset downloaded in more than 30 countries.

The BharatGen consortium — involving IIT Bombay, IIT Madras, IIT Kanpur and others — has assembled a large India-centric corpus, including trillions of text tokens, thousands of hours of multilingual speech and millions of local documents.

At IISc Bengaluru’s ARTPARK, the Vaani dataset contains 16,000 hours of audio in 54 languages from 80 districts, while the MIDAS project has developed medical imaging datasets for public health.

In the health sector, Vaishnaw said the Indian Council of Medical Research has created a Health Research Data Repository to provide secure access to clinical datasets, in compliance with WHO, ISO and national health protocols.

This includes data from the National NCD Monitoring Survey, a 12-year diabetes study covering over 113,000 participants, tuberculosis treatment trials, an antimicrobial resistance network and chest radiograph datasets.

Other measures cited include ₹1,000 crore allocated under the IMPRINT and Uchhatar Avishkar Yojana for AI curriculum development and joint ; the “AI-for-Science” initiative of the Anusandhan National Research Foundation, which applies AI to research in physics, chemistry and biology; and the India AI Open Stack, which offers a base AI architecture tailored for Indian researchers.

Vaishnaw said these efforts are intended to create datasets and AI models that are locally relevant, multilingual and usable across sectors, while enabling startups, academia and industry to work with standardised resources.

Get the day's headlines from Tech Observer straight in your inbox

By subscribing you agree to our Privacy Policy, T&C and consent to receive newsletters and other important communications.
Tech Observer Desk
Tech Observer Desk
Tech Observer Desk at TechObserver.in is a team of technology reporters led by a senior editor who brings latest updates and developments from the world of technology.
- Advertisement -
Powered By Veeam Logo
- Advertisement -

Subscribe to our Newsletter

By subscribing you agree to our Privacy Policy, T&C and consent to receive newsletters and other important communications.
- Advertisement -

India must scale AI infrastructure responsibly to succeed: STL CTO Badri Gomatam

STL Group CTO Badri Gomatam says India must build low-latency, energy-efficient data centre infrastructure to support AI adoption. Speaking ahead of National Technology Day, he emphasised responsible scaling and domestic manufacturing.

RELATED ARTICLES