Union Electronics and IT Minister Ashwini Vaishnaw on Friday outlined the government’s strategy for artificial intelligence (AI), saying it is aimed at making AI resources accessible to all, addressing India-specific challenges and creating economic opportunities.
Replying to questions in the Rajya Sabha, Vaishnaw said the approach builds on the existing technology ecosystem, which the government estimates will generate over $280 billion in annual revenue this year and employs more than six million people.
He said there are more than 1,800 global capability centres in the country, of which over 500 work on AI and that 89% of startups founded last year were AI-powered. Citing Stanford University‘s AI rankings, he said India ranks among the top countries in AI skills, capabilities and policy frameworks and is the second-largest contributor to AI projects on GitHub.
According to the minister, the IndiaAI Mission, launched in 2024, is intended to create an “inclusive AI ecosystem” aligned with national development goals. A key element, he said, is the development of high-quality datasets relevant to local conditions.
Several platforms and initiatives are being used to build these datasets. Vaishnaw said AIKosh — a data platform that integrates government and non-government datasets — currently offers more than 1,200 India-specific datasets and 217 AI models in sectors such as health, agriculture and education.
Other resources, he said, include farmer query data from Kisan Call Centres, geological datasets from states, and medical imaging data for diagnosing brain lesions.
The minister said AIKosh is supported by the Bharat Data Exchange, an extension of the Open Government Data platform, which serves as a repository for shareable government data in machine-readable formats.
Language technology is another area of focus. Under the National Language Translation Mission’s Bhashini platform, citizens contribute voice, text and translations in 22 Indian languages via the BhashaDaan portal.
Vaishnaw said the data reflects India’s linguistic diversity, including dialectal and regional variations, and is used to build AI models for speech recognition, machine translation and other tools. Over 70 research institutions are involved in curating these resources, he said.
The minister also detailed work under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS), which has established 25 Technology Innovation Hubs in academic institutions focusing on AI, machine learning, IoT, robotics, cybersecurity and quantum technologies.
He said the IIIT Hyderabad hub has developed more than 105 India-specific datasets, digitised over 2,000 pathology images, and created an India Driving Dataset downloaded in more than 30 countries.
The BharatGen consortium — involving IIT Bombay, IIT Madras, IIT Kanpur and others — has assembled a large India-centric corpus, including trillions of text tokens, thousands of hours of multilingual speech and millions of local documents.
At IISc Bengaluru’s ARTPARK, the Vaani dataset contains 16,000 hours of audio in 54 languages from 80 districts, while the MIDAS project has developed medical imaging datasets for public health.
In the health sector, Vaishnaw said the Indian Council of Medical Research has created a Health Research Data Repository to provide secure access to clinical datasets, in compliance with WHO, ISO and national health protocols.
This includes data from the National NCD Monitoring Survey, a 12-year diabetes study covering over 113,000 participants, tuberculosis treatment trials, an antimicrobial resistance network and chest radiograph datasets.
Other measures cited include ₹1,000 crore allocated under the IMPRINT and Uchhatar Avishkar Yojana for AI curriculum development and joint R&D; the “AI-for-Science” initiative of the Anusandhan National Research Foundation, which applies AI to research in physics, chemistry and biology; and the India AI Open Stack, which offers a base AI architecture tailored for Indian researchers.
Vaishnaw said these efforts are intended to create datasets and AI models that are locally relevant, multilingual and usable across sectors, while enabling startups, academia and industry to work with standardised resources.

