top of page

We create High-Quality Data
for AI Models.

We create natural, diverse high-quality dataset that captures true human cognition. Unlock the full potential of your language and agent models diverse & natural data.

Humble Beginnings 

Founded in December 1994 with a vision to impart computer literacy in rural India.
In July 2000, SoftAge became a limited company a
ttracting diverse projects in data management.
Set-up operations in UAE, Kenya, Zambia, and Chad in 2011.
Grew from 6 to 14,000 employees in two decades.
Processed 2.5 billion data for 150 clients.
200K workforce employed across 650 offices, over the years.
alues employees as crucial assets for both the company and the client.

In 2022, SoftAge diversified into AI data services.

What we do

Language Agent


Tool Agent

Range of Indian



Human-to-human conversations

Casual chats

Call center recordings



Human reading

Video capture

Record actions

Enterprise AI

Document tagging
using computer vision

Automatic redaction of ID number in Aadhaar card document

Colorization of grayscale document

We partner with

Torrent Power
Airtel Payments Bank

AI data generation

We have expertise in creating custom datasets including text, images, audio, and video.


  • Transcription

  • Summarization

  • Keyword Extraction

  • Text Augmentation

  • Sentiment Analysis


  • OCR

  • Image Enhancement

  • Image Recognition

Audio Data

  • Audio Annotation

Video Data

  • Editing

  • Captioning

  • Video Annotation

Other data services

Quality Assurance & Testing

  🎯 Bias Assessment

  🛠️ Model Robustness Testing

Data Pre-processing & Cleaning

  📝 Text Tokenization

  🔇 Noise Reduction for Audio

  🌄 Image De-noising

Data Cleansing

  🔄 Data Imputation

  📊 Data Normalization

Data Collection & Crawling

  📰 News Article Scraping

  🌐 Social Media Data Collection

  📦 Product Catalog Scraping

Custom Solutions

  🤖Chatbots: Customer Service Automation

We Implemented


May 2002 - July 2002

Outsourced 6 DEOs for data activation.

Collected 25K forms daily for 3 months from 40 distributors.

2003 - 2008

Segregated 4 million documents in 2.5 months.

Consolidated operations in KYC management and warehousing in more regions.

3.2 million subscriber addresses verified in 3 months.

Artificial workout

2009 - 2013

Segregated & re-audited 10 million documents

Established 450 activation centres within 30 days.

Added 200 centres, reaching a total of 650.

2014 - 2018

TeleDoc processed 375K documents/day from 15000 PoS in TAT of 4 hours.

Introduced Digital Know Your Customer (DKYC).

data warehousing

July 2019 - March 2023

Processed 583 million documents for SIM activation.

Scanned and warehoused 1.4 billion pages.

Reconciled 310 million documents for shredding through our custom-built application.

January 2023 - Present

12K+ Hours of Action Recording for AI Model Training.

Team expansion with 150+ skilled professionals.

Processed 115K+ prompts for our clients.

prompt engineering


A person, dedicated to bring the "Document Management System" to India, at a time when it was unheard of. He has 36 years of well-rounded experience in the IT/ITES industry, mainly in the arena of Digitization and Allied Services. With passion and commitment, he has created a world-class Document Management organization. A person with a simple approach, his expertise lies in the ability to acquire large-scale projects, manage consortiums, and deliver assignments globally.

Yasin Ozair



She leads the company's overall strategic direction and fosters the culture of innovation & automation. Her intuitive sense of direction and strategy helps in excellence in execution. Her strong and fair decision making capacity constantly motivates the employees and helps in talent retention and building healthy work culture.  She has working experience of 29 years.

Fahmeda Ozair



He has completed his masters from University of Texas, in computer science. After working with Mr. Cooper in the USA for 2 years, he has joined as a director at SoftAge. He has 10 years of experience in technology and 3 years of experience in running a startup.

Danyal Ozair



Harsh has a vast experience in the IT consulting field which includes auditing, training, project and program Management. He has worked in organizations like NIIT and CSC. He has excellent development and delivery experience both in the Software and BPO space. Having worked with international customers in US, Europe and Japan, Harsh has gained insight into the different aspects of portfolio management. A Six Sigma Black Belt, he has been associated with CMMi, ISO 9001:2015, ISO 27001:2013 certifications.

Harsh Tikku



Most recently Sherjil Ozair was a Staff Research Scientist at Tesla, focusing on the development of neural networks for complete self-driving capabilities. Previously, he made significant contributions as a Senior Research Scientist at DeepMind, working on Large Language Models (Gemini), offline reinforcement learning (StarCraft II Unplugged), multi-agent reinforcement learning (Stratego), and model-based planning (VQM, stochastic MuZero).


He is also renowned for co-inventing Generative Adversarial Networks. His academic background is distinguished, having completed a Ph.D. in Machine Learning from Mila under Prof. Yoshua Bengio, and holding a Bachelor's and Master's in Computer Science from IIT Delhi. A comprehensive list of his publications can be found on his Google Scholar page.




Aaihsa has studied Computer Science at IIT, Delhi. With her natural penchant for computing, she has excelled in creating and delivering high quality data.

Aaihsa's curiosity led her to explore the realms of artificial intelligence, laying the foundation for her future endeavours.

Her approach involves reading research papers, extracting insights, and synthesizing them into datasets. This attention to detail ensures that the data she produces is not only accurate but also relevant to the specific needs of her clients.
Now she is handling various AI projects, right from client interaction, requirement gathering to implementation of data curation processes and delivery. She is an expert in LLM Data.




Address: 204, Phase IV, Udyog Vihar, Sector 18, Gurugram, Haryana - 122002, India

Contact us for more information

bottom of page