UIDAI study unexplained duplicates

Ghosts! The very term is evocative. Widely perceived as parasitical and feeding off the human body, the term generates the strong unconscious reaction of wanting to get rid off them.

The hard part though is to convince people that ghosts exist because otherwise they would not be too willing to deploy the steps required to remove them.

This article will describe how UIDAI performed the magic act of conjuring the existence of duplicates in various welfare schemes out of thin air to justify the Aadhaar project to the Supreme Court of India in Sep 2015.

But first the magic act needs to be described.

UIDAI’s magic act

The first ingredient of the magic is deduplication.

There are so many people who loot the government by enrolling themselves multiple times with any welfare scheme, thus depriving the truly eligible. Once Aadhaar linking is done, it is easy to find the duplicates because they will have the same Aadhaar number. This is the magic story.

What, then, are the official deduplication rates claimed by the government for various welfare schemes? The following table shows the government’s claims made before the Supreme Court in September 2015.

Government claims on ghosts and duplicates made in the Supreme Court in 2015
Government claims on ghosts and duplicates made in the Supreme Court in 2015

The columns from the SC affidavit

  1. Name of the state
  2. Number of beneficiaries in this state
  3. The percentage of beneficiaries who have linked their Aadhaar numbers (called “seeding”)
  4. The total number of duplicates found
  5. The total savings from removing the duplicates

For now, we can ignore the savings column and convert the percentage in the third column to absolute numbers. We get the following table.

SC affidavit de-duplication rate
SC affidavit de-duplication rate

The percentage of duplicates is indeed high in PDS and Pensions. Typically, data is analysed to produce these results. Why would the government not put out the data and analysis to strengthen their case in the public domain?

Nothing truly vanishes

When authentication data started disappearing from UIDAI’s website, I made a regular habit of scraping and archiving it.

So it was an unexpected bonanza to discover that UIDAI actually conducted a study to calculate the deduplication rates across various programs in August 2014. The figures as reported by UIDAI’s own study on deduplication across various schemes, by collating the specific schemes, gives us the following table.

UIDAI study deduplication figures
UIDAI study deduplication figures

Unlike as in the SC affidavit, there are no 25%+ duplicates reported in this study. However, these are summary tables, and the SC affidavit showed state-wise numbers. To do a true comparison, the same approach must be adopted.

(Note: ‘Distincts’ should be ‘Uniques’ + ‘Duplicates’, but the UIDAI study has deviations from this rule, which are clearly marked out in yellow).


First the SC affidavit:
First the SC affidavit:
And then the data from the UIDAI study:
And then the data from the UIDAI study:

The duplicate percentages in the last column have over a 10× difference, and the absolute number of duplicates also do not match. Since the SC affidavit was filed in September 2015 while the internal study was done in August 2014, there could be newly seeded accounts which might increase the number of duplicates found.

Let’s look at what has changed from August 2014 to September 2015:

PDS extra duplicates
PDS extra duplicates

The anomaly is obvious. Newer duplicates exceed the newly seeded cards. This is impossible because duplicates can be detected only when cards are seeded, and hence the maximum number of duplicates can only be equal to the number of seeded cards.

Delhi is the only exception to this and hence it is probably accurate, while the extra duplicates from Andhra, Telangana and Puducherry appear to be a fabrication in the SC affidavit.


While the SC affidavit has a state-wise summary for all scholarships, the UIDAI study shows that there are three scholarship programs:

  1. SC scholarship
  2. ST scholarship
  3. Minorities scholarship

We have to check the matching numbers across all the three programs for these states. First, the SC affidavit:

SC affidavit on Scholarships
SC affidavit on Scholarships

Next, the summation of all the scholarship schemes for the above states from the UIDAI analysis. (Yellow cells are explained at the end of the section.)

Scholarship data from UIDAI Study
Scholarship data from UIDAI Study

Telangana first. Seeded accounts are higher on 15 August 2014 with no duplicates, but they are lower in October 2015 with 36,900 duplicates. Punjab repeats that trend.

However, seeding figures don’t reduce over time. There has been a sustained push to seed accounts with Aadhaar across all the programs, with a threat of service denial, while newer beneficiaries are typically never enrolled unless they give their Aadhaar.

Andhra does not have any data at all in the UIDAI study, which makes comparison impossible.

The yellow cells for Punjab on minority scholarship require an explanation. There was an error in the original data put out by UIDAI, where seeded was lesser than valid. Validity is affirmed only when the Aadhaar number exists in the CIDR. To make a comparison possible, the seeded figure was adjusted to match the valid figure.

Scholarship seeding adjustments
Scholarship seeding adjustments


While the SC affidavit has a state-wise summary for all pensions, the UIDAI study shows that there are three type of pension programs:

  1. Old Age Pensions (IGNOAPS)
  2. Disability Pensions (IGNDPS)
  3. Widow Pensions (IGNWPS)

We have to check the matching numbers across all the three programs for these states. First the SC affidavit:

SC affidavit on Pensions

Then the summation of all the scholarship schemes for the above states from the UIDAI study:

UIDAI Study on Pensions
UIDAI Study on Pensions

In Chandigarh, the SC affidavit claims that for a net addition of 300 seeded accounts, there were 1877 extra duplicates, which is of course impossible.

In Puducherry, even after more beneficiaries were added by October 2015, the total number of seeded accounts has fallen, which is also not possible.

There are still a few schemes as per the UIDAI study that have a high percentage of duplicates needing explanation, such as PDS, scholarships and widow pensions (shown here in bold red).

UIDAI study unexplained duplicates
UIDAI study unexplained duplicates

SC Scholarship Duplicates

SC scholarships: notice the Himachal Pradesh duplicates
SC scholarships: notice the Himachal Pradesh duplicates

Out of 15,923 duplicates, Himachal Pradesh alone contributes 15,692 duplicates, and that is anomalous to put it mildly. However no attempt was made to ascertain the cause of this anomaly and follow up because of the inherent belief that these were a result of corruption (in UIDAI’s study).

Incidentally the follow-up was done for LPG, when the results did not match the expectations. 20%+ duplicates was the expectation, but 0.32% was the actual result (despite uploading data more than once, according to the UIDAI study).

ST Scholarship Duplicates

ST scholarships: notice the Himachal Pradesh duplicates again
ST scholarships: notice the Himachal Pradesh duplicates again

Out of 3,153 duplicates, Himachal alone contributes to 3,132 duplicates and again no attempts were made to follow-up and ascertain the reality (in UIDAI’s study).

Widow Pension Duplicates

Widow pension duplicates anomaly
Widow pension duplicates anomaly

Out of 10,221 duplicates, 9870 (9539 + 18 + 4) came from a single state, Andhra Pradesh, that had a 100% seeding ratio.

PDS duplicates are at 3%, but much lower than the 25%+ figures that the government was publicly claiming in the SC affidavit. It is important to understand why even these figures do not represent corruption on the ground, and the UIDAI study actually helps on that account as well.

Duplicates do not mean corruption automatically

To understand duplicates, an understanding of the seeding process is required, where one links their Aadhaar number to any scheme. There are two ways in which it is typically done.

  • Organic seeding is when the beneficiary links their own number with the scheme.
  • Inorganic seeding is when it is done automatically by UIDAI’s seeding viewer tool, and is very error prone and is not recommended by UIDAI, but is nevertheless done at a huge scale (see here).

Even with organic seeding, there is a human data entry process involved where an operator adds the number to the database, or the beneficiaries do it themselves. An accidental incorrect entry could be flagged as a duplicate.

Duplicates detected through seeding can only be considered as “potential duplicates”, and physical verification via door to door checks is required to ascertain if they are genuine beneficiaries. Only then it can be said for sure that they are confirmed duplicates, and can be deleted from the scheme and towards savings. (Failure to do this leads to exclusion of the underprivileged, who aren’t always in a position to contest the exclusion.)

Confirmed duplicates are usually far lower than suspected duplicates.

Behind the emerald curtain

Now that the magic trick of deduplication has been outed, it is important to understand its true purpose. What does it distract the audience (citizens) from?

The answer is simple. It distracts citizens from the fact that the Aadhaar project is truly broken in both concept and implementation, and causes massive exclusion in welfare delivery. But again what is the proof? Where is the data? Let’s dive in.

Data entry errors cause exclusion

  1. Invalid Aadhaar numbers linked with ration cards cause exclusion.
  2. Wrongly linked Aadhaar numbers to get 100% seeding cause exclusion.
  3. Name mismatches in Provident Fund will prevent beneficiary from withdrawal.

The UIDAI’s study points out how big these mismatches are across a variety of schemes in August 2014.

The percentage of mismatches and invalid numbers (intentional or accidental) are staggering. As pointed out, these errors cause exclusion. For perspective, the potential duplicates as per UIDAI’s own study are reproduced here from above.

It is obvious by just visual comparison that even in August 2014, the cost of the project through exclusion is far higher than the benefits of deduplication for all of the above welfare schemes.


The reason why UIDAI did not put its own study in the SC affidavit: it would have proved conclusively that there exists neither an economic nor a welfare justification for the Aadhaar project in welfare delivery. And the study was expressly commisioned to prepare a report for the Prime Minister’s office:

The timing of the UIDAI study is exceptionally interesting given this ET report which stated the change of heart of the Prime Minister about the Aadhaar project on 24 July 2014.

The obvious question: what exactly was the Prime Minister told? Was he given the true picture of Aadhaar duplicates, or was he given something else?


The Excel sheet from which screen shots were taken is available here.

Thanks to Kiran Jonnalagadda. Public domain.

Anand Venkatanarayanan

One thought on “Aadhaar conjures ghosts and duplicates out of thin air”

Leave a Reply

Your email address will not be published. Required fields are marked *