Image by starline on Freepik

June 1st 2024 brought the curtains down for the penultimate act of the Great Indian Election Carnival with the completion of the 7th and the last phase of the General Elections. There were some days left for the final act of Counting and Results – but who has the patience in this age and time? In came the many exit polls to predict the Winner so that the Junta remains engaged till then. When I saw those numbers, I found something off. This happens when you had spent last decade and a half looking at numbers, data and drawing inferences from the same. I made a mental note to write something on this if these numbers come out to be inaccurate. This is an attempt to do so.

Why did the Exit Polls of June 1st fail so spectacularly on June 4th? Before answering this question, let us try to understand how this Exit Polls are usually conducted. Exit Polls, as the name suggests, are Polls taken on people who have voted. These are usually conducted face to face outside the polling booth but may also be conducted at a later stage either face to face or telephonically. This could be done for many reasons – to make sample more representative; to bolster sample size are the two reasons which come to mind immediately.

Now that we know the methodology in general, let us understand how we can get an accurate result from these Polls. One thing is very clear, we cannot reach out to every person who has voted to find out whom they have voted. Even if done, it would be a wasteful exercise – one, we would be literally carrying out an exercise similar to an election; two, it would require lots of resources and bandwidth which is grossly high for the kind of returns we would get; third, census is not required for getting a definite trend anyhow.

So, an exit poll is conducted on not all but a few people who have voted to get the idea on what a larger population would have done and then to predict the vote share which a particular political entity would get. In the Indian context, our agencies go a step ahead to predict a range for the number of seats this entity can win. For this to hold true, among many other things, two very critical criteria should be met:

  1. The Sample should be truly representative of the people who have voted
  2. Respondents should be telling the truth

Now let us explore why these two are so critical to the outcome of any exit poll conducted on ground.

Image by Freepik

What is truly representative?

A truly representative sample is a group which matches the characteristics of its population as a whole. I just googled the female % in Indian population – It is roughly 48%. So, to be truly representative basis this information, a sample of 100 should have 48 females and 52 males. As a concept this is easy and appealing but as we would see, it is very hard to achieve.

For our purpose, let us add another dimension to the task in hand. Suppose we know the socio economic split of the population. For simplicity let the categories be Poor, Middle Class and Rich. If we have 10% Rich, 30% Middle Class and 60% Poor, then the sample requirement would look something like this:

 MaleFemale
Total5248
Rich (10%)55
Middle Class (30%)1614
Poor (60%)3129
Category Wise Split

Now for the same male to female ratio, the interviewer would need to interview 5 rich, 14 middle class and 29 poor women and similarly 5 rich, 16 middle class and 31 poor males. This shows that a mere addition of one layer (or factor), makes the sample requirement much harder. Now, think of all the factors which is required for voters to make the sample truly representative.

  1. Gender (Already Discussed)
  2. Religion
  3. Caste/ Sub Caste
  4. Age
  5. Socio Economic Status (Already discussed)
  6. Geography

These 6 immediately come to mind. I think Geography needs a bit of an explanation. In a constituency, there are multiple polling booths. A polling booth is mapped to a distinct geographic area. During exit polls, due to bandwidth and resource constraints, one cannot cover all the polling booths. Also, all the polling booths will not way identically even with other demographic parameters remaining the same. So, one will not be able to capture all the variation of vote preferences due to geography.

One must also include the concept to randomness while conducting the survey. An interviewer, to complete the number of surveys assigned to him, may approach people in groups. People in a group are likely to vote in a particular fashion. This would result in bias creeping into the data.

This goes on to show that the task of achievement of truly representative sample is herculean.

Why true representation is important?

Suppose the sample is not truly representative. Instead of speaking to 52 males and 48 females, the poll was conducted on 70 males and 30 females. Suppose there are two candidates in fray: Candidate A and Candidate B. Also, that 55% of males prefer Candidate A while 60% of females prefer Candidate B. If the sample was truly representative (52:48), the pollster could have arrived at a conclusion that Candidate B is ahead of Candidate A. However, in this case where males are oversampled (70:30), the data predicts a close fight as shown in the table below:

 Situation 1MaleFemaleTotal Votes
Sample5248100
Candidate A Preference55%40% 
Total Votes for Candidate A291948
Candidate B Preference45%60% 
Total Votes for Candidate B232952
Candidate B wins with 52% votes while Candidate A gets 48% votes

 Situation 2MaleFemaleTotal Votes
Sample7030100
Candidate A Preference55%40% 
Total Votes for Candidate A391251
Candidate B Preference45%60% 
Total Votes for Candidate B321850
Candidate A has slender lead over Candidate B

Now, we know that if the sampling is faulty then we cannot expect the correct result from the process. So, is it the end of the game? Not exactly. There is something called weights which tries to correct this imbalance in data wherein you try to adjust the weights of the responses given by different respondents. So, in this case, where males are oversampled, the ideas would be to proportionally reduce weightage to their responses and increase the weightage to those of females. But in such complex setup, this may not work entirely.

To make samples more representatives, some pollsters may reach out to categories of people which they feel have been underrepresented in the sample either face to face after polling is done or telephonically. But this is again based on luck, because one must get the true representation of the people who have voted; not those who are on the voting list. Getting hold of this data would be very difficult to say the least.

Now, we know that getting accurate data on voting patterns from the field is not an easy task. Let us add the complexity of Voter not responding correctly.

Why will Voters not tell the truth?

Many reasons come to the mind. A few are listed below.

  1. In India, many people consider vote as an individual preference which should not be shared to anyone. This set of people would simply refuse to give response to an exit poll. If they agree then they are likely to not divulge the truth.
  1. You are being interviewed by a stranger. People are not very comfortable revealing sensitive data to strangers. Many would also not trust the guy.
  1. If a person believes that (s)he is going against a tide or against the people in his locality, he may not come out clean with his voting choice.
  1. Members of marginalized communities who are not vocal may also not divulge their Vote to the interviewer.

Now that we have combined the two factors, it becomes clear that even predicting a vote share within satisfactory margin of error is a huge achievement for the pollsters. Predicting actual seats becomes an even dangerous task.

Are there other factors?

But these are not the only two factors because of which the exit polls may go wrong. Some of the factors which also affect the outcome are listed below:

  1. If you have read it till here, you would agree that Exit Polls require a lot of expertise. All the major news channels want an exclusive poll for TRP. Since there are not many competent agencies, many exit polls would go wrong simply because of competency.
  1. This problem is compounded because there is no accountability set either for the channels or for the agencies. One can publish whatever (s)he wants and there would be no repercussions this time or next time around.
  1. Because of the above two, it is possible for the stakeholders to not present the true picture and not get punished.
  1. The competency is required not only at the data analysis level but also at the field level from where data is being collected. The data quality is of prime importance and the agencies may not have access to competent fieldwork resources.
  1. Even with quality data, one needs time to arrive at the right conclusion. Often in haste, this is thrown into the dustbin

Why are Exit Polls right many times?

If there is a wave election where most of the voters are thinking and acting in a given direction then even with all the things going wrong, one would be able to capture the trend and probably the extent right. I think this happened in 2014 and 2019. Even this time, it was easier for the pollster to predict the outcomes of the polls in Andhra and Odisha (both Assembly and General Elections) or for states like Gujarat, Madhya Pradesh and Himachal Pradesh.

Many a times, it could be that something has gone right, and something has gone wrong, and both cancel each other out. So, the net effect is that the prediction is correct but not the process. This is like MCQ where a student may have marked the right answer for entirely wrong reason.

Another reason could be that may be, just may be, a particular pollster has almost mastered both the art and science of getting the data and prediction right. There could be some misfires here and there but on most of the occasions the predictions would be close. This is very very difficult but possible.

Should Exit Polls be banned?

Considering that the poll results would be out in max 3-4 days after the Exit Polls, many argue that Exit Polls should be banned. It leads to unnecessary speculation which may affect unsuspecting people undesirably (as happened in the stock markets this time around). I personally do not consider this to be an essential exercise.

Having said that, I also believe that data from high quality exit poll is very beneficial. It reveals the political preferences of different groups and can also shed a light on what motivates/ triggers the group if designed properly. This data can allow the political parties to identify their core voters; to engage more meaningfully to the electorate and to curate policies and programmes which resonates with them.