What is “Likely” and “Unlikely” in Polling

The recent VPR poll was conducted like any other general population public opinion poll. The largest sampling frame for telephone was utilized—in this case, a dual-frame sample of landline and cell phone numbers—and the data were weighted to reflect U.S. Census estimates for Vermont’s adult population on age and gender. In addition, the data were also weighted to reflect the county-level populations proportionately.

All of the data related to issues, job performance ratings, and the 2016 Vermont gubernatorial were weighted to reflect the views of the general population. During data collection, the Polling Institute works the sample to achieve the highest response rates possible given time and budget constraints, and in the end, the general population weights are relatively small and do not distort the original data a great deal.

The data reflecting preferences in the upcoming Vermont presidential primary are weighted to reflect the population of likely voters in each of the party’s primary. Weighting the general population is far easier than weighting to likely voters because we have hard data from the Census Bureau describing the general population. The general population actually exists at the time of the poll; this is not the case when considering likely voters. The voting population does not yet exist; there are no pre-existing measure of who what citizens (or poll respondents) will actually cast a ballot on March 1 (or before by absentee ballot).

Weighting to the voting population is weighting to a population that is still speculative. That is why we refer to likely voters as opposed to actual voters. But if we want to estimate what voters may do on election day, we have to recognize that the entire adult population does not vote, and in a primary, the proportion of voters will be lower than that found in a general election.

So, we develop a separate weight to help us understand what voters may do on March 1 as they cast their votes in the presidential primaries. The formula we used to estimate the voting population for the upcoming primary started with eliminating the views of those poll respondents we think are unlikely to vote at all; consequently, we built a model (using a second data set) that excluded all of those respondents who

  1. Are not registered to vote;
  2. Do not follow news about the presidential race either “very closely” or “somewhat closely”; and,
  3. Say that they are either “not too likely” or “not at all likely” to vote in the Vermont Presidential Primary.

Using that criteria, we eliminated 258 actual respondents (unweighted), bringing us to an unweighted base of 637 records or 71 percent of the original data set. We then worked with those remaining records to devise a variable that would give greater weight to those respondents among those remaining who are most likely to vote in the presidential primaries, since we know that turnout will not be as high as 71 percent. In fact, we estimate that turnout will be from 40 – 45 percent of registered voters.
In order to differentiate among the remaining respondents who are most likely to vote, we gave points to respondents meeting the following conditions:

  1. Follow news about the election “very closely”
  2. Say that they are “very likely to vote”
  3. Identify with one of the major political parties
  4. Have a college degree or more education
  5. Responded to poll after the New Hampshire Primary (Feb. 9th)

These criteria were used to generate weights for each individual case that were then applied to the general population weights to devise a new weight variable defining our “likely voter model.” The first two criteria take what respondents tell us about their interest in the election and how likely they are to participate, while criteria 3 and 4 apply data from the demographics that are associated with voting participation. The last criterion takes into account that candidate preference shifted measurably after the New Hampshire primary showed that Trump and Sanders can win and that Kasich may be more of a contender than earlier thought.

Applying the likely voter model to the reduced data set left us with a dataset that represents 58 percent of the originally weighted sample—a figure higher than our voter turnout estimation but weighted to give those within the remaining sample who meet likely voter criteria a greater weighted response.

The estimates for how Vermont would vote if the election were held during the time we were in the field are shown in the following two figures:

Figure 1. Vermont 2016 GOP Presidential Primary Preferences, based on a likely voter model


Figure 2. Vermont 2016 Democratic Presidential Primary Preferences, based on a likely voter model