A New Study Indicates Humans Self-Generate Misinformation

A delightful mess of Google-colored cables

A new study into sources of misinformation suggests that humans self-generate it on a regular basis by misrecalling information they’ve previously learned in ways that fit already-existing opinions and biases.

The term misinformation is specifically defined as Merriam-Webster as “incorrect or misleading information.” It is distinct from terms like disinformation, which is defined as “false information deliberately and often covertly spread (as by the planting of rumors) in order to influence public opinion or obscure the truth.” One of the major differences between misinformation and disinformation is motive. Disinformation campaigns are always deliberate, misinformation can be spread in good faith.

The sources of misinformation matter a great deal if your goal is to deepen people’s understandings of facts and improve the quality of public discourse. If you think about how information is distributed, you probably picture some version of a top-down model: Something happens, eyewitnesses and journalists converge on it, and the information they collectively report filters down to all of us through whatever media we use to consume it. The education system uses more-or-less the same model.

Typically, when people think about fighting misinformation, we think about it in terms of fact-checking sources and ensuring the data in an article or textbook is as complete and up-to-date as possible. I check facts like die sizes, launch dates, and benchmark results on a regular basis to make certain that I’m writing factual data.

A new paper published in Human Communication Research suggests, however, that we’ve been overlooking a significant source of misinformation — and it’s going to be far more difficult to fix: Humans appear to self-generate misinformation even when they’ve been given the facts. This study focused on numerical misinformation — i.e., mistransmission of data related to specific factual information that study participants had been given. The fundamental goal of the experiment was to measure whether or not humans would remember numbers better if the claims they were given were consistent or inconsistent with the beliefs of the individual.

To test this, individuals were presented with data on topics like support for same-sex marriage in the US, gender preferences for one’s boss, the number of Mexican immigrants in the United States, and the total number of white people killed by police in 2016 versus the total number of black people. The individuals being tested were polled for their own pre-test expectations on these topics and the data presented to them was given in a manner that was both consistent with what individuals believed would be true or was chosen to present facts they were less likely to believe are true. Table 1, shown below, shows the framing for the experiment:

Individual polling of the test group showed that the poll results aligned with expectations, which is why this is called “schema consistent.” In the case of Mexican immigrants, people expected there to be more immigrants in 2014 than in 2007, when in fact the opposite was true. The first group of participants were asked to answer questions based on the data they had just seen. Their answers were then used to inform the questions that were shown to a second group of people. The answers from that group were used to inform the questions asked to a third group of people.

The image above shows how the system worked. The test was administered using numerical sliders to give answers and using text input. Effectively, this replicates a game of telephone — each person is transmitting the version of data they remember. Before you look at the next slide, let’s quickly review: Americans generally expect there were more Mexican immigrants in the US in 2014 than in 2007, they believe police killed more black people than white people in 2016, they prefer a male boss to a female boss, and they favor support for same-sex marriage. Now, look at what the test results showed. The values on the far left of the graph are the actual statistics, in every case. Wave 1 indicates the answers of the first group, Wave 2 the second group, etc.

When presented with data that conflicted with their own previously held beliefs, humans get really bad at math. The drop in Mexican immigrants that occurred from 2007 – 2014 reverses in Wave 1. The very first people who saw the data literally couldn’t remember the answer correctly and flipped the values, associating 2007 with fewer immigrants and 2014 with more. Importantly, these results continue to diverge when transmitted to Wave 3. In other words, it’s not just that people think that the overall Mexican immigrant population must have risen because of the passage of time. Wave 1 overestimated the number of Mexican immigrants by 900,000. Wave 3 overestimated it by 4.1 million. In this case, the initial figure of total immigrants doesn’t drop all that much and most of the inaccuracy is introduced by grossly inflated estimates of how many Mexicans moved to the US over this period.

With police shootings, Wave 1 manages to remember that more whites than blacks were shot, even if both values are wrong. Starting with Wave 2, we get the same crossover that we saw with Wave 1 — except in this case, the initial value keeps being shoved lower.

The data on police shootings shows a little more staying power. While the absolute values both moved towards reversing, Wave 1 still remembered which group was larger. By Wave 2 — remember, that’s the group that used the answers Wave 1 gave — that effect has completely reversed. This time, however, both numbers have come unmoored from their original data points in both tests.

But if you give people data they do expect, they show completely different mental patterns — not so much necessarily in terms of absolute accuracy, but at least in terms of relationships. In the case of percentage of Americans who prefer a male versus a female boss, the percentages climb towards the group-reported estimate of belief rather than maintaining the initial levels given, even though the initial percentages show clear preference for male over female bosses (aligning with general group preference). In the last case, the number of Americans who favored same-sex marriage was underestimated, while the percentage opposed declined in Wave 1 and then moved back towards the actual value.

Participants in the NIH ResearchMatch version of the study were told that numerical percentages could not exceed 100 percent in the slider version, and also told that the total number of immigrants did not exceed 20 million, which may explain some of the differences, but the charts are in general agreement.

People Remember Facts Less Well if They Disagree With Them

There are two interesting findings here. First, there’s further evidence that people literally remember facts less-well if they don’t agree with them. For all the people who claim they change their mind if confronted with facts, the reality is that people tend to change their facts, not their opinions — even when asked to answer questions about information they literally just read.

This has serious implications for how we think, as a society, about the transmission of information from one mind to another. About a year ago, I wrote a story debunking some rumors about AMD’s then-future 7nm Ryzen CPUs. At the time, some individuals were arguing that AMD’s 7nm CPUs would simultaneously deliver huge price cuts, more cores, large clock speed increases, and a giant leap in IPC, simultaneously. My debunk article wasn’t 100 percent accurate — I guessed that AMD might not use chiplets for desktop Ryzen and reserve them for Epyc instead — but the final chips AMD launched bear absolutely no resemblance to the rumored configurations.

I addressed this topic several times over six months because this set of rumors simply would not die. I bolstered my arguments with historical CPU data, long-term CPU clock scaling trends, AMD’s statements to investors, AMD’s statements to the press, and long-term comparisons on the relationship between AMD’s margins and its net profits. I discussed increasing wafer costs and how chiplets, while a great innovation, were also a symptom of the problems AMD was facing.

Now, let me be clear. I’m not arguing that everyone who read those stories was somehow automatically obligated to agree with me. My prognostication record is anything but perfect and reasonable people can disagree on how they read broad industry trends. There’s a difference, however, between “I think 7nm clocks might come in a little higher than you do,” and “I think AMD will simultaneously slash prices, slash power consumption, and revolutionize semiconductors with generational performance gains we haven’t seen in almost a decade,” despite the fact that there was literally no evidence to support any of these positions.

If you showed up to argue the former, or something that even reasonably looks like it, I’m not talking about you. I’m talking about the vocal minority of people who showed up to argue that AMD was about to launch the Second Coming in silicon form. Those who didn’t predict my firing often suggested I’d be writing a tearful apology at some later date.

My point in bringing this up isn’t to rehash old arguments or toot my horn. My point is that there’s a real life example of this very phenomena that you can go and read about. I don’t know where these rumors started, but once they took hold, they proved quite tenacious. As good as Ryzen is — and 7nm Ryzen is great — the rumors about it were better than the CPU could ever possibly be. When confronted with this, some people got angry.

Short of giving the planet some in-depth training in overcoming cognitive bias, it’s not clear how to reduce the spread of person-to-person misinformation, and the authors conclude that more study is needed here. As important as it is to ensure the factual accuracy of primary sources, the fact that humans appear to generate misinformation in an effort to make that data align with pre-existing schemas means focusing solely on the primary source problem will never address its full scope.

Now Read:

Leave a Reply

Your email address will not be published. Required fields are marked *