Why Do Some Interns Fake Their Data?

Let me apologize right now for the negative tone of many recent blog posts. I hope my recent negativity does not obscure what has always been a stimulating, productive and (I hope) useful field research project. We have discovered a trove of exciting, odd, and often unexpected behavioral patterns in a species whose breeding system was thought to be monotonously monogamous. We have also described important ecological patterns — such as natal site imprinting and ecological traps — which might help us conserve loons. Along the way, I have met and worked with scores of wonderful people, including over 70 field interns. Most of these folks used the experience of loon research to help mold their career plans, aiming eventually for grad school in wildlife or ecology, or for jobs in conservation at the local, state, or federal level. Every year I get a charge out of meeting a new crop of young people eager to contribute to the project, learn field techniques and lay the groundwork for their careers.

However, not all interns work out. This past field season was my most difficult ever. In 2017 I failed to motivate one of my field assistants to complete data collection at all of her assigned lakes each day, and her complaints and poor performance hurt team morale. But last year was doubly trying, because a second field assistant falsified her data. In truth, this was the third instance of falsification that I had observed during the study.

One can dismiss data falsification by one or two field workers as flukes. When we found that Laura was not going to some of the lakes on her circuit in 1996 and was faking her arrival and departure times at some others, we were horrified. Having never encountered this problem before, though, we eventually decided that something was wrong with Laura for her to have betrayed our trust. Margaret, my head field assistant that year, announced, “I always thought she was a spoiled brat”, and I quickly agreed. When Frank, another team leader, caught Chelsea faking her data in 2005, I allowed myself to reach a similar conclusion. The problem was Chelsea, not us. I assured Frank that I would be more careful in screening potential field assistants in the future.

The third instance of faking data, which occurred last year, forced me to confront an unpleasant possibility: something about the way we select student assistants or go about data collection makes our project prone to data falsification. One aspect of this problem is obvious. While we are able to cover more ground and collect more data per observer than most other studies by having each observer work independently, solitary data collection opens the door to cheating. A single weak person in a weak moment who decides to fake data has no check on their behavior, whereas neither of two observers working in a pair would likely risk discovery by proposing data falsification to their teammate. And solitary work is hard — day after day of rising at 4am, facing all sorts of diverse weather conditions (especially strong wind), and maintaining the focus to collect data of high quality.

Fortunately, the design of our data collection also makes it easy to detect falsification. You see, we systematically rotate observer visits to lakes to limit the impact of observer bias. Observer bias — the innocent and natural tendency of each individual observer to detect and record certain events in their environment and not others — occurs in all observational studies. We limit such bias by making sure that all observers visit all study lakes and that no two consecutive visits to a given lake are by the same observer. This protocol gives us a means to detect faking of data, because any occasion when one visitor’s observations at a lake fail to match those of the previous visitor — like one observer reporting a failed nest and a different observer a week later reporting chicks — is immediately flagged for further scrutiny. Rotation of lake visits should also discourage cheating, because field assistants know that the lakes that they are assigned to visit on a certain day will be visited by a teammate in a week’s time. In effect, we are all constantly checking on each other’s observations. We are like a community of paranoiacs!

Our unintentional safeguard against data falsification is lucky, but it only makes the repeated occurrence of cheating more puzzling. Our field assistants are young people, eager to gain experience and to show their field skills to scientists who might write them strong reference letters. Their application to work with me must include names of three references and considerable information about their academic background and training. Why would such people risk fallout from a severe violation of academic integrity by falsifying data? We can fully never solve this puzzle, I suppose. Needless to say, exit interviews with data cheats are not feasible. In the three instances of falsification: 1) Laura simply denied faking data, despite having been caught red-handed doing so, 2) Chelsea admitted falsifying data and commented, “I cannot believe that I took that chance”, and 3) our faker from last year also admitted her wrongdoing, but insisted that she faked data only on the single occasion when she was caught.

A few bad apples will not ruin the study. We are built to detect such misbehavior, which allows us to toss out anything suspect and preserve the integrity of the data. But it seems worthwhile reflecting on how things have turned sour for a few field assistants so that we can prevent it down the road. I plan to work to prevent cheating by two means. First and foremost — and after consulting with a sociologist and a past field intern about an earlier draft of this blog — I will work harder in the future to give interns a stake in the work. That is, I will try harder to inform them of the scientific questions we are asking and give them the opportunity to ask questions themselves about loon behavior and ecology on side projects, like the one done by Gabby a few years ago. Second, I think it is important to talk about data falsification explicitly and let folks know: 1) that this is very harmful to the project and 2) that they are welcome to take an extra day off here and there, if they feel themselves getting tired of the daily grind of rising early, paddling many miles a day, and typing their data into the computer.

I am a long way from fully understanding why some students falsify data. I probably never will. Perhaps a few adjustments can reduce the frequency of this problem and help keep me focused on all of the positives that have come from our work.