Press release 2023/201 from

Novels and poems often contain descriptions of plants or animals – sometimes more, sometimes less detailed. The extent to which flora and fauna feature in a literary work also depends on who wrote it and under what circumstances. For example, female authors tend to use more species names when they write. This is the conclusion of a research team from Leipzig University, the German Centre for Integrative Biodiversity Research (iDiv) and Goethe University Frankfurt, who examined around 13,500 literary works by approximately 2,900 authors. The study is an example of how methods from the natural sciences and the humanities can be combined using digital techniques.

In a study around two years ago, the research team already demonstrated that biodiversity in western literature has been steadily declining since the 1830s. The researchers have now published a follow-up study. They explain how factors such as the author’s gender, place of residence or age influence the importance given to nature in their works. According to their findings, it makes a difference whether a literary work was written, for example, by a young woman from a US village or by a middle-aged man from a European city.

The study involved researchers from the digital humanities, biology and literary studies. The researchers again used the Project Gutenberg library for their analysis. They linked the works contained therein – mostly western literature from Europe and North America – to biographical information about the authors, which they gathered from online sources such as Wikidata, and, and then manually categorised. In the end, 13,493 works from 1705 to 1969 by 2847 authors were analysed using machine learning methods.

In the 2021 study, the researchers already developed metrics that make it possible to measure biodiversity in literary works. For example, they counted the number of terms used to describe animals or plants in each work, or calculated the variety of vocabulary used to describe living things. Now they have used an algorithm to relate those values to the biographical information about the authors.

They found that, on average, works written by women contained more biodiversity than those written by men across all the periods analysed.
Where the authors came from and where they lived also played a role: the researchers found more occurrences of nature in the works of North American authors than in European works. In addition, writers from smaller towns tended to describe more biodiversity in their work than those living in larger cities.

In terms of age, the picture was mixed: on average, young authors under 25 and older authors over 70 wrote about plants and animals more often than middle-aged authors. According to the analysis, however, whether the writer had children had no influence on the occurrences of biodiversity in their works. In addition to these five core variables, the researchers included many other aspects in the analysis, such as the authors’ level of education, the literary genre and the intention of the works.

“The results are statistically highly significant,” says Lars Langer, a doctoral researcher at the Institute of Computer Science at Leipzig University and lead author of the study. “However, it is important to stress that these are statistical statements, which means that in individual cases the situation can be completely different or even the opposite.”

The study does not provide any direct answers to the question of why the authors’ personal circumstances affect the occurrences of biodiversity in their works. But Langer makes an assumption: “Almost all the correlations we can find can be traced back indirectly to the corresponding education and socialisation. High standards of general education promote an appreciation of nature.” The findings therefore also have implications when it comes to educating specific target groups within society and raising their awareness of biodiversity issues.

Almost as a by-product of the study, a new resource was created for future use by the scientific community. According to the research team, the text corpus, enriched with biographical information, is a valuable new source for further research projects at the intersection of literary studies and the digital humanities.

Original publication in “People and Nature”:
“The relation between biodiversity in literature and social and spatial situation of authors: Reflections on the nature–culture entanglement”,