Institute of Education

Research & Expertise to Make a Difference in Education & Beyond

What Do Digital Traces Have to Offer for the Study of Psychological Wellbeing?

The round table on ‘Psychological Wellbeing in the Digital Age’ brought together a range of scholars and one industry professional to talk about how a user’s digital footprint—or ‘digital traces’—can be used to discern a person’s psychological state, predict their behavior, and, potentially, even improve their psychological wellbeing.

What Are Digital Traces?

When you think about your digital traces and what is done with them, usually what comes to mind are the creepy tactics of companies such as Google or Yandex, who use your digital data to implement targeted marketing and manipulate your consumption habits. But at the roundtable, David Garcia of Complexity Science Hub Vienna, Maksim Skryabin, Senior Research Fellow at HSE’s Institute for Education, and Ivan Smirnov, a laboratory head and lecturer of HSE’s Institute for Education, discussed the potential of digital traces to be used to enhance the field of psychology, develop more productive educational online platforms, study the psychological wellbeing of students, or identify students who need help—as well as the ethical implications of these endeavors.

Also participating in the round table was Valery Babushkin of X5 and Yandex, who provided important context about how digital traces are currently used in the tech industry as well as what constitutes a trace and what doesn’t. ‘If I have found something in my browser and left something in my search engine, it’s a digital trace. If I have changed a profile picture or tweeted something, it’s a digital trace. If I have purchased something at an online store, it’s a digital trace. But, if I buy something at a store offline, is it still a digital trace? If I use a store loyalty card, then yes. There’s a very fine line between what constitutes a digital trace and what doesn’t.

‘Data describes a lot. At Yandex we have a history of what websites each user has visited. Just by knowing the sequence of websites a user as visited, we can describe him or her. In addition, just this information alone can be used to make a clusterization. Just a simple one—of 4 or 5 people. Why do we do this? We assume that people in the same group are close to each other in some way and people not in the same groups are different. So when applying this information to a product metric, one user of one group can differ from a user of another group of twenty times. In this way, web history is behavior—in terms of money, interaction, relationship with a product—and it could differ 20 times from another user. And then based on this behavior, I can predict how much money you’re going to spend or even how much time you will use a product and so on. And that’s only based on web history—not even considering the information we have on what the user has searched for.’

 

An Indicator of a User’s Psychological State

In terms of connecting digital traces with a person’s psychological state, ‘It’s a problem of confounded variable,’ Valery Babushkin says. ‘Because there is a mood—the psychological state of the person—and this state clearly affects what the person does, what he buys, what he writes, and how he behaves and expresses himself both in the web space and in real space. And if we were to put this into a psychological graph, there would be a two-way inference from both how you feel, what you do, and then what you did, and this again, how you feel. And usually in the industry we make these not only because we want to know this information but because we want to use it [for marketing purposes]. But of course there is also the ethical question. Are digital traces indicative of a person’s psychological state? Yes. Can we use digital traces? Yes. Shall we? That’s another question.’

Using Digital Traces Beyond the Tech Industry

Sofia Dokuka, a research fellow at HSE’s Institute of Education and the moderator of the round table, pointed to the complexity and limitations of the working with digital traces. ‘It is a topic with a lot of different dimensions including machine learning and a lot of technological options on the one hand, and different ethical issues on the other, such as how we get the data, and then what we do with it.’

The question of what can be done with digital traces has been a central focus for David Garcia who is a computer scientist by training but has been working with psychologists for the past decade to investigate how this digital data can be used to measure a person’s psychological wellbeing. While digital traces can encompass both active data (information that users post publicly on social media) as well as passive data (information collected from users without their knowing), Professor Garcia focuses mainly on publicly available socio-behavioral data in order to study psychological behavior. ‘This is a treasure trove of information, but we are still determining what we can quantifiably get from this kind of data,’ he says. In his work he incorporates concepts, experiments, or surveys from the field psychology and translates them into metrics that can be applied to digital traces. But this tactic is not without limitations: ‘Most people do not want to participate in a 3-year study in a fast-paced environment being asked constantly, “How do you feel? How do you feel?”’ With more passive metrics, meanwhile, he sees broad potential—not for replacing psychology, but for enhancing it, and helping psychologists see more.

Maksim Skryabin is also pursuing the potential of digital traces to help measure and/or improve users’ wellbeing—specifically in the context of massive open online courses (MOOCs). ‘My definition of digital traces is narrower, because I focus on behavioral data—not what people write publicly or have purchased online, but how they behave on social media. We can use them to learn more about positive attitudes, as well as negative effects, such as bullying and trolling, that can happen in an educational environment or otherwise.’ Additionally, when applying the uses of digital traces to the field of educational, ‘one use is to identify students who need help and to use digital traces to offer them a better service, and this is one of the options that we want to develop.’

Big Potential for Science

Ivan Smirnov sees two promises that digital traces have for science. ‘Using traditional methods, we can study traits like income—something that is stable—or academic performance—something that is stable with time—and a lot of other traits. But, as has already been said, if you want to study emotions, it’s impossible to do this with certainty, and this includes not only emotional states, but also behavior. For example, it is especially important for psychological wellbeing, because it’s a very complex topic and it is related to the behavioral field. For example, we know that psychological wellbeing is related to sleep patterns. If you’re depressed, then you have problems with sleep. Sleep is also very difficult to study, because you need to know each day when a person goes to sleep…and all kinds of related behavior. Another example of something potentially related to psychological wellbeing that is difficult to measure is movement. If you ask people how many kilometers they walk each day, they probably don’t know. We have services now that measure this (like Fitbit) but, before, if you were to ask people how much they walk each day, they don’t know. In a pilot study my colleagues and I conducted, we asked people to share with us their google location history, so that we could learn how they’re moving and how far and see if this behavior is related somehow to psychological wellbeing.’

The second promise of digital traces Ivan Smirnov sees for science is prediction. ‘Some people say that science is about prediction and that the problem with social science is that it’s very bad at making predictions—that it’s kind of not real. It seems, however, that with digital traces we can make models with better prediction than traditional models, because we have these huge dimensions and large amounts of data. But there are also some problems. With some things, we can get very good predictions—so much so that it surprises us—while with other things we cannot.’

But Not Without Roadblocks

The panelists all agreed, however, that the path to successful predictions is still marked by challenges. One challenge is when people know that predictions are being made about them depending on their behavior, so then they change their behavior. One example of this is when Google was predicting the winners of Eurovision by comparing the number of search inquiries related to contestants. Once users learned of this tactic, they started Googling their favorite contestants with higher frequency in order to improve their standing in the prediction rankings, thereby skewing the results.

Additionally, if a researcher is using behavior data from social networks to make predictions about the population at large, the social network may not provide a true reflection. ‘People active on Twitter, for example, are probably more extroverted than those who are not,’ says David Garcia.

Another challenge that still remains for researchers are the ethical factors. One important factor is consent—both of users and non-users alike. ‘It is common knowledge that if a user has Facebook on their phone, Facebook then has access to all of the people on that user’s phone’s contact list—regardless of whether those people are on Facebook or not,’ says David Garcia. ‘Therefore, when we talk about an individual’s digital traces on social media, we can’t view that data as something that has resulted from a conscious decision of the part of that particular individual to join the social network. It is not an individual decision. It is a collective decision. And this is an important distinction with which researchers must contend.’

By the close of the round table discussion, it was clear that the potential uses for digital traces are much richer than one might initially think—but achieving the benefits of these uses comes with the challenges of unreliability as well as ethical problems that are not easily solved.