Using AI and ML for the greater good of society

If we can use artificial intelligence and machine learning to create a public health care system that allows each citizen to live a longer and healthier life and stay independent in old age, what is there to hold us back? But what if that system is created on the basis of people’s personal data – is that okay? How do we implement a system that needs data without compromising individual privacy and breaking the codex of data ethics?

Artificial Intelligence (AI) and Machine Learning (ML) is all about making use of data in a proficient way. That is, if the data are ample. But how do AI and ML work in practice? How do we make use of the data being offered to us in the best way possible? And how do we make sure to get the right data running? 

Those are some of the questions that Sasmita Kusumastuti will put her take on when she speaks at the European Forum on AI & Data Ethics 2020 conference, which is held online and physically in Copenhagen on the 21st and 22nd of October 2020.    

Sasmita is an assistant professor at the Section of Epidemiology, Department of Public Health and Center for Healthy Aging, at the University of Copenhagen. Here, she does research within the field of ageing, epidemiology, and public health. She did her PhD studies on predicting clinical outcomes in older persons. On top of that, Sasmita also coordinates the CHALLENGE platform, which is a large research project titled “Harnessing the Power of Big Data to Address the Societal Challenge of Ageing.”

At the conference, Sasmita will talk about her project of developing a system to predict older persons at risk of declining health. She hopes that her participation will contribute to the discussion of the potentials and pitfalls of implementing AI in practice.

According to Sasmita, both AI and data ethics are ever-evolving concepts that we can use to improve our lives.

Recent advances in AI opens the world to endless possibilities. Like a knife or any kind of tool, however, whether it is good or bad depends on the one operating the tool. Therefore, we need to call attention to data ethics to prevent chaos. We have to ask ourselves: What is right, just, and appropriate?” Sasmita claims and continues by arguing:

I have a feeling that we are currently living in a pivotal era where the discussions raised, the decisions made, and the policies implemented now will have lasting consequences for the future. I really like that the European Forum on AI & Data Ethics 2020 conference involves a diverse range of topics and stakeholders. Therefore, I hope that this conference will be an influential platform steering discussions, decisions, and policies going forward,” Sasmita comments.  

Sasmita Kusumastuti 

Dansk IT has had the pleasure of meeting Sasmita for a talk about her research and how she uses data in her work.

Please tell us a little about yourself and your research.

As a researcher, I am interested in ageing from a public health perspective. I focus not only on physical health such as diseases, disabilities, and deaths, but also on the psychosocial aspects such as people’s wellbeing and loneliness.

At the University of Copenhagen, I am a part of a research group, which is working on developing a system to predict older persons at risk of declining health to help municipalities optimize their prevention strategies.

We use high quality Danish register data to follow a person’s life history e.g. socioeconomic conditions, living conditions, diagnosis, hospitalizations, medicines etc. over the years.

Furthermore, we develop machine learning techniques to recognize patterns in a person’s life history, so that we can accurately predict whether a person will need home care in the future.
As indicated, with this AI system, we hope to help Danish municipalities, so that they can offer early prevention programs that will result in individuals living a long and healthy life and stay independent in old age.

Please explain some favourite aspects of your profession.

My main responsibilities are research and teaching. In my opinion, the best thing about being a researcher is the intellectual challenges along the journey of generating new knowledge. It usually starts with some sparks of curiosity such as “Why is it this way?” or “How can we solve this problem?”. Then I embark on a journey of designing a study to answer a specific research question, implementing the study, documenting the results, and outlining how this new knowledge adds to what is already known. I find it fascinating to be able to connect the dots between what is known and unknown.

As for teaching, the best thing is to be able to interact with curious minds. I enjoy presenting ideas and let the students think freely, question, and challenge these ideas. It is a joy to see their expressions change when they experience that “Aha!” moment and they become even more engaged. After classes, I often get some follow-ups from students who are still curious and want to explore the topic further. That is certainly very rewarding for me.

How do you use big data in your work?

My colleagues and I mostly use register data from Statistics Denmark on the whole Danish population over time. Here we have access to and information on the many aspects of life e.g. social, economic, biomedical conditions, and geographical location over time. Such big data present a lot of noise, so to speak. Thus, we must find ways to turn down the noise and pick up the meaningful signals. Therefore, I collaborate with data scientists and statisticians, and we spend a lot of time and effort on how to efficiently analyse this big data.

How do you make sure that you are using the “right” data?

In my work, I collaborate with people from a diverse range of professions, ranging from medical doctors, municipality officers, data scientists, statisticians, programmers, and ethics experts.
The experts of the subject matter e.g. medical doctors and other relevant experts will have an idea of which kind of information that is necessary for the study.

The data experts e.g. data scientists will identify which data capture the information that we are looking for, and the statisticians will figure out how to best analyse data according to the research aim.

Depending on the context, the ethics experts weigh in on whether the data are used appropriately with regard to the purpose in question.

This is evaluated on every step of the way to ensure we are using the right data, the right way.

What if – for the sake of the future - you had free, unlimited access to every Dane’s personal data (like health data) for any kind of purpose, would that make your research easier?

“Free and unlimited access” in my opinion implies free of accountability, and when one is not to be held accountable, it is “the beginning of the end.” This can be dangerous and may damage the trust that the society has in researchers.

Regardless of the ethical implications, having access to everyone’s personal data would not necessarily make my research easier.

In research, we want to find a meaningful signal, i.e. a contrast, to see what works and what does not work, to find the answers to our research question. With so many different kinds and big volumes of data available, it is often a noisy mess and the big work is sorting out the random noise from the signal that we want to learn more about.

For example, using information from everyone’s personal data does not make sense because everyone is so different, and these variations make it more difficult to pick up the signal. There is also the issue of when the purpose of the data collection does not align with the purpose of the research itself.  For example, register data from General Practitioners are collected mostly for the purpose of reimbursement of services costs. Therefore, the character of the information is different and does not necessarily fit with physical health-related research questions e.g. diseases, though this information may be useful for other researchers who are specifically investigating health care costs.

So even if everyone’s personal data is accessible to me, it would not necessarily help me with my research.

Furthermore, in my opinion, people have the right to be informed and consent to what their data is being used for. Thus, I do not think it is ethically correct to give anyone free access to every Dane’s personal data. Using people’s data in this way, without any accountability, will do much more damage than good.

In your opinion, is it likely that the Danish authorities will ever make every citizen’s data available for research?

No. There is no main argument or any kind of urgent pressure from researchers to do so. Besides, we already have access to a lot of data in Denmark compared to other countries like Sweden and Norway. Nonetheless, there is of course a set of regulations that researchers must follow in order to get access to data. Furthermore, if anyone breaks the rules, there are dire consequences. As there should be.

As someone coming from abroad, I see Denmark as a country where these kinds of discussions are encouraged. There is a feeling of shared responsibility to do your civil duty for the greater good. This is important, especially because we have to keep up with the ever-changing technologies and their influence on the society. Thus, I have seen many calls for better data and privacy protection, and now we will have to find the common ethical framework that can be implemented and reinforced to protect the citizens’ personal data and privacy.

Sasmita will speak on Day 1 at the European Forum on AI & Data Ethics 2020 conference.

Check out the program here.