The University of British Columbia
Personal WebsiteKeynote Title : Combining Large Language Models and Discourse Processing with a focus on HealthCare Applications
Keynote Abstract : Despite the great success of Large Language Models (LLMs), they still suffer from several serious flaws, which are especially problematic in high-stakes domains like Healthcare. For instance, they struggle with tasks involving multiple documents, they hallucinate, they are not interpretable and they are not able to develop complex plans in either solving problems or text generation. It also seems that simply making them more domain specific or just scaling them up is problematic. In this talk, I will argue that more powerful discourse processing could come to the rescue, but only if two key challenges are addressed. First, we need to be able to train modern discourse parsers (NLU) and generators (NLG) across domains and genres, for both monologues and dialogues, without requiring substantial human annotation. Secondly, we need to better understand what discourse info is missing in LLMs and how to inject such missing info into current LLMs. Throughout the talk I will highlight how this research could greatly benefit specific applications of LLMs in Healthcare.
Biography : Giuseppe Carenini is a Professor in Computer Science and Director of the Master in Data Science at UBC (Vancouver, Canada). His work on natural language processing and information visualization to support decision making has been published in over 160 peer-reviewed papers (including best paper at UMAP-14, ACM-TiiS-14 and Sigdial-24). Dr. Carenini was the area chair for many conferences including recently for ACL'21 in “Natural language Generation”, as well as Senior Area Chair for NAACL'21 in “Discourse and Pragmatics”. Dr. Carenini was also the Program Co-Chair for IUI 2015 and for SigDial 2016. In 2011, he published a co-authored book on “Methods for Mining and Summarizing Text Conversations”. In his work, Dr. Carenini has also extensively collaborated with industrial partners, including Microsoft, Salesforce and IBM. He was awarded a Google Research Award in 2007 and a Yahoo Faculty Research Award in 2016.
Université de Fribourg
Personal WebsiteKeynote Title : Causal Reasoning and Constrained Generation with LLMs
Keynote Abstract : In this talk, we present a number of recent results involving GenAI models and their capabilities. First, we delve into the apparent ability of LLMs to perform causal reasoning or grasp uncertainty. We investigate whether these abilities are measurable outside of tailored prompting and Multiple Choices Questions (MCQ). We define scenarios with multiple possible outcomes we compare the prediction made by the LLM through prompting (their Stated Answer) to the probability distributions they compute over these outcomes during next token prediction (their Revealed Belief). Our findings suggest that the Revealed Belief of LLMs significantly differs from their Stated Answer and hint at multiple biases and misrepresentations that their beliefs may yield in many scenarios and outcomes. We then turn to the capabilities of LLMs to handle structure content. We show how LLMs can be leveraged to identify and correct errors in semi-structured documents. We then investigate how using a limited number of in-context examples can significantly improve their ability to produce structured responses while alleviating constrained generation distortion effects. Finally, we briefly discuss how one could apply these results to medicine, e.g., in the context of patient-doctor dialogue.
Biography : Philippe Cudré-Mauroux is a Full Professor at the University of Fribourg in Switzerland and a Research Council Member at the Swiss National Science Foundation (where he is responsible for Applied Computer Science). He received his Ph.D. from EPFL, where he won both the Doctorate Award and the EPFL Press Mention in 2007. Before joining the University of Fribourg, he worked on information management infrastructures at IBM Watson (NY), Microsoft Research Asia and Silicon Valley, and MIT. He recently won the Verisign Internet Infrastructures Award, a Swiss National Center in Research award, a Google Faculty Research Award, as well as a 2 million Euro grant from the European Research Council. His research interests are in next-generation infrastructures for non-relational data and AI.
Aalborg University
Personal WebsiteKeynote Title : On generating privacy-preserving health data
Keynote Abstract : Sharing and processing large amounts of health data is crucial for studying diseases and researching potential solutions. However, health data contains sensitive information that, if exploited, may harm patients - such data is therefore protected by data protection regulations like GDPR and HIPAA. Privacy-preserving synthetic data generation has emerged as a solution to share data while safeguarding confidentiality. Intuitively, these techniques extract meaningful patterns and statistical properties from datasets, and exploit them to generate new data points. In this talk, I will present existing research on privacy-preserving data generation. I will first introduce basic concepts like differential privacy and synthetic data generation. Next, I will explain the main approaches to achieve privacy-preserving synthetic data generation from tabular and relational data, focusing on neural networks and probabilistic graphical modelsm e.g. PATE-GAN and PrivBayes. Finally, I will delve into existing challenges, especially benchmarking and scalability, which I am investigating at the moment. I will discuss ongoing initiatives and projects at Aalborg University, highlighting the solutions achieved so far.
Biography : I am an associate professor in the Data, Knowledge and Web Engineering (DKW) group at Aalborg University My research focuses on the management of dynamic knowledge graphs, with a particular interest in querying, processing, and privacy. Currently, I am exploring how knowledge graphs can facilitate privacy-preserving federated analytics on health data through the EU HEREDITARY project. Previouly, I was a postdocal researcher at the University of Zurich, and a research fellow at the University of Aberdeen. I earned a PhD from Politecnico di Milano, where my thesis defined a formal reference model to capture the behaviour of existing stream reasoning solutions. During my PhD, I also interned at IBM Research Ireland, and was a visiting student at WU Vienna. My research was recognised with an IBM PhD Fellowship award for 2014/15. From 2013 to 2016, I contributed to the W3C Community Group on RDF Stream Processing. Earlier in my career, I worked as a junior researcher and consultant at CEFRIEL, participating in Smart City research activities of the LarKC EU project and in Web services and recommender systems research for the SOA4All and the Service Finder EU projects.
The University of British Columbia
Personal WebsiteKeynote Title : NLP tools for Understanding and Extracting from Clinical Documents and User Generated Content
Keynote Abstract : In the first half of this talk, we will explore the use of NLP, enhanced by Large Language Models, to advance cancer care system. We will share how NLP is being used to accelerate processes within the British Columbia Cancer Registry (BCCR) to detect, from unstructured pathology reports, reportable tumours, and provide expediated triaging for triple-negative breast cancer patients. In the second half of this talk, we will turn our attention to user generated content. By analyzing the content provided by Canadian university students, we will discuss how aspect-based mood detection has helped students in managing their mental wellness
Biography : Raymond Ng is the Canada Research Chair on data science and analytics. He is also the founding Director of the UBC Data Science Institute, and an elected fellow of the Royal Society of Canada. For both 2022 and 2023, he was named one of the world’s top-75 academic data science leaders by the MIT-based CDO magazine. Ng’s main research area for the past three decades is on data mining, with a specific focus on health informatics, text mining, and Natural Language Processing. He has published over 230 peer- reviewed publications on those topics. He is the recipient of two best paper awards – from the 2001 ACM SIGKDD conference, the premier data mining conference in the world, and the 2005 ACM SIGMOD conference, one of the top database conferences worldwide. He is also the recipient of the 2024 Genome BC Award on Scientific Excellence. For the past 15 years, he has co-led several large-scale genomic projects funded by Genome Canada, Genome BC and industrial collaborators.
TU Delft
Personal WebsiteKeynote Title : Human-in-the-loop Generative AI for Health
Keynote Abstract : Generative AI Technologies are used more and more in society, from healthcare to transport and finance. Along with the rapid increase of their adoption come increased concerns about the inherent robustness issues of such technologies and the social, and ethical implications. To create generative AI systems that can properly serve humans, it is crucial to put humans at the center of the process such that the outcome system behaves in a way that fits the cognition and values of people in the contexts of use. This poses new challenges: how to build systems that can be understood by humans and that can align their behaviour with human values? Tackling these challenges requires new ways of looking at the computational roles humans can and should play in both developing and using generative AI systems technologies. In this talk, I will present our recent work on human-in-the-loop approaches for understanding and improving the robustness of generative AI systems in the context of health and beyond.
Biography : Jie Yang is an assistant professor in the Web Information Systems group at TU Delft and manager of the ICAI national lab GENIUS on Generative AI development and use in large organizations. Before joining TU Delft, Jie was a scientist at Amazon (Seattle, US) and a senior researcher at the University of Fribourg (Switzerland). His research focuses on human-centered approaches for trustworthy AI, especially NLP. His work has received six “best paper” awards or nominations at premier AI conferences, including ACM TheWebConf/WWW (both 2022 and 2023), AAAI/ACM AIES (2023), AAAI HCOMP (2022), ACM SIGIR (2024), and ACM HT (2017). The work finds application across a wide range of societal domains, via collaboration with medical centers, libraries, banks, etc., and partly through projects funded by NWO or EU such as the MSCA-DN project ANT where Jie is the scientific coordinator. He serves as an associate editor for Frontiers of Artificial Intelligence and the Journal of Human Computation, and regularly serves on the senior program committees of TheWebConf/WWW, AAAI, and CIKM.