Hi I’m going to present our paper on user characteristics in explainable AI the rabbit hole of personalization which was the work of my master student Robert nmo and myself Simone schum at University of Glasgow in collaboration with Marius constantinides Kia and danela kesia from Nokia Bell labs in Cambridge UK why were we doing this work we have seen arise in research that Focus on explainable AI or xai in order to open up blackbox AI models and help users with understanding and appropriately trusting the decisions that an AI makes previous work for example by asan adal Jang and myself and others have provided some evidence that user characteristics might matter in system use and also explanations and age previous experience gender and personality traits have been repeatedly suggested as important some work for example by Kona has also attempted to personalize xai to a user based on their characteristics following the notion that explanations should be tailored to specific user groups so the overall aim of our study was to establish how user’s characteristics are related to measures such as user engagement perceived and actual understanding and Trust when these users are exposed to explanations we conducted an empirical user study which which can be characterized in four steps we used 149 participants with a median age of 36 of which 74 were women from pfic we situated our task in hate speech detection and employed the well-known and publicly available jigsa hate speech data set this contains 220,000 Wikipedia edit comments which have been labeled by human rers for toxic Behavior we then use detoxify an existing multiclass classif fire trained with a bird Bas model and then used lime to explain our classifier our participants were asked to moderate the content by checking whether a Wikipedia comment is toxic or not the user study took roughly 30 minutes to complete the participants completed two pre-task questionnaires first the 10 item personality inventory or TP was employed to measure each participant’s Big Five personality traits in terms of their scale on dimensions of openness conscientiousness extroversion agreeableness and ticism this is a shortened personality test which is commonly used if time is of the essence as we would imagine it might be if we want to personalize an experience for users we then also measured the level of experience with AI systems machine learning and explanations using a scale of expertise developed by McBeth in the main task the participants were presented with an interface in which they could check 100 comments of which 61 were toxic the accuracy was 98% so pretty good but still held some misclassifications the explanations consisted of highlighted words in the comments as well as a graph of the feature weights and more information about the UI can be found in our paper participants were not expected to examine every comment but rather just spend time to thoroughly think through each comment they could confirm or relabel comments and provide feedback on feature weights and any interactions were loged to measure the level of user engagement with the explanations finally after the main task participants were requested to fill out a posttask questionnaire to measure the participants trust perceived understanding and actual understanding in the form of a mental model score so what did we find first we ran a linear multiple regression models using age gender and previous experience as independent variables against each of our outcon variables we found that none of the models fit for engagement perceived understanding and trust and none of the user characteristics were significant in these models either the only significant result we found was for actual understanding where the linear regression model fit with a probability of 01 and where age significantly influenced actual understanding with a weight of minus 275 this means the older you are the less your actual understanding score we then ran

LEAVE A REPLY

Please enter your comment!
Please enter your name here