Collaboration Technologies and Social Computing

This book constitutes the refereed proceedings of the 10th International Conference on Collaboration Technologies, CollabTech 2018, held in Costa de Caparica, in September 2018.The 12 full papers presented in this book together with 4 short papers were carefully reviewed and selected from 36 submissions. The papers focus on topics such as: Communication Enhancement, Inter-Cultural Collaboration, Learning Support System, Entertainment System, Social Studies, and UI and UX.


128 downloads 3K Views 22MB Size

Recommend Stories

Empty story

Idea Transcript


LNCS 11000

Hironori Egi · Takaya Yuizono Nelson Baloian · Takashi Yoshino Satoshi Ichimura · Armanda Rodrigues (Eds.)

Collaboration Technologies and Social Computing 10th International Conference, CollabTech 2018 Costa de Caparica, Portugal, September 5–7, 2018 Proceedings

123

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board David Hutchison Lancaster University, Lancaster, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Zurich, Switzerland John C. Mitchell Stanford University, Stanford, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel C. Pandu Rangan Indian Institute of Technology Madras, Chennai, India Bernhard Steffen TU Dortmund University, Dortmund, Germany Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbrücken, Germany

11000

More information about this series at http://www.springer.com/series/7407

Hironori Egi Takaya Yuizono Nelson Baloian Takashi Yoshino Satoshi Ichimura Armanda Rodrigues (Eds.) •





Collaboration Technologies and Social Computing 10th International Conference, CollabTech 2018 Costa de Caparica, Portugal, September 5–7, 2018 Proceedings

123

Editors Hironori Egi Department of Informatics The University of Electro-Communications Chofu, Tokyo Japan Takaya Yuizono Graduate School of Knowledge Science Japan Advanced Institute of Science and Technology Nomi, Ishikawa Japan Nelson Baloian University of Chile Santiago Chile

Takashi Yoshino Wakayama University Wakayama City Japan Satoshi Ichimura Otsuma Women’s University Tokyo Japan Armanda Rodrigues Departamento de Informatica Universidade Nova de Lisboa Caparica Portugal

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-319-98742-2 ISBN 978-3-319-98743-9 (eBook) https://doi.org/10.1007/978-3-319-98743-9 Library of Congress Control Number: 2018950648 LNCS Sublibrary: SL1 – Theoretical Computer Science and General Issues © Springer Nature Switzerland AG 2018 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Message from the General Chairs We are delighted with the brilliant success of the 10th International Conference on Collaboration Technologies (CollabTech 2018). CollabTech 2018 offered a unique forum for academics and practitioners to present and discuss innovative ideas, methods, or implementations related to collaboration technologies, whose contributions to the successful completion of various routine collaboration activities have been enhanced by recent advances in networking, computing, and interaction technologies. CollabTech conferences were held in Tokyo in 2005, Tsukuba in 2006, Seoul in 2007, Wakayama in 2008, Sydney in 2009, Sapporo in 2012, Santiago in 2014, Kanazawa in 2016, and Saskatoon in 2017. Following the success of the joint organization of CollabTech conferences with CRIWG 2014, CRIWG 2016, and CRIWG 2017, CollabTech 2018 was co-located and organized with CRIWG 2018 in Costa de Caparica, Portugal. We believe that our selection of this venue guaranteed the success of this technical conference, which was enriched by the culture and scenery of Portugal. Although the CRIWG and CollabTech communities had similar research topics and goals, they had been geographically located in different regions. Therefore, we believed this joint endeavor would provide an interesting opportunity for the two communities to meet and get to know each other. As the conference chairs of CollabTech 2018, we know that the success of the conference ultimately depends on the efforts of many people who worked with us in planning and organizing the conference. We thank the program co-chairs for their wise counsel and brilliant suggestions regarding the organization of the Program Committee to ensure that it conducted a thorough and timely review of papers, and our sponsors, who helped us to make CollabTech 2018 affordable for all its participants. In addition, we attribute the success of the conference to the efforts of Universidade NOVA de Lisboa, the Special Interest Group (SIG) on Groupware and Network Services of the IPSJ, the SIG on Cyberspace of the Virtual Reality Society of Japan, and the SIG on Communication Enhancement of the Human Interface Society. Our technical program was diverse and encompassed approximately 16 technical papers. Further, we provided the participants with numerous opportunities for informal networking. We are pleased that the conference was fruitful for all the participants and significantly contributed to the development of academic interest in this research field. September 2018

Takashi Yoshino Satoshi Ichimura Armanda Rodrigues

Message from the Program Chairs

After nine events of the International Conference on Collaboration Technologies series, we had the tenth edition (CollabTech 2018) in Costa de Caparica, Portugal. The following topics on collaboration technologies were discussed: – – – – – –

Communication Enhancement Inter-Cultural Collaboration Learning Support System Entertainment System Social Studies UI and UX

For this conference, we received 36 submissions (26 full papers, ten work-in-progress papers) and assigned three reviewers per full paper or two reviewers per work-in-progress paper. As a result, we had 12 full papers, and four work-in-progress papers. The acceptance rate was 44%. Because of the high quality of submissions, many excellent papers were not among those accepted. We hope that the detailed technical review comments we provided are helpful. Without our distinguished Program Committee members, we could not have maintained our high standards. We truly appreciated their devotion. Finally, we hope that these proceedings serve as a reference for future researchers in this rapidly evolving field. September 2018

Hironori Egi Takaya Yuizono Nelson Baloian

Organization

Conference Co-chairs Takashi Yoshino Satoshi Ichimura Armanda Rodrigues

Wakayama University, Japan Otsuma Women’s University, Japan Universidade NOVA de Lisboa, Portugal

Program Co-chairs Hironori Egi Takaya Yuizono Nelson Baloian

University of Electro-Communications, Japan Japan Advanced Institute of Science and Technology, Japan Universidad de Chile, Chile

Publication Chair Junko Ichino

Tokyo City University, Japan

IPSJ SIG GN Liaison Noriaki Saito

Tokyo Online University, Japan

VRSJ SIG CS Liaison Kazuyuki Iso

NTT, Japan

HIS SIG CE Liaison Takashi Yoshino

Wakayama University, Japan

Steering Committee Hideaki Kuzuoka Ken-ichi Okada Jun Munemori Minoru Kobayashi Hiroaki Ogata Tomoo Inoue

University of Tsukuba, Japan Keio University, Japan Wakayama University, Japan Meiji University, Japan Kyoto University, Japan University of Tsukuba, Japan

X

Organization

Program Committee Gwo-Dong Chen Hui-Chun Chu Kinya Fujita Atsuo Hazeyama Gwo-Jen Hwang Tomoo Inoue Yutaka Ishii Kazuyuki Iso Marc Jansen Jongwon Kim Hyungseok Kim Wim Lamotte Yuan Tian Chen-Chung Liu Wolfram Luther Hideyuki Nakanishi Mamoun Nawahdah Masayuki Okamoto Masaki Omata Nobuchika Sakata Yoshiaki Seki Hidekazu Shiozawa Daniel Spikol Hao-Chuan Wang Kazushi Nishimoto Shin Takahashi Satoshi Nakamura

National Central University, Taiwan Soochow University, Taiwan Tokyo University of Agriculture and Technology, Japan Tokyo Gakugei University, Japan National Taiwan University of Science and Technology, Taiwan University of Tsukuba, Japan Okayama Prefectural University, Japan NTT, Japan University of Applied Sciences Ruhr West, Germany Gwangju Institute of Science and Technology, South Korea Konkuk University, South Korea Hasselt University, Belgium Singapore Management University, Singapore National Central University, Taiwan University of Duisburg-Essen, Germany Osaka University, Japan Birzeit University, Palestine Toyota, Japan University of Yamanashi, Japan NARA Institute of Science and Technology, Japan Tokyo City University, Japan Tamagawa University, Japan Malmo University, Sweden National Tsing Hua University, Taiwan Japan Advanced Insutitute of Science and Technology, Japan University of Tsukuba, Japan Meiji University, Japan

Contents

Communication Enhancement Discussion Map with an Assistant Function for Decision-Making: A Tool for Supporting Consensus-Building . . . . . . . . . . . . . . . . . . . . . . . . Ryunosuke Kirikihira and Kazutaka Shimada An Integrated Support System for Disaster Prevention Map-Making Using Town-Walk Information Gathering. . . . . . . . . . . . . . . . . . . . . . . . . . Sojo Enokida, Takashi Yoshino, Taku Fukushima, Kenji Sugimoto, and Nobuyuki Egusa Concealment-Type Disaster Prevention Information System Based on Benefit of Inconvenience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Satoko Shigaki and Takashi Yoshino Development of a Stroll Support System Using Route Display on a Map and Photograph Sharing Service . . . . . . . . . . . . . . . . . . . . . . . . . Junko Itou, Takaya Mori, Jun Munemori, and Noboru Babaguchi

3

19

35

48

Inter-Cultural Collaboration Machine Translation Usage in a Children’s Workshop . . . . . . . . . . . . . . . . . Mondheera Pituxcoosuvarn, Toru Ishida, Naomi Yamashita, Toshiyuki Takasaki, and Yumiko Mori

59

Learning Support System A Presentation Supporting System for Programing Workshops for Elementary School Students . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Koki Ito, Maki Ichimura, and Hideyuki Takada

77

Gamifying the Teaching and Learning Process in an Advanced Computer Programming Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mamoun I. Nawahdah

89

Designing a System of Generating Sound Environment for Promoting Verbal Communication in Classroom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Riri Sekine, Yasutaka Asai, and Hironori Egi

96

XII

Contents

“Discuss and Behave Collaboratively!” – Full-Body Interactive Learning Support System Within a Museum to Elicit Collaboration with Children . . . . Mikihiro Tokuoka, Hiroshi Mizoguchi, Ryohei Egusa, Shigenori Inagaki, Fusako Kusunoki, and Masanori Sugimoto

104

Entertainment System Can Social Comments Contribute to Estimate Impression of Music Video Clips?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shunki Tsuchiya, Naoki Ono, Satoshi Nakamura, and Takehiro Yamamoto Detection of Football Spoilers on Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . Yuji Shiratori, Yoshiki Maki, Satoshi Nakamura, and Takanori Komatsu

115

129

Social Studies Analysis of Facilitators’ Behaviors in Multi-party Conversations for Constructing a Digital Facilitator System . . . . . . . . . . . . . . . . . . . . . . . Tsukasa Shiota, Takashi Yamamura, and Kazutaka Shimada Consideration of a Method to Support Face-to-Face Communication Using Printed Stickers Featuring a Picture of a Character Expressing a Mood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuri Nishimura and Minoru Kobayashi

145

159

UI and UX Migaco: Supporting Young Children’s Tooth Brushing with Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Satoshi Ichimura

173

Improving Visibility and Reducing Resistance of Writers to Fusion of Handwritten and Type Characters . . . . . . . . . . . . . . . . . . . . . . Mikako Sasaki, Junki Saito, and Satoshi Nakamura

185

PopObject: A Robotic Screen for Embodying Video-Mediated Object Presentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kana Kushida and Hideyuki Nakanishi

200

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

213

Communication Enhancement

Discussion Map with an Assistant Function for Decision-Making: A Tool for Supporting Consensus-Building Ryunosuke Kirikihira and Kazutaka Shimada(B) Department of Artificial Intelligence, Kyushu Institute of Technology, 680-4 Kawazu Iizuka, Fukuoka 820-8502, Japan {r kirikihira,shimada}@pluto.ai.kyutech.ac.jp

Abstract. In this paper, we propose a tool for supporting consensusbuilding in conversations with multiple participants. We call it “Discussion Map with Assistant (DMA)”. It consists of nodes and links. We classify the nodes into two types; alternatives and criteria. Alternatives represent what the participants are choosing between. Criteria are used to judge the alternatives. Each criterion contains an importance value. Each link between nodes also contains an importance value. The system estimates a ranking list of alternatives among participants from each map. We introduce a forgetting function to the model. The system also supports the decision-making process by using discussion maps from participants. It generates sentences and charts that describe the current state of the discussion. We evaluate the effectiveness of the discussion map system with DMA in a decision-making task experimentally. Keywords: Discussion map · Decision support system Consensus estimation · Support with charts and sentences

1

Introduction

Supporting consensus-building in conversations with multiple participants is a very important task in intelligent systems. That is expected to be applied in a wide range of fields. Participants in discussion often struggle to identify the most suitable solution for a decision on a meeting agenda because there are generally many alternatives and criteria related to making the decision. It is important for the participants to visualize the discussion state for making a good decision. The visualization also helps each participant to understand temporal opinions. Such a supporting system for consensus-building can play a very important role in many situations. For example, in education, problem-based learning (PBL) has recently become a highly regarded approach to learning [6]. Discussion among participants has a critical role in a PBL environment. However, students tend to be limited, not very effective and often failing to make a satisfying decision. It leads to the failure of the discussion. To conduct smooth, c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 3–18, 2018. https://doi.org/10.1007/978-3-319-98743-9_1

4

R. Kirikihira and K. Shimada Overhead-view camera

Discussion Map

Microphone and Multi-camera images

Roles of Discussion Map 1. Management of personal opinions and information sharing 2. Feedback for consensusbuilding

Fig. 1. Discussion map for decision-making tasks.

active and productive discussions, they need a facilitator who controls the discussion appropriately. However, it is impractical to assign a good facilitator to each group in the PBL environment because of lack of human resources. Similar cases also appear in other situations, such as business meetings and group discussions. Although a project manager needs to appropriately handle a discussion in business meetings, he/she might not have a remarkable skill in terms of discussion facilitation. Ordinary people in a group discussion might subconsciously need help from others to generate a good decision. Therefore, supporting such discussions is an important task. In this paper, we propose a tool for supporting consensus-building in conversations with multiple participants in various environments, such as meetings and PBL situations. The goal of our study is to construct a system that supports consensus-building and management of conversation for high-quality discussions. We call it “digital facilitator”. Figure 1 shows the outline of our system. Here we focus on laptop PCs or tablet terminals as an input tool for discussions. The prototype system in this paper is developed as a Web application. We propose a discussion map which consists of nodes and links among nodes constructed by each participant. Participants can express their ideas and opinions by using the discussion map. Each discussion map is a kind of visualization of each participant’s thought. The visualization helps to easily understand own thoughts. In addition, discussion maps are manageable for computer systems because they are based on a graph structure, as compared with natural language texts, images, and sounds. Our system estimates the discussion state and a latent consensus from discussion maps of each participant, and then explains the circumstances to each participant. This paper contains two contributions; (1) Our system estimates individual opinions of participants by using a scoring method with a decay factor. (2) Our method explains the discussion state to participants by using charts and sentences, namely a decision-making assistant function (DMA). We evaluate these two points through experiments in this paper.

Discussion Map with an Assistant Function for Decision-Making

2

5

Related Work

There are many fields related to our method. One of the targets in our research is the PBL environment. Several researchers have proposed collaborative learning support systems and approaches based on constructive interaction [11,15]. Utilizing graph structures, such as a concept map, is one of the most effective approaches for education supports. Villalon and Calvo [19] have reported a definition and a framework for its evaluation of concept map mining. Yamasaki et al. [20] proposed the Kit-Build method based on a concept map. The purposes of these studies were to extract a concept map automatically or to compare the goal map and each learner map. In contrast, our purpose is to manage opinions and estimate the consensus of participants in a discussion. Takagi and Shimada [18] have proposed a system for collaborative learning with tablet terminals. However, their purpose was just to estimate a level of understanding among participants. Suzuki et al. [17] have proposed a collaborative learning tool using tablet terminals, XingBoard. Masukawa [10] has proposed a web-based notebook for collaborative learning. The purposes of these studies were to construct a framework to support the discussion environment. On the other hand, the goal of our study is to develop a digital facilitator based on a computer and support interaction itself. The target of our research is not limited to the PBL situation. El-Assady et al. [4] have developed a visual analytics framework for multi-party discourse data. However, the purpose of this study was to visualize transcribed text interactively for content analysis. Ito et al. [7] have developed an open on-line workshop system called COLLAGREE that has facilitator support functions for an internet-based town meeting. They incorporated an incentive mechanism for large-scale collective discussions to the system [8]. The system is effective. However, the purpose of this system is to gather opinions from many participants on the web and extract discussion points from them. The purpose of our system is to visualize real-time thoughts and opinions of participants by using discussion maps, and then support the decision-making. Katsura et al. [9] have proposed an argumentation support tool which displays justified arguments to participants. The purpose was to judge whether an argument is logically correct or not. Our target does not always require the correctness of the consensus in a discussion. Nagao et al. [13] have proposed a system for discussion mining. The system generated meeting summaries semi-automatically, retrieved the discussion contents, and generated an answer to a certain question based on the accumulated discussion contents. Nagao [12] also has proposed a creative activity support system. These systems are useful and the motivation is similar to our purpose. One major issue of these systems was to support task execution and evaluation results, namely the plan-do-check-act (PCDA) cycle. On the other hand, we focus on supporting a discussion by an assistant function based on our discussion map system for making the best decision. VatueCharts by [2] and LineUp by [5] are good tools to make the best decision from several alternatives and criteria. Participants can add weight to the criteria, and then easily identify the best choice by using these tools. Alonso et al. [1]

6

R. Kirikihira and K. Shimada

Fig. 2. The outline of our discussion map system with assistant functions. Each map shares nodes. However, links are different. Therefore, each participant doesn’t know other participants’ maps. Only the system knows the whole situation and generates sentences and charts for decision-making success.

have proposed a consensus visualization method based on a clustering algorithm with several measures. It generated a consensus diagram based on opinions of each participant. The motivation of these studies is to visualize the situation and consensus. On the other hand, the main purpose of our study is to support discussions by providing advice that is estimated from the current discussion maps.

3

Discussion Map System with Assistant

In this section, we explain our discussion map system. First, we describe the discussion map (DM) itself. Then, we explain a consensus estimation function from DMs. We introduce a decay factor to the function. Finally, we describe a facilitator function, namely Discussion Map with Assistant (DMA). DMA explains the current discussion state to participants and provides advice that is estimated from the current DMs of all participants. Figure 2 shows the outline of the system.

Discussion Map with an Assistant Function for Decision-Making Add a node Estimation

Add a node Estimation

The DM of participant P1

7

The DM of participant P2

Fig. 3. Two DMs from two participants (P1) and (P2). Each participant expresses own thoughts on own map. Other participants cannot see the map. (Color figure online) The system pops up the menu when a participant selects a criterion.

The left one is a link button to an alternative. The middle one is a button to the weighting. The right one is a button to delete the node.

There are 3 grades; low, middle and high.

Fig. 4. The interface of the weighting to a criterion.

3.1

Discussion Map

The purpose of DMs is to visualize concepts, ideas, and opinions of each participant. Each DM consists of several nodes and links. Figure 3 shows an example of DMs from two participants, P1 and P2, about “Which programming environment is the best for Web applications?” Nodes are classified into two roles; alternatives and criteria. Alternatives are displayed as blue nodes and represent what the participants are choosing between. In the figure, “slim” and “rails” are alternatives created by participants in the discussion. Criteria are displayed as green nodes and are used to judge the alternatives. In the figure, “easy-to-use” and “# of documents” are criteria. Each criterion has a weight as an importance value. The importance value has three grades; high, middle and low. They are expressed by gradations of color on the map; deep green for “high” and light green for “low.” Each participant assigns its own importance value to each criterion. In the figure, the participant who created the DM thinks that “# of documents” is more important than “easy-to-use.” Figure 4 shows the interface to weight a criterion.

8

R. Kirikihira and K. Shimada Value Comments

Fig. 5. The interface of the weighting to a link.

An alternative often has some links to criteria1 . A link specifies a 5-grade evaluation value between an alternative and a criterion; −−, −, 0, + and ++. Each participant expresses the importance of each link by using the evaluation value. In the DM (P1) of Fig. 3, the participant thinks that “slim” is very good in terms of “easy-to-use” while “rails” is weak. Figure 5 shows the interface to set the weight to a link. We now explain the concept of DMs in details. In DMs, nodes, namely alternatives and criteria, are shared among participants. On the other hand, links depend on each participant. Therefore, the structures of DMs created by each participant are different. Please see Fig. 3 again. It contains two DMs from two participants P1 and P2. From the DMs, we can infer the following points; – P1 attaches importance to “# of documents” – P2 attaches importance to “easy-to-use” – P2 probably does not attach importance to “# of documents” because the weight is low and there is no link to any alternatives. – The evaluation values for “easy-to-use” for “slim” exhibit a similar tendency for both P1 and P2; ++ and +, namely positive evaluation. From these points, our system can estimate that the consensus (the best choice) should be “slim” in this discussion. 3.2

Consensus Estimation

The purpose of the DM system is to support consensus-building by using each map of participants. For the purpose, we need to estimate participants’ opinions and thoughts from DMs. In other words, which alternative is preferred among participant? If the system can estimate some alternatives with high importance in the discussion, it can support to make the decision in the last stage of the discussion by using the estimated results. For example, the system can say “the current ranking of alternatives is as follows; 1st: A3 , 2nd: A1 and 3rd: A2 .” The information is useful for the participants to make a decision. In this section, we explain a rank estimation model of alternatives as a consensus estimation model. 1

Note that not all alternatives have a link. It depends on the participant that creates the DM.

Discussion Map with an Assistant Function for Decision-Making

9

The basic idea is based on the summation of the evaluation value between an alternative and each criterion. The score of an alternative Aj of a participant Pi is computed as follows: Score(Pi , Aj ) =

N 

wCk × ev(Pi , Aj , Ck )

(1)

k

where Ck is a criterion on the map. N is the number of criteria on the map. wCk is the weight of Ck , namely the importance value of a node. We set 1.0, 0.5 and 0.1 to “High”, “Middle” and “Low” in Fig. 4. ev(Pi , Aj , Ck ) is the evaluation value between Aj and Ck of Pi , namely the value of a link between nodes (5 grade scores; 0, 2, 4, 8 and 10 for −−, −, 0, + and ++ in Fig. 5). These values are determined heuristically. Here we consider a temporal characteristic of discussion. Participants need to generate alternatives and criteria on the basis of divergent thinking in the early stage of the discussion. Then, alternatives with low importance are culled while alternatives with high importance are discussed continually. In other words, in the current stage of a discussion, a topic that is discussed continually is more important than a topic that is discussed sporadically. In our DM system, we assume that Aj and Ck that each participant operates frequently are important. To incorporate this characteristic to the score, we introduce a forgetting function [3]. Ebbinghaus has formulated a current memory level b after t minutes2 as follows: 100k (2) b= (log10 t)c + k where c and k are constant; c = 1.25 and k = 1.84 in [3]. From the Eq. (2), we compute the saving level sv on a current t as follows: sv =

b 100

(3)

We apply the saving level sv into the Eq. (1). M emScore(Pi , Aj ) =

N 

wCk × ev(Pi , Aj , Ck ) × sv(Aj ) × sv(Ck )

(4)

k

where sv(Aj ) and sv(Ck ) denote the saving levels of Aj and Ck . The t of these saving levels is the elapsed time since the last participant’s action to Aj or Ck . Here the action denotes – – – – 2

the the the the

creation of Aj or Ck , change of the evaluation value of a link between Aj and Ck , change of the importance value of Ck , or move of Aj or Ck on the discussion map.

The initial memory level is 100 in this formulation.

10

R. Kirikihira and K. Shimada (DMA)

Alternatives

Display overall and user’s evaluation Overall

Current state

You

AKIYOSHI-DAI YUHUIN

AKIYOSHI-DAI and YUHUIN obtain high scores. AKIYOSHI-DAI and YUHUIN are valued in terms of “summer-like” and “price”, respectively.

Feedback on overall opinion Which is better in terms of “distance”, AKIYOSHI-DAI or YUHUIN? How about discussing “summer-like”?

Alternatives with criteria

Feedback on user’s opinion

Radar chart on criteria Radar chart on alternatives

You value

Fig. 6. The overview of DMA (DM assistant).

The final score of an alternative Aj is computed as follows: F S(Aj ) =

Np i

M emScore(Pi , Aj ) Np

(5)

where Np is the number of participants in the discussion. 3.3

Discussion Map with Assistant: DMA

Our system contains a support function for discussions, Discussion Map with Assistant (DMA). The purpose of DMA is to support divergent thinking of each participant in the middle stage of a discussion and consensus-building in the last stage of the discussion. DMA generates some charts and sentences as a current summary in a discussion. Charts and sentences are generated from discussion maps of each participant on the basis of the consensus score computed by Eq. (5) in Sect. 3.2. Figure 6 shows the interface of DMA. This example is an output of a participant about a discussion topic, “Where is the best location for a travel with laboratory members?” The left side of DMA consists of four types of charts3 . They are opened and closed by using the icons “+” and “−”, respectively. The right side of DMA consists of three sentence parts; a summary of the consensus candidates on the current state, feedback based on alternatives and criteria with high scores from all participants and feedback based on alternatives and criteria with high scores from the user that sees this DMA interface. The roles of each chart and sentence are as follows: 3

Note that three of them are hidden in this figure.

Discussion Map with an Assistant Function for Decision-Making

11

Fig. 7. Charts generated from DMA.

1. Overall and user’s evaluation (the top-left in Fig. 6): The purpose of this chart is to understand the whole discussion and own opinions. In addition, the user understands the gap between the majority view and his/her opinion. For example, in Fig. 6, the user’s preference, namely HUIS TEN BOSCH, is not valued by other participants while a candidate AKIYOSHI-DAI is valued by other participants. 2. Alternatives with criteria ((a) in Fig. 7): This chart is used in the last stage of the discussion. Some alternatives often contain high scores in the last stage. As a result, participants struggle to determine the final decision. In this situation, each user can understand the evaluation points of each alternative by using this chart. For example, assume that the participants need to select AKIYOSHI-DAI or YUHUIN in (a) of Fig. 7. From the chart, we can understand that the criterion “distance” is the approximately same value. In other words, this situation implies that “distance” may be the most important criterion to select one from two alternatives because it becomes clear about comparative merits and demerits of two alternatives if the criterion has a difference between them. This becomes a good trigger for the discussion. In addition, DMA displays his/her ratio of criteria of each alternative by clicking the button on the topside, “overall” and “you”. By using this own chart, he/she can organize his/her thoughts. 3. Radar chart on criteria ((b) in Fig. 7): Participants can easily compare all criteria about selected alternatives by using this chart. In this example, the most different points between AKIYOSHI-DAI and YUHUIN are visualized, namely “summer-like” vs. “price” in two alternatives.

12

R. Kirikihira and K. Shimada

4. Radar chart on alternatives ((c) in Fig. 7): Participants often want to know evaluation values of all alternatives when they discuss some criteria. This chart is used for the situation. Each participant can easily understand the difference of all alternatives on selected criteria. In this example, the score of TSUNOSHIMA about “distance” is higher than the others and that of YUHUIN about “price” is higher than the others. 5. Sentence feedback to the user (the right side in Fig. 6): This process is based on a template-based sentence generation scheme. For the explanation of the current state, our system selects the top alternative based on the score, Eq. (5), first. Then, it searches another alternative that is close4 in the score to the best alternative. The templates are: – [A1 ] and [A2 ] obtain high scores. – [A1 ] and [A2 ] are valued in terms of [C1 ] and [C2 ], respectively. where Ai is an alternative and Ci is a criterion related to the Ai . The purpose of the feedback on the overall opinion is to pull the trigger that leads to a good final decision. Therefore, our system suggests a topic, namely a good criterion, for two alternatives. The template is: – Which is better in terms of [Cvar ], [A1 ] or [A2 ]? where Cvar is a criterion that links to both of the A1 and A2 and the variance of the score is the smallest in all criteria. To engage in wide-ranging discussion, our system also suggests another criterion that is barely mentioned in the discussion if such a criterion exists in the discussion. The template is: – How about discussing [Clow ]? where Clow is the number of links to alternatives that is the lowest in the discussion maps. In this regard, however, the criterion must link to more than half of the alternatives. If any conditions mentioned above for feedback on the overall opinion are not satisfied, the system outputs the sentence “Let’s keep up the discussion.” Our system also generates feedback to individuals. It suggests the gap between his/her preference and the current consensus by using the following templates: – You value [Aown ] highly although the current consensus is [A1 ]. – How about providing comments about [Cgap1 ] and [Cgap2 ] to other participants? where Aown is the best alternative of the user (Aown = A1 ). Cgap1 and Cgap2 are criteria with high scores in the user and the scores are smaller than scores that computed from all participants. Our system also generates “Let’s keep up the discussion.” if these conditions are not satisfied.

4

Experiment

In this paper, we evaluate two parts; the accuracy of consensus estimation explained in Sect. 3.2 and the effectiveness of DMA explained in Sect. 3.3. 4

The difference of the score is 5% or less.

Discussion Map with an Assistant Function for Decision-Making

13

Table 1. The final numbers of alternatives and criteria. Method

D1 D2 D3 D4

Alternatives 7

9

12 6

Criteria

5

5 4

5

Table 2. The experimental result: RMSE for all ranks. Method

D1

Baseline

0.76 2.26 1.58 0.82 1.35

D2

D3

D4

Ave

Proposed 0.00 2.11 1.58 0.58 1.07

4.1

Evaluation of Consensus Estimation

We evaluated the consensus estimation method with four real discussions (D1, D2, D3, and D4). Each discussion consisted of four participants for 20 min. The topics of each discussion were (T1) Where is the best location for a seminar camp?: D1 and D2 (T2) What store/restaurant/cafe do you want in this university?: D3 and D4 Each participant created a DM in the discussion. The statistics of DMs in each discussion are shown in Table 1. The participants made the decision at the end of the discussion. After the discussion, each participant ranked each alternative. We evaluated our method with the root mean square error value between the system ranks by the consensus estimation method and participants’ ranks, as follows:   N 1  (SysRanki − P artRanki )2 (6) RM SE =  N i=1 where N is the number of alternatives in the discussion. SysRanki and P artRanki are the i-th ranked alternative by our system and the i-th ranked alternative by participants5 , respectively. We compared our method with a baseline. It is a method without the forgetting function, namely the Eq. (5) using the Eq. (1) instead of the Eq. (4). Table 2 shows the experimental result. Our method barely improved the RMSE, as compared with the baseline. However, the difference was not statistically significant. Alternatives in the low ranks were not essentially important to evaluate the methods. Therefore, we also evaluated these methods with the top 3 alternatives. 3 agree(SysRanki , P artRanki ) (7) T op3Acc = i 3 where SysRanki and P artRanki are the same as the RMSE calculation. The function agree(SysRanki , P artRanki ) is 1 if there is the agreement between 5

This is based on the average ranks among participants.

14

R. Kirikihira and K. Shimada Table 3. The experimental result: accuracy of top 3. Method

D1

D2

D3

D4

Baseline

0.000

0.333 0.667 0.333

Ave 0.333

Proposed 1.000 0.333 0.667 1.000 0.750

SysRanki and P artRanki . Table 3 shows the result. The proposed method obtained higher accuracy rates than the baseline on average. However, the value of McNemar’s test was p = 0.125. This result does not always show a strong difference in correlation. We confirmed the limited success of our method with the forgetting function. We need to evaluate our method with a larger test dataset. 4.2

Evaluation of DMA

We evaluated the effectiveness of our discussion map system with an assistant function (DMA) described in Sect. 3.3 with four groups (G1, G2, G3, and G4). Participants in each group are university students and not related to this work. Each group discussed two topics (T3 and T4)6 . (T3) Decide one product that boosts sales of a convenience store. As a condition, the target convenience store is a place where all participants know. (T4) Decide one prefecture that you want to travel along with a foreign student. As conditions, it is a trip of three days and two nights and there is no limitation for traveling expenses. To evaluate the effectiveness of the presence or absence of our system with DMA, each group discussed one topic with our system7 and another topic without our system, namely normal discussion without any systems. To avoid the bias based on the order of the presence or absence of our system in the experiment, each group discussed the order expressed in Table 4. Although there are many evaluation points for the system, we evaluated the effectiveness of our system by the following points: (1) time until the final decision, (2) a satisfaction level about the discussion and (3) a satisfaction level about the final decision. The range of the satisfaction levels was 1 (bad) to 10 (good). Table 5 shows the discussion time until the final decision for each group. The total time said that discussions without our system got to a solution faster than those with our system (24.5 vs 26.4 min). Therefore, it seems that our system did not work well. However, the discussion time depends on the number of alternatives. If the number of alternatives is larger, the time to discuss them becomes essentially longer. Therefore, we calculated the average discussion time 6 7

As conditions for the final decision, each discussion needs more than four alternatives and more than two criteria. The assistant function, DMA, becomes active in five minutes although groups with our system can use the DM system from the start.

Discussion Map with an Assistant Function for Decision-Making

15

Table 4. The experimental settings. GroupID 1st discussion

2nd discussion

G1

T3 without system T4 with system

G2

T4 without system T3 with system

G3

T4 with system

T3 without system

G4

T3 with system

T4 without system

Table 5. The experimental result: total time (minutes). The numbers in each parenthesis denote time per alternative. GroupID Without

With

G1

21 (3.00)

19 (2.11)

G2

24 (4.00)

33 (3.30)

G3

23 (2.56)

21 (3.00)

G4

30 (5.00)

33 (3.67)

Average

24.5 (3.64) 26.5 (3.02)

for alternatives (the numbers in each parenthesis in the table). From the table, our system reduced the discussion time per alternative (3.64 vs. 3.02). In addition, our system contributed to the increase of the number of alternatives that participants discussed8 . This is a good point for making the final decision that participants want. Visualizing alternatives and criteria by using our discussion map system led to the improvement of the decision-making environment. Next, we discuss the satisfaction levels of the test subjects. Table 6 shows the results; sat.discussion denotes the satisfaction level about the discussion and sat.decision denotes the satisfaction level about the final decision. For the satisfaction level about the proceedings and the contents of the discussion from Table 6. The experimental result: the satisfaction levels about the discussion and the final decision. The dagger denotes the significant difference on the T-test. GroupID sat.discussion sat.decision Without With Without With

8

G1

8.25

9.00

9.00

9.75

G2

6.50

4.75

8.25

8.75

G3

4.25

6.00

7.25

8.75

G4

7.00

5.50

6.75

7.25

Average

6.50

6.31

7.81

8.63†

For instance, for G1, the number of alternatives with our system was 9 (19/2.11) while that without our system was 7 (21/3).

16

R. Kirikihira and K. Shimada

the test subjects (sat.discussion), there is no difference between discussions with/without our system (6.50 vs. 6.31). The role of the current system was to support the consensus-building process. Therefore, the current system did not always contribute to the early stage of each discussion, namely stimulating divergent thinking. In addition, DMA occasionally did not work well, as an assistant. DMA generates feedback comments in the situation that there are some alternatives with high scores or there are gaps between overall and individual references. In other words, the system just outputs “Let’s keep up the discussion”, in the situation that any conditions mentioned above are not satisfied. It led to the decrease of the satisfaction level about the discussion, as a support system. As a result, the system did not obtain higher scores in this evaluation point, sat.discussion. On the other hand, we obtained a good result for the satisfaction level about the final decision (sat.decision). There was significantly different between situations with/without our system (p = 0.041 on the T-test). The reason why the sat.decision score improved was that each participant can understand the difference between own thoughts and the whole opinion via outputs from DMA. DMA contributed to lax sharing of opinions in a positive sense. This result shows the effectiveness of our system with an assistant function for the group discussion. After the experiment, we took a survey from test subjects. The followings are the positive and negative opinions: Positive opinions: – It helped to easily understand the own ranking and the whole ranking for an alternative. – DMA’s suggestion was sometimes effective to break the silence in the stagnant discussion. – Visualization was effective to be clear about own thoughts. Negative opinions: – Sometimes, I struggled to concentrate on the discussion because of operations for the DM system. – It was difficult to understand the map when the number of nodes became larger. We obtained the positive opinions for the visualization and the assistant function. On the other hand, our system needs to improve the operability. This is one important future work.

5

Conclusions

In this paper, we proposed a tool for supporting consensus-building in conversations with multiple participants. Our method estimated the ranking of alternatives in each discussion by using each discussion map. We introduced a forgetting function into the consensus estimation model. In the experiment, the method obtained the limited success as compared with a naive score calculation.

Discussion Map with an Assistant Function for Decision-Making

17

We also implemented some feedback functions to participants during discussions; charts and sentences. Visualizing information about the current state in the discussion was useful for the participants to make the decision. We need to evaluate other evaluation points although we evaluated our system with sat.decision and sat.discussion in this paper. The input for estimating the consensus in the current system is the discussion map on tablet terminals or laptop PCs. However, conversations contain many characteristics; verbal and nonverbal information. We have studied several aspects for discussion situation, such as top-view images [14] and facilitators’ behaviors [16]. Integrating this discussion map system with these characteristics, namely a multi-modal interpretation approach, is one interesting future work. In the experiment, the topics of the discussion were just a case study. Evaluation in real PBL situations or in decision-making tasks is also important future work. The current system handles Japanese. However, the techniques in the system are easily scalable to other languages. The discussion map system itself is essentially language-independent. The assistant function for the sentence generation is based on a template-based generation scheme. Therefore, we can generate other language templates easily. Experiments with the multilingual system are interesting future work. Acknowledgment. This work was supported by JSPS KAKENHI Grant Number 17H01840.

References 1. Alonso, S., Herrera-Viedma, E., Cabrerizo, F.J., Chiclana, F., Herrera, F.: Visualizing consensus in group decision making situations. In: IEEE International Conference on Fuzzy Systems, FUZZ-IEEE 2007, pp. 1–6 (2007) 2. Bautista, J., Carenini, G.: An integrated task-based framework for the design and evaluation of visualizations to support preferential choice. In: Proceedings of AVI 2006, pp. 217–224 (2006) 3. Ebbinghaus, H.: Memory: A Contribution to Experimental Psychology. Dover Publications, New York (1885) 4. El-Assady, M., Hautli-Janisz, A., Gold, V., Butt, M., Holzinger, K., Keim, D.: Interactive visual analysis of transcribed multi-party discourse. In: Proceedings of ACL 2017, System Demonstrations, pp. 49–54 (2017) 5. Gratzl, S., Lex, A., Gehlenborg, N., Pfister, H., Streit, M.: LineUp: visual analysis of multi-attribute rankings. IEEE Trans. Vis. Comput. Graph. 19(12), 2277–2286 (2013) 6. Hmelo-Silver, C.E.: Problem-based learning: what and how do students learn? Educ. Psychol. Rev. 16, 235–266 (2004) 7. Ito, T., Imi, Y., Ito, T., Hideshima, E.: COLLAGREE: a facilitator-mediated largescale consensus support system. In: Proceedings of the 2nd Collective Intelligence Conference (2014) 8. Ito, T., Imi, Y., Sato, M., Ito, T., Hideshima, E.: Incentive mechanism for managing large-scale internet-based discussions on COLLAGREE. In: Proceedings of the 3rd Collective Intelligence Conference (2015)

18

R. Kirikihira and K. Shimada

9. Katsura, Y., Okada, S., Nitta, K.: Dynamic argumentation support tool using argument diagram. In: Proceedings of The 29th Annual Conference of the Japanese Society for Artificial Intelligence (2015). (in Japanese) 10. Masukawa, H.: Development of the reflective collaboration note: ReCoNote. In: Proceedings of the 29th Annual Conference of JSET (2013). (in Japanese) 11. Miyake, N., Shirouzu, H.: The dynamic jigsaw: repeated explanation support for collaborative learning of cognitive science. In: The Meeting of the 27th Annual Meeting of the Cognitive Science Society (2005) 12. Nagao, K.: Meeting analytics: creative activity support based on knowledge discovery from discussions. In: Proceedings of the 51st Hawaii International Conference on System Sciences, pp. 820–829 (2018) 13. Nagao, K., Kaji, K., Yamamoto, D., Tomobe, H.: Discussion mining: annotationbased knowledge discovery from real world activities. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds.) PCM 2004. LNCS, vol. 3331, pp. 522–531. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30541-5 64 14. Sakaguchi, K., Shimada, K.: Cooperation level estimation of pair work using topview image. In: Kim, S., Jung, J.-W., Kubota, N. (eds.) Soft Computing in Intelligent Control. AISC, vol. 272, pp. 77–87. Springer, Cham (2014). https://doi.org/ 10.1007/978-3-319-05570-1 9 15. Scardamalia, M., Bransford, J., Kozma, B., Quellmalz, E.: New assessments and environments for knowledge building. In: Griffin, P., McGaw, B., Care, E. (eds.) Assessment and Teaching of 21st Century Skills, pp. 231–300. Springer, Dordrecht (2012). https://doi.org/10.1007/978-94-007-2324-5 5 16. Shiota, T., Yamamura, T., Shimada, K.: Analysis of facilitators’ behaviors in multiparty conversations for constructing a digital facilitator system. In: Proceedings of the 10th International Conference on Collaboration Technologies (2018) 17. Suzuki, H., Funaoi, H., Kubota, Y.: Supporting “assemble & disperse” style collaborative learning using tablet terminals. Technical report of IEICE-ET2013-26, pp. 41–46 (2013). (in Japanese) 18. Takagi, H., Shimada, K.: Understanding level estimation using discussion maps for supporting consensus-building. Procedia Comput. Sci. 35, 786–793 (2014) 19. Villalon, J.J., Calvo, R.A.: Concept map mining: a definition and a framework for its evaluation. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology 2008, pp. 357–360 (2008) 20. Yamasaki, K., Fukuda, H., Hirashima, T., Funaoi, H.: Kit-build concept map and its preliminary evaluation. In: Proceedings of The 18th International Conference on Computers in Education, ICCE 2010, pp. 290–294 (2010)

An Integrated Support System for Disaster Prevention Map-Making Using Town-Walk Information Gathering Sojo Enokida1 , Takashi Yoshino1(B) , Taku Fukushima2 , Kenji Sugimoto2 , and Nobuyuki Egusa1 1 Wakayama University, Sakaedani 930, Wakayama-City, Japan [email protected], [email protected] 2 Osaka Institute of Technology, Osaka 573-0196, Japan

Abstract. Throughout Japan, numerous disaster prevention maps have been made over time using the method of town-walk. This method will assist in disaster prevention and enable a greater understanding of the disaster prevention field. The development of these maps increases disaster awareness, improves the self-help ability of individuals, and increases cooperation amongst local communities. Currently, there is a support system for developing conventional disaster prevention maps; however, no maps have been designed using the town-walk method. Therefore, in this paper, we present an integrated support system for making disaster prevention maps using the information gathered by the town-walk method. We conducted a comparison experiment between the proposed system and the conventional system (using paper maps). This experiment demonstrated that a consistent support function would be effective in developing disaster prevention maps using the town-walk method. We also confirmed that the proposed system improves disaster awareness among individuals, understanding of the area, and the conventional system. Keywords: Disaster prevention map Disaster prevention awareness

1

· Town-walk · WebGIS

Introduction

The Great East Japan earthquake caused damage to the local government. This earthquake paved way to the concept of “public help from the central and local governments” in a wide-area disaster. It is often said that disaster-prevention measures should ideally be a combination of public help from central and local governments, mutual help from local communities, and every individuals’ selfhelp. The map-making using the town walking technique is held in various places in Japan. Its purpose is to improve the awareness of disaster prevention among the participating people and to understand the area where they live. It has c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 19–34, 2018. https://doi.org/10.1007/978-3-319-98743-9_2

20

S. Enokida et al.

been confirmed that the making of disaster prevention maps contributes to better awareness regarding disaster prevention among the individuals. It can be expected to better self-help and mutual help in local communities. The disaster prevention map making is not just about collecting information, it is also about improving the awareness of the whole region through communication, creation of disaster prevention maps, and presentations at workshops. The following steps constitute the rough procedure of making a disaster prevention map using the town-walking method: – While walking through a town, take a picture with a digital camera and jot down relevant information on a map or a notebook. – The individuals organize the information gathered during the town walk by making a disaster prevention map on the desk. – The representatives of the group announce the completed map. The problems faced while making a disaster prevention map are as follows: – Since the making process of the disaster prevention map is time bound, the time to organize the information is restricted. – Even if a lot of information is collected while walking through a town, the information that is finally put on the map is limited by the time limitation and the area restriction of the paper map. – It is difficult for all individuals to take the paper map home; thus, rendering its reusability low. There are many systems that support the WebGIS of disaster prevention maps. However, there has been no proposed system that can provide consistent support for information gathering, disaster prevention maps, and presentation of town walks. Therefore, we have developed an integrated support system to facilitate the making of disaster prevention maps using the information gathered by town walking. The goal of this research is to improve the disaster prevention awareness of the individuals while making a disaster prevention map using this method. The purpose of this paper is to provide consistent work support for the creation of an efficient disaster prevention map through this system.

2

Related Works

From the studies conducted by Ushiyama [1], it has been confirmed that developing disaster prevention maps using conventional paper maps improves of the awareness of disaster prevention of individuals. According to Ushiyama, the proposed system can be expected to produce the following: 1. Individuals assume a disaster on a map and know the possibility of damage. 2. Individuals learn about the necessity and method of evacuation in the event of a disaster. 3. Individuals share information through discussion.

An Integrated Support System for Disaster Prevention Map-Making

21

In our proposal system, we expect to raise the same level awareness regarding disaster prevention as paper maps. Many systems have been proposed to provide disaster prevention information using WebGIS [2–4]. However, none of these systems are for the creation of a disaster prevention map through town-walking. Some systems are experimental or simply applied WebGIS on the real-world fields [5–8]. These studies have developed a GIS that can be used in the community, however it provides a limited support for the development of disaster prevention maps. This proposed system supports the process right from the gathering of information by town walking, to making the disaster prevention map, and finally the announcement.

3 3.1

An Integrated Support System for Disaster Prevention Maps Outline of the Integrated Support System for Disaster Prevention Map

The term “integrated support” of the integrated support system for disaster prevention maps means collecting disaster information while walking around a town, organizing disaster prevention information, and putting together information regarding evacuation routes to support the presentation. This system assumes its use in the making of a disaster prevention map sponsored by a voluntary disaster prevention organization. The number of individuals in this system are expected to be about 20 to 30 people. In addition, the range of the town walking is assumed to be enough for individuals to walk in an hour. 3.2

System Configuration

Figure 1 shows the system configuration of the integrated support system for disaster prevention maps. This system is a web application that uses HTML5 and JavaScript that runs on the browser in WebGIS. The server side consists of the disaster information server and the Google Maps server by PHP and PostgreSQL (PostGIS). When the user registers any disaster prevention information, it is stored in the disaster prevention information server. The disaster information sent by the disaster prevention information server and the map data sent from the Google Maps server are combined and displayed on Google Maps of the user’s device via the Google Maps JavaScript API. In this system, the web application works if there is a browser and it can operate either on a smartphone, tablet, or PC. When walking through a town, individuals use the registration function to register disaster prevention information and photos on their personal smartphones and tablets. When creating a disaster prevention map, individuals can modify the information they have already registered using the editing function, register the evacuation route using the registration function, and register additional detailed information, which could not be entered at the time of the town

22

S. Enokida et al.

Fig. 1. System configuration.

walk, with the word-of-mouth function. After which the individuals use the disaster map that they have created and announce its completion. In addition to this, the function that assists the registration of disaster prevention information also has the location, hazard map, and elevation display function. The location display function reduces the need for users to find their current location. The hazard map display function makes it easy to understand the damage caused by a disaster in the region. The elevation display function is especially used to know the possibility of flooding. Furthermore, the function to modify the registered disaster information has an editing function and a coupling function. 3.3

Registration Function of Disaster Prevention Information

There are four types of disaster prevention information that can be registered: “placemark”, “line”, “area”, and “text.” “placemark”, “line”, and “area” are registered as disaster prevention information by adding the name of the disaster prevention information and the registered person and a photograph if necessary. Figure 2 shows examples of disaster prevention information. The user chooses the type of disaster information to enter in Fig. 2(a). After which, the user taps the point on the map where the disaster information is to be placed and a mark is placed on that location. Figure 2(b) is a registration example of a shelter.

An Integrated Support System for Disaster Prevention Map-Making

(a) Sign registration screen

(b) Shelter tration

regis-

(c) Line tion

registra-

(d) Area tion

23

registra-

Fig. 2. Examples of disaster prevention information.

Figure 2(c) shows an example of line registration. Figure 2(d) shows an example of domain registration. 3.4

Word of Mouth Function

Additional information, such as photos and comments (word of mouth) can be added to the disaster prevention information registered by a user. Figure 3 shows examples of comment (word of mouth) registration function. In Fig. 3(a), the window of the balloon is placed at a dangerous point. By tapping this window, you will see a word-of-mouth screen like Fig. 3(b). You can use the Post form to register information and photos in Fig. 3(b). The registered word of mouth is displayed in the timeline format as shown in Fig. 3(c). 3.5

Expected Effects of Disaster Prevention Maps

While using this system, we can expect the following effects shown by supporting each stage of the disaster prevention map of the town walking type. – Gathering information by walking around the town When walking a town, individuals can register their location information and disaster prevention information on the spot using a smartphone. At the time of developing the disaster prevention map, individuals can confirm and edit registered information and compensate for the lack of information. – Disaster map making Since the information is entered while walking in the town, the individual can use the work time to confirm the registered information, to revise work, and to add to the lacking information. Moreover, since there is no restriction on the area unlike paper maps, it is possible to add a lot of disaster prevention

24

S. Enokida et al.

(a) Icon showing dangerous point

(b) Post form

(c) Timeline

Fig. 3. Examples of a comment (word of mouth) registration.

information and upload multiple photos, thereby increasing the amount of information on the disaster map. – Presentation Unlike a paper map, it is possible to enlarge this map and the individuals can make a presentation easier to understand and explain while confirming the registered information.

4 4.1

Experiment Purpose of the Experiment

The purpose of the experiment is as follows: – To find out if the proposed system can achieve the same effect as that of making a disaster-prevention map using conventional paper maps. – What is the effectiveness of the proposed system as compared to the conventional paper map (conventional method)? 4.2

Group Composition

We conducted an experiment in Susa district, Wakayama City, Wakayama Prefecture on April 15, 2017. The 16 individuals in the experiment were students from Wakayama University, out of which 9 were graduate students. For the experiment, we divided the 16 students into four groups, wherein two groups were given the responsibility of making a disaster map using conventional paper map (Paper group) and the remaining two groups were making a disaster prevention map using the proposed system (System group). In the experiment, we had the cooperation of one adviser and four residents of the district (hereafter, guide). One guide was assigned to each group and one group of advisors was assigned to build a disaster prevention map.

An Integrated Support System for Disaster Prevention Map-Making

4.3

25

Experimental Procedure

The experiment was carried out according to the procedure of making a real disaster map. The procedure followed on the experiment day is as follows: 1. Explanation of an advisor’s disaster prevention map (about 10 min) There was an explanation about the damage assumption of the landslide disaster in the district. This explanation was given with regards to making a normal disaster map. There was another explanation of the procedure for making a disaster prevention map. 2. Group meeting (about 10 min) We confirmed in advance how to walk the district in the town and the kind of disaster information available. 3. Gathering information by walking around the town (approx. 1 h) The individuals collected disaster information while listening to the guide. A paper map was given to the group (paper group) carrying out the experiment using the conventional method (paper map). The other group registered the disaster information on their smartphones. 4. Creating a disaster map on the desk (about 1 h) The paper group used an A2-size paper map and the system group used one notebook PC to create a disaster prevention map. The notebook PC and the smartphone has the same web system, hence there is no difference in their functions and displayed content. 5. Presentation of disaster map created by representatives of each group (approximately 5 min per group) The paper team put a paper map on the wall and gave a presentation. The system group projected the system screen and presented it using the projector. 6. Comments by advisors and guides (approximately 10 min). 7. Complete questionnaire (approximately 10 min). 4.4

About the Disaster Map that Has Been Created

Figure 4 shows a map developed by paper group A using a paper map. In this disaster map, a lot of photographs had been arranged in the form of outline maps. Furthermore, the evacuation route was drawn with a color pen and the disaster prevention information had been presented using sticky notes. The Paper group used red sticky notes for earthquake-related information, light blue sticky notes for water-related information, and green sticky notes for information on safety and countermeasures. Figure 5 shows the disaster prevention map of system group A. In the case of wide-area display as shown in Fig. 5, many place-marks overlap. However, because the proposed system can change the display scale, it is possible to view the disaster prevention information by the narrow region display where the mark overlaps.

26

S. Enokida et al.

Fig. 4. Disaster prevention map (Paper group A). (Color figure online)

4.5

The Difference Between the Work of the System Group and the Paper Group

In the experiment, we describe the difference between the system group and the paper group. The Difference in Information Gathering During Town Walking. During the town walk, the paper group made notes on the paper map when the disaster information was found, while one of the groups was took pictures with a digital camera. The system group registered the system directly at the spot where the disaster prevention information was found. As a registration procedure, one of the groups registered a place-mark and everyone in the group wrote down information using the word-of-mouth function. The members of the system group used their smartphone camera to take pictures and uploaded it. Figure 6 shows the system group during town-walking. In the figure, while one of the groups on the left side of Fig. 6 is registering or viewing disaster prevention information. The person on the right is taking pictures with a smartphone. The paper group was able to mark rough positions because they had to carry a large map. The system group registered the disaster information using the location display function while walking in the town. Differences in the Creation of a Disaster Map on a Desk. The paper group worked on an A2-size paper map and summarized the disaster information that was noted while walking around the town. At this time, the individual chose the photograph taken with the digital camera, printed and pasted it to the paper map. The individuals used sticky notes to show the main disaster information and the evacuation route and dangerous areas were recorded using a color pen.

An Integrated Support System for Disaster Prevention Map-Making

27

Fig. 5. Disaster prevention map (System group A).

The time needed by the paper group to print the photographs became an issue and the number of photos that they were able to paste on a paper map were limited. In the case of system group, since they entered the disaster information during the town walk, the system group had to do corrections for this information. Moreover, since the system group had no space constraint while entering additional information using the word-of-mouth function, unlike the paper map, a lot of information was entered. Numerous photos were included using the word-of-mouth function. The paper map is not easy to correct once the information has been written, however in the case of a system, it is easier to modify. The system group also used the street view function in Google maps to recheck the disaster prevention information. In the system group, a difference was observed in the work of each group. The system group A used a laptop computer and discussed the disaster prevention map. The system group B did not have a lot of disaster information on the map at first, as two of the individuals used a laptop computer to enter the detailed information and the remaining two members registered it using the word-of-mouth function from their smartphone. Figure 7 shows the work scenery of System group B. Differences in the Presentation. The system group made an announcement while expanding or shrinking the map on the system. Specifically, the presenter explained the map in a narrow area with a lot of disaster information and if it was necessary to show the whole of the district, to see the evacuation route, the map was shown in a wide area.

28

S. Enokida et al.

Fig. 6. Photograph of System group town-walking.

Fig. 7. A photograph of disaster prevention map making (System group B).

In the paper group, the presenter looked at the map and remembered what information there was and was able to see the story. On the other hand, since a lot of information is registered as word-of-mouth in the system group, the presenter who announced the information also confirmed it. It was easy to confirm the information from the audience because it was possible to show the image during the announcement using the system. In addition, system group A displayed the map using a satellite photograph of Google Maps, under the assumption that it would be easier to understand visually. Differences in the Amount of Information on Disaster Maps. Table 1 shows the amount of information on the disaster map for the paper and system group. Compared to the paper group, a lot of additional disaster prevention information was provided by the system group.

An Integrated Support System for Disaster Prevention Map-Making

29

Table 1. Amount of the information about disaster prevention map on Paper groups and System groups. Group Sticky notes Line Area Photo Total Paper A 20 1 3 30 54 Paper B 12 4 8 16 38 Group Mark (Word of mouth) Line Area Photo Total System A 56 25 6 3 52 117 System B 49 41 5 0 36 90

Paper groups generally have limited information on paper maps when working on a disaster prevention map. It was necessary to select the disaster information collected when walking the town. In addition, it takes time to select. The time available to make a disaster prevention map is limited and it is assumed that the disaster prevention information of the paper group was limited because it was mandatory to make a disaster map within the specified time limit. Since the system group has registered the location information and the disaster prevention information at the exact spots while walking in the town, it is possible that the amount of information has been confirmed and corrected by the detailed information while making the disaster prevention map. In this system, additional photos and explanations can be added using the word-of-mouth function and many additional information have also been registered.

5

Questionnaire Survey

After the experiment, we conducted a questionnaire survey. The subjects of the survey were the 16 student individuals of the experiment. Table 2 shows the results of the questionnaire survey. “System” shows the system group and, “Paper” shows the paper group. Each question item has a free description column that writes the reason for the evaluation. The distribution, median, and mod value (most frequent values of the evaluations) show that there is no significant difference in the experimental results between the paper and the system and that the system has the same degree of effect as the paper used. – Evaluation: 1: Strongly disagree, 2: Disagree, 3: Neutral, 4: Agree, 5: Strongly agree – “Evaluation” is the number of people who answered the evaluation – Group: System G means a group that uses the proposed systems. Paper G means a group using the conventional method. – The significance probability was calculated by using the Wilcoxon signed rank test.

30

S. Enokida et al. Table 2. Questionnaire survey results. Questionnaire

Group

Evaluation

Median Mode value

Significance probability

1 2 3 4 5 (1) It was easy to use the map (2) It was easy to identify the disaster prevention information on the map (3) It was easy to identify my location on the map

System 0 2 3 1 2 3

3

Paper

2

0 4 1 3 0 2.5

System 0 1 2 5 0 4

4

Paper

3

0 1 4 3 0 3

System 2 2 3 1 0 2.5

3

Paper

2

0 3 1 2 0 2.5

(4) It was interested in the System 0 0 0 3 5 5 danger and the safety of the region through the making of the disaster prevention map Paper

5

0 0 0 6 2 4

4

(5) I was able to get to know the System 0 0 0 3 5 5 town better through the making of the disaster prevention map

5

Paper (6) I was able to communicate with other participants through the creation of a disaster prevention map

0 1 0 6 1 4

4

System 0 0 1 5 2 4

4

Paper

4

0 1 0 4 3 4

(7) I learned about the disaster System 0 0 0 4 4 4.5 prevention information by making a disaster prevention map Paper

5.1

1 0 2 3 2 4

4.5

0.425 0.567

0.547

0.315

0.0853

0.870

0.180

4

Ease of Working

From the results of the question items related to ease of working in Table 2, the participants who answered, “I strongly agree” of the system group answered as follows. – I am glad I had less work to do, because I could put photos and comments in the actual place. – I was able to do the work because I could write information while actually checking my position with the GPS. The participants who answered, “neural” of the system group answered as follows.

An Integrated Support System for Disaster Prevention Map-Making

31

– Because the screen of the smartphone was small, I felt that a little work was done hard. – I want to work while grasping the big picture because the screen is small when it is a smartphone. The participants who answered, “I agree” of the paper group answered as follows. – I took a lot of pictures, so I was able to take a look at the map after I walked. – It’s easy to understand the road. The participants who answered, “I disagree” of the paper group answered as follows. – It was difficult for me to think about the layout when I thought about the space to put together a photograph. – I had a case where the matching of the photograph and the relevant place did not go well. In the paper group, it is likely that it might take too much effort to match the position with the photograph when it cannot recall the kind of disaster information. On the other hand, because the system group can register the disaster information on the spot where information is found, it is understood that the burden while making the disaster prevention map on the desk is decreased. Moreover, due to the limited area on a paper map, we must make an effort to select the amount of information to be put. On the other hand, there was no limit to the amount of information that could be registered in the system, so there was no problem regarding the amount of information in the ease of work. However, it is difficult to work on the screen of a smartphone. 5.2

Change of Interest in Risk and Safety

In the question item “I was interested in the danger and the safety of the region through the making of the disaster prevention map”. The ratings were high with both groups. The individuals of the system group responded as follows: – When I was consciously making a map, I could notice the place where I was, which I would not have done if I only took a walk. I want to see my area from this point of view. – I assumed that I did not usually worry too much and it led to danger and safety unexpectedly. I decided that I would try to be more conscious of the usual. – I thought about the road that I usually use, no one has been a conscious participant to mark out the various dangers along it by making a disaster map. I thought that it might be better not to have the preconception to make it in the town which I did not know.

32

S. Enokida et al.

The participants of the paper group responded as follows. – Because I had never seen the town with a view of a disaster, I was able to realize that such a place was dangerous. I want to look for the same thing around me. – When I walked around the area in consideration of disaster prevention, I felt that the unexpected and dangerous place stood out and I could see the painful eyes. – I felt that making a disaster map was a good opportunity to learn about the area. In both the groups, we found that the interest in the disaster prevention of individuals was increasing. Since there is no significant difference between the responses of the system group and the paper group, it can be said that the same effect as the construction of a disaster prevention map using a conventional paper map is obtained, even if the system is used.

6

Effects and Issues of Integrated Support System for Disaster Prevention Maps

This section describes the effects and issues of the integrated support system for disaster prevention maps. 6.1

Effect of Integrated Support System for Disaster Prevention Map

We confirmed that it is possible to improve the efficiency at each stage of the disaster prevention map using the integrated support system. – Gathering information by walking around the town We confirmed that it is possible to correlate location and disaster information on a specific spot while gathering information during a town walk. Therefore, it is more efficient than the conventional method because the work time for confirming and editing the registered information and supplementing the insufficient information increases when the next stage disaster prevention map is made. – Disaster map making In this system, the information gathering in the town is directly connected with the creation of a disaster prevention map. Therefore, the work done for the disaster prevention map could be used for the confirmation of registered information, correction work, and supplementing the lack information. The ability to work concurrently with multiple people in this system was essential for efficient work. Moreover, it was possible to increase the amount of information of the disaster prevention map due to its ability to add a lot of information and upload multiple photos without restriction, unlike a paper map.

An Integrated Support System for Disaster Prevention Map-Making

33

– Presentation The presenter used this system to confirm the use of the map and the contents of the presentation. We have confirmed that the system supports a more comprehensible presentation by zooming in on the information displayed on the screen. Improving the awareness of disaster prevention and understanding of the local community was the purpose of the town-walking type disaster prevention map and the proposed system, similar to the conventional method, obtained great results. Hence, there is no significant difference. There was also no significant difference between the results of the questionnaire for both the experiments. 6.2

Challenges of Integrated Support System for Disaster Prevention Maps

We describe the problem in the use of the integrated support system for disaster prevention maps. – Use of information equipment In the case of the system use, the smartphones are used in the information gathering, PC and the smartphones are used in the disaster prevention map making, and PC and the projector are used in the presentation. In order to use the system, it is necessary to prepare these information devices. In addition, it is necessary to be accustomed to the smartphone on a regular basis such as character input and photography using a smartphone. – Communication between participants There is a possibility that the communication between participants decreases when the system is used. If you use a paper map, you can check the state of work mutually. If the system is used at the same time on multiple devices, communication may be reduced because the other workers are not fully understood.

7

Conclusion

In this paper, we developed an integrated support system for disaster prevention maps and conducted experiments for verification. The experiments reveal that proposed system is capable of providing consistent support for information gathering, disaster prevention maps, and presentations. We also confirmed that the proposed system contributed to efficient work support. Results from the questionnaire survey indicate that the proposed system can be expected to be used for mapping out dangers and safety of the local community, as well as to create a disaster prevention map using conventional paper maps and to improve knowledge regarding the town that it is applied in. In the future, we will use the proposed system to make a disaster prevention map with the help of the local voluntary disaster prevention organization.

34

S. Enokida et al.

Acknowledgment. In the development and evaluation of the system, we received a great deal of cooperation from Mr. Akio Nakasuji, the Research Center for Disaster Science Education, Wakayama University. We will show my gratitude here.

References 1. Ushiyama, M., et al.: Basic information on the community-based disaster map creation workshop. Tsunami Engineering Technical report 21, pp. 83–91 (2004). (in Japanese) 2. Kobayashi, I., Hoshino, Y., Furuta, N.: Browsing emergency evacuation information using free map services. In: Proceedings of the 75th National Convention of IPSJ, vol. 4, pp. 537–538 (2013). (in Japanese) 3. Kusano, K., Izumi, T., Nakatani, Y.: Disaster information sharing system using pictograms examination to input two-dimensional information. In: Proceedings of the 76th National Convention of IPSJ, vol. 4, pp. 561–562 (2014). (in Japanese) 4. Denki, T., et al.: Geofence check-in application for disaster prevention and asynchronous evacuation drills. In: Proceedings of the 78th National Convention of IPSJ, vol. 4, pp. 999–1000 (2016). (in Japanese) 5. Murakoshi, T., Yamamoto, K.: Study on a social media GIS to support the utilization of disaster information: for disaster reduction measures from normal times to disaster outbreak times. Socio-Inform. 3(1), 17–30 (2014). (in Japanese) 6. Tanaka, T., Uchihira, T.: Application of mobile GIS equipped with GPS to field survey with public participation. J. Architect. Build. Sci. 14(27), 199–204 (2008). (in Japanese) 7. Masafumi, K., Funakoshi, H., Utsu, K. et al.: Introduction of MGRS code into disaster-related information sharing system. IPSJ SIG Technical report, vol. 2016GN-98, no. 14, pp. 1–8 (2016). (in Japanese) 8. Kubota, S., Soga, K., Sasaki, Y., et al.: Development and operational evaluation of regional social networking service as public participation GIS. Theory Appl. GIS 20(2), 35–46 (2012). (in Japanese)

Concealment-Type Disaster Prevention Information System Based on Benefit of Inconvenience Satoko Shigaki and Takashi Yoshino(B) Wakayama University, Sakaedani 930, Wakayama City, Japan [email protected], [email protected]

Abstract. Using a hazard map and disaster-preparedness system during normal times is typically recommended to cope with a disaster appropriately and promptly. However, a hazard map has the possibility of information overload and the disaster-preparedness system has a low utilization rate. For increasing the users of the disaster-preparedness system, an approach different from the conventional disaster prevention system is required. The proposed system “Crimap” presents only the information near the user on the map and hides the outside of the surrounding. Thus, we can reduce the amount of information and create awareness of the disaster-preparedness information in daily life. This is based on “benefit of inconvenience.” It is inconvenient to not see the whole map. However, it is “useful” to be aware of the disaster-preparedness information. From the usefulness verification experiment of the map concealment, we found that the participants were able to recognize the information presented by reducing the amount of information.

Keywords: Disaster prevention awareness Benefit of inconvenience · Evacuation map

1

· Information overload

Introduction

Japan is a naturally disaster-prone country. To act accurately and promptly in the event of a disaster, evaluating the disaster prevention measures during normal times is essential, for not only relying on the information provided, but also assuming appropriate action during the disaster. Countermeasures include the use of hazard maps and disaster prevention systems, including disaster prevention applications. Hazard maps are important in disaster prevention because disaster information can be obtained beforehand, thus leading to the improvement of the disaster prevention consciousness by the use during normal times [1]. However, according to the survey conducted by the Ministry of Land, Infrastructure and Transport, the municipalities have a large burden, and they are often abandoned. In addition, the hazard map contains approximately 23 kinds of information, including the list of flood ranges and c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 35–47, 2018. https://doi.org/10.1007/978-3-319-98743-9_3

36

S. Shigaki and T. Yoshino

evacuation areas and the knowledge of evacuation procedures. However, hazard maps have shown the possibility of information overload [2]. Many disaster prevention apps are being distributed currently. Both systems focus on convenience, such as the immediate understanding of current disaster information and abundance of information. Although many apps exist, according to the Department of the Ministry of Internal Affairs and Communications, and according to the hearing survey conducted in 2017, the awareness of disaster prevention applications is low, and the percentage of people who use these applications remains at approximately 50% [3]. The percentage of people who responded that they wanted to use the disaster prevention app is approximately 80%, and few people have negative feelings about the disaster prevention app. However, there are many apps that are not actually used. We observed an insufficient usage of existing disaster prevention applications. We believe that, to increase the users of the disaster prevention systems, a different approach other than the conventional disaster prevention systems is required. In this study, we propose a disaster prevention system that focuses on the “inconvenience,” rather than “convenience.” The proposed system “Crimap” presents only the information in the vicinity of the user on the map and hides the outside of the surrounding. This method reduces the amount of information presented and creates awareness of the information about the disaster. This approach is based on the idea of “Benefit of Inconvenience” [4]. The idea of “Benefit of Inconvenience” is to receive some benefit from a properly devised inconvenience. This method is “useful” because it becomes easy to notice the disaster prevention information around a user that an inconvenient state that cannot be seen excluding the user’s circumference. In this paper, we describe the outline of Crimap and the usefulness verification experiment of the map concealment.

2

Related Work

We present the systems of the map use, the systems adopting the inconvenience benefit, and the systems creating awareness, and then clarify the position of this research. 2.1

Disaster Prevention System for Map Use

Amano developed a flood hazard map app for smartphones [5]. This system is used for confirming normal and desk training. The flood hazard map of a wide area by using open data can be viewed. This system does not consider the amount of information to be provided. In the proposed system, only the information about the user is provided to avoid information overload. Fukada developed a tsunami evacuation support system using a Tablet PC [6]. This system obtains the evacuation trajectory in the evacuation training in a normal situation and supports evacuation through the evacuation navigation

Concealment-Type Disaster Prevention Information System

37

function during the disaster. This system supports evacuation behavior, whereas the proposed system creates awareness of disaster prevention information. Ahmed developed an evacuation information system to evacuate disasters [7]. Australia is often affected by floods. An efficient and effective evacuation is necessary to minimize damage. This system uses the Australia metropolitan area (ACT) and Quinbian in its vicinity. It simulates evacuation plans based on the flood area, geographical data, and population data, simulating the time of vigilance and evacuation. This system optimizes the evacuation plans in normal circumstances, and the target users are different from our system. Murakoshi developed a social media GIS that integrates Web-GIS and SNS and incorporates a posting information classification function [8]. This system raises awareness of disaster prevention among local residents by collecting disaster information through SNS and accumulating local disaster information. During a disaster, automatic classification of disaster information is immediately displayed on a map to support evacuation behavior. It is similar to the proposed system in that it is a device that does not become overloaded with information, and it assumes use in normal time. This system collects information in normal circumstances, and enables viewing of the disaster information during normal times. 2.2

A System that Incorporates the Benefit of Inconvenience

Tanaka developed a tourist navigation system that showed the route to the destination [9]. This system incorporates the following idea of the benefit of inconvenience: – Does not provide complete map information. – Does not always display the current location on the tourist navigation. – Uses landmarks and photos for navigation. This system does not passively follow the road to a destination by the abovementioned method, but supports it to enjoy independently and positively. Gouko developed a robot for encouraging people to clean-up the objects on the table [10]. This system assists robots and humans in cooperating to cleanup the tools scattered on the table. In this system, the profit of the custom of tidying up and the mastery of transportation by making the transfer in charge of human rather than the robot directly is gained. These systems are the same as the system that incorporates the benefit of inconvenience. The major difference is in the field of application. This system supports disaster prevention. 2.3

System to Create Awareness

Shirai classified the browsing (information acquisition behavior when the information required is unclear) and retrieval (information acquisition behavior when the information required is clear), which is a human information acquisition behavior [11]. This system supports browsing efficient information by using a

38

S. Shigaki and T. Yoshino

device seamlessly depending on each mode. The aim of presenting information that provides awareness is similar to the purpose of our system. In the field of “Shikakeology,” the example of the print of the y often attached to the urinal is provided [12]. This study uses the psychology “I want to aim,” and to minimize scattering. This is a system that uses the notice to the print of the y. Although the point of using awareness to accomplish the purpose is similar to this system, this system is different to create awareness of the disaster prevention information.

3

Crimap

In this section, we describe the configuration, the screens of the proposed system “Crimap.” 3.1

System Overview

Crimap is a system for disaster prevention information that uses an Android device. Crimap is used while commuting. It supports awareness of disaster prevention information in the vicinity of the commuter road, as well as disaster prevention information in the vicinity of the home and travel. When information displayed on the map is huge, it is considered that users are not aware of the disaster information around them. In this system, we present only the information in the vicinity of the user, hide the outside of the surrounding, and reduce the amount of information. This method provides the users awareness of the disaster prevention information around them. This method is based on the “benefit of inconvenience” [4]; its idea is to determine a good inconvenience and adopt it as a guideline for designing a new system. The “inconvenience” of this system is that only a small portion of the map information is visible. The “useful” point of this system is that the amount of information is low, and it becomes easy to notice the disaster information around users. 3.2

System Configuration

Figure 1 shows the system configuration; it consists of a server and an Android application used by each user. The server has a database of the location information of shelters and shelters, and the “name of evacuation facilities,” and “type of evacuation facilities.” When the user moves, the system obtains the location information and sends it to the server in Fig. 1(1). The Android device receives the data of the evacuation facility and evacuation facility information according to the user’s location information from the server in Fig. 1(2). The data of the evacuation facility sent to the Android device is displayed as a marker on the map. When the user taps the marker, the evacuation facility information is presented.

Concealment-Type Disaster Prevention Information System

39

Evacuation facility information server

(1) User location information

(2) Evacuation facility information

Android device

User Fig. 1. System configuration.

3.3

System Screen

Figure 2 shows an example of the system screen. This system hides the map information by covering the black tiles shown in Fig. 2(a). In this system, the size of each tile was 129 m to one side. This is due to the following reasons. The shortest tsunami arrival time is 3 min in the Nankai trough huge earthquake (the Nankai trough huge earthquake is an earthquake expected to occur in the near future). Elderly people are going uphill with a walking speed of 43 m/min. The tile is a guide to the range that can be moved in less than 3 min per piece. By using this system, users can visually understand the extent to which they can escape from the current location within 3 min. When a user passes over a tile, the system disappears and the hidden map is visible. First, when the system is started, the system obtains the user’s current location. Then, the tile of the range of the user disappears, and the system screen is displayed as shown in Fig. 2(1). Next, when the user moves in the direction of the arrow in Fig. 2(c), the tile in the passed range disappears, and the visible range of the map increases. In addition, when the user keeps advancing in the direction of the arrow in Fig. 2(c), it becomes a state as shown in Fig. 2(2). When the user enters the range of the tile where the evacuation facility shown in Fig. 2(d) is hidden, the tile disappears, and an evacuation facility icon is displayed.

40

S. Shigaki and T. Yoshino

Figure 2(3) shows the overall view of the user’s moving path. Figure 2(e) shows the moving path. Figure 3 shows an information window showing an evacuation facility. The information window appears when the evacuation facility marker is tapped. The contents of the presentation are information on the name of the evacuation facility and type of evacuation facility.

Fig. 2. Examples of Crimap screen.

4

Verification Experiment of the Usefulness of Map Concealment

In this section, we describe the usefulness of the verification experiment of the map concealment. 4.1

Outline of the Experiment

The purpose of this experiment was to investigate whether the map was hidden and provided awareness of the disaster prevention information around us. We used an evacuation facility as the disaster information. We divided it into two groups in the comparative experiment: no-tile group: not hiding the map, and tile group: hiding a map with tiles. We investigated the method that was more aware of the evacuation facilities. We used the following factors as an index of consciousness: – Number of evacuation facility icons tapped during the experiment. – Number of evacuation facilities remembered immediately after the experiment.

Concealment-Type Disaster Prevention Information System

41

Fig. 3. Example of the evacuation information on the information window.

Figures. 4(a) and (b) show the systems startup screen of no-tile group ant tile group, respectively. The system of no-tile group is not obscured by the map, and all nearby evacuation facility markers are displayed on the screen. By contrast, the system of tile group shows only the icons of evacuation facility in the vicinity of the user. The location of the experiment was the place where the number of evacuation facilities was scattered. In total, there were 11 participants of Wakayama University, of which seven were undergraduate students, four graduate students (21 to 24 years old, average 22.1 years, five males, six females). Four of the 11 participants used their own Android devices. The participants assigned IDs from A to K. A to E is the group of five people with the no-tile group, F to K is a group of six people with the tile group. 4.2

Flow of Experiment

The participants had an Android device and a map with a range of experimental information. We requested the participants to walk freely individually within an hour of the experiment. Figure 5 shows the experimental range. The inside, surrounded by an orange line, is the experiment range. The experimental scope of this experiment included nine evacuation facilities. We set up some shelters so that the participants can be easily located. We did not explain to the participants the purpose of the experiment and specification of the system beforehand. We requested them to do the following four steps: 1. Do not use the system on a smartphone while walking. However, check the screen of the system occasionally. 2. Walk around as much as possible. 3. Do not look at the screen of other collaborators during the experiment. Not having a conversation about the system.

42

S. Shigaki and T. Yoshino

Fig. 4. Examples of system startup screen in each group.

4. Do not write anything on the map that was distributed. In addition, we contacted by e-mail and telephone, so that the participants may touch the screen during the experiment. However, the announcement was not communicated to participant H. We conducted a questionnaire survey after the experiment. Figure 6 shows the experiment and the situation before the start of the experiment.

5

Experimental Results and Discussions

In this section, we provide the results and discussions of the usefulness verification experiment. 5.1

Reaction Rate

We focused on the “reaction rate” as an index of the number of evacuation facility icons that were tapped during the experiment. The reaction rate is defined as follows. Reaction rate =

(N umber of evacuation f acility icons tapped) (N umber of evacuation f acility icons displayed on the screen)

However, the number of the evacuation facility markers displayed on the screen of the no-tile group is the estimated number from the path that the participants walked.

Concealment-Type Disaster Prevention Information System

43

Fig. 5. Map distributed in the experiment. (Color figure online)

Table 1 presents the result of the reaction rate of the no-tile and tile groups. The result of the reaction rate of the no-tile group was the mean accuracy rate of 42.3%, and the standard deviation was 0.39. The average accuracy rate of the tile group was 73.8%, and the standard deviation was 0.26. As a result of the ttest, which does not assume variance, p = 0.087 < 0.10 was close to significance. However, a significant trend was observed that the reaction rate is high with the tile group. We found that it was possible to be conscious of the cover of the map. 5.2

Correct Number and Accuracy Rate

We focused on “correct number” and “accuracy rate,” the indices of the number of evacuation facilities that were remembered immediately after the experiment. In the questionnaire survey, we asked the question, “Please answer the evacuation facilities discovered by using the system.” First, we focused on the number of correct answers. Table 2 presents the number of correct answers for the no-tile and tile groups. The result of the number of correct answers of the no-tile group was 2.8, and the standard deviation was 1.64. The result of the number of correct answers of the tile group was 3.0, and the standard deviation was 1.90. As a result of the t-test, which does not assume variance, p = 0.428 > 0.10. There is no significant difference between the no-tile and tile groups. Next, we focused on the accuracy rate. The accuracy rate is defined as follows:

44

S. Shigaki and T. Yoshino

Smartphone device

Fig. 6. Photograph of the experiment.

Accuracy rate =

(N umber of evacuation f acilities that were correct) (N umber of evacuation f acility icons displayed on the screen)

However, the number of the evacuation facility markers displayed on the screen of the no-tile group is the estimated number from the path that the participants walked. Table 3 presents the result of the accuracy of the no-tile and tile groups. The result of the answer rate of the no-tile group was the average accuracy rate of 26.0%, and the standard deviation was 0.16. The result of the answer rate of the tile group was the average accuracy rate of 47.8%, and the standard deviation was 0.30. As a result of the t-test, which does not assume variance, p = 0.082 < 0.10 was close to significance. However, a significant trend was observed that the reaction rate is high with the tile group. As a result of the experiment, it is more likely to remember the evacuation facilities per place if you cover the map. We found it useful to cover the map and reduce the amount of information. 5.3

Comments of the Participants

In the questionnaire survey conducted after the experiment, we received a free description of the pros and cons of the system. Based on these descriptions, we

Concealment-Type Disaster Prevention Information System

45

Table 1. Reaction rates on the no-tile and tile groups.

Table 2. Number of correct answers on the no-tile and tile groups.

considered the awareness of the purpose of this experiment. In this experiment, participants only used one of the other systems. The system’s pros from the no-tile group were: – I thought it was easy to see where the shelter was. – I was conscious of the evacuation place. – Because the information is only a refuge place, I do not have too much information and is easy to see. The system’s pros from the tile group were: – – – –

I was inspired to walk where I could find shelter. I felt that I had the pleasure of making a map by myself. I felt that it was interesting to see the map from a black screen. I was able to take a walk to answer the quiz (I looked for shelter). I was able to remember the location of the evacuation.

The opinion obtained from the no-tile group showed that the evacuation facility was considered. In the tile group, the opinion that “I was inspired to walk where I could find shelter” was shown to be conscious of the evacuation facilities. We found that both groups were aware of the evacuation facilities by using the system. From the comments of the tile group, Crimap (tile-based system) played the role of gamification, and the user used the system while having fun. The cons of the system from the no-tile group were:

46

S. Shigaki and T. Yoshino Table 3. Accuracy rates on the no-tile and tile groups.

– – – – – –

It is difficult to identify the building of the evacuation site. The information window sometimes prevented you from seeing the contents. The cons of the system from the tile group were: I thought that it was hard to judge whether I entered the system tile. I suspected that, when the map was black, it was an obstacle to the Internet. I thought that it was better to see only the evacuation place in the part not seen.

In the tile group, there was an opinion that “I thought that it was better to see only the evacuation place in the part not seen,” and it showed that this function was conscious of the evacuation facilities. Both groups were aware of the evacuation facilities. From the tile group, we observed that hiding the map was inconvenient. We found that the effect of the inconvenience benefit was obtained because the reaction rate and accuracy rate of the tile group were higher than those of the no-tile group.

6

Conclusion

In this paper, we proposed a disaster prevention system that focused on inconvenience as an alternative approach to the conventional disaster prevention system. We described the outline of the system and the usefulness verification experiment of the map concealment. The proposed system “Crimap” presents only the information in the vicinity of the user on the map, and hides the outside of the surrounding. Using this method, we reduced the amount of information and created awareness of the information about disaster prevention. This is based on the “benefit of inconvenience.” We observed that the “inconvenient” state, in which the whole of the map was not seen, was “useful” because it became easy to notice the surrounding disaster information. In the effectiveness verification experiment of the map concealment, we experimented with the tile and no-tile groups. We investigated the three viewpoints of

Concealment-Type Disaster Prevention Information System

47

“response rate”, “number of correct answers,” and “accuracy rate,” from which the evacuation facilities were considered. The experimental result showed that there was no significant difference in the number of correct answers. However, we observed a significant tendency in the reaction and accuracy rates. We can be more aware of the evacuation facilities than to obscure the map. From the participants’ views, we found that they were aware of the evacuation facilities with the no-tile and tile groups.

References 1. Ministry of land, infrastructure, transport and tourism, promotion of soft measures for safety and security. http://www.mlit.go.jp/kisha/kisha06/01/010629/03.pdf. Accessed 21 July 2017. (in Japanese) 2. Tanaka, K., Kato, T.: Cognitive psychological analysis of flood hazard map design. In: Proceedings of the Fuzzy System Symposium, vol. 27, p. 145 (2011) 3. Fire and disaster management agency, report on the function of the evacuation support app March 31, 2015. http://www.fdma.go.jp/neuter/about/shingi kento/ h28/hinanshien appli/houkoku/houkoku.pdf. Accessed Apr 2018. (in Japanese) 4. Kawakami, H.: Toward system design based on benefit of inconvenience. Hum. Interface 11(1), 125–134 (2009). (in Japanese) 5. Amano, T.: Development of a floods hazard map application for iPhone using open data. Theory Appl. GIS 23(2), 37–42 (2015). (in Japanese) 6. Fukada, H., Hashimoto, Y., Akabuchi, A., Oki, M., Okuno, Y.: Proposal of Tsunami evacuation support system using a tablet PC. In: 2013 Multimedia, Distributed, Cooperative, and Mobile Symposium (DICOMO), pp. 1938–1944 (2013). (in Japanese) 7. Elsergany, A.T., Griffin, A.L., Tranter, P., Alam, S.: Development of a geographic information system for riverine flood disaster evacuation in Canberra, Australia: trip generation and distribution modelling. In: Proceedings of the 12th International Conference on Information Systems for Crisis Response and Management, ISCRAM 2015, pp. 1–13 (2015) 8. Murakoshi, T., Yamamoto, K.: Study on a social media GIS to support the utilization of disaster information: for disaster reduction measures from normal times to disaster outbreak times. Socio-Inform. 3(1), 17–30 (2014) 9. Tanaka, K., Nakatani, Y.: Invisible tourist navigation system without detailed map information. In: Proceedings of the 73th National Convention of IPSJ, Vol. 3, pp. 85–86 (2011). (in Japanese) 10. Gouko, M., Kim, C.H.: Study of robot behavior that encourages human to tidy up disordered things on table top. Trans. Jpn. Soc. Artif. Intell. 32(5), 1–8 (2017). (in Japanese) 11. Shirai, Y., Matsushita, M., Ohguro, T.: HIEI projector: augmenting a real environment with invisible information. In: 11th Workshop on Interactive Systems and Software (WISS 2003), pp. 115–122 (2003). (in Japanese) 12. Matsumura, M.: Shikakeology: how to create an idea that moves people, pp. 27–30. Toyo Keizai Inc. (2016). (in Japanese)

Development of a Stroll Support System Using Route Display on a Map and Photograph Sharing Service Junko Itou1(B) , Takaya Mori1 , Jun Munemori1 , and Noboru Babaguchi2 1

Wakayama University, 930, Sakaedani, Wakayama 640-8510, Japan [email protected] 2 Osaka University, Suita, Osaka 565-0871, Japan

Abstract. This paper proposes a stroll support system that uses a photo sharing service and route display with panoramic images to guide tourists to places of interest within a limited timeframe available for sightseeing. This system obtains photographs from a photo sharing service based on position information and places the pictures on a map with corresponding panoramic images. This enables users to understand what kind of landscape is near their position. Additionally, this system records a user’s walking route and duration, subsequently providing the record as memories of their stroll. Comparison experiment results suggest that the pictures displayed on the map and the panoramic image were very useful in helping tourists choose places of interests. Keywords: Sightseeing support system Location information · Downtime

1

· Stroll · Photograph

Introduction

Combining global positioning systems (GPS) with social networking service (SNS) has become useful in collecting information and providing navigation to sightseeing spots as a tourist moves from one place to another. However, existing sightseeing support systems [1–4] are tailored to famous landmarks only, and many of them do not provide any information on other nearby points of interests. There is also a lack of information regarding the type of experience that one may enjoy in such places. This is especially true for tourists who want to view the same landscape as a picture on the SNS, or want to explore the area around the destination within a certain time limit. In this article, we focus on effective utilization of downtime at unfamiliar travel destinations, as well as scenic photographs near sightseeing spots. The purpose of this study is to develop a smartphone service to support about 20– 30 min of strolling time around sightseeing spots using pictures of nearby places of interest, panoramic images, and location/route information. Our system displays pictures taken near a user’s current location, as well as panoramic images of the immediate area, enabling the user to set a tentative c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 48–55, 2018. https://doi.org/10.1007/978-3-319-98743-9_4

Development of a Stroll Support System Using Route Display on a Map

49

destination based on impressions derived from the pictures and their remaining time. Subsequently, a user can freely stroll through multiple places of interest around a sightseeing spot within a limited period. This paper is organized as follows: In Sect. 2, we describe the related service for sharing sightseeing information. In Sect. 3, we explain our proposed system, which supports strolls in a travel destination using photographs, panoramic images and maps. A comparison test for our system presented in Sect. 4. Finally, we discuss conclusions and future work in Sect. 5.

2 2.1

Related Work Optimal Route Recommendation

A common trend among existing sightseeing support services is to determine the most efficient way to travel by providing an optimal route to the destination. This results in a lack of information about other places of interest around sightseeing spots, which can be important for certain users. Google Places is one informationsharing service that uses a map [1]. Users share information such as addresses, photographs, and comments about popular locations. In some cases, pieces of information may overlap, making it difficult for users to find the information they seek. 2.2

Recommendation Based on Online Information

Approaches using geotagged photo data posted to a photo-sharing service have also been proposed [2,3]. They analyzed large quantities of photographs and extracted travel information such as activities and typical sightseeing patterns. These systems provide users information about attractive areas or example itineraries using visual analysis results. The primary emphases are to provide famous places of interest in a given area, and/or generate an efficient tour route. Therefore, there is a lack of information regarding additional places of interest located along the route between famous sites. Tiwari et al. proposed a sightseeing spot recommender system using enrichment information including weather and traffic conditions [4]. They assembled a database containing location data and contextual information input by registered users. Users can obtain detailed graphical information about a tourist area using these systems. Fujii et al. proposed a method to analyze tourists’ behaviors automatically from travel blog entries [5]. By using this method, tourist information regarding souvenirs and sightseeing spots was extracted with high accuracy. In these systems, recommended sightseeing routes are provided to the user based on route preferences from other users. However, these systems still have trouble providing route recommendations for sightseeing spots that are not as famous due to a lack of user preference data. CT-Planner introduces routes starting at a specific point and then moving within a certain area, without a predetermined destination [6]. Users can therefore design their tour interactively. The system also allows users to register new

50

J. Itou et al.

sightseeing spots and give recommendations. However, it is not clear whether the place that the traveler actually wants to visit is included in the recommended route. 2.3

Study Approach

In this study, we propose a sightseeing support system in which pictures taken near a tourist’s current location are displayed on a map. In addition, panoramic images of the immediate surrounding area are displayed on the user’s smartphone, enabling the user to set a tentative destination based on their impressions of the pictures. Users can set a place that they are interested in as a temporary destination, and the map will generate the required time to get there. The system also allows the tentative destination to be changed at any time. The primarily utility of the proposed system is that allows a user to take a stroll through multiple places of interest around a famous sightseeing spot within a limited time.

3 3.1

Proposed Stroll Support System Goal

Our goal is to develop a system that provides scenery pictures to guide tourists to places of interest when they take a stroll, as well as information that can be used to plan for spare time at or near a user’s travel destination. Our target user is not a priori familiar with the sightseeing area. In contrast to conventional navigation systems that display the shortest route to the sightseeing destination, the user can freely plot his or her own course, with as many detours as desired. 3.2

Design Method

The proposed system is implemented as a web service that can be accessed from a smartphone. Pictures are placed as markers on a map and on panoramic images. The following section describes the system design method. 1. Setting the destination based on pictures The user sets the destination of interest based on their impressions of the pictures of nearby places. The time, date, and location information of each picture are displayed in the application. Pictures of scenery located elsewhere are also displayed at times. This allows the user to select a destination that is visually attractive. 2. Overlaying pictures taken around the current location Once the destination is set, the shortest route is not displayed; instead, a group of pictures of nearby places is displayed, along with the season, time, and location of the picture. This is accompanied by a rough estimate of the time required to reach the point of interest in the picture. This enables visual selection of a route as the user looks for sightseeing spots other than the destination.

Development of a Stroll Support System Using Route Display on a Map

51

Fig. 1. Overview of a client screen.

3. Recording time spent by users at each point The proposed system records a user’s position information every one minutes. As the user shops or takes a picture, the user remains in the same position for a longer time than if he/she were merely walking past a point. The system estimates the required time at each spot and, after the stroll is over, displays the user’s route and activities using a map. This allows the user to look back on the shops and scenery that the user found by chance during his/her stroll.

3.3

System Overview

The system is implemented as a web service that can be accessed from a smartphone. The system clients are smartphones running on iOS or Android, with GPS and PHP-enabled web browsers installed. Google Maps API1 is used to display maps, and the pictures are placed as markers. All map data in this article is based on Google Maps. All picture data is acquired by Flickr API2 . Services such as Google Earth can also display uploaded photographs as panoramic images. However, in the proposed system, users can set conditions such as seasons, time zones, and directions, as well as change the displayed picture. In addition, tags are attached to photographs acquired from Flickr. The tags include geographical names, as well as related persons and historical events that occurred at the point of interest. These tags are also displayed as picture information. 1 2

Google Maps API: https://developers.google.com/maps/. Flickr API: https://www.flickr.com/services/api/.

52

J. Itou et al.

Fig. 2. Panoramic image and overlapping pictures.

Figure 1 shows screenshots of the proposed system in operation. The map with markers for the current location and the destination is displayed on the right. The left and middle images show panoramas from the location identified on the right image. Users can switch between the map and the panoramas. The pictures obtained from Flickr are overlapped with the panoramic image. In the middle image, the selected picture is displayed along with supporting information. The supporting information includes the direction in which the picture was taken, as well as the date and season of the photograph. The system displays pictures along the path from the current location to the final destination as thumbnails or markers on the map, or in a list. The user can decide the next destination or transit point after browsing the pictures. 3.4

Presentation of Photographic Information

Photographs are placed on the map and the panoramic image is set as a marker or thumbnail. Figure 2 shows screenshots of pictures overlaid on a panoramic image of a Japanese castle. The size of the photograph is proportional to the distance from the interest point; nearby photographs are displayed as larger images. As a user changes the display position of the panorama, the overlapping photographs change. The transmittance of the photographs is set to 30%. When the user taps the photograph to select it, the transmittance changes to 0%. Users can set a marker indicating a picture of interest as a tentative destination. When the route reaches a branching point, the user can check the pictures along possible avenues beyond the branching point by manipulating the panoramic images. Since the display pictures for each route is different, the user can either select the shortest route to the destination or one that includes detours to other places of interest.

4 4.1

Evaluation Experiments Experiment Outline

We performed comparison experiments to investigate the proposed system. The experiment site was a Japanese castle that is a famous sightseeing spot. The

Development of a Stroll Support System Using Route Display on a Map

53

participants in our experiment were eight college students, divided into two groups. Some participants had previously strolled through the site. The comparison means of guidance was tourist attraction brochures3 . We conducted two experiments at intervals of six days. In the first experiment, four participants used the comparison brochures while the other four participants used the proposed system. In the second experiment, participants switched roles from the first experiment considering the influence of order effects. In other words, the participants who used the proposed system in the first experiment used the brochures during the second experiment. The participants used their own Android phone or iOS phone equipped with a GPS function. All participants were accustomed to handling a smartphone. The participants tested the system on their smartphone’s web browser before the experiment. Their smartphones were iPhone7, iPhone8, or Android phone with 4.7 to 5.5-inch displays. The transmission and processing speeds were sufficient. We explained the functions and usage of the system to the participants in advance. All participants started the stroll from the same point. They could stroll freely and we did not specify any particular destination or route. The experiment site has an area of approximately 400 m2 and includes a castle tower, bridge, garden, zoo, and shrine. The participants’ position data were recorded using GPS functionality. The participants were subsequently asked to complete a questionnaire after the experiment ended. 4.2

Experiment Results and Discussion

The experiment results are presented in Table 1. From the results of item (i) and (iii), participants who used the proposed system were able to find places of interest based on the information provided by the pictures. A significant difference in item (i) and a significant trend in item (iii) can be obtained by comparing the two means using the Wilcoxon signed rank test. We can conclude that the proposed system was more useful as a support method to make new discoveries within the experiment site. The results of items (vi) and (v) in Table 1 suggest that although understanding of the route to each spot were similar for each navigation method, the participants were more easily able to find points of interest using our proposed system. Considering the results of items (vii), (viii), and (ix), participants clearly tried to more aggressively explore places that were not included in the system, as compared to those not included in the brochure. Additionally, they utilized photographs taken in different seasons. The Wilcoxon signed rank test indicated a significant difference between expp and expc in items (vii) and (ix). These results suggest that the presentation of information by the proposed system leads users to aggressively seek out new points of interest. 3

Pamphlet download-Wakayama castle-: http://wakayamajo.jp/riyou/pamphlet. html.

54

J. Itou et al. Table 1. Questionnaire results obtained in the two experiments. Questionnaire item

Exp

Value

Median Mode P value

1 2 3 4 5 (i) I was able to discover something I did not know on the castle

expp 0 0 1 3 4 5.0

5

expc 1 1 3 2 1 3.0

3

(ii) Necessary information for strolling expp 0 1 1 3 3 4.0 was found (iii) I obtained information that attracted my interests

4, 5

expc 0 1 3 4 0 3.5

4

expp 0 0 2 5 1 4.0

4

expc 1 1 3 3 0 3

3, 4

(iv) I was able to decide quickly where expp 0 0 1 4 3 4.0 to go for a stroll

4

expc 0 2 2 3 1 3.5

4

(v) I was able to understand the route expp 0 1 2 4 1 4.0 to each spot

4

expc 1 1 2 3 1 3.5

4

expp 0 0 0 6 2 4.0

4

expc 0 1 3 4 0 3.5

4

expp 0 1 0 5 2 4.0

4

(vi) I was able to find the point of interest (vii) I consciously looked for the place introduced in the system/brochure

expc 0 3 4 1 0 3.0

3

(viii) I wanted to find places that were expp 0 1 2 2 3 4.0 not introduced in the system/brochure

5

expc 1 3 2 1 0 2.5

2

expp 0 2 2 1 3 4.0

5

(ix) I used information when the seasons did not match

0.033

0.149

0.076

0.085

0.580

0.015

0.011

0.084

0.046

expc 2 3 2 1 0 2.0 2 Evaluation value: 1: strongly disagree, 2: disagree, 3: neither, 4: agree, 5: strongly agree. expp : The experiment was performed using the proposed system. expc : The experiment was performed using the brochure.

Through expc , it is evident that the participants mainly referred to maps to recommend routes. In contrast, tags, landscapes, and photographs of different seasons were utilized in expp . In the free description field, one user answer stated that he/she knew the details of popular places mentioned in the brochure, but he/she did not feel like visiting other places. On the other hand, in the proposed system, one user opinion mentioned that he/she was able to discover new scenic spots even though he/she had visited the main point of interest several times. The brochure mainly provides information on the castle towers, garden, and zoo. Many participants took a walk around the castle tower and the zoo. However, using the proposed system, the same number of participants visited the shrine not directly related to the castle.

Development of a Stroll Support System Using Route Display on a Map

55

Based on the questionnaire results, we can conclude that users reading the brochures were primarily focused only the places mentioned in the brochure. However, the proposed system was more useful to the participants because it provided information such as photographs and tags; it proved especially useful when the current season and that of the panoramic images were not the same. Some participants pointed out that it was difficult to tap the buttons and that the photographs were small and hard to view. In the future, it is necessary to improve the interface so that it can more comfortably provide photograph information on the map and allows users to select a destination.

5

Conclusions

This study proposes a system that supports strolls to a destination using a photo sharing service and route display on a panoramic image. Unlike existing sightseeing support systems, the proposed system enables a user to freely select a destination or a route to the destination from pictures and supporting information overlaid on a panoramic image of the user’s current location. We conducted a comparison experiment in the same sightseeing spot using the proposed system and tourist attraction brochures. Results suggest that participants subjects were able to share not only the scenery that they wanted to see but also discover new places of interest. In future work, we plan to improve the interface as users can more easily view and utilize photographs and supporting information easily. Acknowledgments. This work was supported by JSPS KAKENHI Grant Number 16K00371.

References 1. Google Developers: Google places API. https://developers.google.com/places/. Accessed 28 Feb 2017 2. Kisilevich, S., Krstajic, M., Keim, D., Andrienko, N., Andrienko, G.: Event-based analysis of people’s activities and behavior using Flickr and Panoramio geotagged photo collections. In: 2010 14th International Conference on Information Visualisation (IV), pp. 289–296 (2010) 3. Popescu, A., Grefenstette G., Moellic, P.: Mining tourist information from usersupplied collections. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM), pp. 1713–1716 (2009) 4. Tiwari, S., Kaushik, S.: Information enrichment for tourist spot recommender system using location aware crowdsourcing. In: IEEE 15th International Conference on Mobile Data Management (MDM), vol. 2, pp. 11–14 (2014) 5. Fujii, K., Nanba, H., Takezawa, T., Ishino, A., Okumura, M., Kurata, Y.: Travellers’ behaviour analysis based on automatically identified attributes from travel blog entries. In: Workshop on Artificial Intelligence for Tourism (AI4Tourism), PRICAI 2016 (2016) 6. Kurata, Y.: Collecting tour plans from potential visitors: a web-based interactive tour-planner and a strategic use of its log data. In: Egger, R., Gula, I., Walcher, D. (eds.) Open Tourism. TV, pp. 291–297. Springer, Heidelberg (2016). https://doi. org/10.1007/978-3-642-54089-9 20

Inter-Cultural Collaboration

Machine Translation Usage in a Children’s Workshop Mondheera Pituxcoosuvarn1(&), Toru Ishida1, Naomi Yamashita2, Toshiyuki Takasaki3, and Yumiko Mori3 1

2

Kyoto University, Kyoto, Japan [email protected] NTT Communication Science Labs, Kyoto, Japan 3 NPO Pangaea, Kyoto, Japan

Abstract. Machine translation (MT) enables a group of people who do not share a common language to work together as a team. Previous studies have investigated the characteristics of MT-mediated communication in laboratory settings and suggested various ways to improve it. Yet, few studies have investigated how MT is actually used outside the lab. We still lack an understanding of how MT is used in real-world settings, particularly when people use it in face-to-face situations. In this paper, we report on an ethnographic study of a multilingual children workshop using MT to communicate with each other in real world. We studied how children use various communication methods such as gesture and internet to compensate for the mistranslations of MT. For example, children tried to understand poorly translated messages by reading the alternative translations and used web browsers to search for pictures of unknown objects. Finally, we propose design implementations based on our findings. Keywords: Children’s collaboration Machine translation

 Multilingual workshop  Field study

1 Introduction Different languages are the main barrier to the collaboration of multilingual groups. Machine translation (MT) services are now available and have been used as support systems [1]. They allow a multilingual team to work together without having a shared language. Many researchers have tried to support multilingual communication by evaluating [2, 3] and improving MT quality [4, 5]. Some researchers studied how MT is used in general [6, 7] but how MT supports users in face-to-face communication remains an unstudied area. Pangaea, a non-profit organization (NPO), organizes an event called Kyoto Intercultural Summer School of Youth (KISSY) once a year. Its goal is to encourage children to develop social bonds across boundaries and motivate them to communicate with children from different countries with different languages. KISSY is an event that encourages children from different countries to collaborate by working on a shared project using KISSY tool, which is a machine translation tool. It augments the face-to-face communication established among children and staffs with different language backgrounds. © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 59–73, 2018. https://doi.org/10.1007/978-3-319-98743-9_5

60

M. Pituxcoosuvarn et al.

Hida [8] studied the KISSY workshops of 2014 and 2015. He suggested that problems were present in the children’s communication and collaboration. One such problem is some of the messages were incomprehensible because of low MT accuracy. However, previous work did not discover how the children overcame the problems caused by MT errors. In this paper, we report an ethnographic study of KISSY with narrative. The objective of this paper is to understand how users collaborated using the MT embedded in the KISSY tool. This year, our team conducted at KISSY a four-day ethnographic study involving 2 teams, a total of 16 users. From our observations, we identify the solutions used by the children when MT failed to help them fulfill their communication goal. Knowing how the children solve communication problems should allow us to better understand the communication difficulties raised by MT. Our research question, ‘How did the children solve their communication problems encountered when using MT’, is intended to allow better support tools to be designed in the future, especially for users of low-resource languages. Low-resource languages refer to less-studied languages, minority languages or languages with low technological support resources and corpora. Our results should help HCI researchers to better understand the users’ problems and behaviors when using MT, and thus create more effective design support systems for multilingual collaboration. Based on the results, we suggest the design of a multilingual tool that improves overall communication.

2 Related Work 2.1

Multilingual Communication Support

There exists a variety of studies aimed at supporting multilingual communication. Imoto [9] introduced a tool that translates sentences and displays possible answers based on the question’s intention type; they claim that their system can be used in some specific applications, i.e. hospitals. Many researchers are trying to support multilingual communication by improving the quality of MT and some researchers suggested that, involving human in the translation can improve translation quality and user understanding [5, 10]. For example, Avramidis [10] integrated a human interpreter into the process of rating and postediting machine translated messages. Morita [11] proposed a method using monolinguals to boost the fluency and adequacy of both sides of two-language machine translation. Other existing works consider back-translation [12], which was originally created to investigate translation quality. Back-translation has also been adopted for MT. Shigenobu’s study [6] indicates that showing back-translation output can improve the accuracy of outward translation. There are various multilingual collaboration support tools using MT systems developed in previous studies, including AnnoChat [4], Langrid Chat [13], and Online Multilingual Discussion Tool (OMDT) [14]. The support systems mentioned were designed for adults and mainly used for general communication. YMC system, an existing MT system designed for children [15], was created for multi-language knowledge communication between children and adult experts.

Machine Translation Usage in a Children’s Workshop

2.2

61

Difficulties of Using Machine Translation

Although MT is a useful tool for multilingual communication, it can still create difficulties due to its unreliable quality. Yamashita et al. [4] studied how MT affects human communication. They gave pairs of users two sets of ten tangram figures which were placed in different sequences. The users were instructed to match the arrangements using an MT tool. They found that using MT lead to asymmetries in the machine translation process which yielded trouble in identifying tangrams and sequences through the expressions used and accepted. The asymmetric quality of each MT service can also cause difficulties, especially, for low language resource users who cannot converse very well, because they cannot understand messages and communicate correctly. A group of researchers [16] proposed a method that made the choice of communication channel dependent on MT quality and users’ language skill; it helps to balance the opportunity for participation in the conversation. However, still more MT problems remain to be studied.

3 Kyoto Intercultural Summer School for Youth (KISSY) Children aging 8 years old to 14 years old from different countries gathered together at a university to participate in a workshop and collaborate with each other with no foreign language skills being required. The main task for the workshop was to create a short clay animation using clay figures. The participants were asked to create a story from one or two given objects: one brown rectangular block, and one white clay piece shaped like a bottle gourd. Each team had to create a scenario, model the clay, take photos, draw backgrounds, record sound effects, assemble the results, and edit the videos. The workspace for each team was separated but within the same hall; partitions were not used. Each team had a main table for discussions around a laptop PCs; the children sat in a U-shape facing the middle of the table, see Fig. 1. There was a shared screen linked to the team leader’s PC, who sat next to the screen. There was a table for clay sculpting next to the photo booth. Each group had their own editing table with two PCs for sound and video editing. The positions of the photo booth, clay work table, and editor PC table were similar but slightly different for each team. The order of participant’s seating within the group was changed a few times during the four-day workshop. In addition to the team work space, there was also a space for administrative use, for example to distribute equipment in different parts of the activity hall. In this area, also had a small tent some distance from the team work space for making sound and voice recordings. 3.1

KISSY Tool

KISSY tool is a web application with various functions to support multilingual collaboration; it was created specifically for KISSY. In order to use the system, each participant was provided with a laptop PC and internet connection. Each user was given

62

M. Pituxcoosuvarn et al.

Fig. 1. Sketch of team work space

Fig. 2. A screen shot of KISSY tool team chatroom

a username and a password to access the system. They could choose the interface language that they were comfortable with, from English, Khmer (Cambodian language), Korean, and Japanese. The main function was the multilingual chatroom, where the user could input text in her/his language and see everything in her/his language while the other users saw the text in their own languages. It has two interfaces for sending messages and reading messages in team chatroom. One of them is a general chat interface, the arrival of a new message pushes out the oldest message being shown, see Fig. 2. Another chat interface gave everyone their own space in different boxes. Messages from each participant appeared only in each person’s box, as in Fig. 3. This function makes it easy to see the messages of all users at a time.

Machine Translation Usage in a Children’s Workshop

63

Fig. 3. A screen shot of discussion screen

Fig. 4. A screen shot of ideaboard

Figure 4 shows ideaboard, the interface of another function of KISSY tool that was used often. The team leader could use ideaboard to pose a question and the children could express their ideas by typing in virtual responses in their own language. Each member could vote (click) for one favorite idea per question. MT services used by KISSY tool were provided by LanguageGrid [17, 18]. Services selected for each pair of languages were provided by GoogleTranslate.

64

M. Pituxcoosuvarn et al.

4 Method We conducted an ethnographic study by observing the participants and staff at KISSY. Ethnography is the most basic form of social research [19]. It is a method, most often used in anthropology, that involves encounters, respecting, recording, understanding, and representing human experience [20]. This method is now practiced in various discipline. Of the five teams participating in KISSY, two teams were observed for this study. Each team consisted of one adult facilitator, called team leader, and seven participants. Team Red had a Japanese team leader two Korean children, four Japanese children, and one Cambodian child. Team Green had a Japanese team leader, three Korean children, three Japanese children, and a Cambodian child. Korean children could communicate in simple English, while most Japanese children could not communicate in English. Cambodian children also could speak very little English. Videos were recorded from afar to minimize interference with the activities and to encourage the participants to relax and react in a natural manner. The videos were analyzed in our laboratory. After the event, we conducted face-to-face and online interviews. For each team, we interviewed two children and the team leader. We also interviewed bilingual and trilingual staff who were not part of any team but helped with the translation.

5 Coping with Mistranslation During the workshop, when a mistranslation occurred, users tried to solve the problem both by themselves and by asking a human interpreter for help, as shown in Fig. 5. The chart is discussed in detail in the following subsections.

Alternative methods of communication when MT translated message was not understandable

Fig. 5. Behavior of users when MT translated message was not understandable

Machine Translation Usage in a Children’s Workshop

5.1

65

Alternative Channels of Communication

When an MT problem occurred and users shared a language, even if the language skill was low, they turned into face-to-face communication. As shown in Fig. 3, the alternative methods included using shared language, other media including screen, drawing, gesture, picture, and language learning books. Many times, they used more than one communication channel at a time, usually including trying to use their shared language even though their shared language skill was low. Here is an example of MT failure and the user’s response. The paragraphs in italic are from the ethnography transcript. During role decision of Team Green, team leader (TL2) asked everybody a question via the ideaboard function. He typed this question in Japanese which basically means ‘what role do you want to take?’. It appeared on the Cambodian child (C2)’s screen as ‘A position that is this good?’ in Khmer. C2 did not understand. Since she sat next to TL2, she suddenly turned to him and TL2 noticed that she could not understand the question. He switched his tool into English language mode and tried to speak with her in English. She understood. The translation in English was shown as ‘Which role should I take?’. TL2 noticed the mistranslation so he suddenly fixed it by saying ‘Not me, but you’. In this situation, they both tried to communicate, using gesture when C2 turned to TL2, alternative translation on TL2’s screen, and then English. 5.2

Asking Others

Many times, the MT tool failed to enable users to communicate, especially for the user of low resource language (Cambodian children). There was only one Cambodian staff to support three Cambodian children in three different groups.

Fig. 6. Cambodian staff helped a Cambodian child by reading English translation instead of Khmer translation.

66

M. Pituxcoosuvarn et al.

In the interview, the Cambodian staff stated that “They only understood Khmer language and the translations on the screen in Khmer weren’t correct. It was almost correct if English was translated into Khmer. But they typed in Korean or Japanese so this is the problem for translation, I think. The kids need more help from me in this activity” As shown in Fig. 6 and the interview, Cambodian staff helped the children to understand messages by switching the interface to English and reading English texts instead of Khmer. When the translated messages were incomprehensible, the communication changed from MT among participants and team leader to face-to-face with a human interpreter. When the bilingual staff helped, he normally used one of two options. We asked how he helped them when the Khmer message was difficult to understand. He answered, “Sometimes I checked in English, if I still did not understand because of translation from Japanese or Korean to English, I asked the team leader to explain it to me in English”. First, the bilingual staff read the messages and tried to understand the message by reading the message translated into other language. For example, the Cambodian staff, who could speak both Khmer and English, helped the Cambodian kid by reading messages in English and translated for her, since English translation had higher quality and thus easier to understand. If the staff still could not understand the message, he asked the team leader directly in their shared language and translated his understanding to the kid.

6 Understand Culturally Dependent Context Sometimes, the children sometimes could not understand each other when the messages or words depended on the culture of the originator. At the beginning of the workshop, our staff showed brown rectangular block, as shown in Fig. 7, to the children and asked them what did they think this block looked like.

Fig. 7. A clay block was shown to the children. (Color figure online)

Machine Translation Usage in a Children’s Workshop

67

One of the Japanese used KISSY tool to say that it looked like ‘Anko’ or Japanese style red bean paste. People from a different culture could not understand the comment, because red bean paste in the other countries does not look like a block. Moreover, the Khmer MT translation was which means ‘something made by Japanese’. Khmer speaker could not understand the whole translated phrase. Since the MT translated message was incomprehensible, face-to-face communication was needed and a human interpreter was also needed to help the children create mutual understanding of the object. As shown in Fig. 8, the Cambodian staff helped the Cambodian participant to search for photos of Anko. It helped them understand why the Japanese participant referred to the object as red bean paste.

Fig. 8. Cambodian staff helped the Cambodian child to search for ‘Anko’ picture

Another similar situation arose when the name of thing that exists in one language meant something different in another language. During the self-introduction period, when the children were talking about movies, even though MT could translate the movie names correctly, many movies have specific names in different languages. As a consequence, the users had trouble understanding which movies was being referred to. In this case, as well, they solved the problem by searching for images or posters of the movie to show the others.

7 Tool Navigation and Instruction During the workshop, the children could use different software or different windows in the KISSY tool. The tool includes many functions for example, ideaboard, chatroom, etc. Working as a team requires that everyone be on the same page for collaboration to succeed. In the workshop, TL2 had just finished creating a new question on ideaboard but one Korean child (K1) was using a webpage outside the KISSY tool. Since K1 was not on the chatroom page, she did not know what was going on and which page should be viewed at that moment. TL2 told an English-Korean interpreter (I3) to tell K1 “Go to the next question”

68

M. Pituxcoosuvarn et al.

As shown in Fig. 9, the team leader wanted everybody to look at the ideaboard after he finished entering a question. Because K1 was not viewing the chatroom screen, it was impossible for her to read messages even if TL2 sent a message to the chatroom for everybody to navigate to ideaboard. In this case, direct face-to-face communication or communication via human interpreter was needed. Because MT cannot cope with this kind of situation, an alternative communication channel was used, and an interpreter was needed.

Fig. 9. Interpreter (I3), on the left-hand side, helped team leader (TL2), the man on the right, to talk to (K1), the girl who is using a PC on the left, about what to do next.

Sometimes, the team leader communicated non-verbally by showing the page on the shared screen and pointing to the page so the children could follow him on the same page. In another example, when the users were not familiar with the tool or when the users faced difficulties with the tool, they shifted to face-to-face communication, as in Fig. 10. At the very beginning of the workshop, while the other children are voting for the idea they like. C2 was still not sure what to do. TL2 had to point at her screen and ask her “Can you choose one?” verbally in English. In this case, users could not communicate using MT in the KISSY tool. Hence, conversation in shared language was needed when there was no human interpreter.

Fig. 10. Team leader (TL2), on the right sitting next to the shared screen, communicated directly to the Cambodian Child (C2), the girl next to him.

Machine Translation Usage in a Children’s Workshop

69

8 Substituting Machine Translation In many situations, MT was not used. This section describes situations wherein other communication methods were preferred over MT. 8.1

Using Common Words and Signs

In the workshop, when there was a common word among all languages or there was a simple word that could be understood by everyone. For some words, i.e. “Okay” or an object that everybody understood such as “Soba”, “Sushi”, face-to-face communication was often used. Non-verbal communication was also used. For example, pointing index finger down at the keyboard meant “Vote!”, and clapping hands expressed that something has been done. This kind of gesture can be easily understood by every member. 8.2

Involving Physical Objects in Communication

When the communication involved physical objects, MT was used less often. The following situation is from the video taken in the morning of the third day of the workshop. Team Red had a short meeting before working separately. In the meeting, before team leader (TL1) explained the work plan of that day using a physical board with written papers on it, he called a Korean-Japanese interpreter (I1) and a KhmerEnglish interpreter (I2) to help him. Then he explained the work, mostly in Japanese and sometimes in English, while pointing on the board from time to time. I1 translated what TL1 said from Japanese to Korean for Korean Children in parallel. TL1 spoke English later but not for all messages in Japanese. Instead of using MT, TL1 decided to ask for interpreter’s help and speak directly in this mother language. In this case, using machine translation would make the use of gesture and the physical board difficult, especially when the MT tool requires typing. However, not using MT can cause the inequality in successful receipt of messages. In this situation, I2 could not speak Japanese so he had to wait until TL1 spoke English to him. TL1 spoke much shorter sentences in English due to his limited ability to communicate in English and this prevented C1 from understanding the whole meeting while the Japanese and Korean children could understand more quickly about what was going on.

9 Discussion From our investigation of KISSY 2017, problems exist that are deeper than the conversation and translation level.

70

9.1

M. Pituxcoosuvarn et al.

Low Language Resource User Support and Problem Detection

Good collaboration should have team members equally and actively participate in the conversation and activity. In KISSY, the low language resource users, the Cambodians, faced the biggest barrier to participation. One problem was the quality of Khmer MT. Since the language is low resource for translation, the messages translated from and to Khmer are difficult to understand and sometimes incomprehensible. When the messages on the screen do not make sense, it is difficult to know what are people talking about and it is almost impossible to talk or express one’s idea in this situation. Another problem as a consequence is, it is difficult to detect the problem when a user needs help. If the children know they need help and ask for help, their problem can be solved easier, but many times they did not recognize they needed help. During the workshop, many times the team leaders had to identify who needed helps and then ask the interpreters to help the children. The interpreters also looked around and checked if any of the children needed help. However, if help is not promptly available, it will raise difficulties with their participation. 9.2

Human Interpreter Task Overload

When the children had to solve their understanding problems whether by using alternative communication methods or asking the interpreter for translation help, the communications were usually one-to-one. Unfortunately, KISSY multilingual chat is inherently not suitable if the goal of communication is one-to-one. At the workshop, when the team leader and the Cambodian child wanted to communicate one-to-one, the Cambodian interpreter was called to help with the translation. The number of human resource or staffs is limited, especially for the minority languages. One-to-one communication without multilingual tool support increased the need for human interpreters. Because of the problems caused by MT and others, the Cambodian children needed a lot more help than the other children. The Cambodian interpreter also reported that he could not manage to help all the children at the same time. Many times, the children need his help but saw him busy with others; the children did not want to disturb him and so waited until he was free. The time spent waiting delayed their participation.

10 Design Implication 10.1

Design Implication to Support Communication with Low MT Quality

Image Browser in Multilingual Chatroom As mentioned with regard to culturally-dependent context, the participants tried to search for images and show them to the other participants to help their understanding. However, it was not convenient to search and share images since no shared display was

Machine Translation Usage in a Children’s Workshop

71

provided, other than that controlled by the team leader. Adding a shared image browsing function to the tool could save time and raise user effectiveness. It would also encourage the users to use more photos or figures to express their ideas and to understand the others. This design guideline can also help to solve the problem of understanding the culturally dependent context that cannot be explained easily by words, for example, travel attractions, and ethnic foods. Interpreter Calling Function with Prioritization One of the main problems for minority users was the paucity of MT and human resources, since MT quality is low for low resource-languages and it is difficult to find speakers of the minority language to support the children. Predicting the help needed would be a useful function as children often failed to notice that they needed help or were shy in asking for help. A prediction model could be made by timing the periods of inactivity of the participant. Raising the priority of users who have been idle or who need more help might be useful. Developing a language profile of each member of the team is also possible. If there are two children waiting for help but one of them can speak better English as a shared language with the team leader, that person might need less help. The flag for help can also be sent to the team leader since sometimes the team leader checked the progress of minority users and tried to communicate directly or call for the interpreter to help them. Showing Translated Result in Known Foreign Language in Parallel In the workshop, non-native English speaker staff helped the children to understand written messages by reading messages in English, instead of reading message in his own language with poor translation, and translating it to the children. Even though a user might have limited skill of second language, it is still possible that messages translated in second language could be more readable than low quality messages in the main language. Hence, showing both results, those in the user’s language and those in the user’s foreign language, could increase the probability of understanding messages. 10.2

Design Implication for More Convenient Communication

MT for 1:1 Even though collaboration tools should focus on group communication, sometimes oneto-one communication is also needed to run the team activity, as mentioned already with regard to human interpreter task overload. Providing 1:1 translations by stand-along portable devices would be extremely useful if permitted by the group activities. Such a function would allow members to share messages directly without having to involve everybody. This will help to reduce the costs created by the human interpreters. Graphic Signs and Keywords for Changing Method of Communication We already noted that sometimes users used common words and signs to substitute for MT. Better communication requires the greater use of a common language. Images yield better and easier understanding even if the users do not speak the same language. Thus, ideograms might be useful for multilingual collaboration without a shared language. For example, showing the chatroom logo to the children would ensure that they turned to the chatroom.

72

M. Pituxcoosuvarn et al.

Having some basic shared keywords is another way to create easier communication. We can give a list of keywords with their translations and pronunciations in their languages to the participants before the workshop starts. The keywords can be those that are often used in the event, for instance, for KISSY, “Let’s vote!” as “Vote”.

11 Conclusion We reported on a field study of machine translation (MT) usage in a social collaboration event for children. The children were asked to conduct a project using KISSY tool, an MT embedded system for multilingual communication. In the workshop, participants and staffs faced various types of problems due to and related to the use of MT. They chose alternative communication methods when they could not understand the translated messages. The alternative methods involve using a shared language, screen sharing, drawing, gesture, picture, etc. When problems arose due to cultural differences or culturally-dependent words, they turned to an interpreter for help and/or used web browsers to search for related photos to increase understanding or to confirm the understanding of the others. They also needed to communicate via face-to-face methods, when, for example, one or more users were not on the chatroom page, because they could not read the instruction messages. Some problems have yet to be solved. Better support for low language resource users is still needed. The interpreters can be become overloaded, particularly for low resource languages. Finally, we drew a few design implications based on our study. We suggest that the future designs should consider the inclusion of image browsers to assist user understanding, a 1:1 translation function in addition to the group chat, an interpreter calling function with priority, and the use of common keywords or images, to be used together with MT. Showing the translation results in the user’s second language in parallel with her/his mother language could also be effective if the user’s first language is a low resource language or machine translation quality is low. Acknowledgments. This research was partially supported by a Grant-in-Aid for Scientific Research (A) (17H00759, 2017–2020) from Japan Society for the Promotion of Science (JSPS), and the Leading Graduates Schools Program, “Collaborative Graduate Program in Design” by the Ministry of Education, Culture, Sports, Science and Technology, Japan.

References 1. Ishida, T.: Intercultural collaboration and support systems: a brief history. In: Baldoni, M., Chopra, A.K., Son, T.C., Hirayama, K., Torroni, P. (eds.) PRIMA 2016. LNCS (LNAI), vol. 9862, pp. 3–19. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44832-9_1 2. Gaudio, R.D., Burchardt, A., Branco, A.: Evaluating machine translation in a usage scenario. In: Proceedings of the Language Resources and Evaluation Conference 2016 (2016) 3. Scarton, C., Specia, L.: A reading comprehension corpus for machine translation evaluation. In: Proceedings of the Language Resources and Evaluation Conference 2016 (2016)

Machine Translation Usage in a Children’s Workshop

73

4. Yamashita, N., Ishida, T.: Effects of machine translation on collaborative work. In: Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work, pp. 515–524. ACM Press, New York (2006) 5. Chunqi, S., Lin, D., Ishida, T.: Agent metaphor for machine translation mediated communication. In: Proceedings of the 2013 International Conference on Intelligent User Interfaces, pp. 67–74, ACM Press, New York (2013) 6. Shigenobu, T.: Evaluation and usability of back translation for intercultural communication. In: Aykin, N. (ed.) UI-HCII 2007. LNCS, vol. 4560, pp. 259–265. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73289-1_31 7. Hara, K., Iqbal, S.T.: Effect of machine translation in interlingual conversation: lessons from a formative study. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3473–3482. ACM Press, New York (2015) 8. Hida, S.: Supporting multi-language communication in children’s workshop. Master’s thesis. Kyoto University, Kyoto, Japan (2016) 9. Imoto, K., Sasajima, M., Shimomori, T., Yamanaka, N., Yajima, M., Masai, Y.: A multi modal supporting tool for multi lingual communication by inducing partner’s reply. In: Proceedings of the 11th International Conference on Intelligent User Interfaces, pp. 330– 332. ACM Press, New York (2006) 10. Avramidis, E., Burchardt, A., Federmann, C., Popovic, M., Tscherwinka, C., Vilar, D.: Involving language professionals in the evaluation of machine translation. In: Proceedings of the Language Resources and Evaluation Conference 2012, pp. 1127–1130 (2012) 11. Morita, D., Ishida, T.: Collaborative translation by monolinguals with machine translators. In: Proceedings of the 14th International Conference on Intelligent User Interfaces, pp. 361– 366. ACM Press, New York (2009) 12. Brislin, R.W.: Back-translation for cross-cultural research. J. Cross-Cult. Psychol. 1(3), 185– 216 (1970) 13. Inaba, R.: Usability of multilingual communication tools. In: Aykin, N. (ed.) UI-HCII 2007. LNCS, vol. 4560, pp. 91–97. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3540-73289-1_11 14. Nakaguchi, T., Otani, M., Takasaki, T., Ishida, T.: Combining human inputters and language services to provide multi-language support system for international symposiums. In: Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT 2016), pp. 28–35 (2016) 15. Kita, K., Takasaki, T., Lin, D., Nakajima, Y., Ishida, T.: Case study on analyzing multilanguage knowledge communication. In: The International Conference on Culture and Computing (ICCC 2012) was Organized with a Symposium on Digital Media and Digital Heritage to Show the Latest Research and Development Results in the State of the Art on Cultural Computing Technologies And Traditional Culture, pp. 35–42 (2012) 16. Pituxcoosuvarn, M., Ishida, T.: Enhancing participation balance in intercultural collaboration. In: Yoshino, T., Yuizono, T., Zurita, G., Vassileva, J. (eds.) CollabTech 2017. LNCS, vol. 10397, pp. 116–129. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63088-5_11 17. Ishida, T., Murakami, Y., Lin, D., Nakaguchi, T., Otani, M.: Language service infrastructure on the web: the language grid. IEEE Comput. 51(6), 72–81 (2018) 18. Ishida, T. (ed.): The Language Grid: Service-Oriented Collective Intelligence for Language Resource Interoperability. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-64221178-2 19. Hammersley, M., Atkinson, P.: Ethnography: Principles in Practice. Routledge, Abingdon (2007) 20. Willis, P., Trondman, M.: Manifesto for ethnography. Ethnography 1(1), 5–16 (2000)

Learning Support System

A Presentation Supporting System for Programing Workshops for Elementary School Students Koki Ito(B) , Maki Ichimura, and Hideyuki Takada Faculty of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan k [email protected], [email protected] http://www.cm.is.ritsumei.ac.jp/lab/top.html

Abstract. There is a growing interest in programming learning thereby many programming workshops are held for elementary school students. Presentation are carried out to share the completed project with other participants in programing workshops. One of the general presentation methods is to show static slides on a screen containing project explanation, which has a negative impact on a dynamic feeling of the project. We assume that it is important to demonstrate a project work directly to the audience not by using static slides. However, it is not easy for elementary school students to speak while thinking the points of their projects in their minds during the demonstration. In order to improve this situation, we have developed a presentation support system for programming workshops. This system enables students to create and view a material of the presentation easily with the provided interface. As a result of applying the system to an actual workshop, it was confirmed that conducting a presentation while browsing the material of the completed project was effective and made students speak spontaneously and voluntary.

Keywords: Programming education Presentation

1

· Computational thinking

Introduction

In Japan, programing learning in elementary education will be a compulsory course from 2020 [3]. Many programming workshops are held for students by various organizations including non-profit organizations and companies. It can be said that the interest in programming learning is increasing in Japan. The purpose of programing learning is to foster elementary school students’ computational thinking [5,6]. Therefore, in programming learning, teaching methods aiming at nurturing elementary school students’ computational thinking have been implemented. c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 77–88, 2018. https://doi.org/10.1007/978-3-319-98743-9_6

78

K. Ito et al.

In programing workshop for elementary school students, visual languages such as Scratch [7] are often used. One of the methods to make a presentation for their own completed projects is to first prepare the slides and then explain about their projects with the prepared slides. However, despite having the completed project which is an achievement of actual programming, the presentation using static slides impairs the dynamic feeling of the presentation and leads to a boring presentation. In this research, we think the presentation method showing the action of the completed work is more efficient than the method using only the slides. On the other hand, the method showing the action of the actual programming is difficult for elementary students because students need to think what to speak in their mind while making a presentation. To overcome this situation, we developed a presentation support system for programming workshops, and evaluated its effectiveness by using it at programming workshops for elementary school students. This system supports elementary school students to prepare a material summarizing the contents of their presentations and view the material as auxiliary material while making a presentation. During the presentation, students explain the projects displayed on the screen while checking the prepared material. For making the material, we provide a function to make it easier to summarize the presentation focusing on the behavior of the object included in the project. If elementary school students can prepare a material summarizing the contents of the presentation, it is expected to be easy for them to speak. The rest of this paper is organized as follows. Section 2 shows the background of this research. Section 3 describes a presentation support system for programming workshops. Section 4 gives the evaluation of the proposed system. Section 5 concludes the paper with some future works.

2 2.1

Background Programing Education

The Ministry of Education, Culture, Sports, Science and Technology held a meeting to discuss the significance of programming education for primary education. The purpose of the meeting was to collect experts’ knowledge from various fields and share the understanding of the significance of programming education for elementary school level. Therefore, every elementary school can smoothly carry out programming classes in the future. In programming classes at elementary schools, students are expected to realize that computers are used in daily lives and there are necessary procedures to solve problems while experiencing that users can instruct a computer to perform an intended processing. In addition, students are expected to acquire basic “computational thinking” based on the thinking skills gained though primary education and a positive attitude to use computers for their daily lives.

A Presentation Supporting System for Programing Workshops

2.2

79

Problems of Presentation

Many programming workshop organizers conduct a presentation session in programing workshops. This method is effective to share the projects among participants in a workshop. Sharing the projects is mentioned in the “creative thinking spirals” proposed as a process model of fostering a creative thinking skill [7]. This spiral is to nurture creative thinking by repeating the process of “create,” “play,” “share,” and “reflect.” A presentation is part of the role of “share.” In order to activate this spiral, the presentation has to play a role of “share” well. One of the major methods of a presentation is to use presentation tools such as PowerPoint and Keynote. This is a method which uses slides including a media such as texts, images, and movies. On the other hand, the method to show the completed programming works not by using slides is more effective than the method using the static slides from the viewpoint of sharing the contents of the projects in the presentation of a programming workshop. However, this presentation method has a problem that it is difficult for elementary school students to explain the projects. In order to present the work directly, students need to speak while assuming how the project being presented behaves next. Therefore, it is necessary to summarize what students want to present about the project before making a presentation. 2.3

Related Research

Some research on the system to promote the sharing and reviewing of elementary school students’ projects in a workshop using visual language have been conducted [4]. In this research, the authors applied the system to the programming workshops that can share projects in a programming environment using visual language and observed the sharing activities of the elementary school students. As a result, sharing the projects with other students leads to increasing a motivation and performing a reviewing activities on their projects. A research on interfaces for presentation material creation and presentation itself for IT beginners has been conducted [2]. This research revealed that the function of preparing material that can be easily used by IT beginners who are unfamiliar with a keyboard and a mouse was necessary. They mentioned that document editing functions of PowerPoint is one of the examples of an interface that is not easy to use for IT beginners. The inconvenience to create materials for IT beginners is considered to be the same inconvenience for elementary school students, and existing presentation tools should not be applied to the students. For elementary school students, it is necessary to consider an interface which is easy to understand for material preparation and construct a system which enables them to create material easily.

3

Presentation Support System for Programming Workshops

In this section, we describe requirements definition, functions, implementation of the proposed system, and its usage example.

80

3.1

K. Ito et al.

Requirements Definition

As shown in Fig. 1, we propose a system to aid a presentation with which elementary school students show the programming projects directly to the audience. After completing the project, the proposed system will support the participants to prepare the material for a presentation. The material can be confirmed on the material browsing PC, which helps their speech while controlling the project displayed by the project demonstration PC in the presentation. Scratch is a programming learning environment in which programs can be created by combining blocks. In Scratch, objects that can incorporate programs created by combining blocks are called a “sprite” and the sprites hold various images called a “costume.” In a presentation, presenters explain an intention of the project and which part of the work is created well. In Scratch, the objects for creating a program are sprites; hence, the presentation is also likely to be related to the sprite behavior. Therefore, the material summarizing the explanation of the sprites behavior can support the presentation. In addition, there are cases where other sprites are involved in the behavior of sprites. In this case, it is assumed that a description of an action performed by an interaction of two sprites may be included. Interface for material preparation allows the presenters to summarize the content of the presentation about the behavior of the sprites contained in the projects. Furthermore, we set a guideline for the presentation contents from these viewpoints. We promote describing the content of the presentation by reviewing on their projects from the viewpoints of characteristics of the project and the functions to be developed in the future.

1

Material creation

2

Presentation

Where did you put the most effort?

Material browsing PC

Project demonstration PC

control

Fig. 1. System overview

A Presentation Supporting System for Programing Workshops

3.2

81

Functions

Figure 2 shows the structure of the screen displayed when creating and viewing the material. There are three pages to fill in, the first page is for “name” and “title”, the second page is for “Where did you put the most effort?”, and the third page is for “Where you want to improve.” The screens of the second page and the third page are shown in Fig. 3, and these pages contain the functions of a sprites display, a paint tool, a save button, and a page transfer. Sprite Display. The sprite display is a function displaying images of sprites which will be explained in the contents of the materials. As shown in Fig. 4, it is possible to switch displayed sprites by clicking on the part where the sprites are displayed. When the workshop participants prepare their material, sprite images of each participant’s projects are displayed on the page. Because sprite behavior description usually includes one or two sprites, up to two sprits are displayed on the page. If participants want to describe the content of only one sprite, a white rectangular image is displayed on the page. Paint Tool. As shown in Fig. 5, the painting tool has a drawing space to write an explanation of behavior of sprites, along with a switching button for a pen and an eraser, switching buttons for four colors, and a canvas clear button. When you touch the drawing space with a mouse or a pen, the pen tool is activated for hand drawing. By switching to the eraser function, you can erase what you draw. Moreover, when you press the canvas clear button, the contents of the entire drawing space will be erased. Save Button. When the save button is pressed, the contents of the sprite displayed on the screen and the drawing in the drawing space will be saved in the database. Page Transfer. When the page move button is pressed, the screen will be switched among the three pages. 3.3

Implementation of the Proposed System

An overview of the implementation of this system is shown in Fig. 6. This system is implemented as a web application. Users prepare material on the client terminal. Each client terminal has a unique ID for differentiation. Up to 16 client terminals can be used in this implementation. In addition, a server is provided with an image folder allocated for each ID and a database using MySQL for saving the drawing contents of each terminal. The web application is implemented in JavaScript using the library deck.js [1]. Requests for sprite data and saving the contents of the created material to the server are implemented using PHP.

82

K. Ito et al.

Fig. 2. Pages for creating materials

Where did you put the most effort?

Sprite display

Page transfer

Paint tool

Fig. 3. Functions creating materials

Where did you put the most effort?

Click

Change a sprite

Where did you put the most effort?

Fig. 4. Function viewing sprites

A Presentation Supporting System for Programing Workshops

save

83

clear

Fig. 5. Button of page for creating materials. (Color figure online)

Transfer Sprite Image to Server. When a user creates a project with Scratch, compressed files with sb2 filename extension as a saved data will be output. In the sb2 file, image files and audio data of the created project are compressed. The images which are decompressed from the sb2 file are transferred to the server so that the sprite images can be displayed in the document. In addition, because the images contained in the file is saved in either png format or svg format, the system converts svg format images to png format to make the formats consistent. Ideally, decompression of sb2 files, transferring and converting images should be done only with client terminals. However in this implementation, we prepared a PC for images uploading to the server. Drawing Web Pages. On the server, the images of sprites are distinguished and saved according to the ID of the client terminal. The images and the web application are transferred to each client terminal connected to the server and logged in. In addition, the server saves the contents drawn by the web application. The drawing space is created using the Canvas function of HTML, and it is saved as an image. The drawn images are converted to the character string called Base64 format and saved in the database. The information which sprites are displayed is saved in the database as well. After the drawing contents are saved on the web application, the client terminal can request the server to restore the drawing contents and the sprite images. 3.4

Usage Example

Users create and store their projects on Scratch. When the web application is activated, the login page will be displayed, and users input the specified login ID. Next, the material creation page will be displayed on the client terminal, and users create and save the material according to the title of the material preparation page. After preparing the material, users confirm the material on the material browsing PC and make a presentation while operating the project demonstration PC.

84

K. Ito et al. PC for uploading images

Shellscript

Decompression of project file Picking up images

client Scratch

images

save

Web application

server

ID + Drawing data

JavaScript Restoration

PHP + MySQL

Drawing data

Fig. 6. Structure of implementation

4

Experiment

This section describes the contents and results of the experiments conducted using the proposed system. 4.1

Outline of Experiments

In order to verify the effects of the proposed system on presentations, the evaluation experiments were conducted at the programming workshop organized by NPO Super Science Kids (SSK). The experiments were carried out twice. The proposed system was evaluated by the behavior of participants, the observation of the created material, and the questionnaire conducted after each experiment. We provided each participant with a laptop computer that is used in a regular workshop. A touch panel and a pen are attached to this laptop computer. 4.2

Experiment Contents

At the regular programming workshops sponsored by SSK, participants first complete a project with Scratch in about three hours and conduct a presentation in about two minutes about the project. In the evaluation experiments, we allocated ten minutes of the material preparation with the proposed system before making a presentation. Then, each participant conducted a presentation for two minutes using the prepared material. Table 1 shows the environment of the experiments. Tables 2 and 3 show the contents of the questionnaire conducted after the experiments.

A Presentation Supporting System for Programing Workshops

85

Table 1. Environment of the experiments First

Second

Place

Kodomo Mirai Kan (Kyoto city)

Date

December 3

December 17

Participants 3rd: 3, 4rd: 5, 5rd: 1 3rd: 2, 4rd: 1, 5rd: 1, 6rd: 2 Table 2. Questionnaire contents of creating materials Question

Format

Q1–1 Could you create the material easily?

5-level

Q1–2 Could you summarize the content of the presentation well?

5-level

Q1–3 Do you think you will use this system in next presentation? 5-level Q1–4 Write the reason of Q1–3

4.3

Free description

Results

In the experiments, we provided the participants with an interface for material preparation and explained them how to use this system on a screen. Questionnaire results of creating materials are shown in Table 4. We also provided them with an interface for material confirmation so that they can confirm the content of presentation while they operate their programming project. Questionnaire results of confirming materials are shown in Table 5. After the experiment, we found the evidence from the material that functions of material creation such as a sprite display, a paint tool, a save button, and a page movement were used. In addition, as the description contents of the material is shown in Table 6, there were many descriptions concerning the behavior of the object. 4.4

Discussion

Based on the results of the questionnaire, the completed project, the material created, and the recorded video of presentation, the proposed system was evaluated from the following two points. – if the participants obtained a positive effect from the proposed system which promotes writing about the sprites operation – if the proposed system made elementary school students easy to give a presentation Table 3. Questionnaire contents of confirming materials Question Q2–1 Could you explain about your project well?

Format 5-level

Q2–2 Is this material useful when you did the presentation? 5-level Q2–3 Write the reason of Q2–2

Free description

86

K. Ito et al. Table 4. Questionnaire results of creating materials Question

No. of responses

Q1–1 Could you create the material easily? 1. Very easy 3 2. Easy 2 3. Normal 5 4. Difficult 5 5. Very difficult 0 Q1–2 Could you summarize the content of the presentation well? 1. Very well 3 2. Well 7 3. Normal 5 4. Bad 0 5. So bad 0 Q1–3 Do you think you will use this system in next presentation? 1. Very useful 6 2. Useful 7 3. Fair 0 4. Not so useful 1 5. Not useful at all 1 Table 5. Questionnaire results of confirming materials Question Q2–1 Could you explain 1. Very well 2. well 3. Normal 4. Bad 5. So bad

No. of responses about your project well? 4 7 4 0 0

Q2–2 Was the material useful when you did the presentation? 1. Very useful 6 2. Useful 4 3. Normal 3 4. Not so useful 0 5. Not useful at all 2

A Presentation Supporting System for Programing Workshops

87

Table 6. Contents on the proposed system The content of the description using the proposed system Number The behavior of two sprites

6

The behavior of one sprite

4

The appearance of the sprite

2

The Interface of Material Creation. Because the description about the sprite was displayed in the drawing space, this system was able to prompt the participants to write about the behavior of the sprites. The students writing about the behavior of sprites answered in Q1–4 of this questionnaire “It is easy to summarize” or “It will be easier for presentation.” From this fact, it can be assumed that the functions of this system to prompt users to write about the behavior of sprites assist the contents of the presentation. Many answered “It was difficult to create” in Q1–1. One of the reasons is that the operability of the web application of this system was low. However, the fact that the students who answered “It was successful” for Q1–2 also answered “It was difficult” for Q1–1 implies that all the participants were unfamiliar with the interface. The Presentation Implementation Using This System. In Q2–1 of this questionnaire, many participants answered that “it was successful.” During the presentations, we observed that students spoke spontaneously. For the free writing answers of Q2–3, some answered “I could say what I want to say” or “I could speak smoothly.” It is considered that being able to view the materials during the presentation will be a supplement to the presentation. Same thing can be said to the questionnaire results of Q2–1 and Q2–2. In addition, two students answered “it was not very useful” in Q2–2. One of them did not use this system because he requested to change the project for the presentation after he transferred the sprite images from the completed project to the server. Moreover, another student did not use the system because the application did not work. Regarding these points, we would like to improve in the future by making it possible to transfer the sprite images directly from the client terminal to the server and improving the stability of the web application.

5

Conclusion

In this paper, we proposed the presentation support system for programming workshops. As a result of applying this system in the workshops, it was proved that browsing the material during the presentation can prompt elementary school students to make remarks spontaneously. The evaluation experiments revealed that some students could not operate well when preparing the material and

88

K. Ito et al.

making a presentation. It is necessary to investigate in the future whether it is because the participants were not familiar with the system or the interface operation workload is high. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number 16H02925.

References 1. deck.js Modern HTML Presentations. http://imakewebthings.com/deck.js/ 2. Kurihara, K., Igarashi, T., Ito, K.: A pen-based presentation tool with a unified interface for preparing and presenting and its application to education field. Comput. Softw. 23(4), 14–25 (2006). (in Japanese) 3. Ministry of Education, Culture, Sports, Science and Technology: The Vision for ICT in Education: Toward the Creation of a Learning System and Schools Suitable for the 21st Century (2011). http://www.mext.go.jp/component/a menu/education/ micro detail/ icsFiles/afieldfile/2017/06/26/1305484 14 1.pdf 4. Morimoto, T., Takada, H.: Promoting creative thinking with a sharing and reflecting system for creative activity in the classroom. SIG Technical reports of IPSJ (GN) 2013(2), 1–7, January 2013. https://ci.nii.ac.jp/naid/110009509272/en/ 5. Papert, S.: Mindstorms: Children, Computers, and Powerful Ideas. Basic Books, Inc., New York (1980) 6. Resnick, M.: Lifelong Kindergarten: Cultivating Creativity Through Projects, Passion, Peers, and Play. MIT Press, Cambridge (2017) 7. Resnick, M., et al.: Scratch: programming for all. Commun. ACM 52(11), 60–67 (2009)

Gamifying the Teaching and Learning Process in an Advanced Computer Programming Course Mamoun I. Nawahdah(&) Birzeit University, Birzeit, Palestine [email protected]

Abstract. These days, the conventional ways of teaching programming are not attractive to students. For instance, classical lecture and tutorial classes are not sufficient and provides only one-way learning environment. Most of the students nowadays prefer to have more engaging, fun, competitive, collaborative, and instant feedback learning environment. These elements can be achieved using gamification to increase students’ interest in computer programming courses. More specifically, we used three gamification techniques in one advanced programming course: pair-programming teaching technique to maintain collaboration between students, Kahoot! system to provide an interactive quizzes system and instance feedback, and finally we used Robocode platform to teach Object-Oriented programming concepts in a fun and competitive fashion. We believe that gamifying teaching and learning process has great potential to assist teachers and engage students in a new and challenging way. This paper presents an empirical study that was carried out in one advanced computer-programming course were the mentioned gamification techniques were applied. A subjective system evaluation revealed that the students appreciated the used techniques. The results also revealed that the students’ interest in computer programming were enhanced as well. Keywords: Gamification  Collaboration  Teaching/learning methodologies Computing education  Object-Oriented programming

1 Introduction Programming courses are reputed with having low averages, with failure rates varying between 30% to 50% worldwide [1]. In another study that was carried out in the University of West Indies, the average fail rate was 20% in the period from 2004 and 2009 [2]. In addition, students who fail to grasp the fundamental concepts of programming in the first introductory courses are often unable to recover and catch up, and end up dropping out of Computer Science programs [2, 3]. This has been noted as well in Computer Science (CS) and Computer System Engineering (CSE) students at Birzeit University (BZU), where the fail rates during the two semesters previous to our experiment were 29% in the first semester of 2016–2017, and 42% in the second semester of the same year. © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 89–95, 2018. https://doi.org/10.1007/978-3-319-98743-9_7

90

M. I. Nawahdah

Literature shows that the application of gamification as a teaching technique has promising results of improving the education of computer science, if applied correctly [4–6]. Gamification is using game-based mechanics, aesthetics and game thinking to engage people, motivate action, promote learning, and solve problems [7, 8]. In one hand, game-based mechanics include levels, earning badges, point systems, scores, and time constraints. Game thinking, in the other hand, is the idea of thinking about an everyday experience (e.g. learning, teaching) and converting it into an activity that has elements of competition, cooperation, exploration and storytelling. In this research, we used three gamification techniques in order to enhance the students’ learning experience in one advanced programming course during the academic year 2016–2017: Pairprogramming for students’ collaboration, Kahoot! for interactive quizzes system and instance feedback, and Robocode to teach Object-Oriented programming concepts in a fun and competitive way. The students agree the used gamification techniques has enhanced their learning experience and helped them learn more Object-Oriented programming concepts.

2 Background In universities, students are usually pushed to do their work and assignments individually, and collaborations are most often considered as cheating attempts. Literature shows that the application of pair-programming as a teaching technique has promising results of improving the education of computer programming courses [9]. In pairprogramming technique, two students will sit next to each other on one computer, looking at the same screen. They use one keyboard and one mouse, to manipulate the computer and type their code. They collaborate with each other to solve a programming problem. The students work together, taking turns to type, and continuously discuss their code, improve it, revise it, and debug it [10]. When used as a teaching technique, pair-programming has several advantages. Students’ performance was shown to have improved when using pair-programming [11], even when students were required to program individually for their final exam [12]. Working in pairs promotes peer tutoring. Research shows that student may be more receptive to information when coming from their peers as opposed to an instructor [13]. In addition, students produced better quality programs than when working on their own [14]. Students exchange allows them to learn from each other’s experiences, and brainstorm before working, therefore allowing them to come up with the most efficient techniques to solve a given problem [15]. Another advantage that has been noticed when surveying students who studied in labs applying pair-programming technique is course enjoyment. Students reported that they enjoy working in pairs more than they do on their own [11]. This might be due to the social aspect of pair-programming, or even due to the fact that they are producing good code and solving problems with a little less effort. On the same note, pair-programming allows for less frustration when programming, since it is less likely to get stuck at a problem for long [9]. There are many benefits of introducing quizzes to a lecture, like assessing the students’ knowledge, students get to think about what they have learned, and students can get feedback on their understanding and the benefits from breaking up lengthy

Gamifying the Teaching and Learning Process

91

teacher lecturing [16]. For this purpose, we used Kahoot! system as a classroom quiz game. Kahoot! is a fun interactive online quiz game with multiple choice answers. Every student will need his or her own device to participate. In one hand, quiz games can be great for formative assessment as can be a great way to help students review before a test. On the other hand, for teachers they see a real-time view of the quiz game’s results. Kahoot! shows the total number of questions that have been answered correctly and incorrectly. Kahoot! also displays real-time progress bars for each player. At a glance you can see how many questions a player has got right, answered incorrectly, and have left to answer. Robocode is one of many applications that fall into programming games; these are characterized by an arena in which automatons compete for victory without user input. The student must program an agent effectively in order to be successful. Robocode is written entirely in Java, and students must create a class that inherits from a provided Robot class, overriding methods so as to provide a strategy and deal with a number of defined events, such as colliding with another tank and successfully hitting another tank. Literature shows many benefits of using computer games in computer programming education. For instance, it has been shown that a game-based assignments is more enjoyable to the students than a traditional one [17]. A field study conducted by Long regarding the effectiveness of Robocode revealed that about 80% of the participants’ programming skills were increased after playing Robocode [18]. The study also found that Robocode was enjoyable not only to novice students but also to experienced students.

3 Research Methodology In an attempt to try to improve the students’ confidence, their pass rate, and their enjoyment, we applied three gamification methods in the teaching of an advanced programming course. This course is offered for sophomore level students, and covers the basics of Object-Oriented programming and is taught in Java language. The course includes two one-hour lectures, and one three-hour lab per week, and spans over a 15week semester. The students were asked to work in pairs during the lab only as seen in Fig. 1. They were presented with the programming problems in their lab work-book, and asked to solve them while working on one computer. Students were instructed to switch roles constantly, which usually happened in between exercises. Students were also encouraged to discuss the problem before starting to solve it, and to avoid asking the instructor or the TA for help, unless they both fail to reach a solution. Every other week, the students were asked to participate in an interactive quiz through Kahoot!. Figure 2 shows a screenshot from one of the quizzes. These quizzes were delivered at the beginning of the lab sessions using the lab’s computers and students’ mobile phones. Each quiz contained 10 questions and lasted for 15 min on average. During the last month of the semester, the students were introduced to the Robocode application so they can practice Object-Oriented concepts. Two worksheets were carefully prepared and were given to the students in two weeks. The first worksheet

92

M. I. Nawahdah

Fig. 1. Two students practicing pair-programming in the lab.

Fig. 2. Sample Kahoot! Quiz.

aimed to introduce Robocode platform and to let the students create a dummy robot and fight against other robots. Figure 3 shows a screenshot from one Robocode fight created by the students. The second worksheet aimed to let the students practice ObjectOriented concepts by creating and controlling a robot according to a predetermined strategy. By the end of semester, a fight contest between students’ developed robots was held. This contest was support by a local computer programming company and money prizes were given to the winners. The data collected throughout the semester was in the form of questionnaire and observations done by the instructor and TA during the labs.

Gamifying the Teaching and Learning Process

93

Fig. 3. Sample Robocode fight.

4 Results The questionnaire was designed to measure how much the students enjoyed working in pairs, and how useful they felt the new teaching methods were. The statements relating to the new techniques were: • (Q1) I learned new programming concepts and methods from my partner. • (Q2) Working with a partner helped me understand some of concepts that was not clear during the lecture. • (Q3) Working with a partner made programming process more enjoyable. • (Q4) Using Kahoot! is fun way to perform quizzes. • (Q5) Kahoot! provided me with an instant feedback regarding my performance in the quizzes. • (Q6) Using games technique is a good way to learn programming. • (Q7) Programming my own robot encouraged me to learn OOP. Most of the answers to the previous questions were in favourable towards new techniques as shown in Fig. 4. Although most of the students were of the same academic level as their partners, 92.3% of them thought that they learnt new concepts from their partner. 84.6% of all students thought that pair-programming helped them in better understanding concepts that were unclear or confusing to them during the lectures. Pair-programming also made programming more fun for 92.4% of the students. Regarding Kahoot!, the majority of students (92.3%) saw that using Kahoot! is fun and many of them (69.7%) agreed that Kahoot! provides the required instant feedbacks. Using Robocode to practice OOP by program robots to fight against other students’ robot was evaluated positively as well. 76.9% of students saw it as a fun and encouraging way to practice OOP.

94

M. I. Nawahdah

Fig. 4. Students’ subjective evaluation.

5 Conclusion The work described in this paper was concerned with the implementation of gamification as a teaching technique in an advanced programming course during the academic year 2016–2017. As the result revealed, gamification methods can be useful learning and teaching tools for Object-Oriented programming courses. However, it is important to bear in mind, play is one of the very first and most effective ways that learner can get about whatever knowledge they required.

References 1. Bennedsen, J., Caspersen, M.: Failure rates in introductory programming. ACM SIGCSE Bull. 39(2), 32–36 (2007) 2. Depradine, C.: Using gaming to improve advanced programming skills. Carib. Teach. Sch. 1 (2), 93–113 (2012) 3. Wood, K., Parsons, D., Gasson, J., Haden, P.: It is never too early: pair programming in CS1. In: Proceedings of the Fifteenth Australasian Computing Education Conference, vol. 136, pp. 13–21 (2013) 4. Barata, G., Gama, S., Jorge, J., Gonçalves, D.: Improving participation and learning with gamification. In: Proceedings of the First International Conference on Gameful Design, Research, and Applications, pp. 10–17. ACM (2013) 5. Morrison, B., DiSalvo, B.: Khan academy gamifies computer science. In: Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pp. 39–44. ACM (2014) 6. Preist, C., Jones, R.: The use of games as extrinsic motivation in education. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3735– 3738. ACM (2015) 7. Huotari, K., Hamari, J.: Defining gamification: a service marketing perspective. In: Proceeding of the 16th International Academic MindTrek Conference, pp. 17–22. ACM (2012)

Gamifying the Teaching and Learning Process

95

8. Deterding, S., Dixon, D., Khaled, R., Nacke, L.: From game design elements to gamefulness: defining “gamification”. In: Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments (MindTrek 2011), pp. 9– 15. ACM (2011) 9. Teague, M.: Pedagogy of Introductory Computer Programming: A People-First Approach. Queensland University of Technology, Brisbane (2011) 10. Begel, A., Nagappan, N.: Pair programming: what’s in it for me? In: Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 120–128. ACM (2008) 11. Salleh, N., Mendes, E., Grundy, J.: Empirical studies of pair programming for CS/SE teaching in higher education: a systematic literature review. Softw. Eng. IEEE Trans. 37(4), 509–525 (2011) 12. McDowell, C., Werner, L., Bullock, H., Fernald, J.: The effects of pair-programming on performance in an introductory programming course. SIGCSE Bull. 34(1), 38–42 (2002) 13. Khan, S., Ray, L., Smith, A., Kongmunvattana, A.: A pair programming trial in the CS1 lab. In: Proceedings of the Annual International Conference on Computer Science Education: Innovation and Technology (CSEIT) (2010) 14. He, X., Chen, Y.: Analyzing the efficiency of pair programming in education (2015) 15. Porter, L., Guzdial, M., McDowell, C., Simon, B.: Success in introductory programming: what works? ACM 56(8), 34–36 (2013) 16. Roediger III, H.L., Putnam, A.L., Smith, M.A.: Chapter one - Ten benefits of testing and their applications to educational practice. In: Mestre, J.P., Ross, B.H. (eds.) Psychology of Learning and Motivation, vol. 55, pp. 1–36. Academic Press, Cambridge (2011) 17. Venkatesh, V.: Creation of favorable user perceptions: exploring the role of intrinsic motivation. MIS Q. 23(2), 239–261 (1999) 18. Long, J.: Just for fun: using programming games in software programming training and education - a field study of IBM Robocode community. JITE 6, 279–290 (2007)

Designing a System of Generating Sound Environment for Promoting Verbal Communication in Classroom Riri Sekine, Yasutaka Asai, and Hironori Egi(B) Graduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo 182-8585, Japan [email protected], {y.asai,hiro.egi}@uec.ac.jp

Abstract. In this study, we developed a system that generates a sound environment that encourages verbal communication among students in a classroom. A lively classroom is essential for student communication. When a classroom is quiet, students may be reluctant to ask questions. Here, the goal is to lower the utterance threshold; thus, we designed a system that plays conversational noise when quietness is detected in a classroom. The playback device comprises a directional microphone, a speaker, and a single-board computer. We introduced the system to an actual classroom and found that changes in sound pressure level were dependent on the location in the classroom and the phases of the lecture. The sound pressure level was investigated and compared relative to subjective student evaluations. Keywords: Sound environment · Communication supporting system Conversational noise · Promoting utterance · Classroom

1

Introduction

To activate an environment in which students discover and solve problems autonomously, i.e., active learning, educational institutions must adopt methods other than lectures to deliver knowledge. Bonwell et al. [1] defined active learning as anything that involves students in doing things and thinking about the things they are doing. During lectures in face-to-face environments, students can interact with other students directly. According to Prince [2], active student engagement contributes to successful active learning. However, there are barriers that prevent students from speaking actively, such as silent conditions in the classroom; thus, despite being in a situation where they are encouraged to speak freely, students may be reluctant to talk to each other due to the sound environment of the classroom. In this study, the sound environment is defined as a set of acoustic stimuli that vary depending on the situation. c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 96–103, 2018. https://doi.org/10.1007/978-3-319-98743-9_8

Sound Generation System to Promote Verbal Communication in Classroom

97

To address this problem, we intend to lower the utterance threshold and promote verbal communication among students by playing conversational noise when the classroom is quiet. We have designed a system that detects quietness in a classroom. Note that the conversational noise was recorded in classrooms. The system plays conversational noise to maintain an active sound environment in the classroom.

2

Related Work

Research by Tsujimura et al. [3] employed electroencephalography to measure the brain activity of subjects performing mental arithmetic or Chinese character memorization tasks. They performed an experiment in a sound environment that involved news broadcasts and white noise to investigate the influence of noise on brain function. It was found that meaningful external noise, i.e., the verbal information in news broadcasts, had a negative influence on task results. Toplyn et al. [4] conducted a study in which subjects performed creative tasks, such as picture arrangement and block design, under white noise conditions at 60, 80, and 100 dB. Prior to conducting the experiment, the subjects were divided into high and low scoring groups based on a creativity test without noise. It was found that subjects with high creativity scores were more resistant to noise stress. In the current study, we consider that a moderate noise level may encourage creativity. Mehta et al. [5] evaluated the effects of three sound pressure levels (50, 70, and 85 dB; low medium and high levels) using noise recorded in a real environment. The noise was edited to switch among three types of noise, i.e., a multi-person conversation, and road traffic, and construction noise. The experiment was conducted under conditions similar to a roadside restaurant during the daytime. Five experiments were performed to measure subject creativity. The tasks included finding common meaning from multiple words and answering Remote Associates Test (RAT) [6]. Based on the numbers of the generated ideas, the results indicated that a medium level sound (70 dB) enhanced creativity.

3

Sound Environment to Promote Verbal Communication

We considered the following factors, i.e., effective areas divided in the sound environment, sound pressure level and type of noise. 3.1

Effective Areas Divided of Sound Environment

The sound environment of a classroom is heterogeneous. Some areas in the environment will be livelier depending on the amount of student utterances. Thus, it is necessary to determine whether student utterances occur primarily in a specific area of the classroom. Note that the appropriate size of such an effective area in the sound environment is not clear. In this study, the first hypothesis is that the sound environment differs depending on the given area in the classroom.

98

3.2

R. Sekine et al.

Sound Pressure Level

Previous studies [4,5] suggested that a moderate sound level (70 or 80 dB) enhances creativity. Thus, our second hypothesis is that a moderate level also promotes verbal communication. To evaluate this hypothesis, an experiment was conducted to compare the sound pressure level to subjective student evaluations of liveliness. 3.3

Types of Noise

In this study, we used conversational classroom noise recoded during exercises. Conversational noise is considered meaningful external noise if its content can be understood. On the other hand, if several students speak simultaneously, the conversational noise is considered meaningless external noise. A previous study [3] found that meaningful external noise has a negative influence on task results. However, a different study [5] that used meaningless external noise, i.e., multi-person conversational noise, suggested that such noise enhanced creativity. Rather than playing white and pink noise, multi-person conversational noise is considered a more natural sound environment to encourage students to speak during exercises.

4 4.1

System Design System Component

The proposed system senses the sound environment of the classroom and maintains an appropriate volume to prompt student utterances. The system consists of recording devices comprising a directional microphone (Headset MMHSUSB13BKN, Sanwa Supply) and a single-board computer and playback devices comprising a speaker and a single-board computer. The recording devices and playback devices are controlled by a server. Note that the single-board computer used for the recording and playback devices is a Raspberry Pi 3B powered by a mobile battery (3000 mAh). 4.2

Procedure

An outline of the proposed system is shown in Fig. 1. Here, classroom sounds are measured constantly using the directional microphones. The recording devices calculate the sound pressure level at each location in the classroom. The calculated sound pressure level is sent to the control server, which determines whether the average sound pressure level per unit time is less than a lower limit value or greater than a higher limit value. The playback devices start or stop playing conversational noise when the devices receive a command from the control server. The control server sends a play command if the playback devices are not currently playing conversational noise and the average sound pressure level per unit time is less than the lower limit value. The control server sends a stop

Sound Generation System to Promote Verbal Communication in Classroom

99

command if the playback devices are currently playing and the average sound pressure level per unit time is greater than the higher limit value. In this manner, the proposed system prompts student utterances during classroom exercises. 4.3

Type of Sound

The volume of the input captured by the microphones is not necessarily due only to student conversations. During exercises, the volume level can include work sounds, e.g., operational sounds of equipment and noise that occurs due to the movement of materials. Note that work sounds indicate that the students are engaged in learning activities. Therefore, work sounds and student utterances are not differentiated.

Fig. 1. Outline of the proposed system

4.4

Sound Pressure Level

The sound volume acquired by the recording device is converted to sound pressure level L as follows. Here, x is the average value of one chunk of the acquired sound volume, and the second term is a fixed value determined by a preliminary measurement with the recording device. Here, L is calculated and compared to the sound pressure level measured using a commercial sound level meter (SL4023SD, Mother Tool). L = 20 log 10x + 11 (1) The recording device is shown in Fig. 2.

5 5.1

Experiment Experiment 1: Positional Differences in Sound Environment

An experiment was conducted to determine if the sound environment differs depending on a given position in the classroom, i.e., our first hypothesis. Here,

100

R. Sekine et al.

Fig. 2. Recording device installed in the classroom

the target environment was a beginner programming exercise 90 min for beginners at a science and engineering university. In this experiment, the sound pressure level in a computer room was calculated using 12 recording devices. The recording devices were numbered and positioned as shown in Fig. 3.

Fig. 3. Arrangement of recording devices

Note that sound pressure level fluctuations at each position during the exercise were visualized. 5.2

Experiment 2: Subjective Evaluation of Sound Environment

A second experiment for the second hypothesis was conducted to compare subjective student evaluations of liveliness to the sound pressure level in the classroom, i.e., our second hypothesis. Here, the target was the same type of exercise as in experiment 1. The subjects evaluated the sound environment in the classroom once every five minutes throughout the exercise. The subjects ranked the

Sound Generation System to Promote Verbal Communication in Classroom

101

current liveliness of the classroom on a five-point scale (1: lively; 2: rather lively; 3: neutral; 4: rather quiet; 5: quiet). During the experiment, the sound pressure level around the subjects was calculated using the proposed system’s recording devices. To mimic a typical classroom situation, the subjects performed the tasks freely, except for the subjective evaluation.

6 6.1

Results and Discussion Experiment 1

The results of the sound pressure level (dB) measured at each position during the exercise are shown as a heat map in Fig. 4. In Fig. 4 vertical axis is the recording device number (Fig. 3). Note that the teacher’s explanation, i.e., the blank part of Fig. 4, was not considered. As show in Fig. 4, the sound pressure level fluctuated relative to the position in the classroom and time. Since the sound environment in a classroom is not uniform, it is necessary to control when conversational noise is played. It was also found that the sound pressure level might fluctuate across multiple recording devices synchronously.

Fig. 4. Experiment 1: sound pressure level transition during exercise

6.2

Experiment 2

The relationship between the subjective evaluations and sound pressure level (dB) in the classroom for each subject (A to M) was examined. Here, the sound pressure level was the average value over 10 seconds beginning when the subject

102

R. Sekine et al.

performs a subjective evaluation. The correlation coefficients between subjective evaluation and sound pressure level during the exercise for each subject and the p values are shown in Table 1. Note that the teacher’s explanation time is not included in Table 1. Table 1. Experiment 2: relationship between sound pressure level and subjective evaluation Day Subject Correlation coefficient p value 1st

A B

0.24 0.25

0.57 0.56

2nd C D E F

0.50 0.16 0.49 0.54

0.31 0.76 0.33 0.27

3rd

G H I J

0.73 0.069 0.60 0.77

0.026 0.86 0.088 0.014

4th

K L M

0.61 0.33 0.56

0.11 0.42 0.15

As can be seen, the correlation coefficients between the subjective evaluations and sound pressure level and p values differed for each subject. There are cases where the students were quiet or active during teacher’s explanation, and it is possible that the subjective evaluations differed in such cases. In addition, the difference in sound pressure level was as low as 40 to 52 dB, and the criteria for judging on sound pressure levels differ for each subject.

7

Conclusion and Future Work

In this study, our goal was to promote discussions between students by developing a system that generates a sound environment that encourages verbal communication. We designed a system that plays conversational noise when it detects quietness in a classroom. We introduced the system to an actual classroom and found that changes in sound pressure level vary depending on different positions in the classroom and the current phase of the lecture. We found that the relationship between the subjective student evaluations and sound pressure level differed for each subject. In future, we intend to increase the amount of data used in the two experiments to further investigate whether communication among students can be promoted by playing conversational noise during exercises.

Sound Generation System to Promote Verbal Communication in Classroom

103

References 1. Bonwell, C.C., Eison, J.A.: Active learning: creating excitement in the classroom. ASHE-ERIC Higher Education Reports (1991) 2. Prince, M.: Does active learning work? A review of the research. Br. J. Educ. Technol. 93(3), 223–231 (2004) 3. Tsujimura, S., Akita, T.: Psychophysiological experiments on extent of disturbance of noises under conditions of different types of brain works. In: Siano, D. (ed.) Noise Control, Reduction and Cancellation Solutions in Engineering, chap. 7. InTech, Rijeka (2012) 4. Toplyn, G., Maguire, W.: The differential effect of noise on creative task performance. Creat. Res. J. 4(4), 337–347 (1991) 5. Mehta, R., Zhu, R., Cheema, A.: Is noise always bad? Exploring the effects of ambient noise on creative cognition. J. Consum. Res. 39(4), 784–799 (2012) 6. Mednick, S.A.: The associative basis of the creative process. Psychol. Rev. 69(3), 220–232 (1962)

“Discuss and Behave Collaboratively!” – Full-Body Interactive Learning Support System Within a Museum to Elicit Collaboration with Children Mikihiro Tokuoka1(&), Hiroshi Mizoguchi1, Ryohei Egusa2, Shigenori Inagaki3, Fusako Kusunoki4, and Masanori Sugimoto5 1

Tokyo University of Science, 2641, Yamazaki, Noda, Chiba, Japan [email protected], [email protected] 2 Meiji Gakuin University, 1-2-37, Shirokanedai, Minato-ku, Tokyo, Japan [email protected] 3 Kobe University, 3-11, Tsurukabuto, Nada, Kobe, Hyogo, Japan [email protected] 4 Tama Art University, 2-1723, Yarimizu, Hachioji, Tokyo, Japan [email protected] 5 Hokkaido University, Kita 15, Nishi 8, Kita-ku, Sapporo, Hokkaido, Japan [email protected] Abstract. For children, museums are an important place to acquire scientific knowledge through experience and conversation. However, the main learning method in museums is passive: observing exhibits and reading explanations on text panels. Few opportunities exist to discuss the experience and engage in conversation. Therefore, it is difficult for young children to learn sufficiently and efficiently. We developed a collaborative immersive learning support system for a museum that enables children to learn through body movements and conversation. Children can learn by thinking hard when moving with multiple people. We developed content that can be manipulated by the body movements of multiple people. For example, people can cooperate to observe a fossil projected on the screen surrounded by other exhibits and answer quizzes. We expect that this system can help children efficiently gain knowledge of fossils and enhance cooperation. In this paper, we describe the results of an experimental evaluation conducted on a prototype at the Museum of Nature and Human Activities in Hyogo, Japan. Keywords: Kinect sensor

 Learning support system  Body movement

1 Introduction Museums are an important place for children to acquire scientific knowledge [1]. One of the most important reasons is that children can apply their own level of understanding without worrying about evaluation. In addition, they can tell whether their understanding is correct through experience and conversation with multiple people, which improves their motivation [2]. However, the primary learning method in © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 104–111, 2018. https://doi.org/10.1007/978-3-319-98743-9_9

“Discuss and Behave Collaboratively!”

105

museums is passive: observing exhibits and reading explanations. There are no opportunities for experience or conversation. Therefore, it is difficult for young children to learn enough and engage in acquiring knowledge. In recent years, the difficulty of learning the most important radiolarians for paleontology, especially in museums, has been considered a problem [3]. These problems should be solved in order to improve the quality of science education in museums. In addition, radiolarians are more difficult to learn about because they are observed with a microscope, which can require independent learning. This is a problem for all fossil observations. Various methods have been proposed to address these issues. Group activities using mobile terminals are regarded as important in the field of computer-supported cooperative work (CSCW) because they can strengthen interactions with museum exhibits and explanations [4]. For example, Papadimitriou et al. proposed providing two children with a personal digital assistant (PDA) equipped with a radiofrequency identifier (RFID) reader so that they can learn while searching exhibits and explanations [5]. Grinter et al. proposed a voice guidance system so that children can acquire information in pairs [6]. However, although these studies allow information to be obtained cooperatively from exhibitions and explanations, actual observation with a microscope is performed by one person. Moreover, the proposed means of learning support do not address children’s need for real experience. Consequently, the fundamental problem of cooperative observation of fossils has not been solved. Children should receive opportunities to cooperate and encounter near-real experiences. Sociality, cooperation, and knowledge clearly cannot be obtained without actual experience gained from interaction with others to achieve some goal. In particular, play is a good opportunity for children to gain such experience. Studies have revealed that children develop a deep understanding when cooperating and playing [7]. When children use body movements while cooperating, the learning environment becomes more natural [8], and they can retain more of the knowledge being taught [9]. We developed a system based on these ideas in order to solve the above problems. We developed a collaborative immersive learning support system that allows multiple people to learn through body movements and conversation while collaborating. Multiple people can cooperate to observe radiolarians, which leaves a stronger impression and allows them to acquire knowledge efficiently. For example, the system can be used to actively learn about radiolarians by having multiple learners move their bodies and talk to each other. The motion information of multiple people is simultaneously acquired via a sensor, and the contents are manipulated based on the information. Multiple screens are spread across the entire field of view, and learners can touch the virtual environment, which should leave a strong impression. In this paper, we describe a prototype developed to evaluate the usefulness of the current system as a first step towards realizing cooperative immersive learning support. Evaluation experiments were carried out at the Museum of Nature and Human Activities in Hyogo, Japan. The social presence was evaluated with a questionnaire.

106

M. Tokuoka et al.

2 Cooperative Immersive Learning Support System 2.1

System

We are developing a collaborative immersive learning support system to allow multiple people to collaborate through conversation and body movements to learn more efficiently and with a stronger impression. Multiple fossils can be observed. This system supports the learning of radiolarians [3]. Radiolarians are zooplanktons; because they have changed forms over time, their geological era can be easily determined. Radiolarians are a very important learning material because they are index fossils. As shown in Fig. 1, radiolarians in each era have common features [3]. However, when they are observed with a microscope in a museum, it is impossible for multiple people to collaborate efficiently despite the important material. Furthermore, there is no system that allows multiple people to observe with the microscope while simultaneously improving cooperativeness. For these reasons, we focused on supporting learning of radiolarians. Our proposed system allows observation by multiple people; children can cooperate to find the characteristics of each era of radiolarians, which supports efficient learning. Observation with multiple people realizes the following points: get new awareness from others’ awareness, apply knowledge to different problems, and notify others of your own awareness. The system uses various sensors to measure the positions, attitudes, and body movements of people. Learners operate the system based on this information. The screen displays the object of learning and changes in conjunction with the learners’ body movements. This system gives learners a natural learning environment. In addition, real observations of cooperative behavior among multiple people are incorporated to provide learners a more realistic experience than just viewing exhibits or videos.

Fig. 1. Radiolarian.

2.2

System Configuration

To realize a cooperative immersive learning support system, we are currently developing a system for learning about radiolarians where two people cooperate through conversation and body movements. Figure 2 shows this system, which consists of a Kinect sensor, control PC, projector projecting on the ground, and projector projecting forward. Microsoft’s Kinect

“Discuss and Behave Collaboratively!”

107

sensor is a range image sensor. Although it is inexpensive, the sensor can record sophisticated measurements regarding the user’s location. The Kinect sensor can measure the location of human body parts such as hands and legs, and it can identify the learner’s pose or status with this function and location information. The system consists of three parts: collaborative observation, collaborative consideration, and collaborative solution.

Fig. 2. Setup of the system.

In collaborative observation, two learners cooperate to observe radiolarians and look for common features, as shown in Fig. 1. The learner on the right can select radiolarians to observe by pushing their hands toward the Kinect sensor. They can click on a radiolarian by pushing their palm toward the screen. The learner selects the radiolarians that they want to observe. Next, the radiolarians can be zoomed in or out by the two learners approaching or moving away from the screen. As shown in Fig. 3(a), when the two people approach at the same time, the fossils of the radiolarians are enlarged. When they move away at the same time, they are reduced. In this manner, the two learners communicate and decide on the radiolarians that they want to observe, and the learner on the right takes action. As the two learners move forward and backward, they can observe the radiolarians together. Normally, radiolarians are observed with a microscope by a single person, but this approach allows two people to observe while communicating. Because two people are observing, we expect each to find different features. Because there is a time limit, cooperation in a limited amount of time is necessary. In collaborative consideration, learners work on radiolarian quizzes based on the information gathered during collaborative observation. As shown in Fig. 3(b), radiolarians of a certain era are displayed on the screen. Learners consider which era this radiolarian belongs to. Based on the characteristics found by each learner during collaborative observation, they share opinions and think about which era the radiolarians belong to. In collaborative solution, learners answer quizzes while moving their bodies. When they answer quizzes, they choose the answer by standing. The quiz answer screen is projected on the ground. Because the standing position blinks (see Fig. 3(b)), the

108

M. Tokuoka et al.

answer can be selected by moving the whole body. If the two learners do not stand on the same correct answer, they cannot complete the quiz. Therefore, the two learners must share their opinions to answer quizzes. In this manner, we expect the two learners to exchange views and increase cooperativeness to efficiently learn about radiolarians.

Fig. 3. System flow.

3 Experiment 3.1

Evaluation Method

An experiment was performed to evaluate participants’ learning about radiolarians in terms of social presence in gaming experience. The participants were 19 students from Kobe University Elementary School (13 fifth graders and six sixth-graders; 10–12 years old; 11 boys and 8 girls). The experiment was performed at the Museum of Nature and Human Activities in Hyogo, Japan.

“Discuss and Behave Collaboratively!”

109

The following procedure was used. First, museum curators conducted a 30-min workshop for the participants. During the workshop, participants touched sedimentary rocks and charts and observed them with a magnifying glass. Next, the participants experienced using the system to learn about radiolarians. Participants were divided into pairs, and each pair observed the group of radiolarians displayed on the screen. They were asked to find features found in every era and guess the era that each radiolarian lived in. The quiz was five questions in total. A game experience questionnaire (GEQ) based on the social presence module was prepared for the evaluation [10] to investigate the social presence of a participant’s system experience. The social presence module of GEQ is used to investigate the social gaming experience from three viewpoints: psychological involvement (empathy), psychological involvement (negative feelings), and behavioral involvement. In this evaluation experiment, we adjusted the GEQ of the social presence module to match our objective. In total, 16 items were evaluated. There were six items related to psychological involvement (empathy): “I understood the feelings of my friends well,” “I often looked at my friends during the game,” “When my friends seemed happy, I enjoyed myself too,” “I think that my friends were having fun when I was having fun”, and “I thought that my friend was amazing during the game.” There were four items related to psychological involvement (negative feelings): “The behavior of my friends influenced my feelings,” “I think that I influenced my friends’ feelings by working together,” “I got angry with my friend,” and “I was glad when my friend made a mistake.” There were six items related to behavioral involvement: “During the game, I acted according to my friend’s behavior,” “I think that my friend was acting according to my behavior during the game,” “I felt connected with my friend during the game,” “I think that my friends often looked at me during the game,” “I think that my behavior during the game influenced my friend’s behavior,” and “My friends’ behavior during the game affected my own behavior.” Each item was answered according to the fivelevel Likert scale. 3.2

Evaluation Results

Table 1 presents the results of the questionnaire. We classified the responses as positive (“strongly agree” and “agree”), neutral (no strong opinion), or negative (“disagree” and “completely disagree”). We then analyzed the number of positive replies and neutral and negative replies by using the directly established calculation: 1  2 population rate inequality. The results for each viewpoint are shown below. For psychological involvement (empathy), all items elicited more positive responses than neutral/negative responses. In addition, there was a significant difference between the number of positive responses and number of neutral/negative responses. This indicates that empathy was induced by multiple people engaging in social game play with the developed game. For psychological involvement (negative feelings), more positive responses were elicited than neutral/negative responses for “The behavior of my friends influenced my feelings” and “I think that I influenced my friends’ feelings by working together.” This indicates that temporary emotional fluctuations (mood) influenced the participants in collaborative game play with the developed game. In addition, there was a significant

110

M. Tokuoka et al.

difference between the number of positive responses and number of neutral/negative responses. There was no significant difference between the number of positive responses and number of neutral/negative responses for the other items. These results indicate that no aggressive bad feelings towards the opponent occurred. For behavioral involvement, all items obtained more positive responses than neutral/negative responses. In addition, there was a significant difference between the number of positive responses and number of neutral/negative responses. This indicates positive behavioral involvement from social game play with the developed game. Table 1. Questionnaire for system evaluation. Items I understood the feelings of my friends well✳ ✳ I often looked at my friends during the game✳ ✳ It was fun to work on games with my friends✳ ✳ When my friends seemed happy, I enjoyed myself too✳ ✳ I think that my friends were having fun when I was having fun✳ I thought that my friend was amazing during the game✳ ✳ The behavior of my friends influenced my feelings✳ ✳ I think that I influenced my friends’ feelings by working together✳ ✳ I got angry with my friendn.s. I was glad when my friend made a mistaken.s. During the game, I acted according to my friend’s behavior✳ ✳ I think that my friend was acting according to my behavior during the gamen.s. I felt connected with my friend during the game✳ ✳ I think that my friends often looked at me during the gamen.s. I think that my behavior during the game influenced my friend’s behavior✳ My friends’ behavior during the game affected my own behavior✳ N = 18 p✳ ✳ < 0.01, p✳ < 0.05, n.s: not significant. SA: Strongly agree A: Agree N: No strong opinion D: Disagree SD: Strongly disagree

SA 8 10 12 8 9

A 9 6 6 8 7

N 1 2 0 2 2

D 0 0 0 0 0

SD 0 0 0 0 0

7 6 7

9 7 6

2 4 5

0 1 0

0 0 0

0 0 7

0 0 6

1 0 5

5 2 0

12 16 0

6

2

10

0

0

6 7 7

7 4 9

3 5 4

2 1 1

0 1 0

6

6

1

2

0

4 Conclusion As a first step towards implementing a cooperative immersion learning support system for children, we propose a system for children to learn about radiolarians that uses a Kinect sensor and encourages cooperation between two learners through conversation and body movements. We evaluated the proposed system through an experiment and questionnaire to determine whether children were able to learn cooperatively. The

“Discuss and Behave Collaboratively!”

111

results of our evaluation experiments clearly showed that we were able to elicit collaborative play. However, the relationship between proactive cooperation with the proposed system and promotion of knowledge understanding is still unknown. Future research will require exploring in detail the correlation between system experience and knowledge understanding and how to improve cooperation. Acknowledgments. This work was supported in part by Grants-in-Aid for Scientific Research (A). Grant Number JP16H01814. The evaluation was supported by the Museum of Nature and Human Activities, Hyogo, Japan.

References 1. Falk, J.H., Dierking, L.D.: Museum Experience Revisited, 2nd edn. Left Coast Press, Walnut Creek (2012) 2. Haneyman, B.: The Future of Learning: An Emerging Role for Science Museums and Informal Learning Institutions, Museum Communication, pp. 28–33 (2007) 3. O’Dogherty, L., Carter, E.S., Dumitrica, P., Gorican, S., De Wever, P.: An illustrated and revised catalogue of Mesozoic radiolarian genera: objectives, concepts, and guide for users. Geodiversitas 31, 191–212 (2009) 4. Luff, P., Heath, C.: Mobility in collaboration. In: Proceedings of the ACM Conference on Computer Supported Collaborative Work (CSCW 1998), Seattle, WA, USA (1998) 5. Papadimitriou, I., Komis, V., Tselios, N., Avouris, N.M.: Designing PDA mediated educational activities for a museum visit. In: Proceedings of Cognition and Exploratory Learning in Digital Age (CELDA 2006), Barcelona, Spain (2006) 6. Grinter, R.E., Aoki, P.M., Hurst, A., Szymanski, M.H., Thornton, J.D., Woodruff, A.: Revisiting the visit: understanding how technology can shape the museum visit. In: Proceedings of the ACM Conference on Computer Supported Collaborative Work (CSCW 2002), New Orleans, LI, USA, pp. 146–155 (2002) 7. Dau, E., Jones, E.: Child’s Play: Revisiting Play in Early Childhood Settings. Brookes Publishing, Maple Press, Baltimore, Noida (1999) 8. Grandhi, S.A., Joue, G., Mittelberg, I.: Understanding naturalness and intuitiveness in gesture production: insights for touchless gestural interfaces. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2011), Vancouver, Canada, pp. 821–824 (2011) 9. Edge, D., Cheng, K.Y., Whitney, M.: SpatialEase: learning language through body motion. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2013), Paris, France, pp. 469–472 (2013) 10. IJsselsteiji, W.A., de Kort, Y.A.W., Poels, K.: The Game Experience Questionnaire. Technische Universiteit Eindhoven, Eindhoven (2013)

Entertainment System

Can Social Comments Contribute to Estimate Impression of Music Video Clips? Shunki Tsuchiya1(&), Naoki Ono1, Satoshi Nakamura1, and Takehiro Yamamoto2 1

2

Meiji University, 4-21-1 Nakano, Nakano-ku, Tokyo, Japan [email protected] Kyoto University, Yoshida Hommachi, Sakyo-ku Kyoto-shi, Kyoto, Japan

Abstract. The main objective of this paper is to estimate the impressions of music video clips using social comments to achieve impression-based music video clip searches or recommendation systems. To accomplish the objective, we generated a dataset that consisted of music video clips with evaluation scores on individual media and impression types. We then evaluated the precision with which each media and impression type were estimated by analyzing social comments. We also considered the possibility and limitations of using social comments to estimate impressions of content. As a result, we revealed that it is better to use proper parts-of-speech in social comments depending on each media/impression type. Keywords: Estimating impression

 Music video clip  Social comments

1 Introduction Due to the spread of consumer-generated media (CGM) websites such as YouTube and Nico Nico Douga, and the advancement of DTM software such as VOCALOID [13], the number of music video clips, which are composed of music and a video, on the Web has dramatically increased. A standard method of searching for these music video clips is to input information such as an artists’ names, song titles, and tags provided. This search methods makes it possible to find the target music video clip directly. However, as this method requires users to know information on music video clips in advance, it sometimes is not easy to find the target clips. To solve this, researchers in the field of music searches have been actively researching ambiguous searches based on the user’s subjective impressions such as cheerful or sorrowful to solve such problems. If searches based on impressions become possible, the users will be able to search from a new viewpoint. In addition, we can expect users to be able to find new music video clips. To realize the impression-based music video clip search, we have to evaluate and provide subjective impressions on individual music video clips in advance. However, as previously explained, since the number of music video clips has been increasing explosively, it is too difficult for us to evaluate the impressions of all music video clips. Thus, we need to mechanically estimate the impressions of music video clips. © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 115–128, 2018. https://doi.org/10.1007/978-3-319-98743-9_10

116

S. Tsuchiya et al.

Nevertheless, it is not easy to mechanically estimate the impressions of music video clips that viewers would have because music video clip consists of only music and video. To achieve this, we decided to use comments written on music video clips on some website. For example, users can freely post comments in order to show appreciation for authors, to communicate with others, to express their feelings, to add explanations and lyrics and so on while viewing a music video clip on Nico Nico Douga in Japan and BiliBili Douga in China. We regard these comments as the viewers’ subjective impressions for the music video clips, and make use of them for mechanical estimation. Although we conducted the impression estimation of music video clips using comments in our past work [4], we only looked at adjectives in comments, and we did not consider other parts-of-speech. However, we thought that along with adjectives, other parts-of-speech can also be an essential factor to estimate impressions. Thus, we examine what parts-of-speech in comments should be considered for impression estimation. Also, the past research [4] used the whole of a music video clip for the estimation. However, the most exciting part of the structure of music is known to be a chorus part [12]. Therefore, we assume that the chorus part decides the impression that the viewers would receive, and decided to estimate the impressions of the chorus part of the music video clip only. A music video clip normally consists of music and a video. As a result, people may focus on different media types (i.e., music, video, or combined) of the music video clip when making a search for it based on the impression. For example, one may search for music video clips of happy songs, while others may search for those of cool video picture. In addition, different people may post comments on different media types of a music video clip. For example, one may post comments for music video clips to express “the songs are happy”, while the others may post comments to express “cool video picture.” Thus, we focused on social comments on Nico Nico Douga and examined the possibility of estimating the impressions of music video clips using comments. At that time, we also considered media types that are music only, video picture only, and combined. In this paper, we generated the impression evaluation dataset which is an evaluation of eight different types of impressions for each of the three media types (music only, video picture only, and both) for the chorus part of 500 music video clips. In addition, we collected social comments on the chorus part of those music video clips and generated 12 types of bag-of-words based on a particular part-of-speech used in comments. Then, we tested these bag-of-words of estimating the impressions of music video clips. In addition, we examined the accuracy of the estimation with which impressions were estimated by support vector machines (SVMs) using these bag-ofwords. The main contributions of this paper are below. • We generated the impression dataset of chorus part of 500 music video clips in three media types (music only, video picture only, and combined). • We revealed that it is better to use proper parts-of-speech in social comments depending on each media/impression type.

Can Social Comments Contribute to Estimate Impression of Music Video Clips?

117

2 Related Work There have been various kinds of researches on estimating impressions of contents of music video clips. Some researchers initially made estimates of impressions of songs [9, 10]. These researchers improved the accuracy of estimates with not only acoustic features but also subjective features like lyrics. They also estimated the subjects’ impressions of videos [11]. This research disclosed estimates with high levels of accuracy using not only video features but also subjective features such as viewer’s expressions. There have also been many researches on estimates of impressions of music video clips that we have been targeting [7, 8]. The researchers focused on the fact that music video clips combine music and videos, and estimated impressions by combining these characteristics. As a result, although it is possible to estimate impressions with high levels of accuracy, features of music and images are machine enemy features, where no human emotions are reflected. Therefore, we considered that better estimates would be possible using subjective features like those in the researches explained above [9–11]. Therefore, there is a research that has focused on comments provided to music video clips as one of the subjective characteristics of these clips [5, 6]. These researchers have estimated impressions using comments posted on YouTube. However, since comments unrelated to the movies such as conversations between users are posted, we cannot use many of them to estimate impressions. Here, Nico Nico Douga, which is the most popular CGM website in Japan, has a function to provide comments in real time to the video. These comments can be considered to express impressions that users directly felt in real time. In fact, there actually is a research that estimated the impressions of music video clips using these comments [4]. Our research treated adjectives in the comments and the length of the comments as comment features. We focused on the parts-of-speech in the comments and analyzed the accuracy with which impressions were estimated. There are also various approaches to the impression class. First, there is a research on impressions of clustering of songs [1]. This research clustered the impressions of music into eight groups. Russell also proposed a valence-arousal space as a model of estimating impressions of music [2]. Valence involves pleasure-discomfort, and arousal is a dimension expressing arousal-sedation, which is the idea of expressing an impression in these two dimensions. In our study, we estimated and analyzed the impression of valence-arousal space, the impression of music information retrieval evaluation exchange (MIREX), and the impression of “cute” which is frequently used in Nico Nico Douga.

3 Generating the Impression Evaluation Dataset In this paper, we generate the impression evaluation dataset of music video clips. The dataset covers the chorus part of music video clips. This dataset also divides one music video clip into three media types (music only, video picture only, and music video clip (combined)), and three or more subjects evaluated eight impressions for each.

118

S. Tsuchiya et al.

We collected target 500 music video clips from March 26, 2015, to June 18, 2016. The music video clips to be evaluated were tagged “VOCALOID” from videos posted on Nico Nico Douga and had the large number of views. In addition, we extracted 30 s of the music video clip from 5 s before the start of the chorus part estimated by refrain detection (RefraiD) [12]. The reason why we chose to extract the clip 5 s before the timing detected a chord is that the change from pre-chorus to the chorus would be also important. Most of the music videos targeted this time were those with chorus part less than 25 s. In addition, we watched and checked 500 music video clips, but there was no case where chorus part was detected incorrectly. The eight impressions were composed of five impressions used in MIREX [3], which is a music information search workshop, two impressions called valence-arousal space proposed by Russell et al. [2], and one impression called “cute” used in the research of Yamamoto et al. [4]. Table 1 summarizes the eight impressions used in the dataset. The “impression names” in the table are labels representing the impressions that have been given for convenience. In addition, “adjectives representing impression” express the impression classes when collecting the evaluation value from subjects in dataset construction. Table 1. 8 Impressions in dataset Impression names C1 (exciting) C2 (cheerful) C3 (painful) C4 (fierce) C5 (humorous) C6 (cute) Valence Arousal

Adjectives representing impressions Exciting, bustling, proudly, & dignified Cheerful, happy, hilarious, & comfortable Painful, gloomy, bittersweet, & sorrowful Fierce, aggressive, emotional, & active Humorous, funny, strange, & capricious Cute, lovely, awesome, tiny, & Bright feelings & fun Dark feelings, sad, Fierce, aggressive, & bullish Gentle, passive, & bearish

For the evaluation, we presented one of the media for 30 s to subjects. After watching it, they answered each impression with a five rank Likert scale. The impression evaluation dataset was evaluated on a five rank Likert scale from one (strongly disagree) to five (strongly agree) for C1 to C6, −2 (dark feelings and sad) to +2 (bright feelings and fun) for valence and −2 (gentle, passive, and bearish) to +2 (fierce, aggressive and bullish) for arousal. When they finished answering, the next content is presented. We present at random regardless of media type. We asked subjects to evaluate using the Web interface in the above procedure. To make it easier to compare C1 to C6 and valence-arousal, they were converted to −2 to +2 by decreasing the evaluation values of one to five to −3. After that, we calculated the average of three subjects for the impression evaluation value and used it as the evaluation value for each media and impression type in this paper. We published this dataset at http://nkmr.io/mood/.

Can Social Comments Contribute to Estimate Impression of Music Video Clips?

119

4 Evaluation Experiment We conducted an evaluation experiment using the impression evaluation dataset to investigate whether evaluations made by people for the impressions of music video clips could be mechanically estimated using social comments. We tested and verified in the evaluation experiment by using SVMs whether impressions having an evaluation of more than a certain value could be mechanically estimated in the impression evaluation dataset. Two sets of music video clips (high and low evaluation groups) were specially constructed based on the impression evaluation value for each media/impression type. In addition, we divided each dataset into learning and test data. We evaluated the efficiency of classification using the high evaluation group from social comments by learning and testing it with SVMs and performing cross-validation. First of all, we will describe methods of collecting social comments and generating bag-of-words to perform SVMs, and further I will explain the basic evaluation to consider the amount of data. In addition, each method of generating bag-of-words indicated how much could be estimated by each media/impression type. Based on the results, we will discuss the appropriate method of bag-of-words generation to estimate impressions in each media/impression type. 4.1

Generation of Bag-of-Words for Music Video Clips

We gathered comments given to the music video clips corresponding to the impression evaluation dataset to consider the accuracy with which each media/impression type of a music video clip was estimated from social comments. We specifically collected all comments on the relevant music video clips using the Nico Nico Douga application programming interface (API) on July 23, 2015, and gathered 860,455 comments. Comments posted to the chorus part based on the start and end times of each music video clip were extracted after that. We extracted 132,036 comments (264.1 on average per music video clip) by doing this processing. We next generated a bag-of-words for music video clips from the social comments. We first morphologically analyzed comments on the chorus part of each extracted music video clip using MeCab [14] and divided them into words. After that, the number of occurrences of each word was taken as a bag-of-words for the music video clips. We prepared 12 kinds of methods depending on the parts-of-speech used for a bagof-words generation for the research discussed in this paper. The first method involved all parts-of-speech. The second method involved four parts-of-speech. Adjectives were considered to show impressions, nouns and verbs were thought to have features presented by the music video clips, and adverbs were considered to express the degree of impression, such as “more” or “very.” We also prepared a method that combined two parts-of-speech and a method that used all four parts-of-speech. Table 2 summarizes all of these method names and the parts-of-speech we used.

120

S. Tsuchiya et al. Table 2. Methods of bag-of-words generation Method names All method All2 method Noun method Verb method Adj method Adv method Noun-verb method Noun-adj method Noun-adv method Verb-adj method Verb-adv method Adj-adv method

4.2

Parts-of-speech used All parts-of-speech Nouns, Verbs, Adjectives, Adverbs Nouns Verbs Adjectives Adverbs Nouns, Verbs Nouns, Adjectives Nouns, Adverbs Verbs, Adjectives Verbs, Adverbs Adjectives, Adverbs

Basic Evaluation of Impression Classification

As described in the previous subsection, two sets of music video clips (high and low evaluation groups) were constructed based on the impression evaluation value, and we determined whether the machine could judge the music animation of the high evaluation group for each media/impression type. More specifically, music video clips having an evaluation value of greater than or equal to one were first set as a high evaluation group, and those having minus one or less were set as a low evaluation group to construct a music video clips set. We next divided each music video clip set into five groups and performed five-fold cross-validation using four of them as training data and the other as test data, and calculated the precision of the high evaluation group. We first evaluated fundamentals in machine learning. Tables 3 and 4 summarize the number of music video clips for each media/impression type of the constructed high and low evaluation groups. “Movie” means music video clips, “Audio only” means music, and “Visual only” means videos. Also, “V” means Valence and “A” means Arousal in the tables below. Machine learning was performed based on these sets of music video clips by using each bag-of-words. However, a problem with imbalanced data occurred probably because there was bias in the number of music video clips depending on the media/impression type (the number of Audio-C3 and Visual-C1was small.) After this, we under-sampled each media/impression type, and made the number of music video clips the same in an experiment and evaluated them. Table 3. No. of music video clips in high evaluation group C1 C2 C3 Movie 76 105 87 Audio 133 127 46 Visual 21 50 142

C4 54 69 49

C5 C6 V 83 104 101 49 73 124 81 78 57

A 150 178 111

Can Social Comments Contribute to Estimate Impression of Music Video Clips?

121

Table 4. No. of music video clips in low evaluation group C1 C2 Movie 105 169 Audio 65 92 Visual 252 272

4.3

C3 191 232 165

C4 209 195 247

C5 178 180 207

C6 215 209 234

V A 62 94 61 43 96 155

Results

Tables 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 and 16 summarize the average precision for the high evaluation group using the SVMs of each media/impression type when we generated the bag-of-words with all the preparation methods. An experiment was also carried out. In addition, each table shows a value of 0.8 or more in pink and a value of 0.6 or less in blue. Table 5. Precision of All methods

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.720

0.830

0.713

0.765

0.718

0.758

0.783

0.777

0.758

Audio

0.742

0.671

0.612

0.661

0.600

0.712

0.704

0.744

0.681

Visual

0.611

0.680

0.752

0.714

0.603

0.797

0.660

0.743

0.695

Average

0.691

0.727

0.692

0.713

0.640

0.756

0.712

0.755

0.711

Table 6. Precision of All2 method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.645

0.814

0.705

0.765

0.728

0.792

0.694

0.822

0.745

Audio

0.738

0.658

0.566

0.750

0.725

0.787

0.736

0.778

0.717

Visual

0.880

0.786

0.390

0.725

0.564

0.776

0.814

0.870

0.725

Average

0.754

0.753

0.554

0.747

0.672

0.785

0.748

0.823

0.730

122

S. Tsuchiya et al. Table 7. Precision of Noun method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.575

0.720

0.644

0.653

0.704

0.680

0.646

0.652

0.659

Audio

0.698

0.606

0.528

0.621

0.721

0.661

0.708

0.650

0.649

Visual

0.700

0.640

0.608

0.600

0.620

0.688

0.552

0.641

0.631

Average

0.658

0.655

0.593

0.625

0.682

0.676

0.635

0.648

0.647

Table 8. Precision of Verb method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.667

0.627

0.440

0.544

0.642

0.714

0.575

0.574

0.597

Audio

0.615

0.622

0.133

0.658

0.587

0.500

0.600

0.551

0.533

Visual

0.588

0.549

0.606

0.517

0.584

0.573

0.508

0.654

0.572

Average

0.623

0.599

0.393

0.573

0.604

0.596

0.561

0.593

0.568

Table 9. Precision of Adj method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.733

0.869

0.710

0.750

0.667

0.838

0.650

0.842

0.757

Audio

0.667

0.635

0.595

0.667

0.581

0.775

0.706

0.733

0.669

Visual

0.714

0.736

0.733

0.759

0.536

0.829

0.603

0.850

0.720

Average

0.705

0.747

0.679

0.725

0.595

0.814

0.653

0.808

0.716

Table 10. Precision of Adv method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.618

0.586

0.522

0.576

0.520

0.481

0.556

0.603

0.557

Audio

0.679

0.600

0.580

0.537

0.545

0.481

0.642

0.538

0.575

Visual

0.879

0.759

0.211

0.632

0.519

0.451

0.777

0.805

0.629

Average

0.725

0.648

0.438

0.582

0.528

0.471

0.658

0.649

0.587

Can Social Comments Contribute to Estimate Impression of Music Video Clips?

123

Table 11. Precision of Noun-verb method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.687

0.699

0.648

0.620

0.681

0.714

0.661

0.636

0.668

Audio

0.683

0.580

0.489

0.642

0.689

0.672

0.729

0.658

0.642

Visual

0.881

0.760

0.308

0.614

0.595

0.639

0.805

0.859

0.682

Average

0.750

0.680

0.482

0.625

0.655

0.675

0.732

0.718

0.665

Table 12. Precision of Noun-adj method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.662

0.854

0.690

0.780

0.750

0.778

0.694

0.800

0.751

Audio

0.754

0.644

0.612

0.750

0.707

0.772

0.740

0.806

0.723

Visual

0.888

0.792

0.409

0.706

0.657

0.768

0.821

0.874

0.739

Average

0.768

0.763

0.570

0.745

0.705

0.773

0.752

0.827

0.738

Table 13. Precision of Noun-adv method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.592

0.714

0.644

0.654

0.722

0.673

0.656

0.649

0.663

Audio

0.672

0.589

0.538

0.621

0.711

0.661

0.694

0.632

0.639

Visual

0.879

0.763

0.372

0.636

0.622

0.683

0.805

0.852

0.701

Average

0.714

0.689

0.518

0.637

0.685

0.672

0.718

0.711

0.668

Table 14. Precision of Verb-adj method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.781

0.811

0.711

0.684

0.667

0.856

0.652

0.784

0.743

Audio

0.692

0.627

0.520

0.714

0.682

0.740

0.673

0.707

0.669

Visual

0.921

0.734

0.400

0.734

0.511

0.764

0.779

0.871

0.714

Average

0.798

0.724

0.544

0.711

0.62

0.787

0.701

0.787

0.709

124

S. Tsuchiya et al. Table 15. Precision of Verb-adv method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.667

0.568

0.535

0.531

0.657

0.630

0.600

0.660

0.606

Audio

0.677

0.560

0.458

0.566

0.587

0.513

0.589

0.581

0.566

Visual

0.882

0.729

0.250

0.622

0.488

0.529

0.724

0.814

0.629

Average

0.742

0.619

0.414

0.573

0.577

0.557

0.638

0.685

0.601

Table 16. Precision of Adj-adv method

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.700

0.837

0.679

0.690

0.681

0.848

0.695

0.844

0.746

Audio

0.733

0.646

0.581

0.634

0.683

0.743

0.667

0.718

0.675

Visual

0.911

0.765

0.477

0.653

0.622

0.757

0.840

0.884

0.738

Average

0.781

0.749

0.579

0.659

0.662

0.783

0.734

0.815

0.720

First, we found that the value of the All2 method was more significant than that of the All method by more than 0.8 for each media/impression type, when comparing them, and the overall average value was also high. However, the value of C3 (painful) impression was low in all media types. Next, by comparing methods using only one part-of-speech, we can see that the Noun, Verb, and Adv methods were not as highly accurate in estimation as the Adj method. Although the Adv method had high values that slightly exceeded 0.8, low values below 0.6 were often found. However, the Adj method had many values that exceeded 0.8, and it particularly demonstrated that the precision of C6 (cute) and Arousal was high. High values increased by combining parts-of-speech for the method using two parts-of-speech; it especially indicated that the method achieved many high values including those for adjectives. Furthermore, high values that exceeded 0.8 for Audio were only found for the Arousal of the Noun-Adj method out of all the methods. However, C3 achieved no high values for any of the methods, and we found that there were many low values below 0.6. Visual-C1, Movie-C2, and Visual-Arousal attained relatively high values for each media/impression type, regardless of which method was used. We can also see that there is some bias in the media/impression type with high values.

Can Social Comments Contribute to Estimate Impression of Music Video Clips?

125

5 Discussion We found that the accuracy of estimation using social comments differed depending on each method and each media/impression type. The All2 method achieved much higher values than the All method, and the overall average value was higher for the All2 method. This might be because all types of written expressions including symbols such as parentheses and emoticons, which are hardly thought to represent impressions, were used in the All method. However, we found that the accuracy of C3 for the All method was higher than that of the All2 method in all media types. The Visual-C3 of the All2 method was 0.39, which is especially low. We considered from this that the parts-of-speech excluded by the All2 method were factors in improving the accuracy of estimating C3. A high value appears in the Adj method using only one part-of-speech; however, we can see that the other three methods do not have high values and the overall average value is also very low. From this, we considered that nouns, verbs, and adverbs were not used much to express impressions, or words used for impressions did not have features. Therefore, we considered that users most often expressed impressions using adjectives and that the words that were used had features. Next, values exceeding 0.8 increased for each media/impression type in a method that combined two parts-of-speech, unlike a method using only one part-of-speech. Therefore, we considered that the method that used two parts-of-speech was useful. In particular, the results obtained for Visual-valence of Noun-verb and Noun-adv methods were high; however, the results for the Visual-valence of Noun, Verb, and Adv methods were low. We could see from this that the combination of parts-of-speech improved the accuracy of estimation. Since the results differed depending on the combination of parts-of-speech used in bag-of-words generation, it can be assumed that the accuracy of estimation will be higher by combining parts-of-speech not used in this research with other parts-of-speech. However, the value for C3 was lower in all combinatory methods because the parts-of-speech used in this experiment made it difficult to reveal features, and we expect to improve this using the parts-of-speech. However, nouns, verbs, adjectives, and adverbs are major components to construct sentences, and it is difficult to estimate that C3 is lower from the social comments using these parts-of-speech. Moreover, we obtained high values for C6 (cute) and Arousal for the impression type. Therefore, these impressions were considered to be easy to estimate from social comments. The main reason for the higher values was considered to be because the words used in the high evaluation group had features. For example, users often expressed the impression of C6 (cute) with the word “cute.” Hence, we thought that C6 was able to learn well due to the features. There were also media/impression types that had relatively high values with all methods, such as Visual-C1 (proudly), Movie-C2 (cheerful), and Visual-arousal. We expect that these media/impression types will be easy to estimate from social comments. Therefore, by analyzing what features (number of comments and words used) were used in the comments, we aim to improve the accuracy of impressions in other media/impression types.

126

S. Tsuchiya et al. Table 17. Method that yielded highest values in each media/impression type

C1 C2 Movie Verb-adj Adj Audio Noun-adj All Visual Verb-adj Noun-adj

C3 All Noun-adj All

C4 Noun-adj Noun-adj Adj

C5 Noun-adj All2 Noun-adj

C6 Verb-adj All2 Adj

V All Noun-adj Adj-adv

A Adj-adv Noun-adj Adj-adv

Table 18. Highest value for each media/impression type

C1

C2

C3

C4

C5

C6

V

A

Average

Movie

0.781

0.869

0.713

0.780

0.750

0.856

0.783

0.844

0.797

Audio

0.754

0.671

0.612

0.750

0.725

0.787

0.740

0.806

0.731

Visual

0.921

0.792

0.752

0.759

0.657

0.829

0.840

0.884

0.804

Average

0.819

0.777

0.692

0.763

0.711

0.824

0.788

0.845

0.777

Tables 17 and 18 lists the methods and the values that yielded the highest value in each media/impression type. Table 17 indicates that methods that included adjectives had the highest values with all the media/impression types. We considered from this that people use adjectives when expressing impressions, and features are likely to appear in the adjectives. We also found that adjectives are an important parts-of-speech when estimating impressions of music video clips from social comments. Table 18 indicates that values that exceed 0.75 appear in 20/24 media/impression types (three media  eight impressions). Since the evaluation value of the dataset used in this paper was the averaged value of the evaluations by three people, there is a blur in the evaluation value. Therefore, we considered that accuracy that exceeded 0.75 was relatively effective. In particular, values that exceeded 0.8 could be classified with accuracy that was as high as 80%, so we considered those to be an effective value. There was a clear difference when we compared the average of Audio and Visual types. Users tended to comment on the video from this, and we considered that estimates from the comments were useful concerning the impressions of the video. We only used VOCALOID songs in this paper. Therefore, there is a possibility that comments on characters will be made regardless of the music when a character such as Hatsune Miku appears in a video. We plan to analyze this carefully. We considered that estimating the impressions of music video clips from social comments could be done by separately using methods that were suitable for each media/impression type, based on the results above when estimating the impressions of music video clips. Moreover, if it is possible to estimate the impressions of all media types, we expect that highly accurate estimates of impressions of music video clips will be possible by combining them with researches on combining the impressions of music and video.

Can Social Comments Contribute to Estimate Impression of Music Video Clips?

127

6 Conclusion In this paper, we generated the impression evaluation dataset that consisted of 500 music video clips, three media, and eight impressions, and analyzed the possibility of estimating impressions for each media/impression type from social comments using this dataset. We created bag-of-words for music video clips and obtained the results from estimating impressions using SVMs for each media/impression type and discussed their usefulness. When generating bag-of-words, we mainly used four parts-ofspeech and their combinations, compared each method, and found effective methods for each media/impression type. As a result, we found that there was a difference in the accuracy of estimation of each method and that methods that included adjectives yielded the highest values in all media/impression types. When estimating the impressions of music video clips, from the results in this research, we considered that estimates of impressions was possible using the most effective method for each media/impression type. Therefore, social comments can contribute to estimate impression of music video clips. However, the highest values for Audio-C2 (hilarious), Audio-C3 (painful), and Visual-C5 (humorous) were not high as each of them were 0.671 for Audio-C2, 0.612 for Audio-C3, and 0.657 for Visual-C5. We aim to improve accuracy in this regard by not only estimating impressions from social comments, but also estimating impressions in combination with other features such as sound and video. This was also considered to be similar not only to media/impression types, which had low values, but also to all of them. We evaluated the accuracy of estimation using classification accuracy in this research; however, we considered that searches based on higher accuracy in impressions will become possible by concretely estimating the evaluation value. Therefore, we intend to explore specific methods of estimating the evaluation value in the future. In addition, we considered that there was blurring in the evaluation value in the impression evaluation dataset used in this research because there were only three evaluators. Furthermore, since we did not investigate the influence of the number of comments or what kind they were, we plan to investigate their impact with Yamamoto and Nakamura [4] in the future. Acknowledgments. This work was supported in part by JST ACCEL Grant Number JPMJAC1602, Japan.

References 1. Hevner, K.: Experimental studies of the elements of expression in music. Am. J. Psychol. 48 (2), 246–268 (1936) 2. Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980) 3. Hu, X., Downie, J.S., Laurier, C., Bay, M., Ehmann, A.F.: The 2007 MIREX audio mood classification task: lessons learned. In: 9th International Conference on Music Information Retrieval, ISMIR 2008, Philadelphia, pp. 14–18 (2008)

128

S. Tsuchiya et al.

4. Yamamoto, T., Nakamura, S.: Leveraging viewer comments for mood classification of music video clip. In: 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, Dublin, pp. 797–800 (2013) 5. Eickhoff, C., Li, W., de Vries, A.P.: Exploiting user comments for audio-visual content indexing and retrieval. In: Serdyukov, P., et al. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 38– 49. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36973-5_4 6. Filippova, K., Hall, K.: Improved video categorization from text metadata and user comments. In: 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, pp. 835–842 (2011) 7. Acar, E., Hopfgartner, F., Albayrak, S.: Understanding affective content of music videos through Learned representations. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds.) MMM 2014. LNCS, vol. 8325, pp. 303–314. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04114-8_26 8. Ashkan, Y., Evangelos, S., Nikolaos, F., Touradj, E.: Multimedia content analysis for emotional characterization of music video clips. EURASIP J. Image Video Process. 1–10 (2013) 9. Hu, X., Downie, J., Ehmann, A.: Lyric text mining in music mood classification. In: 10th International Society for Music Information Retrieval, ISMIR 2009, Kobe, pp. 411–416 (2009) 10. Laurier, C., Grivolla, J., Herrera, P.: Multimodal music mood classification using audio and lyrics. In: 7th International Conference on Machine Learning and Applications, ICMLA 2008, San Diego, pp. 688–693 (2008) 11. Sicheng, Z., Hongxun, Y., Xiaoshuai, S., Pengfei, X., Xianming, L., Rongrong, J.: Video indexing and recommendation based on affective analysis of viewers. In: 19th ACM International Conference on Multimedia, MM 2011, Scottsdale, pp. 1473–1476 (2011) 12. Goto, M.: A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans. Audio Speech Lang. Process. 14(5), 1783–1794 (2006) 13. Kenmochi, H., Oshita, H.: VOCALOID – commercial singing synthesizer based on sample concatenation. In: 8th Annual Conference of the International Speech Communication Association, Interspeech 2007, Antwerp, pp. 4009–4010 (2007) 14. Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to Japanese morphological analysis. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004, Barcelona, pp. 230–237 (2004)

Detection of Football Spoilers on Twitter Yuji Shiratori(&), Yoshiki Maki, Satoshi Nakamura, and Takanori Komatsu Meiji University, 4-21-1 Nakano, Nakano-Ku, Tokyo, Japan [email protected] Abstract. Sports spoilers on SNS services such as Twitter, Facebook and so on spoil viewers’ enjoyment when watching recorded matches. To avoid spoilers, people sometimes stay away from SNSs. However, people often use SNSs to habitually check messages posted by their friends and build and maintain their relationships. Therefore, we need an automatic method for detecting spoilers from SNSs. In this paper, we generated a Japanese spoiler dataset on Twitter and investigated the characteristics of the spoilers to create a foothold in construction of automatic spoiler detection system. Consequently, we clarified the relationship between spoilers and the statuses of football matches. In addition, we compared three methods for detecting spoilers and show the usefulness of SVM with Status of Match method. Keywords: Blocking spoilers SNS  Twitter

 Machine learning  Sports  Football

1 Introduction There are many people who like to watch sports games in real time and feel excited and surprised. However, it is often difficult for them to watch sports games in real time because they are busy with work, studies, etc. In such situations, it is common to schedule a recording in advance and watch it when they have some free time. However, if a viewer gets to know the results of the match before watching it, feelings of excitement and surprise would probably be lost. Since such viewers would like to avoid “spoiler information” such as scores and the winners/losers of the match, they actively choose a self-imposed isolation from their community to block information on the match until they watch it. However, since SNS services such as Twitter, Facebook and so on allow people to habitually check messages posted by their friends and to build and maintain their relationships, the isolation approach should not be taken to keep their good relationship. To avoid deteriorating personal relationships, we need an automatic method for detecting spoilers from SNSs. Here, if potential users want to avoid the status of football games, a simple approach to find football posts may be reasonable. However, the approach blocks all football posts, not excepting posts that viewers don’t want to block. Further, we think viewers should be able to enjoy conversation about the target games unless it contains any spoiler information. In recent years, researches have been conducted to block such spoiler information. For example, Nakamura et al. proposed a method for filtering information on web pages that corresponds to a user’s interest on the basis of their e-mail and a TV program © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 129–141, 2018. https://doi.org/10.1007/978-3-319-98743-9_11

130

Y. Shiratori et al.

guide [1]. In previous researches, the researchers focused on methods for blocking spoilers by interacting with the users and the systems, but it has not been clarified that what characteristics spoilers have and how to detect them with high accuracy. Therefore, we investigated the characteristics of spoilers by generating a spoiler dataset on posts about football matches on Twitter and examined methods for detecting spoilers with high accuracy. The contributions of this work are: (1) the generation of the football spoiler dataset about Twitter posts; (2) the experimental proof of the usefulness of SVM with Status of Match method by comparison of the accuracy of spoiler detection with three methods. The rest of this paper is organized as follows. Section 2 shows a discussion of the related works. Sections 3 and 4 are about generating spoiler dataset and analysis of it. Section 5 explains a verification of the effectiveness of three word-based methods. Finally, Sects. 6 and 7 are discussion and our conclusions.

2 Related Work 2.1

Influence of Spoilers

Regarding investigation into the influence of spoilers, Leavitt et al. focused on novels and investigated what kinds of differences appeared in terms of user’s enjoyment when spoiler information was presented and when it was not [2]. As a result of the experiment, it was claimed that spoiler information does not lower the fun of the content. However, the act of reading a novel and the act of watching sports are essentially different. Moreover, it was only suggested that spoilers help readers understand the content and personal relationships among the characters, resulting in making it easier to read novels by presenting a summary. In addition, Rosenbaum et al. confirmed that those who are not familiar with a novel feel that the story is more interesting with spoilers, and those who are familiar feel that it is more interesting without [3]. Several researchers revealed the bad influence of spoilers. Therefore, there is a need to automatically filter out spoilers from SNSs. To create a foothold to realize it, we generated a spoiler dataset and examined methods for detecting spoilers with high accuracy. 2.2

Blocking Spoilers

As researches into blocking information similar to spoilers, researches on review texts on the Internet have been widely conducted. Ikeda et al. are concerned with the inclusion of spoilers in review texts for story content and eliminate spoilers using machine learning [4]. Pang et al. identified which sentences do not include outline of a story with support vector machine (SVM) for review texts [5]. In the researches on these review texts, all outlines are judged as spoilers, but in sports, outlines (content of the matches) are not directly spoiled (like comments about showing one’s happiness and sadness). Therefore, it is slightly different in nature from the spoilers discussed in this paper. As research on the problem of spoilers on SNSs such as Twitter, Facebook and so on, Boyd-Graber et al. conducted an evaluation of machine learning approaches to find spoilers in social media posts [6]. They targeted movie reviews and used classifiers on multiple sources to determine which posts should be blocked.

Detection of Football Spoilers on Twitter

131

Relative to these studies, sports spoilers are often related to game results. Therefore, their contents differ from that of the spoilers dealt with in these studies. We analyze the characteristics of sentences of spoilers about football games and examine methods for detecting spoilers with high accuracy. Jeon et al. proposed a method of detecting spoilers using machine learning, focusing on “named entities”, “frequently used verbs”, “future tense”, etc. in comments on Twitter [7]. By conducting experiments using comments on television programs, they found that it was possible to detect tweets with spoilers with a high precision compared with methods that use keyword matching or latent Dirichlet allocation (LDA), and they confirmed the utility. In addition, they also carried out an experiment about sports spoilers. However, they conducted it for only one match and labeled tweets themselves, so the evaluation was actually not strict. Moreover, it has no applicability to tweets in Japanese because Japanese does not have future tense. We had classifiers construct a spoiler dataset for tweets on nine football matches and we examined methods for detecting spoilers that can be applied to Japanese.

3 Generating Spoiler Dataset In this section, as a foothold for the construction of automatic spoiler detection system, we analyze the characteristics of spoilers to know what kind of information a spoiler holds. Currently, viewers can encounter spoilers at various forms of media such as news websites, weblogs, and search websites. SNSs like Twitter in particular have increased the chance of encountering spoilers. As for Twitter, there are many people who casually use it because they can learn what their friends are doing just by accessing it and can easily communicate with others; thus, there is a high possibility of seeing spoiler information. Therefore, we collect posts on Twitter related to football matches and analyze the characteristics of spoilers by constructing a spoiler dataset. From now on, a Twitter post is called a “tweet”. To generate a dataset, we first collected tweets on football matches. Here, we focused on matches played by the Japan national football team on which there was a particularly large number of tweets by many fans [8]. The information on the matches is shown in Table 1. Table 1. Matches for generating dataset Match 2015 Women’s World Cup “Japan vs. England” 2015 Women’s World Cup “Japan vs. United States” 2015 EAFF East Asian Cup “Japan vs. South Korea” 2015 Women’s EAFF East Asian Cup “Japan vs. China” 2015 EAFF East Asian Cup “Japan vs. China” World Cup Qualifiers “Japan vs. Cambodia” World Cup Qualifiers “Japan vs. Afghanistan” Friendlies “Japan vs. Iran” World Cup Qualifiers “Japan vs. Singapore”

Score JPN 2 – 1 ENG JPN 2–5 USA JPN 1–1 KOR JPN 2 – 0 CHN JPN 1–1 CHN JPN 3 – 0 KHM JPN 6 – 0 AFG JPN 1–1 IRI JPN 3 – 0 SIN

Day 07/01/15 07/05/15 08/05/15 08/08/15 08/09/15 09/03/15 09/08/15 10/13/15 11/12/15

132

Y. Shiratori et al.

Currently, when tweeting real-time content, symbols called “hashtags” can be used for search/classification in some cases. For example, hashtags such as “#daihyo” and “#JPN” are used for matches of the Japanese national football team. If a hashtag for a target sport is attached to a tweet about a target match, it is sufficient to block all tweets including that hashtag. However, there are many tweets without a hashtag that are actually related to a match. To block them, it is necessary to analyze the contents of the tweets. In addition, it will be possible to cut out the need of changes in the set of hashtags. However, if we collect all tweets related to a football match, we need to collect all the tweets that are posted at that time and then select tweets related to football matches. This leads to a problem of accuracy in selecting tweets, and also it is not possible to collect private tweets. In addition, streaming APIs provided by Twitter cannot collect all tweets. Therefore, tweets with hashtags and tweets without hashtags are considered to have no significant difference in terms of their contents in this paper, although there is a difference in tweets depending on whether they have hashtags or not. We decided to give priority to collect tweets efficiently and collecting tweets with hashtags. Here, some hashtags such as “#daihyo” or “#JPN” which are commonly used for matches of the Japanese national football team were selected before a match, and tweets including the hashtags were collected using the Search API provided by Twitter. Tweets were collected from the start of a match to 2 h after. Among the collected data, there were also many tweets that were not appropriate for classification and analysis. Therefore, we removed inappropriate tweets and formatted tweets using the following procedure. 1. Since many tweets from opponent countries are also posted on matches such as the World Cup games, the collected tweets were in multiple languages. Considering that the dataset constructer is a native speaker of Japanese, we removed the tweets in languages other than Japanese. To remove non-Japanese tweets, the language code was acquired when tweets were collected. Japanese tweets were judged depending on whether the language code is “ja” or not. 2. “RT” at the beginning of a tweet is called “retweet”. This action can repost other viewers’ tweets without modification. This action is taken to send other viewers’ tweets to those who are seeing your tweets. Since it overlaps with the original tweet, it was removed by regular expressions. 3. Hashtags were removed from collected tweets. In this case, from “#” to one character short of blank or new line were judged by regular expression. Also, we deleted blank tweets by regular expression (because there are tweets with hashtags only). Here, when there were only spaces or new lines between the beginning of the tweet and the end, that tweet was removed.

Detection of Football Spoilers on Twitter

133

4. Because there were many spam tweets unrelated to matches in tweets including URLs, tweets including “http://t.co/” or “https://t.co/” were judged and removed by regular expression. After doing the procedure above, we developed a web system in order to have tweets in the dataset labeled as spoiler and non-spoiler. Five college students helped us label the tweets. The students were aged 19 to 22 who were interested in watching football matches and regularly use Twitter. Figure 1 shows the web evaluation system (Fig. 1 shows tweets we translated). If a labeler feels that a tweet is a spoiler, he/she click it. Also, labelers may find spoilers when tweets such as “Kagawa, get it!” and “Nice!!!” are posted at the same time, when “G”, “O”, “A” and “L” is posted at the same time, when the number of tweets on a match suddenly increases, or others besides independent tweets. However, to avoid the situation that the web system and criteria for classification become complicated and difficult, tweets were labeled based on independent tweets. In addition, since it takes a huge amount of time to classify all tweets, the number of tweets to be presented is 1000 per match. If one match is presented at a time, there is a possibility that the context and content of the match will be clearly transmitted from one tweet. For example, supposing that a match that Kagawa (a football player on the Japan national team) scored points is being labeled, there is a possibility that even tweets that do not clearly include spoilers about a match such as “KAGAWA” will be judged as spoilers if labelers who know the match’s result look at that kind of tweet and assume that the details of a target match can be understood. This is undesirable because labelers actually cannot know the content of the match (they may know preliminary information of the match) even though they look forward to watching the match. Therefore, we decided to present three matches each at random. In other words, 9000 tweets were divided into 3 groups of 3,000 tweets. In addition, labelers can understand the elapsed time roughly from the start of the match without watching. If there is a tweet such as “Defense is meaningless, let’s go attack” at the start of the match, it is assumed that many labelers may think that this is a tweet about simple enthusiasm to the match. However, many labelers may regard such a comment as showing their team was losing if the tweet was at the second half of the match. We thought that this is necessary for judging spoilers. Therefore, tweets were presented randomly rather than in chronological order and the approximate elapsed time when the tweet was posted from the start of the match was displayed with each tweet. For example, if “60” was displayed with a tweet, it was a tweet from 51 min to 70 min after the start of the match. The reason it was set to an approximate time is because the viewer cannot know the exact time when the match was actually recorded. When accessing the web system, tweets were displayed for 50 tweets each page. Also, there were 60 pages per group and there were five labelers per group.

134

Y. Shiratori et al.

(a) Entire

(b) Scale-up

Fig. 1. Screenshot of evaluation system

Detection of Football Spoilers on Twitter

135

4 Analysis of Spoiler Dataset In this section, we analyze the contents of the dataset explained in the previous section. Table 2. Examples of dataset Tweet Elapsed time Label “Ooooh! Kagawa scores a goal!!!” 20 Spoiler “Already allowed two goals (´Д`)” 0 Spoiler “Now kick off” 0 Non-spoiler “Hmm. A missed pass is no good” 20 Non-spoiler Table 3. Concordance rate of judging spoilers Number of matching people 5 4 3 2 1 0

Number of tweets 351 680 620 217 634 6498

Percentage of tweets 3.90 7.56 6.89 2.41 7.04 72.20

Percentage of tweets in spoiler’s 14.03 27.18 24.78 8.67 25.34 –

Tables 2 and 3 show some examples of the dataset and spoiler matching rates of classified tweets, respectively. Then, Table 2 shows tweets translated by us. The tweets that all five judged to be spoilers mostly talked about the final result of the match. This is because the final result of a match is thought to be a spoiler for everyone. Many of the tweets that three or four labelers judged to be spoilers were about how the match went. This indicates that there was a certain number of viewers who considered only the final result of a match as important and did not regard other important moments as spoilers. Also, most of the tweets that one or two people judged to be spoilers indirectly expresses how the match went or were about scenes other than decisive moments. This is probably because the degree of perusing tweets, familiarity with target sports, of sensitivity to spoilers varies from viewer to viewer. There were various tweets that no one judged as spoilers, and they were related to moments that had less to do with the content of a match, simple cheering message, or a moment with low importance. On the basis of the results, we regarded tweets judged as spoilers by more than half of the participants (= more than 3 people) as spoiler tweets. In other words, there were 1,651 spoiler tweets among the 9,000 tweets. Many spoiler tweets contained specific pattern descriptions and words of each status of matches. Therefore, ten cases of terms used frequently in spoiler tweets were compared with non-spoiler tweets (TF-IDF [9]), as shown in Table 4. Then, Table 4 shows translated terms. Tweets were divided into the winning time zone, losing time zone, and tying time zone for Japan. MeCab was used for the word division. In addition, consecutive nouns of one character were treated as one word. Single-character words that have a basic form other than a noun and were not defined in the dictionary,

136

Y. Shiratori et al.

particles, auxiliary verbs, and meaningless words were eliminated. Also, since repetitive expressions were noise, the dataset was formatted in reference to Brody et al.’s method [10]. Furthermore, since proper nouns vary greatly from match to match, the numbers were mechanically generalized as [num], the player names as [player], the team names as [team], and the coach names as [coach] by pattern matching. Looking at each time zone from Table 4, terms used frequently in the losing time zone were different from those in the winning time zone and the tying time zone except for the terms “[player]”, “[team]”, and “Break the deadlock”. Also, the winning time zone was different from the tying time zone except for the terms “[player]”, “[team]”, “Goal”, and “Match”. There were frequently used terms that directly expressed the statuses of the matches such as “Win” in the winning time zone and “Tie” in the tying time zone. Furthermore, terms on scoring goals, such as “[num]th points”, “[num]points”, “Point” in the winning time zone, on scoring such as “[num] - [num]” in the tying time zone, and on allowing goals such as “Allowing goals” in the losing time zone, were used frequently, suggesting that the content of spoilers differed depending on the time zone. Table 4. Terms used frequently in spoiler tweets Winning time zone Terms TF-IDF [Player] 0.742 [Team] 0.422 Goal 0.261 [Num]th points 0.173 [Num] points 0.131 [Num] 0.117 Win 0.106 Match 0.099 point 0.095

Losing time zone Terms [Player] [Team] Break the deadlock Allowing goals Parry Second Score Too be -ed (passive voice)

Tying time zone TF-IDF Terms TF-IDF 0.531 [Player] 0.627 0.475 [Team] 0.552 0.238 Goal 0.226 0.238 Tie 0.201 0.224 Match 0.151 0.112 [Num]-[Num] 0.136 0.112 End 0.125 0.112 National 0.110 0.112 First 0.105

5 Experiment: Spoiler Detection 5.1

Experiment Procedure

In this section, we examined methods for detecting spoilers with high accuracy on the basis of the dataset. According to the previous section, since sports spoilers have prominent characteristics in terms of words, we compared three word-based methods: pattern matching, SVM (frequently terms were used as features), and SVM with Status of Match (frequently terms were used as features). In addition, we selected a SVM model based on the results of research by Jeon et al. [7]. • Pattern Matching: Terms used frequently in spoilers such as keywords and tweets containing terms matching the keywords were judged as spoilers. Terms were divided into rules by following the previous section (consecutive tweets with one character for a single word are combined, etc.), and terms with a TF-IDF value of

Detection of Football Spoilers on Twitter

137

0.100 or higher were taken as keywords. 0.100 was set as the threshold because the F-measure was the highest at 0.100 as a result of performing an analysis by changing the threshold by 0.050 from 0.000 to 0.300. • SVM: We generated an SVM model using the tweets of matches other than the match to be detected, and we detected the tweets (1000 test data) of the matches using the model. When preparing the model, we adjusted the amount of data by under-sampling because the number of non-spoiler tweets was higher than that of spoilers (Table 5 shows the number of training data for the SVM-method). In addition, vectors for SVM were generated using a BoW (Bag-of-Words) [11] of each tweet. Words were divided by rules from the previous section, and a linear kernel with a learning rate of 0.01 was set as a parameter for learning model generation by grid search. Also, to make the scale of each dimension (word) the same, normalization was performed. • SVM with Status of Match: According to the previous section, since terms used frequently differ by time zone, considering the statuses of matches in generating SVM’s model, we generated a winning model for the winning time zone, a losing model for the losing time zone, and a tying model for the tying time zone (Table 6 shows the number of training data for the method of SVM with Status of Match). Then, we detected test tweets with the winning model if Japan was winning at the time of the tweet, with the losing model if Japan was losing at the time of the tweet, and with the tying model if Japan was tied at the time of the tweet (Table 7 shows the number of test data for the method of SVM with Status of Match). We performed word segmentation, under-sampling, SVM parameters (learning rate), and kernels the same way as the SVM method. In addition, since this method detects spoilers for each time zone, there were matches with extremely little or no test data. When the number of spoiler tweets in the test data was 20 or less, it is considered that exceptional tweets would greatly influence the result; therefore, these tweets were excluded from the result (even in the case of 0 because results such as precision cannot be calculated). This method presupposes that, since it is necessary to detect the status of the match at the time of each tweet, it is necessary to have some delay in the display of the tweet during the match, and if it is hard to decide which team (or player) from a domestic league a viewer is cheering, there is a time zone in which it is necessary to use the winning model and the losing model at the same time. The three methods above were compared in terms of precision, recall, and F-measure. For all three, the experiment was conducted for nine matches (the number of matches in the dataset), and the average of the nine matches was calculated as a result. Here, for each method, precision means “the ratio of tweets that were detected correctly to detected tweets”, recall means “the ratio of tweets that were detected correctly to spoiler tweets” and F-measure is expressed by Eq. (1). F-measure ¼

2  precision  recall precision þ recall

ð1Þ

138

Y. Shiratori et al. Table 5. The number of training data for the SVM-method Match 2015 Women’s World Cup “Japan vs. England” 2015 Women’s World Cup “Japan vs. United States” 2015 EAFF East Asian Cup “Japan vs. South Korea” 2015 Women’s EAFF East Asian Cup “Japan vs. China” 2015 EAFF East Asian Cup “Japan vs. China” World Cup Qualifiers “Japan vs. Cambodia” World Cup Qualifiers “Japan vs. Afghanistan” Friendlies “Japan vs. Iran” World Cup Qualifiers “Japan vs. Singapore”

3248 3240 3704 3544 3710 3424 3350 3624 3596

Table 6. The number of training data for the method of SVM with Status of Match Match 2015 Women’s World Cup “Japan vs. England” 2015 Women’s World Cup “Japan vs. United States” 2015 EAFF East Asian Cup “Japan vs. South Korea” 2015 Women’s EAFF East Asian Cup “Japan vs. China” 2015 EAFF East Asian Cup “Japan vs. China” World Cup Qualifiers “Japan vs. Cambodia” World Cup Qualifiers “Japan vs. Afghanistan” Friendlies “Japan vs. Iran” World Cup Qualifiers “Japan vs. Singapore”

Winning 1704 2106 2106 1794 2106 1624 1528 2106 1774

Losing 674 954 764 880 770 930 952 756 952

Tying 870 180 834 870 834 870 870 762 870

Table 7. The number of test data for the method of SVM with Status of Match Match 2015 Women’s World Cup “Japan vs. England” 2015 Women’s World Cup “Japan vs. United States” 2015 EAFF East Asian Cup “Japan vs. South Korea” 2015 Women’s EAFF East Asian Cup “Japan vs. China” 2015 EAFF East Asian Cup “Japan vs. China” World Cup Qualifiers “Japan vs. Cambodia” World Cup Qualifiers “Japan vs. Afghanistan” Friendlies “Japan vs. Iran” World Cup Qualifiers “Japan vs. Singapore”

Winning 328 0 0 345 0 842 932 0 855

Losing 672 12 897 655 838 158 68 797 145

Tying 0 988 103 0 162 0 0 203 0

Detection of Football Spoilers on Twitter

5.2

139

Experimental Results

Table 8 shows the average of the precision, recall, and F-measure of each method for each match. SVM with Status of Match had the highest F-measure. The highest of precision was SVM with Status of Match, but that for recall was SVM. Table 8. Accuracy in detecting spoilers for each method Method Pattern matching SVM SVM with Status of Match

Precision 0.270 0.617 0.698

Recall 0.668 0.601 0.565

F-measure 0.372 0.598 0.611

6 Discussion The F-measure of SVM with Status of Match was the highest; thus, this method was superior to the others. In particular, the precision was better than the others. The reason the precision of SVM and SVM with Status of Match was superior to pattern matching is that pattern matching detected a spoiler only from the player name. For example, pattern matching detected a spoiler about a tweet such as “Kagawa’s missed pass is scary because of the heavy turf” because of the word “Kagawa” in the tweet, but SVM detected not only the players’ names but also words such as “goal” that appeared alone with the names. The reason the precision of SVM with Status of Match was superior to SVM seems to be that mistakenly learned tweets by SVM were no longer learned for every time zone by SVM with Status of Match. In fact, tweets such as “It’s been a while since I felt refreshed last” and “I saw a sweeping victory for the first time in a very long time” in the winning time zone were able to be detected correctly, so it is considered that tweets such as “Attacking midfielder Kagawa maybe after a long time” and “I saw a national team match for the first time in a very long time!!” in the tying time zone at the start of the match were fitted as non-spoilers in SVM but were not learned by SVM with Status of Match. In comparison, SVM was superior to SVM with Status of Match for recall. This is because the training data for SVM with Status of Match were divided into three, so it is assumed that the amount of training data was simply less than for SVM. Therefore, it is possible that the recall was improved by increasing the number of matches of the training data and also for the F-measure in SVM with Status of Match. Regarding recall, pattern matching was the most excellent. It is considered that when the threshold of TF-IDF was set to 0.100, there are many spoiler words among the matches. However, the precision was low as a result. As a result, the F-measure was not that high for any of the methods. This may be because we labeled tweets as spoiler or non-spoiler directly. Therefore, unimportant tweets such as “Nagatomo got a cramp!” were judged to be spoilers because we don’t set up the standard for labeling. We need to focus on crucial spoilers at first and set up the standard for labeling. This may also be because tweets such as “I want to see Honda

140

Y. Shiratori et al.

score a goal” and “We will win” were judged as spoilers. Tweets of hope and enthusiasm need to be judged as non-spoilers, but it is difficult to judge from the grammar because the Japanese language does not have a future tense; therefore, it is necessary to use a method different from morphological analysis. Examining methods for detecting tweets for the future is a future problem. In addition, the fact that the number of training data was small for the two methods using SVM is also considered to be the reason the F-measure is not that high. In particular, as shown in Table 5, the number of training data for SVM with Status of Match in the losing model was small. In fact, the accuracy of detecting spoilers with this method for each model is shown in Table 9. The F-measure for the winning model was 0.664, and that for the losing model was 0.447. It is suggested that the number of training data may have been an influence. Moreover, we plan to examine separation of SVM with Status of Match model because there is a possibility that the accuracy of SVM with Status of Match may be improved by separating the timing of the goal from the model. Figure 2 shows Precision-Recall curve for SVM with Status of Match in the winning model. It appears that precision was about 0.3 if keeping high recall. We need to think other methods to design a high-recall model first and then create models realizing higher precision because it may shock viewers even if a spoiler detection system cannot block just one spoiler tweet. Table 9. Accuracy of detecting spoilers with SVM with Status of Match for each model Model Winning model Tying model Losing model

Precision 0.716 0.656 0.773

Recall 0.646 0.528 0.315

F-measure 0.664 0.585 0.447

Fig. 2. Precision-Recall curve

Detection of Football Spoilers on Twitter

141

7 Conclusion We investigated the characteristics of spoilers by generating a spoiler dataset for football matches. As a result of analyzing the dataset, it was revealed that the content of spoilers varies depending on the status of a match. Furthermore, we compared the accuracy of spoiler detection by pattern matching, SVM, and SVM with Status of Match. Consequently, we showed that SVM with Status of Match was superior to the other methods in terms of F-measure. The method can be applied to other languages because feature values were frequencies of used terms and status of matches were language-neutral. In the future, we will improve the accuracy of spoiler detection by increasing the amount of training data and devising better data preprocessing for construction of automatic spoiler detection system such as a Twitter client in order to realize smoother collaborative communication. Furthermore, we plan to conduct experiments for other sports genres. Acknowledgments. This work was supported in part by JST ACCEL Grant Number JPMJAC1602, Japan.

References 1. Nakamura, S., Tanaka, K.: Temporal filtering system for reducing the risk of spoiling a user’s enjoyment. In: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 345–348. ACM, Honolulu (2007) 2. Leavitt, J.D., Christenfeld, N.J.S.: Story spoilers don’t spoil stories. Psychol. Sci. 22, 1152– 1154 (2011) 3. Rosenbaum, J.E., Johnson, Benjamin, K.: Who’s afraid of spoilers? Need for cognition, need for affect, and narrative selection and enjoyment. Psychol. Pop. Med. Cult. 5, 273–289 (2016) 4. Ikeda, K., Hijikata, Y., Nishida, S.: Proposal of deleting plots from the reviews to the items with stories. In: Proceedings of SNSMW 2010, vol. 6193, pp. 346–352. CDROM (2010) 5. Pang, B., Lee, L.: A Sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of ACL 2004, pp. 271–278 (2004) 6. Boyd-Graber, J., Glasgow, K., Zajac, J.S.: Spoiler alert: machine learning approaches to detect social media posts with revelatory information. In: Proceedings of the 76th Annual Meeting of the American Society for Information Science and Technology, vol. 50, pp. 1–9. ASIST (2013) 7. Jeon, S., Kim, S., Yu, H.: Spoiler detection in TV program tweets. Inf. Sci. 329, 220–235 (2016) 8. Top 10 Most Watched Sports In The World. http://top-10-list.org/2010/10/04/10-mostwatched-world-sports/. Accessed 27 Jan 2017 9. Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval: The Concepts and Technology Behind Search, 2nd edn. Addison-Wesley Professional, Boston (2011) 10. Brody, S., Diakopoulos, N.: Cooooooooooooooollllllllllllll!!!!!!!!!!!!!!: using word lengthening to detect sentiment in microblogs. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 562–570. Association for Computational Linguistics, Stroudsburg (2011) 11. Manning, C.D., Schtze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)

Social Studies

Analysis of Facilitators’ Behaviors in Multi-party Conversations for Constructing a Digital Facilitator System Tsukasa Shiota, Takashi Yamamura, and Kazutaka Shimada(B) Department of Artificial Intelligence, Kyushu Institute of Technology, 680-4 Kawazu Iizuka, Fukuoka 820-8502, Japan {t shiota,t yamamura,shimada}@pluto.ai.kyutech.ac.jp

Abstract. In this paper, we analyze characteristics of facilitators from multi-party conversations. The goal of our study is to construct a digital facilitator system that supports consensus-building and management of conversation for high-quality discussions. Therefore, we need facilitator’s knowledge, behavior, and patterns to realize a good digital facilitator. As the 1st step for the purpose, we focus on a macro viewpoint of facilitators’ behaviors on conversation corpora. First, we generate a model based on a decision tree that classifies each participant into a facilitator or a nonfacilitator, from conversation corpora. The classification accuracies by the decision trees were 0.642 and 0.737 for two corpora, respectively. The main purpose of the decision tree generation is to extract patterns from imaginable characteristics, namely features for the classifier. Therefore, next, we discuss behaviors of facilitators by analyzing the decision tree manually. In the analysis, we focus on two types of corpora; one is that each participant has a role, such as a project manager, and another is that each participant has no role in the conversation. We investigate the influence of the difference of the setting through the analysis. From the manual analysis, we obtained some common behaviors and some different behaviors about facilitators from two corpora. Keywords: Facilitator’s behaviors · Multi-party conversation Digital facilitator · Macro viewpoint

1

Introduction

In collaborative work, people need to discuss several topics for decision-making on a meeting, namely multi-party conversation. Supporting consensus-building in multi-party conversations is a very important task in intelligent systems. Participants in discussion often struggle to identify the most suitable solution for a decision on a meeting agenda because there are generally many alternatives and criteria related to making the decision. In addition, participants often fail c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 145–158, 2018. https://doi.org/10.1007/978-3-319-98743-9_12

146

T. Shiota et al.

to make a satisfying decision. It leads to the failure of the discussion. Therefore, a supporting system for consensus-building plays a very important role in discussion. To conduct smooth, active, and productive discussions, we need a facilitator who controls the discussion appropriately. However, it is impractical to assign a good facilitator to each group in the discussion environment because of lack of human resources. The goal of our study is to construct a system that supports consensusbuilding and management of conversation for high-quality discussions, as a digital facilitator, namely a collaborative agent for participants of discussions. Figure 1 shows the overview of our system. We are developing a prototype system for supporting discussions [8]. The system estimates a discussion, and then generates sentences and charts that describe the current state of the discussion. This is a part of our digital facilitator system. However, the timing of the generation depends on participants’ clicks on the system, namely a passive control of the system. Therefore we need facilitator’s knowledge, behavior, and patterns to realize a good digital facilitator, namely an active control from the system. Input /output devices Microphone Multi-camera images

Digital facilitator system

Roles of our digital facilitator 1. Management of the conversation 2. Feedback for consensusbuilding

Fig. 1. Overview of our digital facilitator system.

In this paper, we analyze characteristics of facilitators in multi-party conversations from a macro viewpoint. We focus on features extracted from utterances in conversations; repetition, topic information, dialogue act information, and so on. We generate a model to classify each participant into a facilitator and a non-facilitator, from conversation corpora. We use a decision tree as the model to analyze the result manually. We discuss characteristics of facilitators and nonfacilitators from the features in the decision tree. We also focus on the difference of conversation corpora. We analyze two types of corpora. In one corpus, each participant has a role, such as a project manager and an industrial designer. In another corpus, each participant has no role in the conversation. We investigate the influence arising from the difference of the setting in terms of behaviors of facilitators.

Analysis of Facilitators’ Behaviors in Multi-party Conversations

147

The contributions of this paper are as follows: – We generate a model from imaginable characteristics about facilitators for classifying each participant into a facilitator or not. There has not been much literature on the research topic that is to find a facilitator in a conversation. – We manually analyze the generated model from an aspect of a macro viewpoint. It is not always clear which characteristics about facilitators are effective for good facilitation and decision-making. We clarify the good facilitator’s behaviors by using the model. – We discuss the influence of the discussion setting by analyzing two different corpora. We believe that the acquired knowledge is broadly applicable to collaboration systems and agents.

2

Related Work

DeSanctis and Gallupe [4] have discussed a foundation for group discussion support systems that combine communication, computer, and decision technologies to support problem formulation and solution in group meetings. Klein [9] has proposed a large-scale collective intelligence system with argumentation maps. The purposes of these studies are to support the discussion of participants. On the other hand, our purpose is to construct a digital facilitator system that controls the discussion environment. For the purpose, we focus on the analysis of characteristics of facilitators in conversations in this paper. Matsuyama et al. [11] have proposed a procedural facilitation process framework to harmonize a four-participant conversational situation. Their purpose is to develop facilitation robots controlling engagement density, such as harmonized and un-harmonized. On the other hand, our purpose is to develop a digital facilitator that directly manages a discussion for a decision-making process. Therefore, we analyze behaviors of real facilitators and non-facilitators in conversations from a macro viewpoint, as the 1st step for the purpose. Ito et al. [6] have developed an open on-line workshop system called COLLAGREE that has facilitator support functions for an internet-based town meeting. They incorporated an incentive mechanism for large-scale collective discussions to the system [7]. The system is effective. However, the purpose of this system is to gather opinions from many participants on the web and extract discussion points from them. In this paper, we handle a small multi-party conversation task to construct a digital facilitator for decision-making. Hence, we focus on two multi-party conversation corpora and analyze behaviors of facilitators from them. We handle features such as utterances, topic tags, and dialogue act tags for the analysis. There are many studies that focus on roles of participants in conversations. Sapru and Bourlard [14] have proposed a social role recognition model using conditional random fields. Li et al. [10] have proposed a method to estimate a key speaker in a meeting. Hung et al. [5] have also proposed a method to estimate a

148

T. Shiota et al.

dominant person in a conversation. Okada et al. [12] have analyzed the individual communication skills of participants in a group. Zhang et al. [17] have discussed the functional roles of the participants in group discussion and reported the results of the analysis of the relationship between communication skill impression and functional roles. However, these studies did not discuss facilitator’s speech and behavior directly. Omoto et al. [13] have reported the analysis of facilitating behaviors of the good facilitator from measured non-verbal and para-linguistic data. They defined four actions for facilitation; convergence, divergence, conflict, and concretization, and then analyzed the conversations on the basis of these factors. On the other hand, we focus on linguistic features such as dialogue act tags. In addition, we discuss the influence arising from the difference of the setting; whether the facilitators are stipulated in the discussion environment or not.

3

Target Data

In this paper, we compare two conversation corpora. 3.1

AMI Corpus

The AMI corpus [3] is a famous meeting corpus. It consists of scenario and non-scenario meetings. In this paper, we handle scenario meetings (135 conversations). In the scenario task, participants pretended members in a virtual company, which designs remote controls. Each participant played each role; project manager, industrial designer, user-interface designer, and marketing expert. Here the project manager denotes the meeting leader. In this paper, we regard the project manager as the facilitator in each conversation. The AMI corpus contains numerous annotations, such as topic tags and dialogue acts. The number of topic tags was 24 and a depth of up to three levels with a label such as “opening” and “evaluation of prototype(s).” The dialogue acts denote speaker intentions such as “inform” and “backchannel.” The number is 15. In this paper, we utilize these topic and dialogue act tags for the classification model. 3.2

Kyutech Corpus

The Kyutech corpus is a freely available Japanese conversational corpus by [16]. Each conversation is a decision-making task with four participants. The participants pretended managers of a virtual shopping mall in a virtual city, and then determined a new restaurant from three candidates, as an alternative to a closed restaurant. The corpus consists of nine conversations. In the Kyutech corpus, no roles were assigned to participants. However, the participants answered a questionnaire about the satisfaction of the decision after the discussion. The questionnaire contained a question “Who did control the

Analysis of Facilitators’ Behaviors in Multi-party Conversations

149

discussion?” In this paper, we regard the majority of the answers as the facilitator in each conversation. The Kyutech corpus also contains topic tags and dialogue act tags. Yamamura et al. [16] created 28 topic tags for the corpus, such as “Menu” about food menus of candidates, and annotated the topics of each utterance. They also annotated dialogue act tags based on ISO24617-2 [2] for the corpus [15]. The number of dialogue act tags was 22, such as “Suggest” and “SelfCorrection.” In this paper, we also utilize these topic and dialogue act tags for the classification model.

4

Classification

In this section, we describe a classification model based on machine learning and the features for facilitator detection, namely a facilitator or not. Then, we evaluate the model on the AMI corpus and the Kyutech corpus. Although many strong classification models, such as SVMs and deep neural networks, have been proposed, we use a decision tree model, CART [1], for the classification because the main purpose of this study is to analyze the result manually for clarifying good patterns as a good facilitator. The tree model is usually suitable for the manual analysis. The analysis will appear in the next section. 4.1

Features

In this subsection, we explain features for the classification model. The features are divided into seven categories. – Ratios of repetition of self-utterance and utterance by other participants (self repetition, other repetition) Participants often repeat previous utterances for a variety of purposes. One reason is for emphasis of own opinions. By repeating own utterance, a speaker expresses his/her firm intention. Besides, repeating an utterance of someone else plays a role to elicit some opinions from participants. For controlling the current topic in utterances, repeating an utterance is effective. Therefore, the ratio of repetition is an important feature for detection of facilitators. For the repetition detection, we vectorize each utterance by nouns, verbs, adjectives, and adverbs. Then, we compute the cosine similarity between a target utterance and each utterance in the next 10 utterances. If the similarity exceeds a threshold, we regard the utterances as the repetition utterances. We set that the threshold is 0.6 in this paper. Figure 2 shows an example of the process. The calculation is as follows: SelRepN um(pi ) pi ∈S SelRepN um(pi )

(1)

OtherRepN um(pi ) pi ∈S OtherRepN um(pi )

(2)

SR(pi ) =  OR(pi ) = 

150

T. Shiota et al.

Fig. 2. An example of calculation of the self and other repetition.

where pi is a speaker and S is a set of speakers (A, B, C, D). SelRepN um and OtherRepN um denote the number of utterances repeated in him/herself and the number of utterances repeated by other participants, respectively. We use SR and OR as features. – Coverage of the topics in the conversation (cover topic) Facilitators tend to mention many topics in a conversation to manage the discussion. In the two corpora, topic tags were annotated to each utterance. Therefore, we introduce a coverage factor of topics as a feature. CT (pi ) =

T opicN um(pi ) T opicN umall

(3)

where T opicN um(pi ) is the number of topics that a speaker utters in a conversation. T opicN umall denotes the total number of topics that appears in a conversation. – Ratio of the “Meeting” tag (meeting ratio) A facilitator is responsible for controlling the discussion. As a result, he/she tends to utter topics about the proceedings and decision. The Kyutech corpus has the topic tag “Meeting” that relates to this concept. For example, utterances such as “I’d like to confirm what the most important thing about these criteria is.” and “Let’s move on the next candidate.” We compute the ratio of the “Meeting” tag1 for each participant as follows: M eetingU ttrN um(pi ) pi ∈S M eetingU ttrN um(pi )

M R(pi ) = 

(4)

where M eetU ttrN um denotes the number of utterances with the “Meeting” tag. – Ratio of specific dialogue acts (DA ratio2 ) Dialogue acts denote speaker intentions. One role of facilitators is collecting 1 2

This tag does not exist in the AMI corpus. Therefore, we use this feature for only the Kyutech corpus. Note that “DA” replaces an actual dialogue act tag. For example,“IN ratio” when the DA tag is “inform (IN).”.

Analysis of Facilitators’ Behaviors in Multi-party Conversations

151

participants’ opinions. As a result, facilitators tend to utter words that relate to acts about information exchange, such as “elicit-assess” in the AMI corpus and“question” and “suggest” in the Kyutech corpus. On the other hand, nonfacilitators answer the questions from the facilitator, such as “inform” in the AMI corpus and “answer” in the Kyutech corpus. Therefore, we compute ratios of some dialogue acts. DA(pi , daj ) = 

DAU ttrN um(pi , daj ) pi ∈S DAU ttrN um(pi , daj )

(5)

where DAU ttrN um(pi , daj ) denotes the number of utterances with a dialogue act daj for a speaker pi . In this paper, we select dialogue acts manually for this process. We use seven DA tags for the AMI corpus; backchannel (BC), inform (IN), elicit-inform (EI), assess (AS), elicit-assess (EA), elicitoffer-or-suggestion (EO), and suggest (SU). We also use five DA tags for the Kyutech corpus; question (QU), answer (AN), inform (IN), suggest (SU), and positiveFeedback (PF). – Ratio of utterances as the whole and quarters (utter all, utter q1, utter q2, utter q3, utter q4) We assume that facilitators tend to speak more than non-facilitators. In addition, facilitators tend to speak in the beginning and the end of the discussion for controlling the discussion. Therefore, we compute ratios of utterances for each participant. U T T R(pi , segj ) = 

U ttrN um(pi , segj ) pi ∈S U ttrN um(pi , segj )

(6)

where segj ∈ (the whole, first quarter, second quarter, third quarter, fourth quarter). U ttrN um(pi , segj ) denotes the number of utterances that a speaker pi utters in a segment segj . – Average number of characters and average time of each participant (ave char, ave time) We assume that facilitators tend to speak longer than non-facilitators. As a result, the number of words in facilitator’s utterances in a conversation becomes larger than that of non-facilitators. Therefore, we introduce two types of features; the average number of characters in utterances and the average time of utterances. – Ratio of utterances after a silence (break silence) Discussion sometimes becomes deadlocked. In this situation, the deadlock generates an awkward silence. One role of facilitators is to activate the discussion after the silence; breaking the deadlock. In this paper, we regard a non-utterance section for 10 s or more as “silence.” We count up the number of utterances after the silence, BkSilU ttrN um, and then compute the ratios for each pi . BkSilU ttrN um(pi ) BS(pi ) =  (7) pi ∈S BkSilU ttrN um(pi )

152

T. Shiota et al. Table 1. Classification result for the AMI corpus. Precision Recall F Facilitator

0.654

0.630

0.642

Non-facilitator 0.878

0.889

0.883

Table 2. Classification result for the Kyutech corpus. Precision Recall F Facilitator

4.2

0.79

0.70

0.74

Non-facilitator 0.89

0.92

0.91

Experimental Results

We evaluated our model with two corpora. The task is to classify a participant in each conversation into the facilitator or not. Each conversation in both corpora consists of four participants. For the AMI corpus, we regard the project manager as the facilitator. In other words, the task is to detect the project manager from four participants. For the Kyutech corpus, the facilitator of each conversation was determined by a majority vote from a questionnaire of the participants. By the voting, eight facilitators were determined for eight conversations. For one conversation, two participants got the same number of votes. We regard the two participants as the facilitators in the conversation. Therefore, the task is to detect 10 facilitators3 from 36 persons (9 conversations × 4 participants). We evaluated our model with 10-fold cross-validation for 135 conversations for the AMI corpus. The number of target data was 540 persons (135 conversations × 4 participants). Table 1 shows the experimental result of the AMI corpus. We evaluated our model with conversation-level leave-one-out cross-validation for the Kyutech corpus, due to the size of corpus. In other words, we generated a model from eight conversations, and evaluated the model with one conversation. Table 2 shows the experimental result of the Kyutech corpus. The classification accuracies by the decision trees were 0.642 and 0.74 on the F-value, respectively. Although we evaluated the contribution of features in our model by ablation experiment, the best accuracy was generated by the model with all features, namely Tables 1 and 2.

3

These 10 facilitators were based on a subjective judgment by participants in the discussions. Hence, we checked the judgment by ourselves. The three test subjects checked the Kyutech corpus, and then voted the facilitator in each conversation. The results about eight conversations corresponded to the questionnaire of the Kyutech corpus. The rest was the conversation with the two facilitators by the questionnaire. For the conversation, the result of our judgment partially corresponded to the questionnaire. Therefore, we used the original judgment from the Kyutech corpus as the ground truth.

Analysis of Facilitators’ Behaviors in Multi-party Conversations

153

Fig. 3. An example of a decision tree of the AMI corpus.

It’s arguable whether the F-values, 0.642 and 0.74, are sufficient as a classification model. However, the main purpose of the generation of the decision trees is a trigger for manual analysis that clarifies facilitator’s behavior patterns. Hence, the classification itself is not always our purpose. On the other hand, improving the accuracy leads to improving the manual analysis. Features about on cross-talking and interruption are probably important characteristics for facilitators. Therefore, we need to consider these features in future work.

5

Analysis

In this section, we discuss the decision trees generated in Sect. 4. The main purpose of this paper is to derive facilitator’s behaviors on a macro viewpoint from real discussion corpora. In the context of this paper, it is to clarify facilitator’s behaviors and patterns from structures of the decision trees. 5.1

Analysis from Decision Trees

Figure 3 shows an example of a decision tree generated from the AMI corpus. Figure 4 shows an example of a decision tree generated from the Kyutech corpus. Each node contains (1) the feature and threshold, (2) the number of speakers in the node, and (3) the distribution of roles, namely non-facilitator (left in the square bracket) and facilitator (right in the square bracket). For example, Fig. 3 denotes if utter_all > 0.2858 then if EO_ratio > 0.3794 then if cover_topic > 0.6771 then ‘‘FACILITATOR’’ (63/67) This is the interpretation of the node of the right-side bottom. This interpretation said “Facilitators tend to speak many utterances in the conversation and utterances related to acts that listen to suggestions from other participants. In addition, the utterances tend to encompass many topics in the conversation.” The interpretation of the node of the right-side bottom in Fig. 4 is also

154

T. Shiota et al.

Fig. 4. An example of a decision tree of the Kyutech corpus.

if QU_ratio > 0.3095 then if ave_time > 1.957 then ‘‘FACILITATOR’’ (8/8) This interpretation said, “Facilitators tend to ask other participants questions, and the speech tends to become long.” These behaviors are important and useful to implement the digital facilitator in our system. 5.2

Analysis from Features

We analyzed the decision trees more deeply in terms of features. Figures 5 and 6 shows the distributions of features used in the decision trees from the AMI corpus and the Kyutech corpus, respectively. In the figures, “top level”, “2nd level”, and “3rd level” denote the depth in decision trees. For example, the top level node in Fig. 4 is the “QU ratio” feature. Here we also focus on another point in this analysis. As we said, the settings for facilitators of the AMI corpus and the Kyutech corpus were different. Each participant in the AMI corpus has a role. We regard the project manager as the facilitator in the discussion. In other words, the facilitators in the AMI corpus were explicitly defined in advance and each participant understood which person needed to control the discussion. On the other hand, each participant in the Kyutech corpus has no role. In other words, each participant did not understand which person should control the discussion. Therefore, a person who controlled the discussion was generated almost spontaneously and dynamically. It depends on the communication skills of each participant and hierarchical relations such as superior-inferior. We also discuss the influence arising from this difference. First, we discuss the common tendencies between the two corpora. For the Kyutech corpus, all the top level nodes were the “QU ratio” feature (the mostleft in Fig. 6). QU is a tag about information request. For the AMI corpus, “EI ratio (elicit-inform)” and “EO ratio (elicit-offer-or-suggestion)” frequently appeared in the 2nd level. EI is used by a speaker to request that someone else give some information. EO denotes that the speaker expresses a desire for someone to make an offer or suggestion. From these results, we conclude that facilitators tend to listen to suggestions and opinions from other participants,

Analysis of Facilitators’ Behaviors in Multi-party Conversations

155

Fig. 5. Distribution of features in the decision tree for the AMI corpus.

regardless of the setting of discussions; whether the role of the facilitator in the discussion is given or not, as the setting. Next, we discuss the tendencies of the AMI corpus. The frequent tags that appeared in the decision trees were the ratio of utterances (utter all, utter q1, and utter q4), the coverage of topics in the conversation (cover topic), and the ratio of utterances after a silence (break silence) from Fig. 5. In the situation that a participant has a role as the facilitator (Project Manager), the results lead to the suggestion that the facilitators manage and control not only the whole discussion but also the beginning (utter q1) and the end (utter q4) of the discussion. In addition, the facilitators tend to speak utterances with many topics and break the deadlock in the discussions. Finally, we discuss the tendencies of the Kyutech corpus. The frequent tags without the common tendency, “QU ratio”, were “self repetition” and “ave time” from Fig. 6. In the situation that any participants have no role, a participant that wants to control the discussion, namely a latent facilitator, needs to amply express his/her suggestions and opinions. Therefore, we conclude that the latent facilitators tend to repeat his/her utterances and speak longer than other participants. In this analysis, we focused on a difference between the two corpora; the role of a facilitator was given or not, in advance. However, there are other aspects of the analysis. For example, the results of this analysis might be caused by cultural differences and linguistic differences (the AMI corpus is English and the Kyutech corpus is Japanese). Analyzing the decision trees deeply via these aspects is interesting future work. In addition, the behaviors of facilitators depend on the facilitation skill that each person possesses. Okada et al. [12] and Zhang et al. [17] reported an estimation method for communication skills and the influence in group discussions. Applying the knowledge obtained from these studies to our analysis and our digital facilitator system is important to gain a new and deep insight from our decision trees.

156

T. Shiota et al. 10 8

Top level

Frequency

2nd level 6

3td level

4 2 0

ce io ic io tio etition time ratio _rat er_q4 ter_all er_top _silen U_rat _ra _ k IN_ eeting S QU elf_rep ut ave utt cov brea m s

Fig. 6. Distribution of features in the decision tree for the Kyutech corpus.

6

Conclusions

The goal of our study is to construct a conversation management system, as a digital facilitator system. For the purpose, in this paper, we analyzed characteristics of facilitators through two multi-party conversation corpora, the AMI corpus and the Kyutech corpus, on a macro viewpoint. The results are effective to develop not only our digital facilitator but also many collaboration systems and agents. First, we generated a model that classifies each participant in a conversation into a facilitator or not. We applied several features to a machine learning method, CART. The feature set consisted of a repetition ratio of utterances, ratios about topic tags and dialogue act tags that were defined in each corpus, and so on. The classification accuracies of the AMI corpus and the Kyutech corpus by the decision trees were 0.642 and 0.74 on the F-value, respectively. Although the classification itself is not always our purpose, the decision tree with high accuracy contributes to the precise analysis by humans. Therefore, the improvement of the decision tree model is one of our future work, e.g., addition of cross-talking and interruption features to the feature set. It is not always clear which characteristics about facilitators are effective for good facilitation and decision-making. Therefore, we analyzed behaviors of facilitators by analyzing the decision trees manually, next. In other words, we manually derived frequent features’ combinations from the decision trees as behavior patterns of facilitators. We gained some insights from the trees; e.g., “Facilitators tend to speak many utterances in the conversation and utterances related to acts that listen to suggestions from other participants.” We also discussed the influence arising from the difference of the settings of the two corpora. We obtained some common and different tendencies of facilitation for decision-making tasks from the analysis. One common tendency was “facilitators tend to listen to suggestions and opinions from other participants.” One tendency of the AMI corpus was “the facilitators manage and control not only the whole discussion but also the beginning and the end of the discussion.” One tendency of the Kyutech corpus was “the facilitators tend to repeat his/her utterances and speak longer

Analysis of Facilitators’ Behaviors in Multi-party Conversations

157

than other participants.” We concluded that these behaviors were caused by the difference of the setting of each corpus; the role of a facilitator was given or not in advance. However, the behaviors might be caused by other reasons, such as cultural differences and facilitation skills of each person. Therefore, the detailed investigation about these points is our important future work to construct a digital facilitator system. In this paper, we used utterances and annotated tags in each corpus. On the other hand, non-verbal features are effective to detect and analyze facilitators in conversations. We need to apply voice information, gaze, and postures into our model, as a multi-modal interpretation model. Moreover, in our current method, surface linguistic features were underused. We need to analyze specific surface expressions that facilitators tend to use for controlling the discussion, such as “What do you think about ... ?” and “How about you?” We analyzed behaviors about facilitators on the whole discussion, namely a macro viewpoint. This is valuable as a basic model of our digital facilitator. For example, although the current prototype system [8] can generate sentences and charts that explain the current state of a discussion, the timing of the action depends on the user’s click on the system. By using the knowledge from the analysis, our prototype system can control the explanation generation, e.g., generation after a silence in a discussion (knowledge from the break silence feature in Sect. 5.2). In addition, we also need a micro viewpoint to develop more useful digital facilitator; when should the digital facilitator suggest an opinion in a discussion? Therefore, analyzing utterances and actions of facilitators on the micro viewpoint is one of the most important future tasks. Acknowledgment. This work was supported by JSPS KAKENHI Grant Number 17H01840.

References 1. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984) 2. Bunt, H., et al.: ISO 24617–2: a semantically-based standard for dialogue annotation. In: Proceedings of the 8th International Conference on Language Resources and Evaluation, pp. 430–437 (2012) 3. Carletta, J.: Unleashing the killer corpus: experiences in creating the multieverything AMI meeting corpus. Lang. Resour. Eval. J. 41(2), 181–190 (2007) 4. DeSanctis, G., Gallupe, R.B.: A foundation for the study of group decision support systems. Manag. Sci. 33(5), 589–609 (1987) 5. Hung, H., Gatica-Perez, D., Huang, Y., Friedland, G.: Estimating the dominant person in multi-party conversations using speaker diarization strategies. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 2197–2200 (2008) 6. Ito, T., Imi, Y., Ito, T., Hideshima, E.: COLLAGREE: a facilitator-mediated largescale consensus support system. In: Proceedings of the 2nd Collective Intelligence Conference (2014)

158

T. Shiota et al.

7. Ito, T., Imi, Y., Sato, M., Ito, T., Hideshima, E.: Incentive mechanism for managing large-scale internet-based discussions on COLLAGREE. In: Proceedings of the 3rd Collective Intelligence Conference (2015) 8. Kirikihira, R., Shimada, K.: Discussion map with an assistant function for decisionmaking: a tool supporting consensus-building. In: Proceedings of the 10th International Conference on Collaboration Technologies (2018) 9. Klein, M.: Achieving collective intelligence via large-scale on-line argumentation. CCI Working Paper 2007–001, MIT Sloan School of Management 4647–07 (2007) 10. Li, W., Li, Y., He, Q.: Estimating key speaker in meeting speech based on multiple features optimization. Int. J. Sig. Process. Image Process. Pattern Recogn. 8(4), 31–40 (2015) 11. Matsuyama, Y., Akiba, I., Fujie, S., Kobayashi, T.: Four-participant group conversation: a facilitation robot controlling engagement density as the fourth participant. Comput. Speech Lang. 33(1), 1–24 (2015) 12. Okada, S., et al.: Estimating communication skills using dialogue acts and nonverbal features in multiple discussion datasets. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 169–176 (2016) 13. Omoto, Y., Toda, Y., Ueda, K., Nishida, T.: Analyses of the facilitating behavior by using participant’s agreement and nonverbal behavior. J. Inf. Process. Soc. Japan 52(12), 3659–3670 (2011). (in Japanese) 14. Sapru, A., Bourlard, H.: Automatic social role recognition in professional meetings using conditional random fields. In: Proceedings of Interspeech (2013) 15. Yamamura, T., Hino, M., Shimada, K.: Dialogue act annotation and identification in a Japanese multi-party conversation corpus. In: Proceedings of the Fourth Asia Pacific Corpus Linguistics Conference (2018) 16. Yamamura, T., Shimada, K., Kawahara, S.: The Kyutech corpus and topic segmentation using a combined method. In: Proceedings of the 12th Workshop on Asian Language Resources, pp. 95–104 (2016) 17. Zhang, Q., et al.: Toward a supporting system of communication skill: the influence of functional roles of participants in group discussion. In: Proceedings of the 19th International Conference on Human-Computer Interaction (HCII 2017) (2017)

Consideration of a Method to Support Face-to-Face Communication Using Printed Stickers Featuring a Picture of a Character Expressing a Mood Yuri Nishimura(&)

and Minoru Kobayashi

Meiji University, 4-21-1 Nakano, Nakano-ku, Tokyo, Japan [email protected], [email protected]

Abstract. Sharing our moods is important in communication with the people sharing our space. However, it is difficult to convey our moods to other people constantly, and sometimes we are misunderstood by other people. In this study, we considered a method to visualize our moods to convey them to the people around us. We conducted experiments to evaluate a method that shares moods using printed stickers featuring a picture of a character expressing a mood. In this paper, we introduce the experiment and the results, and we consider the effect of showing moods using stickers. Keywords: Mood

 Face-to-face communication  Sticker

1 Introduction In communication with people, if you cannot let your partner understand your mood or situation, it may cause lower satisfaction with communication. For example, we sometimes feel difficulty showing that we are enjoying a conversation; our face might make us appear to be in a bad mood when we are concentrating. These misunderstandings may cause loss of communication opportunities and obstruct smooth communication. Using biological or other types of sensors, we may be able to build a device that judges and displays our mood. However, people might tend to reject such devices that expose their inner state. When designing such devices, we have to leave some room for people to control their output. Therefore, we considered that information media satisfying the following two conditions would enable better communication. – It can constantly show our moods. – We can control its output if we want. The purpose of this study is to realize a device for sharing moods that satisfies these two conditions. Many online services provide mood sharing functionalities by using facial emoticons. These functions help users to express their mood, i.e. state of feeling,

© Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 159–170, 2018. https://doi.org/10.1007/978-3-319-98743-9_13

160

Y. Nishimura and M. Kobayashi

to other users. The “mood sharing” referred in this paper means such a process in which users let others understand their feelings. We conducted a preliminary experiment to investigate the influence on communication when there was a means of sharing mood. In the experiment, the participants presented their moods by wearing a sticker printed with an illustration of a character expressing their mood. In this paper, we show the method and the results of the experiment, and we discuss the effect of visualizing moods using stickers. In addition, we show the necessary requirements for the design of information media for sharing mood.

2 Related Work 2.1

Interactive System for Promoting Face-to-Face Communication

Okamoto et al. [1] developed a system called Silhouettell to provide awareness support for real world communication. It shows the user’s shadow along with their profile information on a large screen to allow users to know who is in the meeting space, and provides topics to promote communication, based on profile information. Sumi et al. [2] proposed a system called AgentSalon to encourage knowledge exchange and effective conversation in face-to-face communication. In AgentSalon, personal software agents of each user on the display screen perform automated conversation. As the agents have the profile information about the corresponding users, and speak as these users might speak if they were there, the users’ opinions are exchanged indirectly. Unlike these studies, in this study our hypothesis was that not only sharing profiles and providing topics but also sharing of mood promote communication among people sharing a space. Therefore, we discuss ways to convey moods that people want to share. 2.2

Mobile Application for Sharing Mood and Emotions

Church et al. [3] developed MobiMood, a social mobile application that enables groups of friends to share their moods with each other. They found that sharing mood promotes communication among users. Huang et al. [4] developed a mobile social application called Emotion Map that can record emotions with time, place and activity information which can be shared with friends. Emotion Map helps improve people’s awareness and regulation of their emotions and promotes communication. In this paper, we focused on the effects of sharing moods in face-to-face communication.

Consideration of a Method to Support Face-to-Face Communication

2.3

161

Augmentation of Physical Expression by Giving Visual Information

Zhao et al. [5] developed Halo, which is a ring of LEDs (Light Emitting Diodes) that frames the wearer’s face to research the potential of body-centered lighting technology. Halo can change the impression of the expression of the wearer by applying light of various colors and angles to the face. Sakurai et al. [6] researched the effect on judgement of human emotion when projecting comic book images onto the walls surrounding a person. Moreover, Sakurai et al. [7] developed a chair called the Psynteraction Chair that can visualize the degree of concentration of sitting persons by applying this method. In this study, we used a method to visualize mood using printed stickers featuring an illustration of a character expressing a mood.

3 Experiment We conducted an experiment to find the effect of visualizing moods using stickers. The participants chose their moods and put on an appropriate sticker when they entered the laboratory where they performed their daily work. The experiment period was five days. 3.1

Participants

Thirteen college students and graduate students belonging to the same laboratory as the authors volunteered to participate in the experiment. The participants included six males and seven females, and ranged in age between 21 and 23 years. 3.2

Stickers

The authors made stickers to express moods (Fig. 1). The stickers express eight mood states: excited, cheerful, irritated, tense, relaxed, calm, bored and sad. These images were from Pick-A-Mood [8] which is a character-based pictorial scale for reporting and expressing moods and proposed by Desmet et al. It includes three characters of male, female, and a robot (Fig. 2), and we used all three characters in our experiments. The robot character is included for participants who are not comfortable choosing male or female characters to express themselves. The participants did not know which mood state each illustration corresponded to. The stickers were circular, with a diameter of 60 mm. 3.3

Procedure

During the experiment, the participants wore a sticker when entering the laboratory. Figure 3 shows the procedures for choosing the sticker. First, the participants selected an illustration that expressed the mood to present. We did not force the participants to match the sticker to their actual feeling; thus they could choose a sticker that represented a mood other than their actual one if they wanted. Next, the participants placed the sticker somewhere on their body. Finally, the participants record the sticker selected

162

Y. Nishimura and M. Kobayashi

Fig. 1. Stickers used in the experiment

Fig. 2. Examples of male, female, and robot characters

and their name on a tablet. After performing all the steps, the participants went about the normal routine in the laboratory while wearing the sticker, and they removed the sticker when they left the laboratory. The participants could change the sticker that they wear if they want to change their mood that they want to convey. When changing the sticker, the participants input the sticker selected and their name on a tablet again.

Consideration of a Method to Support Face-to-Face Communication

163

In addition, we asked the participants who came to the laboratory on that day to complete an online form providing information about what they had felt and what they had noticed while wearing the sticker. Filling in the online form was optional.

Fig. 3. Procedure for selecting a sticker

3.4

Questionnaire

After the experiment, we conducted a questionnaire for all the participants. The questionnaire included questions about the variations of the stickers, the experience of wearing a sticker, and about communication while wearing the stickers. 3.5

Interview

After the questionnaire, we interviewed all the participants. In the interview, the participants gave a detailed explanation about the comments entered in the online form and the answers to the questionnaire.

4 Results 4.1

How Many Times Each Sticker Was Used

Table 1 shows the number of participants and the stickers used on each day. On the fourth day, two participants changed the sticker during the day. Figure 4 overviews the experiment by showing the number of times each sticker was selected. The stickers expressing other moods were also used at least once. The sticker expressing “relaxed” was used the most, being chosen 10 times. Table 1. Numbers of participants and the stickers used on each day Day Participants Number of stickers used 1 4 4 2 10 10 3 1 1 4 8 10 5 6 6

164

Y. Nishimura and M. Kobayashi

Fig. 4. How many times each sticker was used

4.2

Evaluation of Variations of the Stickers

The results of question 1 “Was the number of cartoon characters in the stickers too many, too few, or about right?” were as follows: one participant answered “Too many,” three answered “Too few,” and nine answered “About right.” The participant who answered “Too many” provided the comment “I always used a sticker featuring a robot.” We asked the participants who answered “Too few” what kind of cartoon characters they would have preferred. They provided answers such as animals, or gender undetermined characters such as a robot. One participant answered that she needed more kinds of male or female characters. The results of question 2 “Was the number of moods conveyed in the stickers too many, too few, or about right?” were as follows: four participants answered “Too many,” four answered “Too few,” and five answered “About right.” Of those who answered “Too many” their reasons included: “Difficult to understand the difference of moods of some images.” Of those who answered “Too few” their reasons included: “Different levels of mood should be included,” and “Different kinds of mood should be included.” The results of question 3 “Did you ever feel that there was no suitable sticker to express your own mood?” were as follows: eight participants answered “Yes,” and five answered “No.” We asked the participants who answered “Yes” about what kind of moods should be included, and their answers included different levels of moods, and other conditions such as “hungry,” “sleepy,” “busy,” “desperate,” and “tired.”

Consideration of a Method to Support Face-to-Face Communication

4.3

165

Evaluation of the Experience of Wearing a Sticker

In question 4, we asked five questions about how the participants felt while wearing a sticker. Figure 5 shows the results of question 4. In question 4–5 “You were aware of the sticker while wearing it,” three participants answered “Agree a little,” and one answered “Strongly Agree.” We asked them about the influence of the sticker on their behavior and mind. They provided comments such as “I tried to cheer up with wearing a positive sticker when I had hard work,” “I tried to raise my spirits when I looked at myself wearing a negative sticker in the glass,” and “I thought that no one would try to talk to me if I was wearing a sticker that expressed I was thinking.”

Fig. 5. Results from Question 4

4.4

Communications During Wearing a Sticker

The results of question 5 “Did you talk about the stickers that you were wearing during communication with other participants?” were as follows: nine participants answered “Yes,” and four answered “No.” The results of question 6 “Did you feel that your mood was conveyed to other participants by wearing the sticker?” were as follows: four participants answered “Yes,” and nine answered “No.” We asked the participants who answered “Yes” when they felt that. Two participants answered “When talking about the stickers”, and the other two participants answered “When other participants reacted to the sticker that I was wearing.” The results of question 7 “Did you feel that you could understand well the mood of other participants wearing a sticker?” were as follows: seven participants answered “Yes,” and six answered “No.” The results of question 8 “Were you helped by the stickers that showed the moods of other participants in communication with them?” were as follows: three participants answered “Yes,” and ten answered “No.” The participants who answered “Yes” provided comments such as “The persons who were wearing positive stickers were easy to

166

Y. Nishimura and M. Kobayashi

talk to because I could understand that they felt good,” and “I decided to be kind to the persons wearing the stickers expressing anger.” For questions 6 to 8, many participants answered “No” as they did not have a chance to utilize the sticker in the short period of time of the experiment. The results of question 9 “Did you select a sticker different from the one closest to your mood?” were as follows: five participants answered “Yes,” and eight answered “No.” Of those who answered “Yes” their reasons included: “To indicate my desire not to be spoken to,” “To try to raise my mood,” “Because I wanted to find out the reactions from other people when I wore a depressed sticker,” “Because I thought it was not good to select the same sticker as the previous day,” and “Because I did not know which sticker was the closest to my mood.” The results of question 10 “Did you think that other participants were wearing a sticker that showed a different mood from their actual mood?” were as follows: one participant answered “Yes,” and twelve answered “No.” The participant who answered “Yes” provided the comment “I felt that one of the participants was wearing a completely different thing intentionally, because he/she apparently was not angry.” In questions 11 and 12, we asked the participants to use free description to report the effects of the stickers on communication. The results of this free description for question 11 “Did you feel that the stickers had positive effects on communication with other participants? If so, please add any comments.” were as follows: – – – –

“We got excited on finding that we were wearing the same sticker.” “I got a chance to talk, for example, discussing the reason for selecting a sticker.” “I got a chance to talk with someone wearing the same sticker.” “When I was wearing a female sticker that expressed happiness, I felt that I was able to give a good impression because I was told ‘You look like you are having fun today, too!’.” – “I thought that the sticker provided a good topic to talk about, because I got a chance to talk about the stickers after putting the sticker on.” – “It was a new sensation to find out what other people were feeling, because there was not much opportunity before.” The results of the free description about question 12 “Did you feel that the stickers had negative effects on communication with other participants? If so, please add any comments.” were as follows: – “I felt that it was hard to speak to a person who was wearing an expressionless sticker.” – “I was told I had a grim look when I was wearing a sticker with an intimidating (?)1 expression.” – “When the other person was wearing a sticker with an angry expression, I refrained from communication.”

1

We left the question mark written by the participants as it was in their answer.

Consideration of a Method to Support Face-to-Face Communication

167

5 Discussion: The Effects of Wearing the Sticker In this chapter, we discuss about the effect of showing a mood to people by wearing a sticker featuring a character expressing moods. 5.1

The Effect of Constantly Showing a Mood

The participants always wore a sticker to show their mood in the laboratory. As a result, from the answers to questions 5 and 11, we found that the participants talked about the sticker that they were wearing and the sticker triggered their conversation. Moreover, from the answers to questions 7 and 8, we found that the sticker was helpful for deciding when to talk to and how to treat the other participants. Therefore, showing a mood constantly may be considered to have a possibility of activating communication and reducing the psychological barrier to starting communication. 5.2

Wearing a Sticker Different from Their Mood

In the experiment, the participants selected stickers themselves from the stickers that represented various mood. They could select a sticker representing their own mood, or could select others as they wished. In the answers to question 9, reported in Sect. 4.4, we found that some participants actually selected a sticker that was not the closest to their mood. According to their answers, the reasons for selecting other stickers were because they wanted to appear as if they had the mood represented in the sticker, or because they had tried to control their own mood by wearing the sticker. Therefore, information media that express mood may have the possibility to help us express our will or intention indirectly and to help us control our moods. 5.3

The Effect of Limiting the Mood Selection to Several Options

In the experiment, the participants had to select one from a limited number of stickers, so they picked one that was closest to the mood they wanted to use. As a result, some participants happened to wear the same sticker as others. From the answers to question 11, we found that communication between the participants wearing the same sticker was encouraged. This occurred because the selection was limited. Thus, showing a mood using a limited selection of stickers may provide chances to make people realize that they have similar feelings to others.

6 Discussion: Design Requirements In this chapter, based on the results of the experiment, we discuss the necessary requirements for the design of information media to help people share their mood.

168

6.1

Y. Nishimura and M. Kobayashi

Expandability

In this experiment, we prepared stickers that featured three cartoon characters that each expressed eight mood states: excited, cheerful, irritated, tense, relaxed, calm, bored and sad. As a result, from the answers to question 2, we found that some participants thought that were too many different kinds of stickers, and other participants thought that there were too few. Moreover, from the answers to question 3, we found that some participants needed the stickers to show different levels of mood and other conditions such as “hungry,” “sleepy,” “busy,” “desperate,” and “tired.” Therefore, we think that the information media for sharing mood needs a function that allows the user to customize the stickers according to his/her own preferences. 6.2

Convenience

In this experiment, we placed the stickers near the laboratory entrance. This is because the participants put on the stickers when entering the laboratory. However, some participants commented that they hesitated to change the stickers. This was because the place where the stickers were set out was in a position that could be seen by many people in the laboratory. Therefore, we think that the information media for sharing mood should be able to easily change the mood shown when we want to change our moods. 6.3

Visibility and Understandability of Stickers

In this experiment, we focused on the method of wearing a sticker to show a mood. However, some participants commented that sometimes they could not see the stickers clearly depending on the spatial relationship between the participants, and depending on where the sticker was attached to their body. Moreover, another participant commented that she was not aware of which mood the sticker indicated, even when she noticed the existence of the sticker. Therefore, we think that the information media for sharing mood needs the ability to give a clearer impression of the mood. Some other methods, such as using LEDs to express moods or displaying an animation instead of a static illustration, may give a better impression of the wearer’s mood.

7 Future Works 7.1

Long-Term Experiment

This experiment period was five days and it was short-term. We might obtain different results on use of the stickers with variation in the period of the experiment and the community of participants. Moreover, there is a possibility that the reason the stickers increased conversations was because the experience of wearing a sticker was unusual. Therefore, in the future, we will conduct a longer-term experiment and will study the effects of showing moods in detail.

Consideration of a Method to Support Face-to-Face Communication

7.2

169

Construction of Media that Meets Design Requirements

In Sect. 6, we pointed out three factors, “expandability”, “convenience”, and “visibility and understandability of stickers”, as necessary requirements for the design of information media useful for sharing mood. In the future, in this research, we will try to realize information media satisfying these design requirements. Then, through use experiments, we will research the effects of sharing mood by using the information media. 7.3

Application to Telecommunications

In remote communications, it is even more difficult to convey our moods because the information that can be communicated is limited. In particular, when conducting a remote conference between the “main venue” where a large number of people are gathered and the “remote venue” where there is only one person, the information and the position are in an “asymmetric” state which is significantly different. There are cases where it is difficult for participants in the remote venue to grasp the state of the main venue, or it is difficult to communicate the intention of a participant in remote venue to participants in the main venue. Though we focused on face-to-face communication in this research on this occasion, we are also expecting to apply the knowledge obtained from this research to telecommunication media design.

8 Conclusion In this study, we aim to realize information media that can share mood in communication with people who share a space. In the experiment, the participants wore a sticker featuring a cartoon character showing their mood, and we researched the effects on communication of wearing a sticker to show mood. As a result, we found that there were cases where people felt that it was easier to talk to those wearing a sticker showing a good mood, and the stickers led to conversation and provided topics of conversation. Moreover, there is a possibility that we can control the impression given to the other people by wearing a sticker. In addition, there is a possibility to let people realize that they have similar feelings to others by wearing the same sticker. In the future, we will conduct long-term experiments to research the effects of showing a mood on communication in more detail. In addition, we will work on realizing mood sharing media that meets the design requirements, and will consider how to apply the knowledge to telecommunications. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number JP18K11410.

170

Y. Nishimura and M. Kobayashi

References 1. Okamoto, M., Nakanishi, H., Nishimura, T., Ishida, T.: Silhouettell: awareness support for real-world encounter. In: Ishida, T. (ed.) Community Computing and Support Systems. LNCS, vol. 1519, pp. 316–329. Springer, Heidelberg (1998). https://doi.org/10.1007/3-54049247-X_21 2. Sumi, Y., Mase, K.: AgentSalon: facilitating face-to-face knowledge exchange through conversations among personal agents. In: Proceedings of Agents 2001, pp. 393–400 (2001) 3. Church, K., Hoggan, E., Oliver, N.: A study of mobile mood awareness and communication through MobiMood. In: Proceedings of the 6th Nordic Conference on Human-Computer Interaction (NordiCHI 2010), pp. 128–137 (2010) 4. Huang, Y., Tang, Y., Wang, Y.: Emotion map: a location-based mobile social system for improving emotion awareness and regulation. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2015), pp. 130– 142 (2015) 5. Zhao, N., Paradiso, J.A.: HALO: wearable lighting. In: Adjunct Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2015 ACM International Symposium on Wearable Computers, pp. 601– 606 (2015) 6. Sakurai, S., Narumi, T., Tanikawa, T., Hirose, M.: Augmented emotion by superimposing depiction in comics. In: Proceedings of the 8th International Conference on Advances in Computer Entertainment Technology (ACE 2011), Lisbon, Portugal (2011). Article no. 66 7. Sakurai, S., Yoshida, S., Narumi, T., Tanikawa, T., Hirose, M.: Psynteraction chair: a proposal of a system for induction of interpersonal behavior by using comic book images as ambient information. In: The 18th International Conference on Virtual Systems and Multimedia (VSMM 2012) (2012) 8. Desmet, P.M.A., Vastenburg, M.H., Romero, N.: Mood measurement with Pick-A-Mood: review of current methods and design of a pictorial self-report scale. J. Des. Res. 14(3), 241– 279 (2016)

UI and UX

Migaco: Supporting Young Children’s Tooth Brushing with Machine Learning Satoshi Ichimura(B) Information Design Course, School of Social Information Studies, Otsuma Women’s University, Sanbancho 12, Chiyoda-ku, Tokyo 102-8357, Japan [email protected]

Abstract. We have developed “migaco”, a system supporting young children’s tooth brushing. It is composed of a general toothbrush equipped with a magnet and a smartphone equipped with a geomagnetic sensor (an electronic compass). The output data from the geomagnetic sensor changes corresponding to the movement of the magnet, so that it is possible to know whether the toothbrush has moved or not. We have also introduced machine learning technology to migaco in order to identify the brushing point, and realized users’ requests like “I want to know if my child is brushing the entire teeth properly” or “I want to know which teeth my child did not brush.”

Keywords: Gamification Machine learning

1

· Tooth brushing · Smartphone

Introduction

Tooth brushing in early childhood is known to greatly affect subsequent dentition such as tooth decaying, tooth alignment, dental bite. However, if parents force their child to brush teeth, there might be a danger that toothbrushing may be memorized as an unpleasant behavior. Therefore it is necessary to help young children make teeth brushing a habit with pleasure, and improve communication between children and parents to realize it. In recent years, “gamification” has attracted attention. Gamification is the concept of applying game mechanics and game design techniques to fields other than games. It aims at improving user experience, user engagement and users’ motivation by utilizing game elements and mechanisms that attract a number of people. Game elements include mechanisms to visualize the results that the user achieved through the use of ranking, scoring or giving a badge. It becomes easier to understand how hard he/she tried or how much his/her progress is being achieved. Furthermore, merging SNS mechanisms to gamification has a possibility to provide pleasure of collaboration or competition with friends having the same purpose, so that it could increase users’ motivation more [1]. c Springer Nature Switzerland AG 2018  H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 173–184, 2018. https://doi.org/10.1007/978-3-319-98743-9_14

174

S. Ichimura

Based on the above background, we have developed “migaco”, a system supporting young children’s tooth brushing. It helps young children make teeth brushing a habit, and improves communication between children and parents. Migaco has a meaning of “Let’s brush teeth” in Japanese. The system is composed of a toothbrush equipped with a magnet and a geomagnetic sensor mounted on a smartphone. Because the toothbrush only has a small magnet, it is not bulky, it is hard to break even if a child treats it violently, it is inexpensive, and any battery is unnecessary. When the magnet on the toothbrush moves, the value of smartphone’s geomagnetic sensor changes. A typical smartphone returns an orientation as the value of the geomagnetic sensor (the electronic compass) as one output value (alpha rotation) of the gyro sensor. So that, when the orientation changes, it can be determined that a toothbrush has moved. migaco counts the number of brushing and displays an animation in which an animal character moves in accordance with the movement of a toothbrush. Furthermore, in order to detect brushing points, we introduced machine learning (deep learning) technology to analyze complex variations of the value of smartphone’s geomagnetic sensor. As a result of the evaluations, it turned out that the brushing point can be identified with high accuracy, and users’ favorable comments such as “It is good because I can see if my kid is brushing teeth properly.” and “It was fun while my kid was enjoying migco.” were obtained.

2 2.1

Backgrounds Tooth Brushing-Support and Related Work

It is known as difficult to make a good life habit in early childhood. Early childhood is the time to acquire basic lifestyle as a human being like tooth brushing, hand washing, gargling etc. Among them, tooth brushing is one of the most important practice. According to “Survey on Children’s Toothpaste” [7], about 98% of parents think that tooth brushing of their children is important. However, about 90% of parents feel that their children do not brush in a proper way. Rather than forcing to brush their teeth, it is necessary to help young children make teeth brushing a habit with pleasure. Sunstar’s GUM PLAY [8] is a related product that helps toothbrushing, it consists of an attachment attached to a toothbrush and a smartphone application, and acquires the movement of the toothbrush with a 3-axis acceleration sensor mounted on the attachment. The acquired data are transmitted it to the smartphone by Bluetooth wireless communication. The smartphone application analyzes the quality of the tooth brushing and records it. However, the attachment having an acceleration sensor module, a battery, and Bluetooth wireless module is heavy, easy to break if a child treats it roughly, expensive, and the battery exchange is necessary.

Migaco: Supporting Young Children’s Tooth Brushing

2.2

175

Gamification and Related Work

Here we describe examples of gamification where the game is applied to solve problems that happens in daily life. Nike+ Running [4] is a smartphone application that calculates mileage and calorie consumption. GPS in the smartphone is used. The level is judged according to the running distance, and the level is visualized by color. The color level starts from yellow, goes through orange, green, blue and goes black. When the running situation is published to Facebook, and “Like” arrives from a friend, cheering voice is emitted during running. Foursquare [2] is a smartphone application that automatically records shops and facilities visited by users. GPS in the smartphone is used. Users can earn points by “checking in” to the places, and earn badges such as Mayor (mayor) of that place when checking more than predetermined number of times. Users can compete with their friends. Ingress [3] is also a smartphone application that makes walking activity a funny game. A user can acquire a base (portal) that is located all over the world by vising the location, and get the area as his/her territory. The portal is often located near famous historical sites or art work, so that a user can enjoy regional sightseeing at the same time. Studyplus [5] is a learning management SNS integrated with gaming mechanisms to promote continuation of learning. When the user inputs the progress of the day into the site, the progress is visualized as a graph. In addition, users can compete with a number of anonymous friends having the same goal. Microsoft [6] introduced gamification called “language quality game” to the development process of Windows multilingual version. During localization, correcting misunderstanding of languages requires a tremendous amount of work, so it was a problem that the motivation of debugging staffs gradually declined. In the language quality game, Microsoft employees around the world were asked to find suspicious word. Every time they find a suspicious word, they get a point. Based on that point, ranking was announced to maintain motivation. It was reported that more than 7000 suspicious word had been discovered. We have conducted research on how to utilize the effect of gamification since several years ago, and we developed a vacuum cleaner with gamification functions that can make cleaning work fun [9]. The vacuum cleaner has an acceleration sensor, and is capable of detecting the motion of the vacuum cleaner and calculates its score, whereas score is high when the speed of the movement is appropriate. Different sound is generated depending on the acquired score. A user can know how good the movement of the vacuum cleaner was by listening to the sound. A typical user might be a person who thinks cleaning is troublesome, or who cannot maintain his/her motivation for cleaning, but the proposed cleaner with game elements could provide more enjoyable experience to users than usual.

176

3

S. Ichimura

Proposal

In this paper, we propose migaco, a system supporting young children’s tooth brushing. It is composed of a toothbrush equipped with a magnet and a geomagnetic sensor mounted on a smartphone. Since the toothbrush only has a magnet, it is not bulky, it is hard to break even if the child treats it roughly, it is inexpensive, and battery exchange is unnecessary. This is how it works. Typically, a smartphone is fixed in front of the user’s face. When the magnet attached to the toothbrush moves, the magnetic field around the smartphone is disturbed, so that the smartphone can detect the movement of the toothbrush. Generally speaking, a smartphone is equipped with a gyro sensor, so that it may be possible to detect rotation around the X, Y, and Z axes of the smartphone (Fig. 1). A typical smartphone returns an orientation (0◦ to 360◦ ) as alpha rotation around the Z axis. This way, when alpha rotation of the gyro sensor changes in response to the magnet, it can be determined that the toothbrush has moved.

Fig. 1. Alpha rotation of the gyro sensor changes in response to the magnet

With the above mechanism, the system counts the number of brushing and displays an animation in which an animal characters moves along with toothbrush movement. Displaying the brushing count and the animal animation may be able to enjoy/encourage child, and promote conversation between parents and child. The system was developed as a Web application with using HTML 5 & JavaScript. Some early users of the system gave us comments such as “I feel that it is interesting to see the brushing count increases” and “I think it is a practical system.” On the other hand, some of them who participated in the experiment requested “I want know if my child is brushing the entire teeth properly” and “I want to know which teeth my child did not brush.” Therefore, we began to consider adding a mechanism to identify the polishing point. However, it is not easy to identify the polishing point because migaco can only get the rotation values around the Z axis from the geomagnetic sensor

Migaco: Supporting Young Children’s Tooth Brushing

177

that is one-dimensional data. Compared with the three-dimensional acceleration data (for example, GUM PLAY uses), the amount of available information to identify tooth brushing point is considerably small. Therefore, we decided to consider incorporating the technology of machine learning (deep learning) so as to analyze complex time-series data of the value from the geomagnetic sensor.

4

Brushing Point Detection

We set our goal to distinguish between three positions of right, center and left on the front side of the teeth.

Fig. 2. Sample data returned from the geomagnetic sensor every 50 ms

The geomagnetic sensor mounted on the smartphone returns the rotation values around the Z axis (0◦ to 360◦ ). Figure 2 shows some sample data continuously returned from the smartphone every 50 ms. They were collected while the right and the center of the front side of the teeth were brushed for 2 s. As shown in the figure, it appeared to be difficult to distinguish between right and center with an easy method such as setting a static threshold value. Next, we planned to conduct the following three experiments: 1. to find the conditions that can get the value of the geomagnetic sensor effectively 2. to decide whether to introduce machine learning 3. to find suitable machine learning (deep learning) method. Each experiment will be described in the following sections.

178

4.1

S. Ichimura

Experiment 1

Although the magnetic field around the smartphone is disturbed when the magnet attached to the toothbrush moves, its influence to the geomagnetic sensor is weak and complex. It is greatly affected by the positional relationship between the smartphone and the magnet, so that we decided to conduct an experiment to find the conditions that can get the value of the geomagnetic sensor effectively. A smartphone was placed to a smartphone holder by which the surface of the smartphone was fixed horizontally. The distance between the smartphone and the magnet was approximately 10 cm. As a result, it was found that when the height of the magnet and the height of the smartphone were close to each other, the value of the geomagnetic sensor varied most. This indicates that it is possible to collect favorable data when the magnet approaches on the extended line of the smartphone surface. In addition, it was found that when the magnet approaches the bottom of the smartphone, the value of the geomagnetic sensor varied most. This can be inferred that the geomagnetic sensor is mounted on the bottom of the smartphone. Subjects were asked to brush right, center and left side of the front side of their teeth with using a migaco prototype, and 1,700 geomagnetic sensor data of every 50 ms were obtained. The distance between the bottom of the smartphone and the mouth was approximately 10 cm to 15 cm. Training data and test data were extracted without overlap from the 1,700 data and used in the following experiments. 4.2

Experiment 2

In order to decide whether to introduce machine learning into migaco, we first constructed a simple neural network of MLP (Multi-Layer Perceptron) [11,12] and conducted a recognition experiment. The system consists of PC and smartphone. In the PC (Windows), the web server and the machine learning program are running. A web application running on a smartphone transmits the sensor data detected by the geomagnetic sensor to the web server on the PC at predetermined time intervals. Upon receiving the rotation values around the Z axis, the web server activates the neural networkbased machine learning program and processes the data. The web application on the smartphone is written in HTML5 & JavaScript, and the machine learning program on the PC is written in Python [13]. Chainer [10] was used as a machine learning framework. In our machine learning, a certain number of sensor data acquired by the geomagnetic sensor every 50 ms are entered into the input layer of the MLP all at once. Suppose L is the number of geomagnetic sensor data to be entered into the input layer of the MLP at the same time. L is called dataset length. For example, if dataset length L is 10, geomagnetic sensor data of 500 ms is entered into the input layer of the MLP all at once. When L is 10, the MLP has 10 neurons in its input layer and it is necessary to wait for 500 ms before entering

Migaco: Supporting Young Children’s Tooth Brushing

179

the geomagnetic sensor data to the MLP. In this experiment, we tried to find the optimum value of L. In the experiment, we also compared two different methods regarding how to enter time-series data to the neural network. Method 1 (Adjacent datasets don’t overlap). For example, when L is 10, the first dataset are from t0 to t9 and the second dataset are t10 to t19. Method 2 (Adjacent datasets do overlap heavily). For example, when L is 10, the first dataset are from t0 to t9 and the second dataset are t1 to t10. See Fig. 4. The 1,700 geomagnetic sensor data collected in Experiment 1 were used in this experiment. The training data and the test data were taken out from it at a ratio of 8:2. The results are shown in Fig. 3. As shown in the figure, from the experimental results, it was found that method 2 got significantly higher recognition rate than method 1, so that we decided to adopt method 2. It also turned out that the optimum dataset length is 34 for migaco, and migaco can obtain acceptable recognition rates when L is 34. There was no tendency for the recognition rate to improve even if dataset length L was larger than 34. When L is 34, the MLP has 34 neurons in its input layer, and it is necessary to wait for 1.7 sec (34 times 50 ms) before entering the geomagnetic sensor data to the MLP. 1.7 sec time delay may be a problem in some other applications, but it is not a problem for our toothbrushing diagnostic application. Therefore, we decided to introduce machine learning to migaco. 4.3

Experiment 3

In experiment 2, it turned out that machine learning can be introduced to migaco, then in experiment 3, we tried to find suitable machine learning (deep learning) method. In addition to the MLP constructed in experiment 2, in experiment 3, CNN (convolution type neural network) and RNN (Recursive Neural Network) were evaluated. CNN is a method that is often used for image processing, but recently it has also been used for handling time-series data. In this experiment, a twodimensional graph image (horizontal axis is time, vertical axis is the value from the geomagnetic sensor) was created for each dataset, and all graph images for all of the datasets were entered to CNN for training the neural network (Fig. 4). In experiment 3, 34 pieces of sensor data acquired by the geomagnetic sensor every 50 ms are used for creating a two-dimensional graph image, and created all graph images are entered to the input layer of the CNN. We incorporated a dropout module, which is known to be effective in preventing over learning even when the neural network is deepened, and a Batch Normalization module that improves learning accuracy into the CNN configuration [11]. RNN is a method that is often used for handling continuous data such as natural language and time-series data, so that it can be expected to improve recognition rate by replacing MLP with RNN.

180

S. Ichimura

Fig. 3. Result of experiment 2 (MLP)

Fig. 4. Datasets created with the method 2

In experiment 3, we incorporated a long short-term memory (LSTM) module, which is known to be effective for long time-series data, into the RNN configuration [11]. 34 pieces of sensor data acquired by the geomagnetic sensor every 50 ms are used to create a dataset, and each dataset was entered one by one to the input layer of the RNN. In this experiment, the 1,700 geomagnetic sensor data collected in experiment 1 were used, and the training data and the test data were taken out from it at a ratio of 8:2. Experimental results are shown in Fig. 5, where the graph of MLP is an excerpt from the graph of experiment 2 (method 2). The recognition rate of

Migaco: Supporting Young Children’s Tooth Brushing

181

Fig. 5. Result of experiment 3 (comparison between MLP, CNN and RNN)

RNN and CNN was higher than MLP, and furthermore, the recognition rate of CNN was higher than RNN. It is generally said that RNN is suitable for handling time-series data, but unexpectedly CNN was much better than RNN. RNN was designed for learning long term time-series data like natural language, but toothbrushing is basically a repetition of the same short action. We thought, for this reason, the merit of RNN did not appear much in the experiment. Therefore, CNN was selected as a machine learning method to be used for migaco. It also turned out that the optimum dataset length is 34 for CNN. There was no tendency for the recognition rate to improve even if the dataset length L was larger than 34 like the result of experiment 2. We conducted recognition experiments using CNN five times, and the average recognition rate was 98.3%.

5

Implementations

As the results of experiments mentioned above, whether to introduce machine learning into migaco, how to input time-series data and what method of machine learning is suitable for migaco have become clear. Then, we have implemented the system based on those experimental results. As mentioned above, the web application on the smartphone is written in HTML5 & JavaScript, and the machine learning program on the Windows PC is written in Python (see Fig. 6). The web application on the smartphone sends geomagnetic sensor data to the web server running on the PC every 1.7 s (whenever 34 geomagnetic sensor data are accumulated). Then, the web server sends the received time-series data to CNN program on the same PC, and CNN program identifies the brushing point. The identified brushing point is sent back to the web application on the smartphone.

182

S. Ichimura

Fig. 6. System architecture

The application screen on the smartphone is shown in Fig. 7. It shows where is being brushed by the animation that stars shine. 5.1

Experiment 4

The system was actually used by 5 young children (2 children of 6 years old, 1 child of 5 years old and 2 children of 4 years old), and then we interviewed 5 parents of each child. The parents gave us favorable comments like “It is good because I can see if my kid is brushing teeth properly.” and “It was fun while my kid was enjoying migco.” In addition, we ask the parents to evaluate some features of the system in five-point Likert scales. The results are 4.6 points for “Do you think there is practicality”, and 4.4 for“Do you think that there is an effect of parent-child communication promotion?” 5.2

Experiment 5

As an additional experiment, we conducted an experiment to distinguish between four positions of upper right, lower right, upper left and lower left on the front side of the tooth. During the experiment, the distance between the lower part of the smartphone and the magnet was approximately 10 cm to 15 cm. 2,000 geomagnetic sensor data acquired every 50 ms were obtained. Training data and test data were extracted from the 2,000 data without overlap and used in the experiment. The experiment was conducted on two subjects and the average recognition rate was 95.5%. We initially predicted that it would be difficult to distinguish between upper and lower teeth on the same side, but as a result, an acceptable recognition rate was obtained. When observing subject’s tooth brushing, we noticed, for example, the variation in the movement of the end of the toothbrush appeared larger when brushing

Migaco: Supporting Young Children’s Tooth Brushing

183

Fig. 7. Application screen on the smartphone

the lower teeth than when brushing the upper teeth. There is a possibility that each person has a habit when brushing teeth, and machine learning might have learned it.

6

Summary

We proposed a migaco for young children. In order to cope with the requests from users using prototype system, we tried to implement a function to identify brushing point. For this purpose, machine learning technology was introduced to migaco, and as a result, an acceptable recognition rate was obtained. The toothbrushing varies greatly among people, and the positional relationship between the smartphone and the magnet differs depending on the environment. So that, it would be necessary to train the neural network for each individual. It would be important to implement easy-to-use user interface for training. Acknowledgement. This work was supported by JSPS KAKENHI Number 16K00506.

184

S. Ichimura

References 1. Yuhas, D.: Three Critical Elements Sustain Motivation, Scientific American (2014). http://www.scientificamerican.com/article/three-critical-elementssustain-motivation/ 2. Foursquare (2017). https://foursquare.com/ 3. Ingress (2017). https://www.ingress.com/ 4. Nike+ Running (2017). http://www.nike.com/us/en us/c/nike-plus 5. Studyplus, Studyplus SNS, in Japanese (2017). http://studyplus.jp/ 6. Language-Quality-Game, Microsoft (2012). https://social.technet.microsoft.com/ wiki/contents/articles/9299.language-quality-game.aspx 7. https://prtimes.jp/main/html/rd/p/000000003.000019240.html (2016) 8. GUM Play, Sunstar (2018). https://www.gumplay.jp/ 9. Ichimura, S.: Introducing gamification to cleaning and housekeeping work. In: Yoshino, T., Yuizono, T., Zurita, G., Vassileva, J. (eds.) CollabTech 2017. LNCS, vol. 10397, pp. 182–190. Springer, Cham (2017). https://doi.org/10.1007/978-3319-63088-5 16 10. Chainer: A flexible framework for neural networks (2018). https://chainer.org 11. Chainer Documents (2018). https://docs.chainer.org/en/stable/ 12. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press Book, Cambridge (2018). https://www.deeplearningbook.me/ 13. https://www.python.org/ (2018)

Improving Visibility and Reducing Resistance of Writers to Fusion of Handwritten and Type Characters Mikako Sasaki(&), Junki Saito, and Satoshi Nakamura Meiji University, 4-21-1, Nakano-ku, Tokyo, Japan [email protected] Abstract. Most Japanese people feel happy to receive a handwritten message, but they often have resistance to writing a message by hand. One of the reasons for this is that they are shy about showing their handwriting to others. In this study, we consider a technique that fuses handwriting with typeface in order to reduce the resistance to handwriting and improve the impression of the message. Experimental results demonstrate the visibility and readability of the considered fusion technique and show that the resistance to sending handwritten messages fused with typeface can be decreased. Keywords: Handwriting

 Type characters  Fusion character

1 Introduction Computers and smartphones can be used to easily input typeface characters by means of a keyboard or flick operation. This has led to an inevitable decrease in handwritten notes and letters. However, according to a survey on handwriting by Zebra Corporation [1], 90% of Japanese people have very positive feelings about handwritten messages. In addition, according to a public opinion poll by Japanese Agency for Cultural Affairs [2], almost half of the people in Japan who participated in the opinion poll tend to feel that they should handwrite greeting cards and letters, presumably because handwritten text contains the unique characteristics of the writer and can emphasize their sincerity from the trouble of writing something by hand. On the other hand, most Japanese people actually have resistance to handwriting a message themselves. Zebra Corporation [1] found that more than 80% of Japanese people are aware that they are bad at handwriting. In general, they tend to feel it is troublesome to write by hand, and they are often ashamed of showing their handwritten text to others. In fact, according to the Zebra survey, more than half of Japanese people have a negative impression of their own handwriting. As stated above, everyone can use computer fonts on computers or smartphones, and create a beautiful text message. People can choose a suitable font for the situation because there is a huge number of computer fonts available in the world today. Universal design (UD) font is based on the concept of “a design that as many people as possible can use” [3]. Computer fonts have several advantages, including visibility (i.e., people can recognize the characters at a glance), readability (i.e., people can easily © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 185–199, 2018. https://doi.org/10.1007/978-3-319-98743-9_15

186

M. Sasaki et al.

read the characters), and legibility (i.e., people are less prone to reading error and illusion) [4]. UD font is made in consideration of these three functions. Handwriting has its own advantages in that people feel a personal touch and appreciate the trouble that was taken while writing, but the main problem is that people are resistant to showing their handwriting to others. Although with computer fonts people can easily create messages beautifully, the disadvantage is that the personal touch and the trouble taken are lost. The objective of our work is to reduce resistance and embarrassment pertaining to one’s own handwriting and help promote feeling of warmth and joy in the reader when they receive a message comprised of a fusion of handwriting and typeface. To this end, we performed an experiment to evaluate the impression of a sentence on a message card written in characters that are a fusion of handwriting and UD font by means of a method proposed by Saito et al. [5]. Specifically, we examined through experiments how the readers and writers evaluated the visibility and readability, as well as the writer’s resistance to sending the message card to the others when the fusion characters are used on a message card. The contributions of this paper are as follows. • We demonstrate that the resistance of the writer to fusion characters is less than that to handwriting. • We demonstrate that the fusion characters retain the characteristics and warmth that handwriting has.

2 Related Work There have been many studies related to handwriting at this point. In terms of researches on how to help writers produce beautiful handwriting, Zintnick [6] et al. proposed a technique where the degree of coincidence of strokes written at that point is calculated by utilizing the curvature, and the handwriting of an individual user is beautified by determining where its degree of coincidence is high and averaging it. Zhu [7] et al. proposed a technique in which handwriting is approximated to the character of an exemplar by applying a template of the exemplar to the handwriting, and the printed style of written Chinese characters of the sculptural handwriting style is generated using a method that generates the printed style of written Chinese characters. A study by Kurihara [8] et al. is often cited as an important work on how to predict handwriting based on ambient multimodal recognition. Specifically, they developed a “Speech Pen” that supports users when taking of handwritten lecture notes by matching speech recognition with online handwriting recognition. As for researches on changing the impression of the handwriting by altering and embellishing it, Kambara [9] et al. proposed “Onomatopen”, which can draw textured lines or shapes based on onomatopoeia such as “zig-zag” and “tick-tock” provided by the user when writing by hand.

Improving Visibility and Reducing Resistance of Writers

187

The studies above focused primarily on the handwriting itself, along with its embellishment. In the present work, we go one step further by demonstrating that the characteristic warmth and trouble of writing something by hand can also be felt in the fusion characters of the handwriting and UD font. Researches that focuses on the characteristics of typeface are also slightly different from ours. Lin [10] et al. conducted a research on generating the font of several thousand or more Chinese characters and symbols; specifically, they conflated Chinese characters by means of the components of characters extracted from the users’ handwriting. With this technique, it is possible to generate Chinese font by handwriting 400 Chinese characters. Bernard [11] et al. examined the different preference of the older population when reading passages containing two serif and sans serif fonts at 12 and 14-point sizes. They found that 14-point fonts can be read faster than 12-point fonts, were more legible, and were more preferred by the participants. In a similar study, Cai [12] et al. measured the minimum visible font size for the most commonly used Chinese characters in the Ming, Kai, and Li styles and found Ming to be the most legible, followed by Kai and then Li. They also showed that both the character style and the number of strokes have a significant impact on the legibility. Liu [13] et al. investigated the effects of font size, stroke width, and character complexity on the legibility of Chinese characters and found that the font size and character complexity have a significant effect on the legibility, but the stroke width does not have much importance. The studies above examined the font itself and the changes to reader impression by transformation of the shape of the font. However, it is not clear what happens to the visibility and readability of the fusion character of handwriting and typeface, and what its characteristics are. For this reason, we investigate the fusion character and clarify the visibility characteristics and the resistance of the writer.

3 Building the Data Set We performed an experiment to clarify how handwriting, typeface, and the fusion character used in simple sentences are evaluated by the writer and the reader. Specifically, we built a data set consisting of message cards with each of the three styles to determine the impressions. First, we used Saito et al.’s method [5] to generate the fusion character. This method represents type character as a numerical formula that parameterizes “t” by performing Fourier series expansion to change the character’s core and thickness. It also uses the weighted average of the typed character’s numerical formula and the handwriting’s numerical formula. The generation of the character at the fusion ratio between handwriting and type character of 0.0 (handwriting), 0.5 (fusion character), and 1.0 (type character) using this method is given in Fig. 1. Prior to the experiment, we selected a phrase “Thank you”, which is commonly written in a variety of gratitude and farewell situations. The main reason we selected this phrase is that it is so frequently used in messages. Also, since it is a short phrase, the stroke order and stroke count do not vary much from person to person.

188

M. Sasaki et al.

Fig. 1. Japanese character “あ” and English character “a” of the fusion ratio between handwriting and type character at intervals of 0.5 between 0.0 and 1.0.

Two fonts were selected for fusion with the handwriting: “BIZ UDP Mincho” and “BIZ UDP Gothic”, both of which were generated by Morisawa [14] (Fig. 2). We selected these fonts because Mincho and Gothic are among the most commonly used fonts. Each font was mathematized in advance using Saito et al.’s method [5].

Fig. 2. UD fonts used in the experiment

When building the data set, we requested 15 participants (four males, 11 females) to imagine a situation that they were writing a message card to a close friend and write the phrase of “Thank you” in Japanese on a tablet computer. Their handwriting was

Improving Visibility and Reducing Resistance of Writers

189

mathematized using Nakamura et al.’s method [16]. The input device was Surface Book (Microsoft Corporation) and the participants were permitted to rewrite characters in 1-stroke unit until they were satisfied with their own handwriting. Finally, we generated the fusion characters from the phrase (five characters) that each participant handwrote and from the font (two types). Then, on the message card [15] (see Fig. 3), we conflated the characters at the fusion ratio between handwriting and typed character of 0.0 (handwriting), 0.5 (fusion character), and 1.0 (type character) and saved them as an image. Figure 4 shows the message card written in the fusion character.

Fig. 3. Template of the message card that we used in the experiment.

Fig. 4. Message card written in the fusion character.

190

M. Sasaki et al.

4 Experiment for the Writer In this experiment, we examined whether the writers felt resistance when sending their close friends a message card written with the fusion character that combines their own handwriting and typeface. Two items were examined: 1. Which message card the writers wanted to send to their friends: the one written with their own handwriting, or the one created with the fusion character. 2. What impression the writers had for the message card written with their own handwriting, in typeface, or with the fusion character. 4.1

Experimental Procedure

The same 15 people that helped build the data set in the previous chapter cooperated here as research participants. In the experiment, we randomly presented three types of message cards: written with their own handwriting, in typeface, and with the fusion character, or in the two different fonts and their own handwriting. They examined the cards and decided which one they would want to send to their close friends (see Fig. 5). The experiment consisted of two trials in total (1 type of the sentence)  (2 types of the fonts). We also had participants look at an image of a Japanese character (see Fig. 6) in a variety of fonts for three seconds in between each trial to ensure that the ranking was not affected by a comparison with the previous or next message card. The system used in this experiment (and the others) was constructed using PHP, JavaScript, and MySQL.

Fig. 5. Experimental system

Fig. 6. Image of Japanese characters in various fonts.

Improving Visibility and Reducing Resistance of Writers

191

Next, we showed the participants their own handwriting, the fusion character, and the typeface one by one in random combinations consisting of one phrase and two types of font, and they performed an impression evaluation using a 7-stage semantic differentials method (−3 – +3) for each phrase. We presented the participants with five pairs of adjectives selected from previous research [17–22]: “hard-to-read—easy-to-read”, “bad —beautiful”, “indistinct—distinct” for visibility and readability and “hesitant—confident”, “resistant—acceptable” for complexity of handwriting. We also had the participants look at an image of Japanese characters written in a variety of fonts for three seconds between each trial. The number of trials in this experiment was five (2 types of the font) + (2 types of the fusion character) + (1 type of their own handwriting) (Fig. 7).

Fig. 7. Experimental system

4.2

Results

The results in Table 1 shows the mean of the degree to which participants wanted to send the three types of message card (3 points = first place, 1 point = second place, and 0 points = third place in terms of the ranking) and the rankings of the message card’s character by type. Table 1. Mean of degree to which participants wanted to send a message card and the number of people that ranked a character type as highest. Mean Handwriting Fusion character Mincho 0.93 2.73 Gothic 0.47 2.47

Number of people that ranked a character type as highest Typeface Handwriting Fusion Typeface character 0.33 2 people 13 people 0 people 1.13 2 person 11 people 3 people

192

M. Sasaki et al.

Fig. 8. Mean of the impression evaluation for Mincho

As shown in the table, the mean of the degree to which participants wanted to send the message is the highest for the fusion character, for both Mincho and Gothic fonts. The second highest value is the handwriting in Mincho, followed by the font in Gothic. In addition, 13 people ranked the fusion character in Mincho as the highest, followed by 11 people for the fusion character in Gothic. These results show that the fusion character is the most supported. Interestingly, none of the writers ranked the message card for typeface in Mincho as the highest. Figures 8 and 9 show the mean of the impression evaluation experiment for Mincho and Gothic, respectively, by type of character. As shown in the figures, in terms of visibility and readability (that is, “hard-to-read —easy-to-read”, “bad—beautiful”, “indistinct—distinct”), the mean of the evaluation for the fusion character is higher than that for the handwritten items in both Mincho and Gothic. The mean of the evaluation for the fusion character was also higher than for the handwritten items in terms of visibility and readability (that is, “indistinct—distinct”) and complexity (“hesitant—confident”, and “resistant—acceptable”) in both Mincho and Gothic. We performed one-way analysis of variance to determine whether the type of character was a factor in the adjective pairs and found significant statistical differences in all items. The change of the impression is particularly large for “bad—beautiful” in Mincho (F[2, 42] = 64.14, P < 0.01), and “bad—beautiful” in Gothic (F [2, 42] = 19.26, P < 0.01) due to the difference of character type. This demonstrates that there is a significant difference in the beauty of characters when it comes to handwriting, the fusion character, and typeface. 4.3

Discussion

The results of this experiment showed that the writers were more inclined to send message cards to their close friends written in the fusion character than in their own handwriting, presumably because the resistance and shyness of the writers regarding their own handwriting were reduced.

Improving Visibility and Reducing Resistance of Writers

193

Fig. 9. Mean of the impression evaluation for Gothic

In terms of the adjective pairs, the evaluations relating to visibility and readability in the fusion character for Gothic are lower than in the fusion character for Mincho. This is probably because the Gothic character is thicker than the Mincho character, and the gap between lines is narrower in Gothic than in Mincho. In the method we used to generate the fusion character (Saito et al.’s), the thickness of the character in the fusion character depends on the thickness of the character in the font, so the evaluation of visibility and the readability in the fusion character is lowered for Gothic than for Mincho. On the other hand, for the “resistant—acceptable” item, the mean of the evaluation for the fusion character for both fonts is higher than that for the handwriting. These results suggested that the shape of the character changed more in the fusion character than in the handwriting, but we assume the writers did not feel more resistance to the fusion character than the typeface because they were still able to find an element of their own handwriting in it. We clarified in this experiment that the writers were more likely to like a message card to their close friends written with the fusion character than with their own handwriting or in typeface. We also found that the writers’ resistance to their own handwriting could be reduced because the fusion character improved the visibility and the readability compared to handwriting.

5 Experiment for the Reader In this experiment, we investigated whether readers preferred the message card written in handwriting, the fusion character or typeface. Two items were examined: 1. Which message card the readers wanted to receive from a close friend: the handwritten one, the fusion character one, or the typeface one. 2. What impression the readers had for sentences written with the writer’s own handwriting, the fusion character, or typeface.

194

M. Sasaki et al.

We used the message cards created when we initially built the data set. Fifteen individuals (13 men, 2 women) participated in this experiment. These were different people from the individuals who participated in the first experiment. 5.1

Experimental Procedure

We had the research participants rank which message card they would prefer to receive from a close friend: one written with the friend’s own handwriting, the fusion character, or typeface. We also had them look at an image of a Japanese character in a variety of different fonts for three seconds between each trial so that the ranking would not be affected by a comparison with the previous or next message card. The number of trials was 30: (1 kind of sentence)  (2 types of fonts)  (15 writers). First, we randomly presented the participants with cards written in the writer’s own handwriting, the fusion character, and typeface one by one and had them evaluate their impression using a 7-stage semantic differentials method (−3 – +3) for each sentence. The pairs of adjectives used for visibility and readability were “hard-to-read—easy-toread”, “bad—beautiful”, and “indistinct—distinct” and for characteristics of handwriting were “impersonal—personal”, and “simple—elaborate”, and “generic— unique”. In addition, we had the research participants look at an image of Japanese characters in a variety of fonts for three seconds between each trial so that the ranking would not be affected by a comparison with the previous or next message card. The number of trials in this experiment was 47: (2 types of the font) + (2 types of the fusion character)  (15 writers) + (15 writer’s own handwriting). 5.2

Results

Table 2 lists the mean of the degree to which participants wanted to receive the three types of message card (3 points = first place, 1 point = second place, and 0 points = third place in terms of the ranking). Table 2. Mean of degree to which participants wanted to receive a message card. Handwriting Fusion character Typeface Mincho 0.92 2.18 0.90 Gothic 0.88 2.22 0.91

As shown in the table, the mean of the degree to which participants wanted to receive the message is highest for the fusion character for both Mincho and Gothic. Moreover, there is little difference between the handwriting and typeface for both Mincho and Gothic. Figures 10 and 11 show the mean of the impression evaluation for Mincho and Gothic for all writers by type of character.

Improving Visibility and Reducing Resistance of Writers

195

Fig. 10. Mean of the impression evaluation for Mincho

Fig. 11. Mean of the impression evaluation for Gothic

As shown in the figures, the mean of the evaluation for the fusion character is higher than for the handwriting on items that are “hard-to-read—easy-to-read”, “bad— beautiful”, and “indistinct—distinct” in terms of visibility and readability for both Mincho and Gothic. In addition, for “impersonal—personal”, the mean of the evaluation for Mincho is the highest for the fusion character followed by the handwriting and then finally typeface, and the mean of the evaluation for Gothic is the highest for the fusion character followed by typeface and then the handwriting. For “simple—elaborate”, the mean of the evaluation for Mincho is the highest for the handwriting followed by the fusion character and then typeface, and the mean of the evaluation for Gothic is the highest for the fusion character followed by the handwriting and then typeface. For “generic—unique”, the mean of the evaluation for both fonts is the highest for the handwriting followed by the fusion character and then typeface. We also performed two-way analysis of variance for the handwriting, the fusion character, and typeface and 15 writers. Results showed a significant statistical

196

M. Sasaki et al.

difference [2, 630] = P < 0.01), elaborate” [2, 630] = P < 0.01). 5.3

for the items that are “hard-to-read—easy-to-read” in Mincho (F 748.33, P < 0.01), “indistinct—distinct” in Gothic (F[2, 630] = 106.86, “simple—elaborate” in Mincho (F[2, 630] = 92.07, P < 0.01), “simple— in Gothic (F[2, 630] = 68.47, P < 0.01), “generic—unique” in Mincho (F 870.09, P < 0.01), and “generic—unique” in Gothic (F[2, 630] = 627.02,

Discussion

The results above demonstrate that readers want to receive message cards from close friends written with the fusion character more than with handwriting or typeface. This would be because these cards led to greater feelings of joy in terms of visibility and readability. Moreover, as shown in the evaluation by adjective pairs in Figs. 10 and 11, the mean of the evaluation of the fusion character for both Mincho and Gothic is higher for the handwriting for “impersonal—personal”. This would be because the readers felt closer to the fusion character since it had a beautiful shape while retaining the elements of personal handwriting. In addition, the evaluation of two items of “simple-elaborate” and “generic-unique” resulted in better handwriting than fusion character in both of Mincho and Gothic. From this result, it was found out that the fusion character is evaluated lower than handwriting regarding unique of one’s handwriting. Nevertheless, we assume that it is important that characters on the letter do not only have unique but also they are beautiful and easy for readers to read. Therefore, we think that a fusion character that is excellent in these points is better than a handwriting. Further, from the results of two-way analysis of variance, we know there is a significant statistical difference for the type of character in terms of “simple—elaborate” and “generic—unique” for both Mincho and Gothic. Therefore, it seems that characteristics of handwriting are still clearly perceivable in the fusion characters with any combination of handwriting and type of fonts, regardless of the writer of the handwriting. In addition, we compared the result of the first experiment with the result of the second experiment. It was found that the readers always evaluated the message card written with the fusion character the most highly, regardless of the type of letter (the fusion character, handwriting, typeface) the writer wanted to use for the message card. Therefore, regardless of the selection by the writers, it is clear that the readers wanted to receive the card written with the fusion character the most.

6 Conclusion and Future Work In this paper, we focused on the use of fusion character that combines handwriting with a UD font. We separated writers from readers and conducted experiments to determine how visibility and readability might differ between handwriting and typeface and to evaluate the resistance of people in terms of sending a message card using the fusion character.

Improving Visibility and Reducing Resistance of Writers

197

First, we examined which message card—the one with their own handwriting, with the fusion character, or with typeface—the writers wanted to send to their close friends and what kind of impression the writers had for each type of card. The results showed that the writers were most likely to select the card with the fusion character as the one they wanted to send the most. Also, the results of the impression evaluation experiment showed that resistance and shyness related to one’s own handwriting could be reduced because the visibility and readability of the fusion character was improved compared to that of handwriting alone. Next, we examined which message card a reader wanted to receive—handwritten, with the fusion character, or with typeface—and what kind of impression the readers had for each of the characters. The results showed that the readers were most likely to select the fusion character as the character of the message card they wanted to receive the most. Also, the results of the impression evaluation experiment showed that the evaluation for the fusion character is high in terms of visibility and readability (characteristics it shares with UD fonts), and is also high in terms of the characteristics of handwriting (warmth, care taken). Only two fonts were used in these experiments, but of course the number of fonts available in the world today is beyond number. Therefore, in the future, we will investigate things such as whether or not the fusion character loses its shape, and if it reflects the characteristic of the handwriting regardless of the font with which it is fused. Moreover, we used 0.5 as the fusion ratio in this study, but we intend to carry out surveys for other ratios as well. Also, this experiment was limited to a fusion character in Japanese. However, in the future, we will consider the generation of fusion character in other languages. For example, Chinese characters and Japanese Kanji have the same form, so we think fusion characters for Chinese also show characteristics of the fusion characters we revealed in this study. However, it is still unknown that fusion characters of the alphabet, etc. show these characteristics. Thus, in the future, we need to do the verification. As future work, we will develop a method that generates the fusion character of handwriting and typeface that corresponds to various fonts, and use it to create an app for writing a message card using the fusion character with smart phones, a system that can use the fusion character to display the lyrics in musical videos, and a system that can use its own fusion characters for cartoon captions, text illustrations, and movie subtitles. The next step will be to develop a method that promotes understanding using fusion characters when reading a difficult text such as a technical book. Acknowledgments. This work was supported in part by JST, JST ACCEL Grant Number JPMJAC1602, Japan, and Meiji University Priority Research A.

198

M. Sasaki et al.

References 1. Zebra Corporation: Attitude Survey on the Handwriting. http://www.zebra.co.jp/press/news/ 2014/0918.html. Accessed 24 Mar 2018 2. The Agency for Cultural Affairs: The Public Opinion Poll on Japanese. http://www.bunka. go.jp/tokei_hakusho_shuppan/tokeichosa/kokugo_yoronchosa/pdf/h24_chosa_kekka.pdf. Accessed 24 Mar 2018 3. Font Garage: What’s UD Font. http://font.designers-garage.jp/ud/. Accessed 24 Mar 2018 4. Iwata Corporation: Iwata UD Font. http://www.iwatafont.co.jp/ud/index.html. Accessed 24 Mar 2018 5. Saito, J., Nakamura, S., Suzuki, M.: A method to increase reader’s empathy by merging their handwritten characters and text in speech balloon in digital comics. In: The Japanese Society for Artificial Intelligence, JSAI, Nagoya (2017). (in Japanese) 6. Zintnick, C.L.: Handwriting beautification using token means. In: ACM Special Interest Group on Computer Graphics and Interactive Techniques, vol. 32. SIGGRAPH, Anaheim (2013) 7. Zhu, X., Jin, L.: Calligraphic beautification of handwritten chinese characters: a patternized approach to handwriting transfiguration. Semant. Sch. (2008) 8. Kurihara, K., Goto, M., Ogata, J., Igarashi, T.: Speech pen: predictive handwriting based on ambient multimodal recognition. In: ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 851–860. CHI, Montreal (2006) 9. Kambara, K., Tsukada, K., Onomatopen: painting using onomatopoeia. In: 9th International Conference on Entertainment Computing, pp. 43–54. ICEC, Seoul (2010) 10. Lin, J.-W., Hong, C.-Y., Chang, R.-I., Wang, Y.-C., Lin, S.-Y., Ho, J.-M.: Complete font generation of Chinese characters in personal handwriting style. In: 34th Computing and Communications Conference. IPCCC, Nanjing (2015) 11. Bernard, M., Liao, C.H., Mills, M.: The effects of font type and size on the legibility and reading time of online text by older adults. In: ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 175–176. CHI, Seattle (2001) 12. Cai, D., Chi, C.-F., You, M.: The legibility threshold of Chinese characters in three-type styles. Int. J. Ind. Ergon. 27(1), 9–17 (2001) 13. Liu, N., Ruifeng, Yu., Zhang, Y.: Effects of font size, stroke width, and character complexity on the legibility of Chinese characters. Hum. Fact. Ergon. Manuf. Serv. Ind. 26(3), 381–392 (2016) 14. Morisawa: MORISAWA BIZ + . http://bizplus.morisawa.co.jp/. Accessed 24 Mar 2018 15. Brother at your side. https://online.brother.co.jp/ot/dl/Contents/greeting/birthday/birthday_ 013/. Accessed 24 Mar 2018 16. Nakamura, S., Suzuki, M.M., Komatsu, T.: Average handwritten hiragana-characters are beautiful. Inf. Process. Soc. Jpn 57(12), 2599–2609 (2016). (in Japanese) 17. Mukai, S.: Analysis of common cognition of impression among japanese fonts and tea beverage packaging. In: 5th Kanesi Engineering and Emotion Research, pp. 1509–1519. KEER, Linköping (2014) 18. Mukai, S., Hibino, H., Koyama, S.: Differences in ratings of impressions between Japanese calligraphic styles and a Japanese font. Int. J. Affect. Eng. 16(2), 53–56 (2017) 19. Henderson, P.W., Giese, J.L., Cote, J.A.: Impression management using typeface design. J. Mark. 68(4), 60–72 (2004) 20. Miyoshi, M., Shimoshio, Y., Koga, H., Uchimura, K.: On evaluation of similarity between visual impressions of handwritten character using Kansei information. Inst. Image Inf. Telev. Eng. Proc. 24(51), 1–8 (2000). (in Japanese)

Improving Visibility and Reducing Resistance of Writers

199

21. Inoue, M., Kobayashi, T.: The research domain and scale construction of adjective-pairs in a semantic differential method in Japan. Jpn. Assoc. Educ. Psychol. 33(3), 253–260 (1985). (in Japanese) 22. Dalton, P., Maute, C., Oshida, A., Hikichi, S., Izumi, Y.: The use of semantic differential scaling to define the multidimensional representation of odors. J. Sens. Stud. 23(4), 485–497 (2008)

PopObject: A Robotic Screen for Embodying Video-Mediated Object Presentations Kana Kushida(&) and Hideyuki Nakanishi Department of Adaptive Machine Systems, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan {kana.kushida,nakanishi}@ams.eng.osaka-u.ac.jp

Abstract. Some studies have been conducted on 2.5D display surfaces, which displays a two-dimensional video with a curved or deformable display surface. In this study, we developed a telepresence system, which protrudes a specific part of a remote video by a 2.5D display surface. The system has a stretch projection screen and a push-out mechanism. The screen is pushed out from behind and expresses protrusion. This protrusion is aiming to express the depth information of the remote video. We expected that it enhances the remote conversation partner’s presence. We supposed a conversation such as showing and explaining an object in videoconferencing, and conducted experiments in order to confirm an effect of the developed system as a telepresence system. The protrusion was in synchronization with the movement of the objects on the projected video. The results of the experiment suggested that the protrusion on the screen surface provided by this system strengthens the presence of the object and remote person. Keywords: Telepresence system  Social telepresence Video-mediated communication  Elastic display

1 Introduction Telepresence is a technology that allows a person to feel as if they were present at a place other than their true location [1–5]. One of the popular applications of telepresence is a video conferencing system. However, most of them are designed for flat surfaces where the perception of depth is lost. This decreases the presence of a remote person [6]. Some studies have tried to express a spatial three-dimensional effect of a twodimensional video with a curved or deformable display surface. For example, “Livemask” is a surrogate system with a face-shaped screen [7], and “Interactive Spatial Copy Wall” is a system, which represents a three-dimensional shape of a remote person with hundreds of movable pipes [8]. Furthermore, shape change is increasingly used in physical user interfaces, both as input and output [9–12]. We focused on the idea of representing a remote person with a flexible and deformable surface. We applied the elastic displays for a telepresence system as an output interface, which represents depth information about the video. The elastic displays are proposed for a new-generation input interface [13–16]. It is attained by a stretch cloth © Springer Nature Switzerland AG 2018 H. Egi et al. (Eds.): CollabTech 2018, LNCS 11000, pp. 200–212, 2018. https://doi.org/10.1007/978-3-319-98743-9_16

PopObject: A Robotic Screen for Embodying Video-Mediated Object Presentations

201

Fig. 1. Snapshot of our deformable screen

and offers new ways to interact with multi-dimensional data by using the deformation of the surface [17]. By applying an elastic display, we can build the deformable screen, which has a smooth surface and can express various shapes. Additionally, we attempted to verify the effectiveness of this simplified 3D surface for a telepresence system. Then, we conducted experiments in order to confirm an effect of the developed system as a telepresence system.

2 System The Snapshot of our deformable screen is shown in Fig. 1, and the constitution of the developed system is shown in Fig. 2. This system makes the object on the projected video appear protruded with its deformable surface. There was a projection screen, which was made of a stretch cloth and was flexibly deformed by pushing. The extruder was set behind the screen. If the object on the projected video moved forward, the extruder also moved forward and pushed the screen to the front. The screen was deformed along a shape of the extruder. This deformation of the screen adds the perception of the depth and a spatial threedimensional effect to the video. A shape of the extruder imitates that of the object. Figure 3 shows some examples of various objects and the extruders which correspond to each objects. Linear positioning tables that were set behind the screen moved to the extruder. The location of the extruder was synchronized with that of the object on the projected video.

202

K. Kushida and H. Nakanishi

Linear motion table device

Projector projects remote video on flexible screen Flexible screen deformed by extruder and expresses threedimensional effect

Extruder

Fig. 2. Mechanism of our deformable screen

(a) Ball

(b) Stuffed animal

Fig. 3. The object (left) and the extruder corresponding to the object (right)

3 Experiment 1 At first, the experiment 1 was done as a preliminary experiment. We estimated a quality of the developed system. 3.1

Conditions

In the experiment 1, we compared the following two conditions: • Flat screen condition: The video of the remote explainer was projected on the flat screen (Fig. 4(a)). • Deformable screen condition: The video of the remote explainer was projected on the screen and the screen was deformed along with the video (Fig. 4(b)).

PopObject: A Robotic Screen for Embodying Video-Mediated Object Presentations

203

(a)Flat screen condition

(b)

Deformable screen condition

Fig. 4. Conditions of the experiment 1

3.2

Setup

In this paper, we focus on the effect of the deformable screen on social telepresence, and we developed the one way system. Figure 5 shows the setup of the experiment. A vertical display monitor situated along one side of a desk provides a nearly identical image of the remote side. 3.3

Task

We contrived a situation where the remote person explains while showing a ball or a stuffed animal as the task. In the ball task, the experimenter held out the ball and had a simple conversation with the subject. In the stuffed animal task, the experimenter held out the stuffed animal. In all conditions the presentation began and ended with a greeting. To conduct a controlled experiment, we offered the same conversation time. 3.4

Questionnaire

We conducted a questionnaire after the experiment. The questionnaire included several statements and asked the extent to which the statements matched the impression that

204

K. Kushida and H. Nakanishi

Subject

90

63

130

70

120

55

(a) Local Side Web camera and kinect

Explainer

70

120

Object

(b) Remote Side Fig. 5. Setup of the experiment 1

the participant had. The questionnaire asked five questions shown in Fig. 6. The subjects answered the questionnaires after they experienced both conditions. Q1, Q2 and Q3 check the quality of the presentation. Q5 correspond to presence of the object. Q4 correspond to presence of the remote person. All the statements were rated on a 7-point Likert scale where 1 = strongly disagree, 4 = neutral, and 7 = strongly agree. 3.5

Results and Discussion

We compared two conditions by the within-subjects experiment. Eight subjects consisting of seven males and one females participated with the ball task. Also Eight subjects consisting of six males and two females with the stuffed animal task. The participants were undergraduate students whose ages ranged from 18 to 24 years. They

PopObject: A Robotic Screen for Embodying Video-Mediated Object Presentations

1

2

3

4

5

6

7

Q1. The video was sufficiently clear.

Ball

Q2. The audio was sufficiently clear. Q3. The presentation was intelligible. Q4. I felt as if I were viewing the stuffed animal in the same room. Q5. I felt as if I were viewing the explainer in the same room.

*

Stuffed animal

Q1. The video was sufficiently clear. Q2. The audio was sufficiently clear. Q3. The presentation was intelligible. Q4. I felt as if I were viewing the stuffed animal in the same room.



Q5. I felt as if I were viewing the explainer in the same room.

**

Flat screen condition Deformable screen condition Fig. 6. Results of the experiment 1

205

** p

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2020 AZPDF.TIPS - All rights reserved.