Methods and Applications for Modeling and Simulation of Complex Systems

This volume constitutes the proceedings of the 18th Asia Simulation Conference, AsiaSim 2018, held in Kyoto, Japan, in August 2018.The 45 revised full papers presented in this volume were carefully reviewed and selected from 90 submissions. The papers are organized in topical sections on modeling and simulation technology; soft computing and machine learning; high performance computing and cloud computing; simulation technology for industry; simulation technology for intelligent society; simulation of instrumentation and control application; computational mathematics and computational science; flow simulation; visualization and computer vision to support simulation.

133 downloads 2K Views 71MB Size

Recommend Stories

Empty story

Idea Transcript

Liang Li Kyoko Hasegawa Satoshi Tanaka (Eds.)

Communications in Computer and Information Science


Methods and Applications for Modeling and Simulation of Complex Systems 18th Asia Simulation Conference, AsiaSim 2018 Kyoto, Japan, October 27–29, 2018 Proceedings


Communications in Computer and Information Science Commenced Publication in 2007 Founding and Former Series Editors: Phoebe Chen, Alfredo Cuzzocrea, Xiaoyong Du, Orhun Kara, Ting Liu, Dominik Ślęzak, and Xiaokang Yang

Editorial Board Simone Diniz Junqueira Barbosa Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro, Brazil Joaquim Filipe Polytechnic Institute of Setúbal, Setúbal, Portugal Igor Kotenko St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, St. Petersburg, Russia Krishna M. Sivalingam Indian Institute of Technology Madras, Chennai, India Takashi Washio Osaka University, Osaka, Japan Junsong Yuan University at Buffalo, The State University of New York, Buffalo, USA Lizhu Zhou Tsinghua University, Beijing, China


More information about this series at

Liang Li Kyoko Hasegawa Satoshi Tanaka (Eds.) •

Methods and Applications for Modeling and Simulation of Complex Systems 18th Asia Simulation Conference, AsiaSim 2018 Kyoto, Japan, October 27–29, 2018 Proceedings


Editors Liang Li Ritsumeikan University Kusatsu, Shiga, Japan

Satoshi Tanaka Ritsumeikan University Kusatsu, Shiga, Japan

Kyoko Hasegawa Ritsumeikan University Kusatsu, Shiga, Japan

ISSN 1865-0929 ISSN 1865-0937 (electronic) Communications in Computer and Information Science ISBN 978-981-13-2852-7 ISBN 978-981-13-2853-4 (eBook) Library of Congress Control Number: 2018957409 © Springer Nature Singapore Pte Ltd. 2018 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore


The Asia Simulation Conference (AsiaSim) is the international conference that has the longest history as a conference of modeling and simulation in Asia. It is an annual conference organized by the Federation of Asia Simulation Societies (ASIASIM), whose current member societies are: CSF (China Simulation Federation), JSST (Japan Society for Simulation Technology), KSS (Korea Society for Simulation), SSAGsg (Society of Simulation and Gaming of Singapore), and MSS (Malaysian Simulation Society). This conference provides a forum for scientists and engineers from around the world to promote the advancement of modeling and simulation in academic and industrial communities. AsiaSim 2018 was held in Kyoto, Japan. We received about 100 full papers. Submissions came from China, Japan, Korea, Singapore, Malaysia, Columbia, and Italy. After an intensive review process by the internationally assembled Program Committee, where each paper was reviewed by multiple reviewers, we finally accepted 45 full papers. Due to the high quality of the submitted papers, the paper selection was very difficult and we were forced to reject many interesting papers. The accepted papers are now consolidated in this volume of the Communications in Computer and Information Science (CCIS) series of Springer, and are divided into relevant topics. The diversity of topics is a unique and important feature of the AsiaSim conference. Giving researchers of different fields opportunities to get together and exchange ideas has inspired many interesting research activities. We hope the publication of this volume will further promote this nice feature of the AsiaSim conference. We thank the members of the Program Committee for their valuable effort in reviewing submitted papers. We also thank the Organizing Committee, which supported our editorial work in various aspects. We also express our special thanks to the College of Information Science and Engineering and ICT Medical Healthcare Center of Ritsumeikan University, the co-sponsor of the conference. Finally, we thank all the authors and participants of AsiaSim 2018. October 2018

Satoshi Tanaka Kyoko Hasegawa Liang Li

AsiaSim 2018 Organization

General Chairs Kazuo Furuta Satoshi Tanaka

University of Tokyo, JSST President, Japan Ritsumeikan University, Japan

Steering Chair Kyoko Hasegawa

Ritsumeikan University, Japan

Program Chairs Liang Li Naohisa Sakamoto

Ritsumeikan University, Japan Kobe University, Japan

Technical Co-sponsors China Simulation Federation (CSF) Japanese Society for Simulation Technology (JSST) Korea Society for Simulation (KSS) Society of Simulation and Gaming of Singapore (SSAGsg) Malaysian Simulation Society (MSS) Society for Modeling and Simulation International (SCS)

Co-organizers Japanese Society for Simulation Technology (JSST) Federation of Asia Simulation Societies (ASIASIM)

Co-sponsors College of Information Science and Engineering, Ritsumeikan University ICT Medical Healthcare Center, Ritsumeikan University

International Program Committee (ASIASIM) ASIASIM President Gary Tan (SSAGsg President)

School of Computing, National University of Singapore


AsiaSim 2018 Organization

ASIASIM Board Members Bo Hu Li Zhang Lin Xiao Song Satoshi Tanaka Kyoko Hasegawa Liang Li Yun-Bae Kim Kang Sun Lee Doo-Kwon Baik Gary Tan Teo Yong Meng Rubiyah Yusof Yahaya Md.Sam Axel Lehmann

CSF board member CSF board member, Beihang University, China CSF board member, Beihang University, China JSST board member, Ritsumeikan University, Japan JSST board member, Ritsumeikan University, Japan JSST board member, Ritsumeikan University, Japan KSS board member, Sungkyun Kwan University, South Korea KSS board member, Myongji University, South Korea KSS board member, Korea University, South Korea SSAGsg board member, NUS, Singapore SSAGsg board member, NUS, Singapore Universiti Teknologi Malaysia Universiti Teknologi Malaysia Honorary member of ASIASIM, Universität der Bundeswehr München, Germany

Program Committee Satoshi Tanaka Liang Li Kyoko Hasegawa Hiroshi Tamura Norifumi Yamada Akinori Kimura Hiroaki Nakamura Naohisa Sakamoto Yoshiyuki Miyamoto Yifa Tang Taku Itoh Shigeru Shimamoto Soo-Hyun Park Shin Muroya Kazuo Furuta Masami Iwase Katsuhisa Ozaki Shafishuhaza Sahlan Fumiaki Araki Yosuke Onoue Susumu Nakata

Ritsumeikan University, Japan Ritsumeikan University, Japan Ritsumeikan University, Japan Chuo University, Japan Fukui University, Japan Ashikaga Institute of Technology, Japan National Institute for Fusion Science Kobe University, Japan AIST, Japan Chinese Academy of Sciences, China Nihon University, Japan Waseda University, Japan Kookmin University, South Korea Matsumoto University, Japan The University of Tokyo, Japan Tokyo Denki University, Japan Shibaura Institute of Technology, Japan Universiti Teknologi Malaysia JAMSTEC, Japan Nihon University, Japan Ritsumeikan University, Japan

AsiaSim 2018 Organization

Zhongkui Wang Katsumi Konishi Gary Tan Muhammad Shalihin Othman Chengxin Wang Kouta Sekine Sicheng Liu Rui Xu Xinchen Ye Kazuaki Tanaka Fei Wang Taro Kanno Tomonori Yamada Kazuya Shibata Herman Wahid Nurul Adilla Mohd Subha Zaharuddin Mohamed Hiroaki Natsukawa Malcolm Low

Ritsumeikan University, Japan Hosei University, Japan National University of Singapore National University of Singapore National University of Singapore Toyo University, Japan Beihang University, China Dalian University of Technology, China Dalian University of Technology, China Waseda University, Japan Beihang University, China The University of Tokyo, Japan The University of Tokyo, Japan The University of Tokyo, Japan Universiti Teknologi Malaysia Universiti Teknologi Malaysia Universiti Teknologi Malaysia The University of Tokyo, Japan Singapore Institute of Technology, Singapore



Modeling and Simulation Technology A Novel Method to Build a Factor Space for Model Validation . . . . . . . . . . Ke Fang, Ming Yang, and Yuchen Zhou


Simulation Credibility Evaluation Based on Multi-source Data Fusion . . . . . . Yuchen Zhou, Ke Fang, Ping Ma, and Ming Yang


A Method of Parameter Calibration with Hybrid Uncertainty . . . . . . . . . . . . Liu Bo, Shang XiaoBing, Wang Songyan, and Chao Tao


Reinforcement Learning Testbed for Power-Consumption Optimization . . . . . Takao Moriyama, Giovanni De Magistris, Michiaki Tatsubori, Tu-Hoa Pham, Asim Munawar, and Ryuki Tachibana


A DEVS Visual Front-End Interface for Model Reusability and Maintainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiyong Yang, Moongi Seok, San Jeong, and Changbeom Choi HLA-Based Federation Development Framework Supporting Model Reuse. . . . Hang Ji, Xiang Zhai, Xiao Song, Xiaoliang Liu, Yazhou Liang, and Zhengxuan Jia

60 72

Soft Computing and Machine Learning Automatic Performance Simulation for Microservice Based Applications . . . . Yao Sun, Lun Meng, Peng Liu, Yan Zhang, and Haopeng Chan


Predictive Simulation of Public Transportation Using Deep Learning. . . . . . . Muhammad Shalihin Bin Othman and Gary Tan


An Ensemble Modeling for Thermal Error of CNC Machine Tools . . . . . . . . Xuemei Jiang, PanPan Zhu, Ping Lou, Xiaomei Zhang, and Quan Liu


Gait Classification and Identity Authentication Using CNN . . . . . . . . . . . . . Wei Yuan and Linxuan Zhang


Deep Dissimilarity Measure for Trajectory Analysis . . . . . . . . . . . . . . . . . . Reza Arfa, Rubiyah Yusof, and Parvaneh Shabanzadeh




High Performance Computing and Cloud Computing Performance Comparison of Eulerian Kinetic Vlasov Code Between Xeon Phi KNL and Xeon Broadwell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takayuki Umeda and Keiichiro Fukazawa Heterogeneous Scalable Multi-languages Optimization via Simulation . . . . . . Gennaro Cordasco, Matteo D’Auria, Carmine Spagnuolo, and Vittorio Scarano Smart Simulation Cloud (Simulation Cloud 2.0)—The Newly Development of Simulation Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bohu Li, Guoqiang Shi, Tingyu Lin, Yingxi Zhang, Xudong Chai, Lin Zhang, Duzheng Qing, Liqin Guo, Chi Xing, Yingying Xiao, Zhengxuan Jia, Xiao Song, and Rong Dai A Semantic Composition Framework for Simulation Model Service . . . . . . . Tian Bai, Lin Zhang, Fei Wang, Tingyu Lin, and Yingying Xiao

143 151



Simulation Technology for Industry Dynamic Optimization of Two-Coil Power-Transfer System Using L-Section Matching Network for Magnetically Coupled Intrabody Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kenichi Ito Demand and Supply Model for the Natural Gas Supply Chain in Colombia . . . Mauricio Becerra Fernández, Elsa Cristina González La Rotta, Federico Cosenz, and Isaac Dyner Rezonzew Deep-Learning-Based Storage-Allocation Approach to Improve the AMHS Throughput Capacity in a Semiconductor Fabrication Facility . . . . . . . . . . . . Haejoong Kim and Dae-Eun Lim Research on the Cooperative Behavior in Cloud Manufacturing . . . . . . . . . . Ping Lou, Cui Zhu, Xiaomei Zhang, Xuemei Jiang, and Zhengying Li Particle in Cell Simulation to Study the Charging and Evolution of Wake Structure of LEO Spacecraft . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nizam Ahmad, Hideyuki Usui, and Yohei Miyake

207 220

232 241


Simulation Technology for Intelligent Society Wise-Use of Sediment for River Restoration: Numerical Approach via HJBQVI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hidekazu Yoshioka, Yuta Yaegashi, Yumi Yoshioka, Kunihiko Hamagami, and Masayuki Fujihara



Calculation of Extreme Precipitation Threshold by Percentile Method Based on Box-Cox Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chi Zhang, Pu-wen Lei, and Koji Koyamada OpenPTDS Dataset: Pedestrian Trajectories in Crowded Scenarios . . . . . . . . Xiao Song, Jinghan Sun, Jing Liu, Kai Chen, and Hongnan Xie


286 296

Description and Analysis of Cognitive Processes in Ground Control Using a Mutual Belief-Based Team Cognitive Model . . . . . . . . . . . . . . . . . Sakiko Ogawa, Taro Kanno, and Kazuo Furuta


A Credibility Assessment Method for Training Simulations from the View of Training Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shenglin Lin, Wei Li, Shuai Niu, Ping Ma, and Ming Yang


Simulation of Instrumentation and Control Application Digital Twin-Based Energy Modeling of Industrial Robots . . . . . . . . . . . . . . Ke Yan, Wenjun Xu, Bitao Yao, Zude Zhou, and Duc Truong Pham


Dyna-Q Algorithm for Path Planning of Quadrotor UAVs . . . . . . . . . . . . . . Xin Huo, Tianze Zhang, Yuzhu Wang, and Weizhen Liu


Boarding Stations Inferring Based on Bus GPS and IC Data . . . . . . . . . . . . Xiang Yu, Fengjing Shao, Rencheng Sun, and Yi Sui


Acoustic Properties of Resonators Using Deployable Cylinders. . . . . . . . . . . Sachiko Ishida and Ryo Matsuura


Iterative Unbiased Conversion Measurement Kalman Filter with Interactive Multi-model Algorithm for Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . Da Li, Xiangyu Zou, Ping Lou, Ruifang Li, and Qin Wei


Computational Mathematics and Computational Science On Convergence Speed of Parallel Variants of BiCGSTAB for Solving Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kuniyoshi Abe Study on Chaotic Cipher with Robustness and Its Characteristics . . . . . . . . . Takashi Arai, Yuta Kase, and Hiroyuki Kamata A Stochastic Impulse Control Model for Population Management of Fish-Eating Bird Phalacrocorax Carbo and Its Numerical Computation . . . Yuta Yaegashi, Hidekazu Yoshioka, Koichi Unami, and Masayuki Fujihara

401 414




A Dialect of Modern Fortran for Computer Simulations . . . . . . . . . . . . . . . . Shin’Ya Hosoyamada and Akira Kageyama


Flow Simulation Performance Comparison of the Three Numerical Methods to Discretize the Local Inertial Equation for Stable Shallow Water Computation . . . . . . . . Tomohiro Tanaka, Hidekazu Yoshioka, Sokly Siev, Hideto Fujii, Ly Sarann, and Chihiro Yoshimura Development of the DRowning hUman Model (DRUM) Toward Evaluation of Performance of Lifejackets in Tsunamis . . . . . . . . . . . . . . . . . Daiki Ajima, Tatsuto Araki, and Takashi Nakamura Illumination Recovery for Realistic Fluid Re-simulation. . . . . . . . . . . . . . . . Hongyan Quan, Zilong Song, Xinquan Zhou, Shishan Xue, and Changbo Wang Ocean Analysis by Tsunami Simulation of the Nankai Trough Massive Earthquake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuto Sakae, Ikuya Morimoto, Takuya Ozaki, Ryo Kurimoto, Liang Li, Kyoko Hasegawa, Satoshi Nakada, and Satoshi Tanaka Improving Traffic Flow at a Highway Tollgate with ARENA: Focusing on the Seoul Tollgate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seung-Min Noh, Ho-Seok Kang, and Seong-Yong Jang


466 477



Visualization and Computer Vision to Support Simulation Pixel Convolutional Networks for Skeleton-Based Human Action Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhichao Chang, Jiangyun Wang, and Liang Han Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds Based on Curvature-Dependent Poisson Disk Sampling. . . . . . . . . . . Yukihiro Noda, Shu Yanai, Liang Li, Kyoko Hasegawa, Atsushi Okamoto, Hiroshi Yamaguchi, and Satoshi Tanaka Image-Based 3D Shape Generation Used for 3D Printing . . . . . . . . . . . . . . . Zemin Li, Lin Zhang, Yaqiang Sun, Lei Ren, and Yuanjun Laili A Memory Efficient Parallel Particle-Based Volume Rendering for Large-Scale Distributed Unstructured Volume Datasets in HPC Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yoshiaki Yamaoka, Kengo Hayashi, Naohisa Sakamoto, and Jorji Nonaka






A Transfer Entropy Based Visual Analytics System for Identifying Causality of Critical Hardware Failures Case Study: CPU Failures in the K Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kazuki Koiso, Naohisa Sakamoto, Jorji Nonaka, and Fumiyoshi Shoji



Modeling the Spread of Epidemic Diseases on ElasticStack-Based Simulation Output Analysis Environment . . . . . . . . . . . . . . . . . . . . . . . . . . Kangsun Lee and Sungwoo Hwangbo


Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Modeling and Simulation Technology

A Novel Method to Build a Factor Space for Model Validation Ke Fang(&), Ming Yang, and Yuchen Zhou Control and Simulation Center, Harbin Institute of Technology, Harbin 150001, China {fangke,myang}, [email protected]

Abstract. Factor space is an indispensable part of the model validation. In order to provide an advantageous method to build factor space for model validation, this paper states the challenging problems of the factor space, and proposes a mathematical model of it. Further based on the model, this paper provides the graphic illustration, the factor decomposition, the credibility aggregation and the model defect tracing of the factor space, which construct a novel method to build a factor space for model validation. Finally, the paper provides a case study of an electromagnetic rail gun model validation to explain the usage of the method. Keywords: Model validation Model defect tracing

 Factor space  Credibility aggregation

1 Introduction Simulation model is a complex object, which usually has multiple inputs/outputs and sophisticated behaviors. The credibility of a simulation model is influenced by many indicators (factors) related to the nature and outputs of the model. It is necessary to build a factor space [1, 2] to describe these factors and their relationship, and further aggregate the total credibility of the model through the factors. Traditional way of building a factor space is using an AHP (Analytic Hierarchy Process) tree [3] and weighted average function to aggregate the grand credibility from leaf nodes. However, the relation between factors is not always linear, and the influence of the factor is not always transferred through layers one by one. Model validation needs a better method to reveal the factors and their influence to the credibility of the model. Many generic methodologies develop the requirements, planning, architecture, process and recommended techniques to perform verification and validation for M&S (Modeling and Simulation) [4, 5]. Unfortunately, these methodologies all suggest hierarchical tree to present and organize credibility indicators of model validation. Even the project exercises and VV&A software tools adopt the traditional hierarchy to perform simulation validation [6]. This cannot satisfy the practical requirements of the model validation, and often leads to a credibility assessment result lack of objectivity and confidence. The remainder of this paper is organized as follows. We state the three challenging problems of the factor space which are not resolved well by the hierarchical tree in © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 3–17, 2018.


K. Fang et al.

Sect. 2. Section 3 proposes the network method for building a factor space of model validation. Section 4 presents a case study of the electromagnetic rail gun model validation to explain the usage of the network method. Concluding remarks are given in Sect. 5.

2 Challenging Problems The basic function of the factor space in model validation is to determine the total credibility of the model. In order to achieve the goal, factor space needs to reveal the credibility indicators and their relationship, and use them to aggregate the total credibility. There are some challenging problems in the function of the factor space, especially the factor decomposition, credibility aggregation and the model defect tracing. 2.1

Factor Decomposition

Factor decomposition is the driving force of a factor space building. Naturally, in model validation the root factor should be the total credibility of the model. Further by making decomposition downwards layer by layer, we can reveal all the factors which influence the model credibility at different level. The relationship between factors is related to the decomposition which is made, and this relationship may exist between any two factors, no matter at which layer the factor is located. Obviously, the AHP hierarchical tree is a special case of the factor space, which can only present factor affiliation between adjacent layers. However, the credibility indicators work in a more complicated way than the analytic hierarchy. Take the factor space part shown in Fig. 1 as an example. A 6-DOF (Degree of Freedom) flying vehicle model is divided into four parts, which are “motion of mass center”, “motion of around mass center”, “motion of elasticity”, and “motion of shaking”. The variants below are the physical quantities of the motions. Apparently, the model credibility is determined by the four parts together, and further is influenced by the output variants below.

6-DOF Flying Vehicle Model

Motion of mass center x



Motion of around mass center




Motion of elasticity



Motion of shaking





Fig. 1. Part of a 6-DOF flying vehicle model validation factor space.

Actually the indicators of four motions are decomposed from the root by the model structure, and this is what the hierarchical tree does in AHP. However when the

A Novel Method to Build a Factor Space for Model Validation


decomposition goes down further it is stuck, because the physical quantities are linked by the model resolving process but not the structure. Obviously the outputs (both intermediate and final ones) are not always irrelevant and the relationship is not always linear, thus it is incorrect to make structural decomposition further. Meanwhile, in case that the outputs x; y; z are so important that if they failed to meet some acceptability criteria, no matter what the performance of other indicators are, the model credibility will be unaccepted. This is an ultra-connection from x; y; z to the root node, which a traditional hierarchy cannot deal with. So we need a better way to perform factor decomposition. 2.2

Credibility Aggregation

The traditional way of credibility aggregation by AHP hierarchy is using weighted average function to accumulate the influence from leaf nodes to root. When the hierarchy and weight matrix are set, the input of the credibility aggregation is only the value of leaf nodes. The aggregation can be expressed by the function below: v0 ¼

q X i¼1


ki Y

wj Þ;



where v0 is the root value (model credibility), vi is the leaf value, wj is the weight and should be summed to 1 among brother nodes, q is the number of leaf nodes, and ki is the number of ancestor nodes which belong to the leaf node possessing vi . Apparently, the weighted average function is linear, and demands the brother nodes are irrelevant each other at different levels. Actually this is almost impossible to be satisfied in the model validation. See the factor space part expanded from the “Motion of mass center” node of Fig. 2 below.

Fig. 2. Factor space part expanded from the “Motion of mass center” node of Fig. 1.


K. Fang et al.

Take the node of Wx (the vehicle acceleration of x direction) as an example. Wx is obtained by the following formula containing Fqx1 (aerodynamic force of x direction) and m (the vehicle mass), and further the Fqx1 is expanded as Cx (aerodynamic coefficient of x direction), q (aerodynamic) and SM (cross sectional area of the vehicle), as the formula shows below: Wx ¼

Fqx1 Cx qSM ¼ : m m


Because m changes during the simulation, Fqx1 is not linear to Wx . If use normal AHP hierarchy and weighted average function to aggregate the partial credibility of Wx from Fqx1 and m, it will be an incorrect result. Actually the relationship of Fqx1 and m to Wx is a derivation, but not a composition. So the credibility aggregation needs better solution than the traditional way. 2.3

Model Defect Tracing

Except for revealing the credibility indicators and their relationship, factor space has another function of model defect tracing [7]. The traditional AHP hierarchy is lack of precise defect tracing ability, because there is swamping effect. The calculation of combined weight may lessen the influence of certain indicators to the model credibility. When these indicators fail to obtain acceptable validation results, we cannot detect them by the mere final credibility. If we find a new way to build factor space and express the relationship between indicators more than hierarchy, it is possible to use it for model defect tracing. Not only the leaf nodes but also the branch nodes should possess partial credibility, which allows path tracing to locate the indicators that cause the deficiency of model credibility.

3 The Network Method for Building a Factor Space The hierarchical tree has a limitation of connections inside, that is, it only allows relationship between factors in adjacent layers. In order to express more complicated connections between credibility indicators, we propose a network method to build the factor space in model validation. 3.1

The Mathematical Definition of the Factor Space Network

Define the factor space network as a directional graph composed by radially distributed nodes. The network can be expressed by the quadruple below: F ¼ f\N; V [ ; \L; A [ g;


where F is a factor space, N and V are the node set and value set, L and A are the link set and attribute set mapped with each other. Set the link direction as from attributeholding node to attribute-receiving node. For example, a structural link of the

A Novel Method to Build a Factor Space for Model Validation


traditional hierarchy has the direction from child node to parent node. According to the requirements of model validation, further develop element definitions of the network as the following:  [N ~ where N  is the Definition 1: Define N ¼ fn1 ; n2 ; . . .; nk g as the node set. N ¼ N ~ is the uncertain node set. If ni 2 N,  use tðni Þ to indicate the node certain node set and N ~ type. tðni Þ 2 fregular; sufficient; inherited g. If ni 2 N, use cðni Þ as the transit condition, and cðni Þ 2 f0; 1g. Definition 2: Define V ¼ fv1 ; v2 ; . . .; vk g as the value set. If tðNÞ 2 fregular; sufficientg, V and N are mapped with each other. Use vi ¼ vðni Þ to indicate the value of ni which is mapped with vi . [L ~ where L  is the certain Definition 3: Define L ¼ fl1 ; l2 ; . . .; lk g as the link set. L ¼ L ~ is the uncertain link set. Use tðli Þ to indicate the link type, and link set and L tðli Þ 2 fregular; sufficient; ultra;equivalent; contraditory; inherited; tracedg. Use to l ¼ ðni ; nj Þ express a link from node ni to node nj . Definition 4: Define A ¼ fa1 ; a2 ; . . .; ak g as the attribute set, and A is one-one mapped with L. Use ai ¼ aðli Þ to indicate the attribute of link li which is mapped with ai . If tðli Þ 2 fequivalent; inherited; traced g, ai 2 £. ~ and cðnÞ ¼ 1, then n 2 N.  If cðnÞ ¼ 0, then n 2 ;, and define n Definition 5: If n 2 N as rubbish node, which needs to be deleted from the network.  tðnÞ ¼ sufficient and vðnÞ ¼ 1, then the sufficient link from n Definition 6: If n 2 N, breaks. If vðnÞ ¼ 0, then n needs to supplement additional brother node, which is defined as shadow node and values 0. Definition 7: If l 2 L, tðlÞ ¼ regular, and l ¼ ðno ; nd Þ, define aðlÞ 2 ½0; 1 as the weight  and tðLc Þ ¼ regular, then distributed from nd to no . If Nc ¼ fnjðn; nd Þ 2 Lc g, Lc 2 L, k P ak ðlk Þ ¼ 1, lk 2 Lc and k ¼ dðLc Þ. i¼1

Definition 8: If l 2 L, l ¼ ðno ; nd Þ, and tðlÞ ¼ ultra, define aðlÞ 2 ½0; 1 as the acceptability threshold. If vðno Þ \ aðlÞ, then vðnd Þ ¼ 0, and define no as the key node of nd , nd as the super conduct node of no . Definition 9: If l 2 L, l ¼ ðno ; nd Þ, and tðlÞ ¼ inherited, then the sub-network under nd has to be replicated to no , and the new nodes are defined as inherited nodes, whose physical meaning will be given by no . Definition 10: If l 2 L, l ¼ ðno ; nd Þ, and tðlÞ ¼ contradictory, define aðlÞ2 ½0; 1 as the contradiction percentage of no to nd . vðnd Þ vðnd Þ  1  aðlÞ  vðn0 Þ vðnd Þ , where vðno Þ is the source node value, ¼ 1  aðlÞ  vðn0 Þ vðnd Þ [ 1  aðlÞ  vðn0 Þ vðnd Þ is the destination node value. Definition 11: If l 2 L, l ¼ ðno ; nd Þ, and tðlÞ ¼ equivalent, define no & nd as mirror node, and vðnd Þ ¼ vðno Þ.


K. Fang et al.

Definition 12: If l 2 L, l ¼ ðno ; nd Þ, and tðlÞ ¼ traced, define nd as traced node. vðnd Þ is irrelevant to vðno Þ. Definition 13: If  n ¼ fxjx 2 N ^ ðx; nÞ 2 Lg, define  n as the pre-set of n. If n ¼ fxjx 2 N ^ ðn; xÞ 2 Lg, then define n as the post-set of n. Definition 14: If ðni ; nj Þ 2 L, define ni as the child node of nj , and nj as the father node of ni . If ðni ; np Þ 2 L and ðnj ; np Þ 2 L, define ni and nj are brother nodes. If n ¼ ;, define n as root node. If n 6¼ ; and  n 6¼ ;, define n as branch node. If  n ¼ ;, define n as leaf node. Definition 15: Set power operator satisfies ðn Þ0 ¼ n, ðn Þ1 ¼ n , ðn Þ2 ¼ ðn Þ … If nj 2 ðni Þs1 , nj 2 ðni Þs2 ,…, nj 2 ðni Þsk , define Sðni ! nj Þ ¼ fs1 ; s2 ; . . .; sk g as the distance set from ni to nj . Non-negative integers s1  sk are all distances from ni to nj . If n0 is a root node, abbreviate Sðni ! n0 Þ as Sni . Definition 16: If NL ¼ fn1 ; n2 ; . . .; nk g, 8n 2 NL makes  n \ NL ¼ ; and n \ NL ¼ ;, and 8ni ; nj 2 NL , 8sk 2 Sni , 8sm 2 Snj makes Maxðsk Þ ¼ Maxðsm Þ, define NL as a layer (order) of the factor space network, which is the Maxðsk Þ–th layer. Definition 17: If 9s; nj 2 ðni Þs , define ni to nj as reachable. If 8s; nj 62 ðni Þs , define ni to nj as unreachable, and set sðni ! nj Þ ¼ 1. Definition 18: If sðni ! nj Þ [ 1 and sðni ! nj Þ 6¼ 1, define ni as the offspring node of nj , nj is the ancestor node of ni , and ni & nj are lineal relative nodes. If sðni ! nj Þ ¼ 1 and ni & nj are not brother nodes, define ni & nj are collateral relative nodes. 3.2

The Rules of the Factor Space Network

In order to use the factor space network to perform model validation, we must define rules to regulate its structure and operation. The rules below should be followed when use the factor space network to validate simulation models. Rule 1: If n 2 N, then  n [ n 6¼ ;. If l 2 L and l ¼ ðn1 ; n2 Þ, then n1 6¼ ; and n2 6¼ ;. (There is no isolated node or link in the network.) Rule 2: If N0 ¼ fnj n ¼ ;g, then dðN0 Þ ¼ 1 and N0  N. (There is only one root node in the factor space.) ~ then l ¼ ðn; ni Þ 2 L. ~ (The link which has a source of uncertain Rule 3: If n 2 N, node is an uncertain link.) Rule 4: If n 2 N and tðnÞ ¼ sufficient, then l ¼ ðn; ni Þ 2 L and tðlÞ ¼ sufficient. (The link which has a source of sufficient node is a sufficient link.) ~ 8nc 2 Nc makes nc to np is reachable, and Rule 5: If Nc ¼ fn1 ; n2 ; . . .; nk g, np 2 N, cðnp Þ ¼ 0, then 8nc 2 Nc makes cðnc Þ ¼ 0. (The offspring nodes of a rubbish node are all rubbish nodes.) ~ cðnr Þ ¼ 0, lrb ¼ ðnr ; ni Þ 2 L ~ and lre ¼ ðni ; nr Þ 2 L, ~ then Rule 6: If nr 2 N, cðlrb Þ ¼ cðlre Þ ¼ 0. (The uncertain link which starts from or ends with a rubbish node is a rubbish link.)

A Novel Method to Build a Factor Space for Model Validation



The Graphic Illustration of the Factor Space Network

In order to express the factor space network visually, we define necessary graph elements to provide a graphic illustration. The graph element set is mapped with all definitions and follows the rules. The figure below shows an example of the graphic illustration of the factor space (Fig. 3). The graphic illustration is explained below: (1) Use single-lined figure (circle or rectangle) to present regular node, double-lined figure to present sufficient node, and double-lined round-cornered rectangle to present inherited node. (2) Mark the node name or number in the node figure, and mark the node value and the transit condition outside nearby. (3) Use directional line segment to present link, and use line end to present the link type. Solid arrow end presents regular link, circle end presents sufficient link, double-arrow end presents ultra link, equality-sign end presents equivalent link, slash-sign end presents contradictory link, hollow arrow end presents traced link, and slash-signed circle end presents inherited link. (4) Mark the attribute by the link. To avoid the intersection of the links, fold the link and mark the source and target node at the folded link. (5) Use solid-lined figure to present certain element, and use dot-lined figure to present uncertain element.

Fig. 3. Example of the graphic illustration of the factor space.


Credibility Aggregation

Dynamic elements (uncertain and sufficient node/link) affect the structure of the factor space. The credibility aggregation cannot be performed unless the network is static, which means the dynamic elements have to be analyzed first.


K. Fang et al.

When the dynamic analysis is done, the credibility aggregation can be achieved by the following procedure: (1) Go through the factor space downwards, and stop at the destinations of traced links. Perform similarity analysis by comparing the simulation outputs and real world outputs, and get partial credibility on these nodes after result transformation [8]. Generally, the partial credibility can be achieved by: CðnÞ ¼ 1 

jjOðnÞ  Oðn0 Þjj jIðnÞ¼Iðn0 Þ ; jjOðn0 Þjj


where IðnÞ and OðnÞ are the simulation input and output of the node, Iðn0 Þ and Oðn0 Þ are the corresponding real world input and output, and jj  jj represents the norm of the variant. According to the technique used, CðnÞ can be achieved by statistics analysis [9], time domain analysis [10] or frequency domain analysis [11] methods together with their result transformation formulas. (2) If the destination of the traced link is lack of real world output, then go down along the path further to find a node whose real world output is valid, and get the value of the node by similarity analysis. The partial credibility of the upper node which has no real world output can be achieved by error analysis via the computational process of the model: Cðn2 Þ ¼ 1 

jjf ðð2  Cðn1 ÞÞ  Oðn1 ÞÞ  f ðOðn1 ÞÞjj ; jjOðn2 Þjj


where n1 is the node which has real world output, Cðn1 Þ is partial credibility, and Oðn1 Þ is simulation output. n2 is the node which has no real world output, Cðn2 Þ is partial credibility, and Oðn2 Þ is simulation output. f is the computational function of the model from n1 to n2 . (3) Use appropriate algorithm to aggregate the total credibility on the root node, by gathering the partial credibility on the destinations of the traced links along the decomposition paths of the factor space: C ¼ f ðvi ; vi þ 1 ; . . .; vi þ k Þ;


where C is the total credibility, vi  vi þ k are the partial credibility on the destination nodes of the traced links, f is the aggregation function. f can be the method of taking the minimum, and weighted average etc., and can be different across the layers in the factor space. 3.5

Defect Tracing

When the total credibility of the model is unsatisfactory, we can perform defect tracing via the factor space network by following the procedure below: (1) Use orthogonal design and Sobol’ method [7] to locate the defect factors other than the destination of traced links.

A Novel Method to Build a Factor Space for Model Validation


(2) Along the validation path in the factor space which contains the defect factors already located, make further validation to get partial credibility on the destination of traced links, and determine if it is a defect node by comparing with the acceptability criteria. (3) If there is a destination of iteration paths, make validation of its initial input, related constants, and other variants which are irrelevant to the iteration variant [12]. (4) When the validation path reaches the leaf node, the defect tracing is over. (5) Collect all the defect nodes in the tracing, and take the nodes with lowest order as the origin that induces the deficiency of the model credibility.

4 Case Study The electromagnetic rail gun model is composed by the sub-models of power supplies, paralleled rails, armature, wave modulation inductor, projectile etc. The armature is located between the two rails, and conducts the current and delivers Lorentz force. The projectile is located in front of the armature, which is forced to move forward together with the armature. The model reveals the whole physical process of the real world object. 4.1

The Factor Space Network

According to the model’s resolving process, use the network method to build the factor space for model validation. Figure 4 shows the factor space network. 4.2

The Validation of the Projectile Displacement

According to the factor space network in Fig. 4, the destination of the traced links is the projectile displacement x. We use MGRA (Modified Grey Relational Analysis) to validate the node. The analysis formula is shown below: 8 cm ðX1 ; X2 Þ ¼ cðX1 ; X2 Þ  RRMSE > > > n P > > > cðX1 ; X2 Þ ¼ 1n cðx1 ðkÞ; x2 ðkÞÞ > > > k¼1 > > > minjx1 ðkÞx2 ðkÞj þ q maxjx1 ðkÞx2 ðkÞj < cðx1 ðkÞ; x2 ðkÞÞ ¼ kjx1 ðkÞx2 ðkÞj þ q maxkjx1 ðkÞx2 ðkÞj ; k >  0:5 > > n P > > 1 ðx1 ðkÞx2 ðkÞÞ2 > n > > k¼1 > > RRMSE ¼ n > P > 1 : jx2 ðkÞj n



where X1 is the simulation data series, X2 is the observed data series, minjx1 ðkÞ  x2 ðkÞj is the minimum difference, maxjx1 ðkÞ  x2 ðkÞj is the maximum k


difference, RRMSE is the relative root mean square error of the simulation and observed data. The result transformation from MGRA to simulation credibility can be defined as:


K. Fang et al.

Fig. 4. The factor space network of the electromagnetic rail gun model.

Cðcm Þ ¼

8 < 1Cth ðc  c Þ þ C ; c 2 ½c ; 1 m th th m th 1c :


Cth cm cth


cm 2 ½0; cth 



where is cm is the analysis result of MGRA, cth is the acceptability threshold of similarity analysis, Cth is the acceptability threshold of simulation credibility. The

A Novel Method to Build a Factor Space for Model Validation


simulation data and observed data are shown in Table 1. The data curves of the projectile displacement x are shown in Fig. 5. Set cth ¼ 0:5 & Cth ¼ 0:8, and use the function in Formula (7, 8), we can obtain the partial credibility of the projectile displacement x is Cx ¼ 0:7049. Because Cth ¼ 0:8, the node is not accepted. Meanwhile we can see the root node is directly linked by the projectile displacement x only, so the model’s total credibility is also 0:7049 and unaccepted. We need to perform defect tracing to find which part of the model causes the credibility deficiency. Table 1. The simulation data and observed data of the projectile displacement x. Time (ms) 0.1000 0.2000 0.3000 0.4000 0.5000 …… 1.1000 1.2000 …… 4.0000 4.1000 4.2000 4.3000 4.4000

Simulation data (m) Observed data (m) 0.0001 0.0001 0.0009 0.0009 0.0048 0.0050 0.0148 0.0155 0.0335 0.0356 …… …… 0.3404 0.3796 0.4246 0.4746 …… …… 4.6349 5.2051 4.8064 5.3973 4.9782 5.5899 5.1503 5.7828 5.3227 5.9760

Fig. 5. The curves of the projectile displacement x.



K. Fang et al.

The Validation of the Individual Circuit Current

According to the defect tracing procedure of the factor space, we make further validation along the paths of traced links. Because of the limited observed data and length of the paper, we pick out two representative nodes individual circuit current and rail voltage to explain the defect tracing work. The model contains 100 power capacitors grouped by every 10 packs, which discharge in chronological order. Take the No. 9 group of the capacitor as an example, the individual circuit current data is shown in Table 2. The data curves of the No. 9 individual circuit current i9 are shown in Fig. 6. Table 2. The simulation data and observed data of the No. 9 individual circuit current i9 . Time (ms) 0.8000 0.9000 1.0000 1.1000 1.2000 …… 2.6000 2.7000 …… 4.0000 4.1000 4.2000 4.3000 4.4000

Simulation data (kA) Observed data (kA) 0.0000 0.0000 63.4997 63.3014 118.0873 118.6114 156.2808 159.1810 172.6993 179.9446 …… …… 71.6777 75.8113 67.5994 71.4827 …… …… 33.8028 35.8419 32.1987 34.1577 30.6890 32.5729 29.2671 31.0806 27.9271 29.6744

Fig. 6. The curves of the No. 9 individual circuit current i9 .

A Novel Method to Build a Factor Space for Model Validation


Use the function in Formula (7, 8), we can obtain the partial credibility of the No. 9 individual circuit current i9 is Ci9 ¼ 0:7393. The node is not accepted and we need further defect tracing along the traced links downwards. 4.4

The Validation of the Rail Voltage

The rail voltage data is shown in Table 3. The data curves of the rail voltage Ur are shown in Fig. 7. Table 3. The simulation data and observed data of the rail voltage Ur . Time (ms) 0 0.1000 0.2000 0.3000 0.4000 0.5000 …… 1.1000 1.2000 …… 4.0000 4.1000 4.2000 4.3000 4.4000

Simulation data (V) Observed data (V) 0 0 0.6055 0.6286 6.6897 7.2459 29.8628 33.4997 72.0103 84.9831 110.6497 134.7681 …… …… 402.5833 493.5161 410.8884 509.3004 …… …… −42.0534 −47.6174 −43.5441 −49.2517 −44.7558 −50.5756 −45.7219 −51.6273 −46.4721 −52.4406

Fig. 7. The curves of the rail voltage Ur .


K. Fang et al.

Use the function in Formula (7, 8), we can obtain the partial credibility of the rail voltage Ur is CUr ¼ 0:6849. The node is not accepted. 4.5

Result Analysis

According to the theory of the factor space network, the credibility of the electromagnetic rail gun model is C ¼ 0:7049, which is not accepted. The defect tracing shows that the node which causes the credibility deficiency is under the node of rail voltage Ur . By further validation we find that the constant of inductor gradient L0r is problematic. The constant in simulation is set as L0r ðsÞ ¼ 0:42 106 , but the observed value in the real world is L0r ðrÞ ¼ 0:40 106 . Although the difference is tiny, since the magnetic force F is in proportion to the total circuit current I, and I is at the level of mega amps, the error is magnified dramatically. Meanwhile, the traced links in the factor space network give chain reaction of the error, and induce the credibility deficiency of the model. Moreover, we can see that the total circuit current I, the individual circuit current i, the projectile acceleration a, the projectile velocity v, and the projectile displacement x have cross iteration links in the factor space network. These iterations intensify the accumulation of the error. So as long as to correct the value of L0r in the simulation, the electromagnetic rail gun model has the opportunity to gain an acceptable credibility.

5 Conclusion The factor space plays an important role in model validation. It reveals all the indicators that influence the total credibility of the model, and makes the work of model defect tracing possible. However, the traditional way of factor space building only analyses the model by structural decomposition, and relies on the linear aggregation function to get the total credibility. This cannot satisfy the factor space requirements in the engineering of model validation. The network based factor space overcomes the limitation of the traditional hierarchy, and serves for the model validation better. Case studies prove that the method is practical and effective. Meanwhile we should notice that, although the factor space lists all the credibility indicators and their relationship, it does not mean the validation can be performed thoroughly without doubt. Some physical quantity is not observable or very hard to measure in the real world, such as the aerodynamic force, the projectile mass etc. The node possessing this kind of physical quantity is called “black node”. We cannot perform similarity analysis from comparing the simulation data and observed data and get the partial credibility, except for knowledge based expert evaluation. Therefore, if there is black node in the factor space, the validation result may be less of objectivity. The definition of the network based factor space is extendable. The types of each quadruple element can be extended to express new indicators and relationship. According to the practical requirements in validation engineering, the network method for building a factor space should be more adaptive and accuracy.

A Novel Method to Build a Factor Space for Model Validation


Acknowledgments. The paper was supported by the National Natural Science Foundation of China (Grant No. 61374164 and 61627810).

References 1. Shi, Y., Ma, B.H.: Multitarget decision fusion method based on fuzzy factor spaces. J. Beijing Inst. Technol. 20(1), 85–89 (2000) 2. Cui, T.J., Ma, Y.D.: The method research on decision criterion discovery of system reliability. Syst. Eng. Theory Pract. 35(12), 3210–3216 (2015) 3. Saaty, T.L.: Modeling unstructured decision problems: a theory of analytical hierarchy. In: Proceedings of the First International Conference on Mathematical Modeling, pp. 59–77 (1997) 4. Sargent, R.G.: Verification and validation of simulation models. J. Simul. 7, 12–24 (2013) 5. IEEE Computer Society: IEEE recommended practice for verification, validation, and accreditation of a federation—an overlay to the high level architecture federation development and execution process. IEEE Standard 1516.4, The Institute of Electrical and Electronics Engineers, Inc., USA, pp. 1–66 (2007) 6. Balci, O., Adams, R.J., Myers, D.S.: A collaborative evaluation environment for credibility assessment of modeling and simulation applications. In: Proceedings of the 2002 Winter Simulation Conference, San Diego, CA, USA, pp. 214–220 (2002) 7. Zhang, Z., Fang, K., Wu, F., Yang, M.: Detection method for credibility defect of simulation based on sobol’ method and orthogonal design. In: Tan, G., Yeo, G.K., Turner, S.J., Teo, Y. M. (eds.) AsiaSim 2013. CCIS, vol. 402, pp. 421–428. Springer, Heidelberg (2013). https:// 8. Zhou, Y.C.: Transformation methods and assistant tools from data consistency analysis result to simulation credibility. Master dissertation, Harbin Institute of Technology, China, (2014) 9. Lourenco, P.B., Rots, J.G., Blaauwendraad, J.: Continuum model for masonry: parameter estimation and validation. J. Struct. Eng. 124(6), 642–652 (1998) 10. Dorobantu, A., Balas, G.J., Georgiou, T.T.: Validating aircraft models in the gap metric. J. Aircr. 51(6), 1665–1672 (2000) 11. Au, S.K.: Model validity and frequency band selection in operational modal analysis. Mech. Syst. Signal Process. 1–21 (2016) 12. Fang, K., Zhou, Y.C., Zhao, K.B.: Validation method for simulation models with iteration operation. Syst. Eng. Electron. 39(2), 445–450 (2017)

Simulation Credibility Evaluation Based on Multi-source Data Fusion Yuchen Zhou, Ke Fang, Ping Ma(&), and Ming Yang Harbin Institute of Technology, Harbin 150080, China [email protected], {fangke,pingma,myang}

Abstract. Real-world system experiment data, similar system running data, empirical data or domain knowledge of SME (subject matter expert) can serve as observed data in credibility evaluation. It is of great significance to study how to incorporate multi-source observed data to evaluate the validity of the model. Generally, data fusion methods are categorized into original data fusion, feature level fusion, and decision level fusion. In this paper, we firstly discuss the hierarchy of multiple source data fusion in credibility evaluation. Then, a Bayesian feature fusion method and a MADM-based (multiple attribute decision making) decision fusion approach are proposed for credibility evaluation. The proposed methods are available under different data scenarios. Furthermore, two case studies are provided to examine the effectiveness of credibility evaluation methods with data fusion. Keywords: Multi-source data fusion  Credibility evaluation Bayesian feature fusion  Model validation

1 Introduction In recent decades, numerical models are extensively utilized to replace the complicated real-world systems for design optimization, analysis and performance evaluation. To guarantee the correct application of simulation models, credibility evaluation should be conducted with the simulation data generated from numerical model and observed data from reference system or subject matter expert (SME). Since the engineering systems are increasing complicated, the cost restrict the experiment times. There are various data may serve as the observed data in credibility evaluation or model validation, including the data obtained from the real-world systems, the data from the high resolution similar systems (which is a high fidelity form compared to the numerical model to be evaluated), empirical data or domain knowledge of SME, etc. The authenticity and quality of the multi-source observed data differs. Thus, it is necessary to investigate how to combine multi-source observed data for credibility evaluation. Min [1] proposed a knowledge-based method for the credibility evaluation of complex simulation systems. Experience of expert and other kind of domain knowledge are used for validation tasks. For complicated physical systems, it is hard to perform full-scale experiments, but it may be possible to collect observed data at lower © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 18–31, 2018.

Simulation Credibility Evaluation Based on Multi-source Data Fusion


levels (subsystem level or components level). Li and Mahadevan [2] discussed how to use multi-level observed data to evaluate the system level validity of numerical models. Mullins [3] and Wang [4] investigated model validation method under various data scenarios or multiple validation sites and put forward integration metric to aggregate the analysis results. Although data fusion is widely used in positioning [5], target recognition [6], state estimation [7], information perception [8], etc., credibility evaluation with data fusion did not attract enough attention. In this paper, we will discuss the hierarchy of multisource data fusion and two credibility evaluation methods with data fusion techniques. This paper is organized as follows. In Sect. 2, the hierarchy of multi-source data fusion for credibility evaluation is firstly discussed. Then, in Sect. 3, a feature level fusion with Bayesian parameter updating is provided. In Sect. 4, a multiple attribute decision making (MADM) method based on information entropy is proposed. Then a decision fusion process with proposed MADM approach is discussed. In Sect. 5, case studies are designed to assess the effectiveness of the two credibility evaluation methods with data fusion. Finally, Sect. 6 presents conclusions about the proposed credibility evaluation methods.

2 The Hierarchy of Multiple Source Data Fusion for Credibility Evaluation Data fusion methods are categorized into three levels: original data fusion, feature level fusion, and decision level fusion. Figure 1 is a credibility evaluation framework with original observed data fusion. The observed data from different sources are concatenated before similarity analysis. The structure of original data fusion seems simple. If the multi-source observed data is independent and identically distributed for the same output, it is easy to use the original data fusion. However, if the multi-source observed data are heterogeneous, the fusion process should be carefully designed.

Observed data 1 Fusion data Observed data 2

Credibility Similarity analysis

Predicted data

Fig. 1. Credibility evaluation with original observed data fusion.

Compared to the restriction of original data fusion, feature fusion is a more promising mean for credibility evaluation. Feature extraction is applied to each observed data before the integration process. Then the similarity between the fusion feature and prediction data feature is analysis. The credibility evaluation with feature fusion of observed data can be divided into two types: symmetric feature fusion


Y. Zhou et al.

(Fig. 2) and asymmetric feature fusion (Fig. 3). Symmetric feature fusion performs the same feature extraction on observed data from different resources. Evidence theory, fuzzy theory may be utilized to realize the feature combination. On the contrary, the multi-source observed data in asymmetric feature fusion is with different preprocessing. A classical method for asymmetric feature fusion is Bayesian approach. Because the authenticity and validity of observed data from multiple source differ, the observed data which is sufficient but less authentic, may serve as the prior information. Then the observed small samples can be updated with Bayesian method. In next section, we will discuss the feature fusion with Bayesian approach.

Observed Data 1

Observed data feature 1 Fusion feature

Observed Data 2

Observed data feature 2

Predicted data

Predicted data feature

Similarity analysis


Fig. 2. Credibility evaluation with symmetric feature fusion of observed data.

Observed Data 1

Observed data feature 1 Fusion feature

Observed Data 2

Predicted data

Similarity analysis


Predicted data feature

Fig. 3. Credibility evaluation with asymmetric feature fusion of observed data.

Decision fusion is also called result level fusion. Compared to the complicated processing of feature fusion, decision fusion is flexible and feasible to implement. The similarity analysis is conducted before the results fusion. Furthermore, the decision fusion allows employing different model validation method to analysis the similarity between simulated data and multi-source observed data (Fig. 4).

Observed Data 1 Similarity analysis

Credibility 1

Predicted data Similarity analysis

Fusion credibility Credibility 2

Observed Data 2

Fig. 4. Credibility evaluation with decision fusion.

Simulation Credibility Evaluation Based on Multi-source Data Fusion


Fusing multiple sources observed data together produces a more efficient representation of the output of reference system and it may be beneficial to obtain accurate and comprehensive evaluation result.

3 Bayesian Features Fusion for Credibility Evaluation There are two things should be seriously considered in credibility evaluation. On the one side, multi-source observed data may be heterogeneous; hence, there has risks to use original data fusion. On the other side, multi-source observed data is usually from systems with multi-resolution. High-fidelity observed data is usually small sample which means it is difficult to employ the classical statistical approach directly. It is of great significance to integrate the multi-source observed data from multi-resolution systems for the credibility evaluation. In this section, we discuss a feature fusion method with Bayesian parameter updating. The prior statistical features are firstly extracted from low-fidelity observed data. The statistical features specially refer the distribution parameters of samples. Then, the prior information and high-fidelity observed data are combined to update the statistical features with Bayesian approach. Finally, the hypothesis test is employed to realize the similarity analysis. 3.1

Bayesian Parameter Updating   Provided x ¼ x1 ; x2 ; . . .; xn is a set of samples generated from an independent and   identically distributed population X ¼ X1 ; X2 ; . . .; Xn . h is a parameter affect the probability density function of X. The Bayesian estimation of h is formulated as follows. Z pðhjxÞ ¼ hðx; hÞ=mðxÞ ¼ pðxjhÞpðhÞ




where pðhjxÞ denotes posterior probability density function (PDF) of h given the samples x; hðx; hÞ denotes the joint probability density function of x and h; mðxÞ denotes the marginal density function of x; pðhÞ denotes prior probability density function of h; H includes all the possible value of h. Conjugate distribution is a commonly used method to get the posterior estimation of h. If posterior distribution pðhjxÞ and prior distribution pðhÞ are with the same probability distribution function, then pðhjxÞ and pðhÞ are called conjugate distribution and pðhjxÞ is called conjugate prior. Take the Bayesian estimation of mean value h of normal distribution for instance,the conjugate prior of h is normal distribution. Pro  2 2 2 vided X 2 h; r and pðhÞ  N lp ; rp . r , lp , r2p are known. The joint distribution function hðx; hÞ is formulated as


Y. Zhou et al.

" , #  n  2 

X 2 2 hðx; hÞ/ exp  h  lp xi  l 2rp  exp  2r2 ; i¼1

h  2 . 2 i 2r1 / exp  h  l1


 . .  where l1 ¼ nxr2p þ lp r2 nr2p r2 , r21 ¼ nr2p r2 nr2p þ r2 . Obviously, mðxÞ is independent to parameter h. Thus, h  2 . 2 i 2r1 : pðhjxÞ ¼ hðx; hÞ=mðxÞ / exp  h  l1


  N l1 ; r21 is the posterior distribution function of h. 3.2

Bayesian Features Fusion Process for Credibility Evaluation

Provided xp is the predicted data from the simulation model. xo1 and xo2 are two sets of observed samples from different sources. xo1 is observed from a low fidelity system and xo2 is measured from a high fidelity real-world system. However, xo2 is a small sample because of the high cost of experiments while xo1 is more sufficient compared to xo2 . Figure 5 reveals a typical feature fusion process with Bayesian method for credibility evaluation. The detailed process is as follows. Start

Input: predicted data xp; observed data xo1, xo2.

Goodness of fit test

Singular value elimination

Estimate the distribution parameter mpri with xo1(Prior Information)

Estimate the posterior distribution parameter mpos of xo2 with Bayesian approach

Hypothesis test h0: E(xp)=mpos? No


The result is incredible

The result is credible


Fig. 5. A typical feature fusion process with Bayesian method

Simulation Credibility Evaluation Based on Multi-source Data Fusion


Step 1: Goodness of fit test. The Bayesian parameter updating is effected by the parameter distribution. Thus, the distributions of the samples should be firstly confirmed. Chi-square test, Jarque-Bera test, Kolmogorov-Simrnov (K-S) test are commonly used goodness of fit test approaches. Step 2: Outlier detection. There may be outliers in evaluation data. Grubbs’ test, Dixon’s Q test, Pauta’s criterion are commonly used techniques to detect the outlier. Step 3: Estimate the prior distribution parameter with xo1 . Since xo1 is output generated from low fidelity system, it is available for the prior estimation of distribution parameter. The Bootstrap approach method may be used to support the estimation. Step 4: Bayesian feature fusion. Update the distribution parameter with the prior information and xo2 . Step 5: Hypothesis testing. Select appropriate approach to check the statistical characteristic similarity between the posterior parameter and simulation data. If null hypothesis is accepted, the model generating simulation data is credible; otherwise, the numerical model and simulation data should be restricted used.

4 MADM-Based Decision Fusion for Credibility Evaluation For multiple sources observed data, different similarity analysis method may be adopted to evaluate the credibility. Take similarity analysis of time series for instance, TIC (Theil’ inequality coefficient) or GRA (grey relational analysis) may be utilized to analysis the overall trend similarity. Furthermore, statistical method may be employed to analysis the statistical characteristic similarity at some key time points. From another perspective, sometimes SME are participating in the credibility evaluation and provide the credibility according to the domain knowledge and related experience. Under these circumstances, decision fusion is an effective method to integrate multiple credibility evaluation results to a synthesized credibility. Table 1 reveals the classification of credibility evaluation method. The evaluation approaches are divided into subjective method, statistical method, time-domain analysis, frequency-domain analysis, MADM [9] method. Table 1. Classification of credibility evaluation method Catagory Subjective method Statistical method

Time-domain analysis Frequency-domain analysis MADM method

Methods Turing test, expert scoring hypothesis test (U test, T test, F test, Chi-square test, etc.), nonparametric test (K-S test, Mann-Whitney U test, Chi-square goodness of fit test, runs test, etc.), Bayesian method (parameter updating, Bayesian factor, etc.), area method, u-pooling method, etc. TIC, GRA, MAE, RMSE, MSE, RMSE, RRMSE, Pearson correlation coefficient, etc. Periodogram analysis, maximum entropy spectrum analysis, wavelet analysis, etc. AHP, fuzzy theory, information entropy, ideal point method, cloud model, evidence theory



Y. Zhou et al.

An MADM Method Based on Information Entropy

Various credibility evaluation results between multiple sources observed data and simulated data are information in nature. According to the information theory [10], information can decrease the uncertainty and increase the understanding of the event. Fusion of multiple validation result is integration of information weight and information. The information entropy may be regarded as the weight hi of the information. Provided c ¼ c1 ; c2 ; . . .; cm denote a set of credibility evaluation results. A criteria set q ¼ q1 ; q2 ; . . .; qn is established to assess the importance of each credibility evaluation result. The decision matrix for the calculation of weights of ci is defined as 2

... ... .. .

r11 6 r21 6 R¼6 . 4 ..

r12 r22 .. .



3 r1n r2n 7 7 .. 7 . 5

. . . rmn




where rij is the criterion qj value of ci , rij 2 ½0; 1. n

P Let si ¼ rij and use the diagonal matrix S ¼ diagð1 s1 ; 1 s2 ; . . .1 sm Þ to norj¼1

malize the decision matrix. R0 ¼ SR


Since si represent the global assessment of ci based on attribute set, the information entropy of ci is not only determined by R0 , but also by si . The Shannon information entropy of ci is defined as hi ¼ K

n X

rij0 logrij0 ;



  where K ¼ log 1 si . Then, we use the following formula to fuse multiple decisions. cfusion ¼


8 < :

 hi c i

N P k¼1


hi ; 8i; ci  As 9i; ci  As



MADM-Based Decision Fusion Process for Credibility Evaluation

The decision fusion for credibility evaluation is more flexible. Provided xp is the predicted data and xoi ; i ¼ 1; 2; . . .N are multi-source observed data. Ams is the accept criterion for the evaluation indicators. Figure 6 reveals a typical decision fusion process with MADM method for credibility evaluation. The detailed process is as follows.

Simulation Credibility Evaluation Based on Multi-source Data Fusion


Step 1: Data preprocessing. Different data preprocessing procedure may be adopted according to the similarity analysis method. For instance, if time-domain similarity analysis method is employed, the observed time series and predicted time series should have same time sequence. Step 2: Select appropriate method to analysis the similarity ci between predicted data xp and multi-source observed data xoi . All the similarity results should be transformed to a normalized interval [0, 1]. Step 3: Select several indicator to calculate the weights of different similarity results with information entropy. Step 4: Decision fusion. Calculate the fused credibility with Eq. (7). Step 5: Acceptability analysis. If the integrated credibility cfusion is larger than the acceptability threshold Ams , the simulation data is valid; otherwise, the model and simulation data should be restricted used. Start

Input: predicted data xp; observed data xoi, i=1,2,...,N.

Data preprocessing

Select appropriate method to calculate the credibility between xp and xoi

Calculate the weights with information entropy

Credibility fusion cfusion




No The result is credible

The result is incredible


Fig. 6. A typical decision fusion process with MADM

5 Case Study 5.1

Credibility Evaluation with Bayesian Feature Fusion

Table 2 shows the simulation data xs and two groups of observed data xh (high fidelity observed data with 8 samples), xl (low fidelity observed data with 20 samples).


Y. Zhou et al.

According to the Jarque-Bera test, all the three groups of data obey the normal distribution. The data passed the outlier elimination. Table 3 reveals the mean lðxÞ and variance r2 ðxÞ of three sets of samples. Table 2. Evaluation data xs


12.124, 12.065, 12.096, 12.358, 11.951, 12.252, 12.050, 12.083, 12.052, 12.431, ……, 11.898, 11.917, 12.245, 12.145, 11.832, 12.015, 12.259, 11.874, 12.126, 11.762

12.052, 12.123, 11.911, 12.008, 12.163,

xh 12.103, 12.182, 11.715, 12.246, 11.957, 11.868, 12.162, 11.897, 11.623, 12.029, 11.749, 12.073, 12.433, 12.344, 11.857

11.827, 12.083, 12.089, 11.749,

11.965, 11.879, 11.763, 12.228

Table 3. Statistic of samples xs



lðxÞ 12.093 12.025 11.948 r2 ðxÞ 0.039 0.043 0.040

Obviously, the variance difference among three groups of data is small. Thus, the mean similarity is given priority. Table 4 reveals the T test results of arbitrary two sets of samples. The level of significance is 0.05. According to the test, the mean of simulation data are not equal to the mean of high fidelity observed data. However, the other two hypothesis test reveals the null hypothesis is accepted. Since high fidelity samples are limited, the small samples may have influence on the test results. Under this circumstance, it is hard to decide whether the simulation data and observed data are consistent. Thus, we use the proposed Bayesian feature fusion based credibility evaluation approach.

Table 4. Conventional hypothesis test results Null hypothesis Hypothesis test results     h (reject null hypothesis): 0 l xs ¼ l xl p-value: 0.153 ci (confidence interval): [−0.163, 0.026]     h (reject null hypothesis): 1 l xs ¼ l xh p-value: 0.040 ci (confidence interval): [−0.284, −0.007]     h (reject null hypothesis): 0 l xl ¼ l xh p-value: 0.363 ci (confidence interval): [−0.094, 0.248]

Simulation Credibility Evaluation Based on Multi-source Data Fusion


(1) Firstly, bootstrap function in Matlab is used to generate 10000 groups of bootstrapping samples xib of low fidelity observed data. xib ði ¼ 1; 2; . . .; 10000Þ are row vectors. All the bootstrapping samples constitute a matrix with size 10000  20. T X b ¼ xT1b ; xT2b ; . . .; xT10000b 1000020 : (2) Then, 10000 groups of bootstrapping  samples are applied to estimate the prior mean of observed data lpriobs ¼ E xib . (3) Thirdly, the mean of observed data is updated with small high fidelity observed data. The posterior mean lposobs of observed data is estimated with Eq. 3. (4) Finally, one sample T test is performed to test the statistical consistency between simulation data and posterior mean of observed data. Table 5 shows result of each step. Table 5. Bayesian feature fusion results Parameters lpriobs

Value 12.025

lposobs 11.986   h0: l xs ¼ lposobs h (reject null hypothesis): 0 p-value: 0.579 c (confidence interval): [−0.275, 0.490]

The probability value (0.579) is 10 times larger than the significance level (0.05), which suggests that the mean of simulation data is sufficiently consistent with the mean of observed data. In this example, Bayesian approach updates the estimation of mean of small high fidelity observed samples with relative sufficient low fidelity observed data. 5.2

Credibility Evaluation with MADM Based Decision Fusion

Electromagnetic railgun (EMRG) is a promising weapon system launching projectile with high power pulse power supply system [11]. As the alternative of physical system, EMRG simulation model is playing an increasing important role in the research of EMRG. The validation of discharge process of pulse power supply is a particularly important part for the credibility evaluation of EMRG interior ballistics model. The power supply units are triggered in a certain time sequence. The discharge current of pulse power supply is expected to increasing to mega ampere (MA) and keep stable for several millisecond [12]. Then, the discharge current will decrease after the entire power supply units are triggered. Furthermore, the discharge current should satisfy the maximum current constraint. Hence, we use proposed decision fusion approach for the credibility evaluation of discharge current. Isu ðu ¼ 1; 2; . . .; 50Þ are 50 predicted time series of discharge current of lumped circuit simulation. Ipv ðv ¼ 1; 2; . . .; 10Þ are 10 observed time series of prototype test. Ihw ðw ¼ 1; 2; . . .; 20Þ are 50 observed time series of a high fidelity distribution


Y. Zhou et al.

parameter circuit. Figure 7 shows the curve of three groups of data. Because each groups of data including many time series, we may evaluate the credibility of discharge current in two aspects. On the one side, the average trend similarity between simulation data and observed data is analyzed with time-domain method (TIC [13]). On the other side, the maximum discharge current similarity is measured with statistical method (area metric [14]). Then we use the MADM to fuse two analysis results.

(b) I p

(a) I s

(c) I h

Fig. 7. Change curve of discharge current

(1) TIC analysis. Firstly, the average trend of Is and Ip is calculated respectively. The comparison between Is and Ip is shown in Fig. 8. Obviously, the mean trend is similar. Then, TIC is utilized to analysis the similarity of Is and Ip . According to the calculation, the TIC result is q = 0.0043, which means Is and Ip are perfectly matched. (2) Area metric analysis. Since Is and Ih is sufficient, we use area metric to calculate the statistical similarity of maximum discharge current of each time series. The empirical cumulative distribution function (CDF) is shown in Fig. 9. According to area metric, the area measure between two CDF is a = 0.0684.

Fig. 8. Mean of discharge current

Fig. 9. Maximum of discharge current

Simulation Credibility Evaluation Based on Multi-source Data Fusion


According to the area metric, the maximum discharge current difference between simulation time series Is and observed time series Ip is less significant. For the TIC or area metric, smaller result means the similarity degree between two groups of data is higher. However, large credibility value indicates the model’s output behavior is more possible to satisfy the accuracy requirement for the intended application. Thus, we transform the TIC and area metric to credibility as follows [15].  cðqÞ ¼  c ð aÞ ¼

    1  q 1  c th 1  qth ; q 2 qth ; 1 ; cth ðq  qth Þ 1  qth þ cth ; q 2 ½0;qth


    1  a 1  c th 1  ath ; a 2 ath ; 1 ; cth ða  ath Þ 1  ath þ cth ; a 2 ½0;ath


where q and a denote TIC and area metric result. qth and ath are the acceptability threshold of similarity analysis. cth is the acceptability threshold of credibility. Set qth ¼ 0:15, ath ¼ 0:1, As ¼ cth ¼ 0:8; the credibility results between multi-source observed data and simulation data constitute the following vector. c ¼ ½0:8851 0:8281 (3) Decision fusion. In general, validation data, completeness of data and validation metrics have influence on credibility result. If evaluation data is complete and accurate, the output may be validated comprehensively. Besides, appropriate method is beneficial for achieving accurate evaluation result. Accuracy of data, completeness of data, and applicability of validation metrics are chosen as the criterion to evaluate every validation result. The decision matrix is evaluated as

0:7 R¼ 0:8

0:8 0:9 0:5 0:8

The row vectors of R represent the assessment value of TIC and area metric result. After the standardization, the information entropy vector is h ¼½1:9923 1:7889T . The decision fusion result of discharge current is c ¼ 0:8581, which means the model is credible under current data scenarios. In practice, more similarity methods or source of observed data may be utilized in credibility evaluation. Decision fusion approach provides a feasible way to integrate multiple evaluation results to a comprehensive evaluation result. 5.3


In this section, we provide two examples to examine the proposed data fusion based credibility evaluation methods. The case studies demonstrate both Bayesian feature fusion and MADM-based decision fusion are effective for the credibility evaluation with predicted data and multi-source observed data.


Y. Zhou et al.

Bayesian parameter estimation or other Bayesian statistical method offer a feasible approach to integrate and estimate the statistical characteristic of multi-fidelity observed data. For consistency analysis of different parameter, appropriate conventional hypothesis test method or Bayesian hypothesis test method should be carefully selected. Furthermore, outlier elimination, goodness of fit test, and bootstrap approach are applied in the Bayesian feature fusion, which suggests that the credibility evaluation with feature level fusion should be implemented cautiously. The criterion and concern of different similarity methods differ. Thus, the participants of credibility evaluation tasks tend to select different methods to measure the consistency between multi-source observed data and simulation data. MADM-based decision fusion approach provides a feasible way to integrate multiple evaluation results to a comprehensive evaluation result.

6 Conclusion In this paper, the process of original data fusion, feature level fusion, and decision level fusion for credibility evaluation are discussed. A Bayesian feature fusion method is appropriate for the similarity analysis between simulation data and multi-fidelity observed data. MADM-based decision fusion approach is suitable for the integration of multiple assessment results. Under different multi-source observed data scenarios, appropriate data fusion method should be selected to evaluate the credibility of numerical model. Two case studies are performed to test the proposed data fusion based credibility evaluation methods. The results reveal that the Bayesian feature fusion and MADM-based decision fusion provide effective ways for the similarity analysis of multi-source observed data and simulation data. There are still many problems to resolve in credibility evaluation with multi-source data fusion approach. For instance, the observed samples may not obey normal distribution. Under this circumstance, it is not appropriate to utilize the conventional hypothesis test straight-forward in Bayesian feature fusion. In the future, more data fusion methods may be employed in credibility evaluation of simulation model. Acknowledgments. The paper was supported by the National Natural Science Foundation of China (Grant No. 61374164 and 61627810).

References 1. Min, F.Y., Yang, M., Wang, Z.C.: Knowledge-based method for the validation of complex simulation models. Simul. Model. Pract. Theory 18(5), 500–515 (2010) 2. Li, C.Z., Mahadevan, S.: Role of calibration, validation, and relevance in multi-level uncertainty integration. Reliab. Eng. Syst. Saf. 148, 32–43 (2016) 3. Mullins, J., Ling, Y., Mahadevan, S., Sun, L., Strachan, A.: Separation of aleatory and epistemic uncertainty in probabilistic model validation. Reliab. Eng. Syst. Saf. 147, 49–59 (2016)

Simulation Credibility Evaluation Based on Multi-source Data Fusion


4. Wang, Z.Q., Fu, Y., Yang, R.Y.: Model validation of dynamic engineering models under uncertainty. In: Proceedings of the ASME 2016 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, IDETC/CIE (2016) 5. Li, X., Chen, W., Chan, C.Y., Li, B., Song, S.H.: Multi-sensor fusion methodology for enhanced land vehicle positioning. Inf. Fusion 46, 51–62 (2019) 6. Chen, Y.M., Hsueh, C.S., Wang, C.K., Wu, T.Y.: Sensor fusion, sensitivity analysis and calibration in shooter localization systems. J. Comput. Sci. 25, 327–338 (2018) 7. Wu, J., Su, Y.H., Cheng, Y.W., Shao, X.Y., Deng, C., Liu, C.: Multi-sensor information fusion for remaining useful life prediction of machining tools by adaptive network based fuzzy inference system. Appl. Soft Comput. 68, 13–23 (2018) 8. Novak, D., Riener, R.: A survey of sensor fusion methods in wearable robotics. Robot. Auton. Syst. 73, 155–170 (2015) 9. William, H., Xu, X., Prasanta, K.D.: Multi-criteria decision making approaches for supplier evaluation and selection: a literature review. Eur. J. Oper. Res. 202, 16–24 (2010) 10. Li, H., Bao, Y.Q., Ou, J.P.: Structural damage identification based on integration of information fusion and Shannon entropy. Mech. Syst. Signal Process. 22, 1427–1440 (2008) 11. Ma, P., Zhou, Y.C., Shang, X.B., Yang, M.: Firing accuracy evaluation of electromagnetic railgun based on multicriteria optimal Latin hypercube design. IEEE Trans. Plasma Sci. 45 (7), 1503–1511 (2017) 12. McNab, I.R.: Pulsed power options for large EM launchers. In: 2014 17th International Symposium on Electromagnetic Launch Technology (2014) 13. Kheir, N.A., Holmes, W.M.: On validating simulation models of missile systems. Simulation 30(4), 117–128 (1978) 14. Roy, C.J., Oberkampf, W.L.: A comprehensive framework for verification, validation, and uncertainty quantification in scientific computing. Comput. Methods Appl. Mech. Eng. 200 (25), 2131–2144 (2011) 15. Zhou, Y.C.: Transformation methods and assistant tools from data consistency analysis result to simulation credibility. Master dissertation, Harbin Institute of Technology, China (2014)

A Method of Parameter Calibration with Hybrid Uncertainty Liu Bo1,2, Shang XiaoBing1, Wang Songyan1, and Chao Tao1(&) 1


Harbin Institute of Technology, Harbin, China [email protected] China Shipbuilding Industry Corporation, Xi’an, China

Abstract. A method, which combines the cumulative distribution function and modified Kolmogorov–Smirnov test, is proposed to solve parameter calibration problem with genetic algorithm seeking the optimal result, due to the hybrid uncertainty in model. The framework is built on comparing the difference between cumulative distribution functions of some target observed values and that of sample values. First, an auxiliary variable method is used to decomposition hybrid parameters into sub-parameters with only one kind of uncertainty, which is aleatory or epistemic, because only epistemic uncertainty can be calibrated. Then we find optimal matching values with genetic algorithm according to the index of difference of joint cumulative distribution functions. Finally, we demonstrate that the proposed model calibration method is able to get the approximation values of the unknown true value of epistemic parameters, in mars entry dynamics profile. The example illustrates the rationality and efficiency of the method of this paper. Keywords: Hybrid uncertainty

 Parameter calibration  Auxiliary variable

1 Introduction In the engineering process of aeronautics and astronautics, computer simulation has become an important means to save cost and improve work efficiency in the engineering model design. The foundation of simulation is the mathematical model describing the real world, so it plays an increasingly important role to find a method that describe the model accurately. However, in the process of actual engineering development, the model is undetermined with the presence of various uncertainties sourced from lack of knowledge, design and manufacturing defects, and variable of objective environment of the product. It not only leads to an uncertain model, but also may lead to worse output under the influence of above uncertainties. The purpose of the paper is to find the approximation values of the unknown true value of parameters in condition of sparse target data. In different environments, mathematical models and experimental data are fraught with uncertainty. There are usually five types of uncertainty sources shown in Fig. 1. Among these sources of uncertainty, the uncertainty of parameter can be summarized into the following two categories.

© Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 32–44, 2018.

A Method of Parameter Calibration with Hybrid Uncertainty


(1) Aleatory uncertainty: Aleatory uncertainty is also known as statistical uncertainty, and is representative of unknowns that differ each time we run the same experiment. It refers to the variability inherent in the system and its operating environment. It cannot be reduced by more experiments or collecting more data. (2) Epistemic uncertainty: Epistemic uncertainty is also known as systematic uncertainty, and is due to things one could in principle know but doesn’t in practice. This may be because a measurement is not accurate, because the model neglects certain effects, or because particular data has been deliberately hidden. It refers to the uncertainties with subjective understanding or information for some parameter values caused by the lack of knowledge. It can be reduced or possibly eliminated with more information.

Parameter uncertainty Inpute parametric variability

Source of uncertainty

Variability of experimental measurements Structural uncertainty Algorithmic uncertainty

Fig. 1. System uncertainty source

After years of research, the uncertainty quantification (UQ) method is gradually developed to solve the problem of uncertainty in the system. UQ is the science of quantitative characterization and reduction of uncertainties in both computational and real word application. Therefore, the analysis of quantitating the uncertainty in the system helps to reduce the occurrence of accidents in the design process, which is of great significance. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known. There are two major types of problems in UQ shown in Fig. 2: one is the forward propagation of uncertainty (where the various sources of uncertainty are propagated through the model to predict the overall uncertainty in the system response) and the other is the inverse assessment of model uncertainty and parameter uncertainty (where the model parameters are calibrated simultaneously using test data). There has been a proliferation of research on the former problem and a majority of uncertainty analysis techniques were developed for it. On the other hand, the latter problem is drawing increasing attention in the engineering design community, since UQ of a model and the subsequent predictions of the true system response(s) are of great interest in designing robust systems. The method of this paper is to solve the challenge of the inverse assessment of model uncertainty, where the model parameters are calibrated simultaneously to reduce


L. Bo et al.

parameter epistemic uncertainty. Parameter calibration (PC) estimates the values of one or more unknown parameters in a mathematical model. In recent years, there has been a great deal of research on model PC methods in uncertain situations. It is generally believed that aleatory uncertainty is usually described directly based on the probability method, while the epistemic uncertainty needs to be described according to the specific characteristics. Currently, the widely used methods include interval theory, evidence theory, fuzzy set, possibility theory, and convex method, etc.

Certain parameters


Uncertainty propagation

System of black box Uncertainty parameters



Parameter calibration

Fig. 2. Two forms of solving the problem of UQ.

The calibrated object is the epistemic uncertainty parameter in the model. This paper firstly introduce the classification and representation (all is probability methods) of the commonly used uncertain parameters and present the method of decomposition of hybrid uncertainty which will be calibrate. Then the calibration method of hybrid uncertainty parameter is studied emphatically. The empirical cumulative distribution function (ECDF) obtained by using the modified two-sample Kolmogorov–Smirnov (K-S) test, is compared with the ECDF of the real data. Then, the optimized matching value can be found, using genetic algorithm. The optimized matching value is seen as approximation of epistemic uncertainty parameters, to eliminate epistemic uncertainty in system caused by lack of knowledge.

2 Hybrid Uncertainty Parameter Decomposition In practical engineering, system parameters usually exist in three forms, including aleatory forms, epistemic forms, and hybrid forms which fixed aleatory and epistemic uncertainty. But, only epistemic uncertainty parameters can be calibrated, and aleatory uncertainty parameters cannot be simplified. Therefore, an auxiliary variable method should be used to decompose hybrid uncertain parameters into a series of subparameters with only one kind uncertainty. 2.1

Characteristics of Uncertain Model Parameters

In practical engineering, there are three kinds of uncertainty in system parameter. The classical description of the subjective and objective uncertainty in the stochastic model

A Method of Parameter Calibration with Hybrid Uncertainty


is always formulated by the form of probability distribution and interval theory, respectively. By combining the two forms, the aleatory uncertainty in the hybrid uncertainty parameter is presented as the random form of the parameter, which is the random value of some probability distribution. The epistemic uncertainty is a random form of statistical parameters in the form of probability distribution. There is an uncertainty model satisfied Eq. 1. Y ¼ f ðX; P; tÞ


where X¼ ½x1 ; x2 ; ::; xnx T 2 Rn is a vector of certain input variables and certain parameters. And nx is the number of variables. Y ¼ ½y1 ; y2 ; . . .; ynp T 2 Rm is a vector of model output observations and np is the number of output. And P, a vector of all uncertain parameters, has three forms as given by: P¼ ½Pa ; Pe ; Ph 


(1) Pa is a class of variables that only exist aleatory uncertainty. The model is modeled as rand distribution, and the distribution parameter is constant, such as the Gaussian case. (2) Pe is a class of variables with epistemic uncertainty. It is modeled as a fixed but unknown constant that lie within given intervals. (3) In practice, there are sometimes two kinds of uncertainty both in the same model parameters, which can be regarded as a third uncertain variable, named hybrid uncertainty. Ph is a class of hybrid uncertainty mixed aleatory and epistemic uncertainty. It is modeled as a random variable with a fixed functional form but unknown parameters. 2.2

Auxiliary Variable Method

The auxiliary variable (AV) method is an effective method for parameter calibration of UQ. It can decompose a parameter with hybrid uncertainty into multiple sub-parameter with single uncertainty. The advantage of this method is to solve subsequent PC challenges conveniently. Now, it is assumed that one of the parameter P, with both aleatory uncertainty and epistemic uncertainty in the model (Eq. 1), obeys a certain distribution P  gðh1 ; h2 ; . . .; hn Þ, where the sub-parameter hi may be either a certain value, or satisfy a certain probability distribution. When the distribution form of g is determined, the sub-parameter hi is just the AV. Each of the sub-parameter only has one kind of uncertainty. If the distribution form of g is implicitly undetermined, we can define that AP represents the aleatory uncertainty of parameter P, and EP represents the epistemic uncertainty of parameter P. Then the two sub-parameter AP ; EP can be seen as the AV. They can calculate by Eq. 3.


L. Bo et al.

Zp AP ðpÞ ¼ FP ðpjX ¼ EP Þ ¼

 fP ðpEP Þdp



where FP ðpjX ¼ EP Þ is the cumulative distribution function (CDF) of the parameter P. Each value of the parameter P corresponds to a random variable AP . Therefore, the aleatory uncertainty of parameters can be represented by AP equivalently. The parameters P can be expressed by the inverse function of the CDF, which is shown by Eq. 4.  P ¼ hðAP ; EPi Þ ¼ FP1 ðAP EP Þ


where EP is an AV for the epistemic uncertainty of parameter, which is equal to the epistemic uncertainty of hi . Therefore, a parameter factor with hybrid uncertainty can be represented by two auxiliary variables ðAP ; EP Þ. In order to unify the parameter variables in this paper, we use the AV to express the parameter with single uncertainty in the same way. Then the model will have a series of sub-parameter h. Figure 3 shows the result of sub-parameters using AV method. x1

y1 System



Y = f ( X , P, t)


... yn y


θa θe

Distribution form


g (θ1 , θ 2 ,..., θ n )




Fig. 3. Parameter decomposition diagram

3 Parameter Calibration Matching Method After qualitative analysis of uncertain parameters, a certain method is used to establish the quantitative model of uncertain parameters. No matter from the nature of parameters, or through the expert judgment, there are more or less personal bias or lack of knowledge, etc., so the epistemic uncertainty interval generally left margin. It means that the form of epistemic uncertainty is given by interval distribution pattern instead of constant of true value it should be. In order to establish a more accurate simulation model, the paper puts forward a kind of matching method which can look for the approximate value of the true epistemic uncertainty parameter, using CDF.

A Method of Parameter Calibration with Hybrid Uncertainty


The CDF matching approach uses the concept of the two-sample K-S test to compare the ECDF of the observations, with the ECDF for the generated output values using aleatory uncertainties for some realization of the epistemic uncertainty variable. The specific process can be summarized as 5 steps. Step 1: Decompose all uncertain parameters of model, especially the hybrid uncertain parameter. We also use the AV to express the parameter with single uncertainty in the same way, in order to unify the parameter variables in this paper. After determining the sub-parameters h of the model, we should make clear that which sub-parameters are aleatory uncertainty that cannot be calibrated and which subparameters are epistemic uncertainty that can be calibrated. So, the uncertainty model Eq. 1 can be transformed into Eq. 5. PðhÞ

Y ¼ f ðX; PÞ ¼ f ðX; hÞ


Step 2: Retrieve a given number of realistic target observations from the database. In this paper, the real target observation value is replaced by the simulated observation value which is obtained by random uncertainty in the real value condition of the epistemic uncertainty parameter. The corresponding ECDF is calculated after simply eliminating outliers. According to the ECDF function of N independent identically distributed samples, the formula 6 is obtained. FN ðYðX; hÞÞ ¼

N 1X IY  y ðhÞ N i¼1 i


where I is the index function; If Xi  x, then I ¼ 1, or I ¼ 0. It is shown in red curve of Fig. 4.

1 the real observation the sample observation

0.9 0.8


0.7 0.6 0.5 0.4 0.3 0.2 0.1 0




0 Y




Fig. 4. The difference ECDF between the real target observation value and samples (Color figure online)


L. Bo et al.

Step 3: Based on the double-samples K-S test method, Latin hypercube sampling was used for uncertainty parameters of model. First, get N 0 random outputs samples of observations for one particular h realization. Then an ECDF can be generated by Eq. (7). It is shown in blue curve of Fig. 4. The random stream for generating N 0 samples of output samples for a particular h realization is fixed to reduce the noise in objective function calculation. N0 1X IY  y ðhÞ; i ¼ 1;    ; N 0 ð7Þ Fi ðYðX; hÞÞ ¼ 0 N i¼1 i Step 4: A modified K-S test, used here for comparing two ECDFs with N (given observations) and N 0 (randomly generated using aleatory uncertainty) samples (DiNN 0 ), is given by Eq. (8). Start Given target observations Calculate the distribution function of the data. ′ samples for one particular realization of epistemic uncertainty parameters

Get random outputs observations samples with aleatory uncertainty Calculate ECDF of samples i D NN = ′

∑ (F



− Fi )


Seeking optimization using genetic algorithm

DNN ′ = min Dnni ' i = 1,


, N′

Whether or not it has been recycled m times? Yes Take the minimum as the best approximation of the real value. End

Fig. 5. Flowchart describing CDF matching method

A Method of Parameter Calibration with Hybrid Uncertainty

DiNN0 ¼


ðFN  Fi Þ2




which represents the variance error of the difference between two ECDF curves. In the Eq. (8), it can be seen that the smaller the index DiNN 0 , the smaller the difference between the two groups of distribution function curves. The selected value of the corresponding sub-parameters is closer to the real value of the epistemic uncertainty parameter. Therefore, the minimum value of the index is used as the optimal matching value h for the parameter of epistemic uncertainty. Step 5: For seeking for the minimum value of the index, the genetic algorithm (GA) is used as means for finding optimization. The minimum distance is used as the fitness function and sample points violating the Latin hypercube design restriction will contribute a large penalty value to the fitness function. Then, we can get a set of optimization of sub-parameters. In order to increase the pervasiveness, the random flow needs to be dynamic. That means to repeat step 3 and step 4, for m times. Then the matching sets of optimization values of m-time cycle is composed of the new calibration interval of uncertainty parameters, which shows dynamic calibrated value fluctuating around true value of epistemic parameters. Finally, take the median of the new interval as the best approximation value of the real value of the parameter. hi 2 ½hlower ; hupper 


A flowchart describing the CDF matching method is shown in Fig. 5.

4 Example Analysis To further illustrate the feasibility and effectiveness of this parameter calibration matching method, an example, which is a 3-d output model in state trajectories of a low-lift Mars entry vehicle due to the uncertainty in initial conditions and other system parameters, is given. Then the evolution of uncertainty due to initial conditions, ballistic coefficient, lift-to-drag ratio, and atmospheric density is quantified based on statistical property. Finally, it can be demonstrated that the parameter calibration approach is able to matching the approximation of the true value of uncertain parameters, in mars entry dynamics profile. 4.1

Mars Entry Dynamics and Uncertainty Modeling

Future Mars missions, such as returning sample and human exploration, require high landing accuracy. However, large uncertainties in Mars atmospheric density, the initial state variables at the entry interface, and vehicle’s aerodynamic parameters will degrade the flight state precision and result in much delivery deviation between the actual terminal state and the designated target. To develop next-generation planetary entry, descent, and landing (EDL) technologies, parameter calibration have become a significant technical challenge for accurate model.


L. Bo et al.

As entry process only lasts a very short time and the Mars rotation angular rate has an impact on the state change less than magnitude of 104 , the MSL-like low-lift entry vehicle can be modeled as an unpowered point mass flying in stationary Martian atmosphere. Entry vehicle is commanded to fly at a constant trim angle-of-attack, and the controlled guidance of the vehicle is achieved only through modulating the bank angle. In order to reflect the impact of dynamical modeling uncertainty on the terminal state of entry vehicle, the longitudinal dynamic equations are written as Eq. (10). In the model, longitude and latitude integrations are not required in this reduced dynamics formulation since they are replaced by the downrange. 8_ c > < h ¼ v sin eqv2 sin c v_ ¼  2Bc  ðrl þ hÞ2 0 > : c_ ¼ keqv cos u  l cos c þ 2Bc vðr þ hÞ2 0

v cos c r0 þ h


where h is the flight altitude, k is the lift-to-drag ratio, e is the uncertain factor of atmospheric density. For Mars atmospheric entry, the most significant uncertainties that affect the landing footprint dispersion come from the initial conditions and aerodynamic parameters. In fact, the initial state at the entry interface cannot be accurately obtained by the Deep Space Network, and the aerodynamic parameters also cannot be accurately modeled through wind tunnel experiments. Therefore, the uncertainties in Mars entry dynamics are modeled as follows: 8 hðt0 Þ ¼ h0 þ d  Dh0 > > > > vðt > 0 Þ ¼ v0 þ d  Dv0 > < cðt0 Þ ¼ c0 þ d  Dc0 c Bc ¼ ð1 þ d  DBc Þ  B > > > >  k ¼ ð1 þ d  DkÞ  k > > : e ¼ ð1 þ d  DeÞ  e


where stochastic variable d 2 ½1; 1, and Dh0 ; Dv0 ; Dc0 ; DBc ; Dk; De represent the  c ; k; e, respectively. perturbations about the corresponding nominal values h0 ; v0 ; c0 ; B Here, these parameters are assumed to have uniform distributions with regard to their nominal values. Thus, the Eq. (10) are stochastic differential equations. The uncertain parameters are set according to Tables 1 and 2. The dynamics is driven by the MSL entry guidance. The planned entry time span is supposed to be 300 s and simulation step is set to be one second. MATLAB (2014a 64bit version) environment is adopted to implement numerical simulation and analysis. 4.2

The Simulation Result

There are hybrid uncertainty parameters, seeing from initial conditions and uncertain parameters given in Table 1. Therefore, the advanced step is to decompose complex parameter into sub-parameters with AV method. The first four variables, which

A Method of Parameter Calibration with Hybrid Uncertainty


Table 1. The uncertain initial conditions setup Parameter Symbol Categories Expectation Variance Distribution Unit

Flight altitude hðt0 Þ 1 135.6 10 Uniform km

Flight velocity vðt0 Þ 1 6750 33 Uniform m/s

Flight path angle cðt0 Þ 1 −15.2 0.15 Uniform 

Table 2. The uncertain aerodynamic parameters setup Parameter Symbol Categories Expectation Variance Distribution Unit

Ballistic coefficient Bc 2 1.622 11.2 Uniform kg=m2

Lift-to-drag ratio Density uncertainty k e 3 3 [0.2233, 0.2733] [0.9, 1.1] The border all is 0.008 The border all is 0.05 Uniform Uniform -

Table 3. Sub-parameters values Original parameter P1 ¼ hðt0 Þ P2 ¼ vðt0 Þ P3 ¼ cðt0 Þ P4 ¼ Bc P5 ¼ k


P6 ¼ e


1 1 1 2 3

Subparameter h1 ¼ P1 h2 ¼ P2 h3 ¼ P3 h4 ¼ P4 h5 is the upper bond of P5 h6 is the upper bond of P5 h7 is the upper bond of P6 h8 is the upper bond of P6



135:6 10 6750 33 15:2 0:15 121:622 11:2 0:2233 0:0080:2733 0:008

Uniform Uniform Uniform Uniform Uniform

0:9 0:05 1:1 0:05


includes flight altitude, flight velocity, flight path angle, and ballistic coefficient, has only one kind of uncertainty. They are decomposed into sub-parameters forms of which are same as themselves, just in order to maintain the variables unifying. The subparameters after decomposing are given by Table 3.


L. Bo et al.

In the sub-parameters, we can know that h4  h8 can be calibrated through some target data because they have epistemic uncertainty, using parameter calibration algorithm based on cumulative distribution function in the third quarter. In the simulation, three sets of observations (30,100,500) are generated using the truth value of the epistemic uncertainty variable in the model. For the CDF matching approach, the K-S test is conducted by generating 50 epistemic variables he realizations using Latin hypercube to make sure getting a minimum. For each he realization, 508 output samples (500 using Latin hypercube sampling and 8 border points in three dimension) are generated using the aleatory uncertainty ha , and the ECDF from these generated samples is compared against the ECDF of given observations. This process is repeated for all the he realizations to obtain the variation in the K-S statistic. The result for parameter calibration for different sample observation sizes are given in Table 4. And a bar comparing the relative error between true value of epistemic parameters and calibrated approximation with three set observations is also given in Fig. 6. Table 4. Calibrated epistemic variables for two set observations Sub-parameter True value Calibrated value of 30 pairs observations Calibrated value of 100 pairs observations Calibrated value of 500 pairs observations

h4 121.622 122.9497 122.4456 121.6385

h5 0.2233 0.2222 0.2238 0.2250



h6 0.2733 0.2741 0.2722 0.2730

h7 0.9 0.8862 0.8994 0.9010

h8 1.1 1.0843 1.1032 1.0996

0.016 0.014

the relative error

0.012 0.01 0.008 0.006 0.004 0.002 0




epistemic uncertainty sub-parameters

Fig. 6. The relative error between true value and calibrated value of 30,100,500 observations

A Method of Parameter Calibration with Hybrid Uncertainty


Seeing from Fig. 6, we can know that the error, between true value and approximate value with calibration method, becomes smaller with the increase of observations in general. It keeps consistent with the thinking that epistemic uncertainty can be reduced from knowledge. The more knowledge we get, more accurate parameters from calibration method in this paper are. They can even be eliminated when there is enough knowledge. Sometimes, h5 and h6 violate the trend due to biased observations producing too much outliers, but the relative error is small enough.

5 Conclusion In view of the importance of determining what value of epistemic parameter is, which is unknown or range form, in engineering practice under the condition of aleatory and epistemic uncertainties, this paper proposes a parameter calibration method of uncertainty model. The method completes the calibration process by combining the CDF and modified K-S test. It compares the distance between joint CDF of the sample values and that of the observed values to complete matching process and can seek optimizations with genetic algorithm. Finally a mathematical example of Mars entry dynamics is used to verify this method. It can be seen from the example that, this method can successfully apply to hybrid uncertainty system, and find the true value of parameters we should know in engineering. It is proved that the more observation information is, the more accurate the found value can be. After analysis, this method can be used to make sure epistemic parameter in system which is unknown but should know. Acknowledgement. This work was supported by National Natural Science Foundation of China (No. 61790562, 61627810, 61403096).

References 1. Kennedy, M.C., O’Hagan, A.: Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 63(3), 425–464 (2001) 2. Bayarri, M.J., Berger, J.O., Paulo, R., Sacks, J.: A framework for validation of computer models. Technometrics 49(2), 138–154 (2007) 3. Arendt, P.D., Apley, D.W., Chen, W., Lamb, D., Gorsich, D.: Improving identifiability in model calibration using multiple responses. J. Mech. Des. 134(10), 100909 (2012) 4. Arendt, P.D., Apley, D.W., Chen, W.: Quantification of model uncertainty: calibration, model discrepancy, and identifiability. J. Mech. Des. 134(10), 100908 (2012) 5. Oberkampf, W.L., Helton, J.C., Joslyn, C.A., Wojtkiewicz, S.F., Ferson, S.: Challenge problems: uncertainty in system response given uncertain parameters. Reliab. Eng. Syst. Saf. 85(1–3), 11–19 (2004) 6. Gogu, C., Qiu, Y., Segonds, S., Bes, C.: Optimization based algorithms for uncertainty propagation through functions with multidimensional output within evidence theory. J. Mech. Des. 134(10), 100914 (2012) 7. Xia, X., Wang, Z., Gao, Y.: Estimation of non-statistical uncertainty using fuzzy-set theory. Meas. Sci. Technol. 11(4), 430 (2000)


L. Bo et al.

8. Du, L., Choi, K.K., Youn, B.D., Gorsich, D.: Possibility-based design optimization method for design problems with both statistical and fuzzy input data. J. Mech. Des. 128(4), 928– 935 (2006) 9. Nikolaidis, E., Chen, S., Cudney, H., Haftka, R.T., Rosca, R.: Comparison of probability and possibility for design against catastrophic failure under uncertainty. J. Mech. Des. 126(3), 386–394 (2004) 10. Jiang, C., Ni, B.Y., Han, X., Tao, Y.R.: Non-probabilistic convex model process: a new method of time-variant uncertainty analysis and its application to structural dynamic reliability problems. Comput. Methods Appl. Mech. Eng. 268, 656–676 (2014) 11. Crespo, L.G., Kenny, S.P., Giesy, D.P.: The NASA Langley multidisciplinary uncertainty quantification challenge. In: 16th AIAA Non-Deterministic Approaches Conference, p. 1347 (2014) 12. Bruns, M., Paredis, C.J.: Numerical methods for propagating imprecise uncertainty. In: ASME 2006 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, January, pp. 1077–1091 (2006) 13. Li, S., Jiang, X.: Review and prospect of guidance and control for Mars atmospheric entry. Prog. Aerosp. Sci. 69, 40–57 (2014) 14. Chaudhuri, A., Waycaster, G., Price, N., Matsumura, T., Haftka, R.T.: NASA uncertainty quantification challenge: an optimization-based methodology and validation. J. Aerosp. Inf. Syst. 12(1), 10–34 (2015) 15. Ghanem, R., et al.: Probabilistic approach to NASA Langley research center multidisciplinary uncertainty quantification challenge problem. J. Aerosp. Inf. Syst. 12(1), 170–188 (2014) 16. Brune, A.J., West, T., Hosder, S., Edquist, K.T.: Uncertainty analysis of mars entry flows over hypersonic inflatable aerodynamic decelerators. In: 11th AIAA/ASME Joint Thermophysics and Heat Transfer Conference, p. 2672 (2014)

Reinforcement Learning Testbed for Power-Consumption Optimization Takao Moriyama(B) , Giovanni De Magistris, Michiaki Tatsubori, Tu-Hoa Pham, Asim Munawar, and Ryuki Tachibana IBM Research – Tokyo, Tokyo, Japan {moriyama,giovadem,mich,pham,asim,ryuki}

Abstract. Common approaches to control a data-center cooling system rely on approximated system/environment models that are built upon the knowledge of mechanical cooling and electrical and thermal management. These models are difficult to design and often lead to suboptimal or unstable performance. In this paper, we show how deep reinforcement learning techniques can be used to control the cooling system of a simulated data center. In contrast to common control algorithms, those based on reinforcement learning techniques can optimize a system’s performance automatically without the need of explicit model knowledge. Instead, only a reward signal needs to be designed. We evaluated the proposed algorithm on the open source simulation platform EnergyPlus. The experimental results indicate that we can achieve 22% improvement compared to a model-based control algorithm built into the EnergyPlus. To encourage the reproduction of our work as well as future research, we have also publicly released an open-source EnergyPlus wrapper interface ( directly compatible with existing reinforcement learning frameworks.

Keywords: Reinforcement learning Data center


· Power consumption


Data centers worldwide consume around 3% of the total electricity consumed globally. This electricity consumption adds up to roughly 90 billion kilowatthours annually in the US1 ; thus, it is one of the largest issues regarding data centers. The report of the US State Department of Energy identified that the effective energy saving policies reduce the energy consumption of data centers (620 billion kilowatt-hours will be saved between 2010 and 2020). The design of hardware and software components is a key factor in saving energy. In this study, we focused on improving the built-in control algorithm (hereafter, ‘controller’), 1

c Springer Nature Singapore Pte Ltd. 2018  L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 45–59, 2018.


T. Moriyama et al.

of one of the major sources of energy consumption, i.e., the cooling system [10], by using recent advances in deep reinforcement learning (DRL). A cooling system is composed of multiple components controlled by adjusting the desired values of different control temperatures called setpoints. Common approaches to control them are based on approximate models of the system [12]. These models are sometimes inaccurate or too complex to calculate. Therefore, a data-driven approach has recently been considered as an interesting alternative. In such cases, the control policy is learned using the data acquired from the system without using any explicit system model. We discuss the advantages of using this approach compared to the classical model-based approach. We propose a controller for controlling the cooling system using a state-of-the-art DRL technique. By designing a reward function considering power consumption and reference temperature ranges, our DRL controller outperforms the built-in controller by 22% in terms of power consumption while maintaining a similar temperature range. Compared to previous related studies [6,14], we provide a DRL open source environment for EnergyPlus [1] following the OpenAI Gym2 interface. This allows designers to easily test their algorithms and control strategies as well as using current algorithms based on the interface. The paper is structured as follows. In Sect. 2, we formally define the problem. In Sect. 3, we explain the open-source environment we built and released to train DRL controllers for energy optimization. In Sect. 4, we describe our DRL controller and resolution method. In Sect. 5, we present simulation results on different control configurations (e.g., classical vs. DRL) and environments (e.g., weather settings). We conclude the paper in Sect. 6 with directions for future works.


Control of Heating, Ventilation, and Air-Conditioning System

The heat, ventilation, and air conditioning (HVAC) controller optimizes the data-center power consumption under operation constraints (e.g., maintaining temperatures in specified ranges for employees). The design of this controller is not simple due to the complexity of the HVAC system (e.g., fluid dynamics and thermodynamics) and dependence of many external factors (e.g., external weather conditions). Figure 1 shows a model of a data center, which consists of two zones (server rooms). Each zone has a dedicated HVAC system, which is connected through a “supply air” duct and “return air” duct to exchange heat. Each HVAC system is composed of several components connected sequentially as follows: – outdoor air system (OA System) exchanges zone and outdoor air flow – variable volume fan (VAV Fan) adjusts the air flow rate to meet the zone target temperature 2

Reinforcement Learning Testbed for Power-Consumption Optimization


Fig. 1. Control of each zone of data center

– direct evaporative cooler (DEC) lowers the air temperature using latent heat from the evaporation of water – indirect evaporative cooler (IEC) lowers the air temperature without adding humidity to the air – direct expansion cooling coil (DX CC) cools air by passing the condensed refrigerant through a heat exchanger – chilled water cooling coil (CW CC) cools air using chilled water. The target temperature for the DEC, IEC, DX CC, and CW CC are specified by a single common temperature, called the setpoint temperature, for each zone. Using model-based approaches, the setpoint manager calculates the setpoint temperatures based on a built-in model for system dynamics and environmental information, system loads, etc. The air volume supplied to each zone is also adjusted by the VAV Fan. To apply DRL to the data-center model, we replace the model-based controller (setpoint manager and VAV Fan controller) with our DRL-based controller.


3 3.1

T. Moriyama et al.

Reinforcement Learning Testbed for EnergyPlus Simulation Wrapper

Figure 1 illustrates our simulation system setup. Our DRL controller is composed of two scripts: DRL-based agent and energy management system (EMS). We use the OpenAI Baselines for open-source implementation of the trust-region policy optimization (TRPO) algorithm [2] as the agent. We explain the details of this algorithm in Sect. 4. We define the simulation wrapper (EnergyPlusEnv) shown in Fig. 1 to manage the interaction between the learning agent and EnergyPlus simulation. We create this wrapper following the OpenAI Gym interface, which consists of the following two major APIs: – Env.reset(): Restart the simulation process – Env.step(): Proceed one simulation timestep. During the start of the training, a new instance of EnergyPlusEnv is created, which then creates a pair of named pipes: one for sending information from EnergyPlusEnv to EnergyPlus and the other for the opposite direction. When Env.reset() is called by the agent, EnergyPlusEnv spawns a new EnergyPlus process with the information about the name of the pipes, building model file (input data file, IDF), and weather file (EnergyPlus weather file, epw). EnergyPlusEnv waits until the first observation is returned from the EnergyPlus simulation. Env.step() sends the action to EnergyPlus and waits until observation is returned. EnergyPlusEnv computes the reward and sends it along with the observations to the agent. The EMS script in Fig. 1 receives action information from EnergyPlusEnv and sets it to corresponding variables in the EnergyPlus simulation process. It also gathers information from the EnergyPlus process and sends it to EnergyPlusEnv as observations. The communication protocol between EnergyPlusEnv and the simulation consists of sending a series of floating-point values of action to the EnergyPlus simulation. Similarly, we receive a series of floating-point values as observation back from the EnergyPlus simulation. Therefore, we do not need to encode any command code but just send (receive) the amount of data, then the sequence of data follows for both directions. 3.2

Extending Built-In Energy Management System

In this section, we explain how to hook into the process of EnergyPlus. EnergyPlus provides two frameworks to extend its functionalities: – Building Controls Virtual Test Bed (BCVTB): software framework for cosimulation, which allows importing of the functional mockup unit (FMU) from other systems, such as MATLAB/Simulink, to EnergyPlus and exporting EnergyPlus as an FMU to other simulation systems. It is intended for building a large-scale and distributed simulation system including EnergyPlus as a component.

Reinforcement Learning Testbed for Power-Consumption Optimization


– EMS: a high-level control method available in EnergyPlus. The EMS uses a small scripting language called EnergyPlus Runtime Language (Erl) to access a variety of sensor data, make decisions based on these data, and select the control actions for actuators. While the BCVTB is designed to connect two or more simulation systems, the EMS is designed to extend EnergyPlus capability inside it but does not have the capability to connect to outside the EnergyPlus system. We used the EMS to extend EnergyPlus because of its simplicity and added two built-in functions to Erl to allow EMS scripts to communicate with external entities: – @ExtCtrlObs: send observation vector to EnergyPlusEnv. – @ExtCtrlAct: receive action vector from EnergyPlusEnv. We selected a two-zone data-center model (“2ZoneDataCenterHVAC wEconomizer.idf”) because it has combination of different types of HVAC components (DEC, IEC, single speed DX CC, and CW CC). @ExtCtrlObs and @ExtCtrlAct are used in Erl script as follows: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19:

E n e r g y M a n a g e m e n t S y s t e m : Program , ExtCtrlBasedSetpointManager , SET tmp = @ExtCtrlObs 1 OutdoorTemp , SET tmp = @ExtCtrlObs 2 WestZoneTemp , SET tmp = @ExtCtrlObs 3 EastZoneTemp , SET tmp = @ExtCtrlObs 4 Whole_Building_Power , SET tmp = @ExtCtrlObs 5 IT_Equip_Power , SET tmp = @ExtCtrlObs 6 Whole_HVAC_Power , SET tmp = @ExtCtrlAct 0 6 , SET W e s t Z o n e D E C O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 1 , SET W e s t Z o n e I E C O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 1 , SET W e s t Z o n e C C o i l A i r O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 1 , SET W e s t A i r L o o p O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 1 , SET E a s t Z o n e D E C O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 2 , SET E a s t Z o n e I E C O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 2 , SET E a s t Z o n e C C o i l A i r O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 2 , SET E a s t A i r L o o p O u t l e t N o d e _ s e t p o i n t = @ExtCtrlAct 2 , SET W e s t Z o n e S u p p l y F a n _ F a n A i r M a s s F l o w R a t e = @ExtCtrlAct 3 , SET E a s t Z o n e S u p p l y F a n _ F a n A i r M a s s F l o w R a t e = @ExtCtrlAct 4;

@ExtCtrlObs specifies one of the elements of a state vector in turn (lines 3–8). It is a function with two parameters. The first parameter specifies the index in the state vector for the specified value in the second parameter. The index starts from 1. The specified value is stored in the internal buffer. If @ExtCtrlAct is called with 0 as the first parameter (line 9), it sends the state vector with a length specified by the second parameter to EnergyPlusEnv through one of the named pipes then waits until an action vector is sent back from EnergyPlusEnv through another named pipe. The received action vector is stored into the internal buffer. The non-zero value for the first parameter of @ExtCtrlAct is treated as an index for an element in the internal buffer, and the specified element is returned (lines 10–19). The value returned from @ExtCtrlAct is assigned to the actuator in the left side of the assignment statement. As described in the code above, we have four elements in the action vector. The first and second elements specify the setpoint temperature for West and East Zones, respectively. The other two elements are used to set the air flow rate for West and East Zones. By defining an “EnergyManagementSystem:Sensor” object as a system variable, it becomes accessible from EMS scripts.


T. Moriyama et al.

E n e r g y M a n a g e m e n t S y s t e m : Sensor , WestZoneTemp , ! - Name WEST ZONE , ! - Output : Variable or Output : Meter Index Key Name Zone Mean Air Temperature ; ! - Output : Variable or Output : Meter Name

However, by defining “EnergyManagementSystem:Actuator” for a control variable, it becomes controllable from EMS scripts as follows: E n e r g y M a n a g e m e n t S y s t e m : Actuator , WestZoneDECOutletNode_setpoint , West Zone DEC Outlet Node , System Node Setpoint , Temperature Setpoint ;


Name Actuated Component Unique Name Actuated Component Type Actuated Component Control Type

We added three new sensors: OutDoorTemp, WestZoneTemp, and EastZoneTemp in addition to existing sensors regarding measuring electric demand power: IT Equip Power, Whole Building Power, and Whole HVAC Power. We also defined ten actuators: WestZoneDECOutletNode setpoint, WestZoneIECOutletNode setpoint, WestZoneCCoilAirOutletNode setpoint, WestAirLoopOutletNode setpoint, EastZoneDECOutletNode setpoint, EastZoneIECOutletNode setpoint, EastZoneCCoilAirOutletNode setpoint, EastAirLoopOutletNode setpoint, WestZoneSupplyFan FanAirMassFlowRate, and EastZoneSupplyFan FanAirMassFlowRate. 3.3

Replacing Existing Controller with Agent-Based Controller

To control the temperature setting, we need to do the following two actions: – disabling the existing controller in EnergyPlus – reporting the current state (observation) in EnergyPlus to the agent system, receiving the next actions from an agent, and reflecting these actions to the EnergyPlus simulation, as defined in the OpenAI Gym interface. The temperature in the original model is controlled using the following “ZoneControl:Thermostat” objects for each zone. ZoneControl : Thermostat , West Zone Thermostat , West Zone , Zone Control Type Sched , T h e r m o s t a t S e t p o i n t : DualSetpoint , Temperature Setpoints ;


Name Zone or Control Control Control

ZoneList Name Type Schedule Name 1 Object Type 1 Name

“ZoneControl:Thermostat” can operate with various control types in a predefined schedule. These controls include single heating setpoint, single cooling setpoint, single heating/cooling setpoint, and dual setpoint (heating and cooling) with deadband. In this case, only the dual setpoint is used with a constant high-limit temperature (23.0 ◦ C) and low-limit temperature (20.0 ◦ C). The target nodes to which setpoint temperatures are controlled are defined by the “SetpointManager:SingleZone:Cooling” object as follows. Note that the comments are indicated with exclamation marks. ! ! ! ! ! ! ! ! !

SetpointManager : SingleZone : Cooling , System Setpoint manager , ! - Name Temperature , ! - Control Variable -99.0 , ! - Minimum Supply Air Temperature { C } 99.0 , ! - Maximum Supply Air Temperature { C } West Zone , ! - Control Zone Name West Zone Air Node , ! - Zone Node Name West Zone Inlets , ! - Zone Inlet Node Name West Sys Setpoint nodes ; ! - Setpoint Node or NodeList Name

Reinforcement Learning Testbed for Power-Consumption Optimization


“West Sys Setpoint nodes” and “East Sys Setpoint nodes” define a list of actual node names. To disable the existing controller, we disabled SetpointManager:SingleZone:Cooling in the simulation model by simply commenting out, as shown above. We simply override the “fan air mass flow rate” of the VAV Fan through actuators to control the air flow rate. 3.4

How to Hook into Simulation Loop in EnergyPlus

EMS provides the “calling point” mechanism, which allows EMS users to invoke a specified EMS procedure in a specified point. These points are called “calling points”. There are 14 calling points defined in the EMS. We used “AfterPredictorAfterHVACManagers”, which activates associated procedures after the predictor, and the traditional HVAC managers are called. By registering the ExtCtrlBasedSetpointManager procedure we discussed in Sect. 3.2 with “AfterPredictorAfterHVACManagers”, we can control the setpoint temperatures of each zone. E n e r g y M a n a g e m e n t S y s t e m : ProgramCallingManager , ExtCtrl - Based Setpoint Manager , ! - Name AfterPredictorAfterHVACManagers , ! - EnergyPlus Model Calling Point ExtCtrlBasedSetpointManager ; ! - Program Name 1


Application to Other Simulation Models

In previous sections, we used the two-zone data-center model “2ZoneDataCenterHVAC wEconomizer.idf” to explain our changes with respect to the original file. If you want to use a different data-center model, the following steps are necessary. 1. Select the building model (IDF file) you want to control using a DRL-based agent. 2. Determine which setpoint (control) you want to replace by using the DRL agent. Inhibit it in the IDF file, as described in Sect. 3.3. 3. Determine the list of required sensors (observations) and actuators (actions). Add them by defining “EnergyManagementSystem:Sensor” and “EnergyManagementSystem:Actuator” objects in the IDF file (Sect. 3.2). 4. Write an Erl program in the IDF file to exchange actions and observation with the EnergyPlusEnv wrapper (Sect. 3.2). 5. Hook the Erl program into the EnergyPlus simulation loop by defining the “EnergyManagementSystem:ProgramCallingManager” object in the IDF file (Sect. 3.4). You also need to develop a model-dependent part of the simulation wrapper for your simulation model. In our case, it is implemented using a Python code called the file, which implements the following methods: – EnergyPlusModel.setup spaces(): Define action and observation spaces


T. Moriyama et al.

– EnergyPlusModel.compute reward(): Compute reward value from current states and actions – episode(), EnergyPlusModel.plot episode(): Read result from CSV file generated by EnergyPlus and plot it on the screen. This is for visualization purpose.

4 4.1

Reference Reinforcement Learning Implementation Model-Free Reinforcement Learning

We consider the data-center control problem as an infinite-horizon discounted Markov decision process (MDP). Such MDPs are characterized by – S a set of states (e.g., all possible temperatures inside and outside a building), – A a set of possible actions to control the system (e.g., cooling commands), – P : S ×A×S → [0, 1] the state-action-state transition probability distribution (e.g., to go from a given temperature to another with a given command), – R : S × A × S → R a reward function corresponding to such transitions, – ρ0 : S → [0, 1] the initial state probability distribution, – γ ∈ [0, 1) a discount factor penalizing future reward expectations. We are then interested in computing a stochastic policy π : S × A → [0, 1] that maximizes the discounted expected return η(π):  ∞  i γ R(si , ai , si+1 ) , (1) η(π) = E τ


with τ = (s0 , a0 , s1 , a1 , . . . ) a sequence of states and actions, where the initial state s0 is initialized following ρ0 , each action ai is sampled following π(·|si ) the control policy given the current state si , leading to a new state si+1 following the transition function P (·|si , ai ). Given Eq. (1), we seek to maximize a reward expectation over multiple steps rather than a direct one-step reward (which we could approach by, e.g., greedy search). Thus, reinforcement learning approaches have to deal with major challenges, including the fact that a reward at a given step is not necessarily associated to a single action but possibly a sequence thereof (the credit assignment problem) and the fact that the agent has to balance between exploiting known rewards and exploring new strategies that can lead to higher but can also lower rewards (the exploitation-exploration dilemma). This is particularly true in the case of data-center optimization since heat distribution occurs over time (e.g., a given command may not impact sensor readings instantaneously) and can depend on non-controllable parameters (e.g., weather conditions). DRL approaches have recently provided considerable results on such problems, where the relationship between states and optimal actions is difficult to model formally, e.g., playing Atari video games directly from pixels using

Reinforcement Learning Testbed for Power-Consumption Optimization


convolutional neural networks (CNNs) rather than handcrafted features [7], or learning complex robotic tasks in both simulated [3] and real environments [5]. In DRL, control policies πθ are typically represented by deep neural networks parameterized by vector of parameters θ (e.g., neuron weights and biases). In our study, we used the TRPO algorithm [11], in which a multilayer perceptron (MLP) predicts as outputs the mean μ and standard deviation σ of a Gaussian distribution, taking directly as input a state vector si . Given a policy π, we denote the state-action value function as Qπ , the value function as Vπ , and the advantage function as Aπ :  ∞  l γ R(si+l , ai+l , si+l+1 ) , (2) Qπ (si , ai ) = E si+1 ,ai+1 ,...

Vπ (si ) =


ai ,si+1 ,...




γ R(si+l , ai+l , si+l+1 ) ,



Aπ (s, a) = Qπ (s, a) − Vπ (s).


The parameters θ of the neural network policy πθ are then iteratively refined by collecting state-action-reward trajectories in the environment and solving the following problem:   πθ (a|s) Aπθk (s, a) , (5) θk+1 = argmax E s∼ρθk ,a∼πθk πθk (a|s) θ with additional constraints on parameter variation to facilitate convergence [11]. In particular, we use the OpenAI Baselines open-source implementation of the TRPO algorithm [2]. In our study, we trained DRL controllers in a purely modelfree fashion, i.e., no specific knowledge on the data-center operation was preencoded in the neural networks or training process, instead using only abstract state and action vectors as inputs and outputs. Note that models can be used to accelerate training when available [9], and the quality of the resulting policy ultimately depends on the quality of the models. Domain adaptation and transfer learning [4,13] constitute important steps towards the application of such modelbased DRL techniques. Finally, while DRL controllers can be designed to solve specific problems in isolation, they can also be integrated as parts of larger systems, e.g., along with classical planning and scheduling modules [8], which is how we will ultimately envision their common application in the future. 4.2

Reinforcement Learning for Power-Consumption Optimization

We now consider the optimization of data-center power consumption as a reinforcement learning problem. To do so, we constructed a dedicated DRL environment by detailing the definition and computation of the following three items for neural network training: state, action and reward.


T. Moriyama et al.

State. The state vector contains the following: – – – – – –

Outdoor air temperature between −20 ◦ C and 50 ◦ C West Zone air temperature between −20 ◦ C and 50 ◦ C East Zone air temperature between −20 ◦ C and 50 ◦ C Total electric demand power between 0 W and 1 GW Non-HVAC electric demand power between 0 W and 1 GW HVAC electric demand power between 0 W and 1 GW.

These ranges are defined as observation space when a DRL environment is created and passed to the agent. Action. We consider the power-consumption optimization problem as a continuous control task, i.e., the commands we are interested in can be controlled in continuous domains (e.g., set a target temperature to between 20 ◦ C and 30 ◦ C) rather than discrete (e.g., flip a switch on or off, set a target temperature to exactly 20 ◦ C, 21 ◦ C, etc.). The action vector contains – – – –

West Zone setpoint temperature between 10 ◦ C and 40 ◦ C East Zone setpoint temperature between 10 ◦ C and 40 ◦ C West Zone supply fan air mass flow rate between 1.75 kg/s and 7.0 kg/s East Zone supply fan air mass flow rate between 1.75 kg/s and 7.0 kg/s.

Note that these values are normalized to the range of [−1.0, 1.0] when they are passed from the DRL agent. Therefore, they must be mapped to actual temperatures by the simulation wrapper described in Sect. 3.1 before sending to the EnergyPlus process. Given a state vector si computed following Sect. 4.2, an action vector ai is sampled from the policy network: ai ∼ πθ (si ). The ai is then played by setting each temperature to the corresponding values and running the simulation for N steps. This allows the computation of an updated state vector si+1 . We also define a reward signal ri for the purpose of DRL. Reward. We maximize a reward function that consists of a temperatureviolation penalty for each zone and the total electrical energy cost. The reward function for each simulation timestep is divided into two components depending on temperature rT and power consumption rP : rt = rT + λP rP ,


where λP is the power-consumption weight and rT = −

z  i=1

z    i   [Tt − TUi ]+ + [TLi − Tti ]+ , rP = −Pt , exp −λ1 (Tti − TCi )2 − λ2 i=1

where z is the number of data-center zones, Tti is the temperature of zone i at time t, TUi is the desired upper bound temperature for zone i, TCi is the desired

Reinforcement Learning Testbed for Power-Consumption Optimization


average temperature for zone i, TLi is the desired lower bound temperature for zone i, Pt is the total power consumption at time t, and λ1 and λ2 are the weights to modify the rT of Fig. 2. The rT consists of Gaussian-shaped and trapezoid penalty parts. The former gives a maximum reward of 1.0 at the temperature center TCi , and the reward decreases quickly toward zero as the difference from the center temperature increases. This makes it difficult to recover to the temperature center once the temperature difference becomes large. Thus, we added the trapezoid penalty part, which degrades gradually toward − inf. The values of the reward parameters are chosen, as discussed in Sect. 5, depending on criteria such as desired temperatures for human workers and priority with respect to energy minimization.



We discuss the effectiveness of our DRL controller through simulations in EnergyPlus. 5.1


Building Model. We used the simulation model “2ZoneDataCenterHVAC wEconomizer.idf” contained in the standard EnergyPlus 8.8.0 distribution. It is a data-center model with two thermal zones: West Zone of dimension 232.26 m2 and East Zone of dimension 259.08 m2 . The HVAC components include an air economizer, DEC, IEC, Single Speed DX CC, CW CC, and VAV Fan with a No Reheat Air Terminal Unit. Zone temperatures are controlled by (i) a setpoint manager to control the setpoint temperature of the HVAC components, and (ii) thermostat controller set for 23.0 ◦ C as the cooling setpoint and 20.0 ◦ C as the heating setpoint. We added the following modifications to the simulation model in addition to the change in connection to the DRL-based agent described in Sect. 3. – Extend the simulation period from 14 days to 365 days. – Change the reference from “West Zone Inlets” to “East Zone Inlets” in the definition of “East System Setpoint manager”. When we extended the simulation period, we needed to relax the convergence tolerance in the CalcIndirectResearchSpecialEvapCoolerAdvanced module; otherwise, we would experience severe errors in the EnergyPlus simulation. We changed the convergence tolerance (TempTol) from 0.01 to 0.02 and the maximum iteration number (MaxIte) from 500 to 1000. Weather Data. The weather information of the data center was simulated using historical events for a year, which are bundled with EnergyPlus 8.8.0 as example weather data. There are five different weather data files from various locations in the USA, collected and published by the World Meteorological Organization as follows (with average temperatures):


– – – – –

T. Moriyama et al.

CA: San Francisco Int’l Airport, CA, USA (13.8 ◦ C) CO: National Renewable Energy Laboratory at Golden, CO, USA (9.7 ◦ C) FL: Tampa Int’l Airport, FL, USA (22.3 ◦ C) IL: Chicago-O’Hare Int’l Airport, IL, USA (9.9 ◦ C) VA: Washington Dulles Int’l Airport, VA, USA (12.6 ◦ C).

Controllers. We compared simulation results of three different controllers: – Baseline: model-based controller built in EnergyPlus. – TRPO (CA): TRPO-based controller trained on CA weather data. We chose it because it has the most moderate average temperature among the five weather data files. After convergence of the reward value is observed, we alternated between all five training data files, and evaluated how temperature and power controls were performed. – TRPO (CA-CO-FL): TRPO-based controller trained on CA, CO, and FL weather data, which we chose from the five weather data files since they have moderate, coldest, and hottest average temperatures. These weather data were switched in every epoch of the simulation process, which is one year long. We then alternated between all five training data files. Hyperparameters for TRPO. The policy consists of an MLP containing two fully connected layers of size 32 with tanh non-linearities as the activation function. The hyperparameters for TRPO were set as follows: max KL = 0.01; timesteps per batch = 16k; cg iters = 10, cg dumping = 0.1; gamma = 0.99; vf iters = 5; vf stepsize = 1e−3; lam = 0.98. These are default parameters for TRPO except for timesteps per batch, which was increased from 1k to 16k for stability of the learning process. Timesteps. The timestep parameter for simulation of EnergyPlus was set to four, as defined in the building model. This means 15 min per timestep. Note that there is another type of timestep used internally in the EnergyPlus simulation process, which is called “System timesteps”. Its length varies dynamically ranging from 1 min/timestep to zone timestep (15 min/timestep in this case) to balance simulation precision and speed. EnergyPlus and the TRPO-based agent communicate with each other in the system timestep frequency. Reward-Function Parameters. We used the following parameters for the reward function (6): TUi = 24.0 ◦ C, TCi = 23.5 ◦ C, TLi = 23.0 ◦ C, λP = 1/100000, λ1 = 0.5, and λ2 = 0.1.

Reinforcement Learning Testbed for Power-Consumption Optimization



Gaussian-shape Trapezoid Temp Penalty


0.5 -2 Reward

Reward value







Reward Reward (average)

-1.5 15










Zone temperature








Fig. 2. Reward function for temperature rT



Fig. 3. Convergence process of temperature control for TRPO (CA-CO-FL)

Simulation Results

Figure 3 shows the convergence process of the average reward of the TRPO algorithm throughout the learning process for training on the CA-CO-FL dataset. At the beginning, the reward value was low because of the temperature and power penalties. The average reward increased as the simulation proceeded. After 200,000 timesteps of simulation, the value was almost stabilized. We observed some fluttering even after 300,000 steps. This is because we were alternating between three different weather data files. Simulation with weather data of a hotter city required higher electrical power for cooling; thus, the reward decreased even though temperature was well controlled. Table 1 shows the average reward values and average power consumption with the Baseline algorithm, TRPO trained on CA weather data, and TRPO trained on CA-CO-FL weather data. In average, the TRPO-based controller achieved 22% reduction of total electrical demand power. Table 1. TRPO was trained on CA or CA-CO-FL then evaluated with five weather data files Method Baseline (training data)




Test data













0.85 0.13 0.59

Power consumption (kW)

117.7 122.7 140.4 124.2 125.9 97.9 99.4 113.9 100.1 100.3 98.2 94.2 108.3 94.6 96.1

Average (kW)


102.3 (−18.9%)






0.98 0.96 0.87





0.93 0.96

98.3 (−22.1%)

Figure 4 is a comparison of different controllers regarding zone temperatures. On the left, the vertical thin lines show maximum and minimum temperatures in a simulation epoch (simulation for 365 days) of each zone. Boxes on the lines show average temperature ± standard deviation.


T. Moriyama et al.


West Zone East Zone







15 CA












Controller and test data




Temperature Occurrences (count)

Zone Temperature (C)

For Baseline, zone temperatures were all distributed within the range of [22.9 ◦ C, 24.8 ◦ C], while zone thermostat setpoints were set to 23.0 ◦ C for cooling and 20.0 ◦ C for heating. The average standard deviations of zone temperatures for Baseline was 0.40 ◦ C. For TRPO (CA), 300 episodes were executed with training data, then the results were collected for each of the five test data files. Note that neural networks were updated while these results were collected. While the standard deviations for CA and FL data were as small as those of Baseline (0.50 on average), the same for CO, IL, and VA were quite large (1.91 in average). In all cases, the minimummaximum temperature range was also large. Just 88.2% of temperature results fell into the range of [22.0 ◦ C, 25.0 ◦ C]. The average standard deviation of these cases was 1.34 ◦ C. For TRPO (CA-CO-FL), because we used more training data than for TRPO (CA), the results for IL and VA data were as good as those for the training data. The average standard deviation was 0.28 ◦ C, the minimum-maximum temperature ranges were as small as those for the training data, and 99.6% of temperature results fell into the range of [22.0 ◦ C,25.0 ◦ C]. The figure on the right shows the distribution of West Zone temperatures for the VA data with TRPO (CA-CO-FL).






0 21






Temperature (C)

Fig. 4. Left: comparison of controllers regarding zone temperatures. Vertical lines show minimum and maximum temperatures of each zone. Boxes show average temperature ± standard deviation. Right: distribution of West Zone temperatures for VA dataset with TRPO (CA-CO-FL)



Recent advances in artificial intelligence aims to make “intelligent” systems capable of imitating or improving human decision-making systems based on human expertise. We showed how recent advances in reinforcement learning can be applied to real-world scenarios such as controlling the cooling system of a data center for power-consumption optimization. Using reinforcement learning techniques, the system is able to outperform the built-in controller by 22%. We

Reinforcement Learning Testbed for Power-Consumption Optimization


believe that by releasing the code of our simulation environment, researchers can use our results as a baseline and further improve cooling-system performance by using advancements in reinforcement learning. In the near future, we plan to use our DRL controller to optimize the energy consumption of an actual IBM data center.

References 1. Crawley, D.B., et al.: EnergyPlus: creating a new-generation building energy simulation program. Energy Build. 33(4), 319–331 (2001). 10.1016/S0378-7788(00)00114-6. Special Issue: BUILDING SIMULATION’99. 2. Dhariwal, P., et al.: OpenAI Baselines (2017). baselines 3. Heess, N., et al.: Emergence of locomotion behaviours in rich environments. arXiv (2017) 4. Inoue, T., Chaudhury, S., De Magistris, G., Dasgupta, S.: Transfer learning from synthetic to real images using variational autoencoders for precise position detection. In: IEEE International Conference on Image Processing (2018) 5. Inoue, T., De Magistris, G., Munawar, A., Yokoya, T., Tachibana, R.: Deep reinforcement learning for high precision assembly tasks. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2017) 6. Li, Y., Wen, Y., Guan, K., Tao, D.: Transforming cooling optimization for green data center via deep reinforcement learning. arXiv (2017). 1709.05077 7. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529 (2015) 8. Munawar, A., et al.: MaestROB: a robotics framework for integrated orchestration of low-level control and high-level reasoning. In: IEEE International Conference on Robotics and Automation. IEEE (2018) 9. Pham, T.H., De Magistris, G., Tachibana, R.: OptLayer - practical constrained optimization for deep reinforcement learning in the real world. In: IEEE International Conference on Robotics and Automation (2018) 10. Report: data centre energy efficiency benchmarking - E2 Singapore (2015). http:// %20Benchmarking%20Summary-%20Final%20Report%20(3).pdf 11. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: Proceedings of International Conference on Machine Learning (2015) 12. Sun, J., Reddy, A.: Optimal control of building HVAC&R systems using complete simulation-based sequential quadratic programming (CSB-SQP). Build. Environ. 40(5), 657–669 (2005). 13. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2017) 14. Wei, T., Wang, Y., Zhu, Q.: Deep reinforcement learning for building HVAC control. In: 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6, June 2017.

A DEVS Visual Front-End Interface for Model Reusability and Maintainability Jiyong Yang1(&), Moongi Seok2, San Jeong3, and Changbeom Choi3 1

Department of Information and Communication Engineering, Handong Global University, Pohang, Gyeongbuk 37554, Republic of Korea [email protected] 2 School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, USA [email protected] 3 Global Entrepreneurship and ICT, Handong Global University, Pohang, Gyeongbuk 37554, Republic of Korea {214000684,cbchoi}

Abstract. Modeling is important in developing simulations. Due to the modeling process that establishes the relationship between the models determines the overall simulation flow and behavior of the models, so it requires the formulation of the overall structure and clear relationship expression. In case of DEVS, which is a discrete event simulation specification technique, modeling is performed by combining atomic models and coupled models. This process is performed in an abstract form and is expressed in a programming language. Therefore, it takes a lot of time and effort to get clear specification and understanding in DEVS modeling process and interpretation process, and it is not intuitive. In this paper, we introduce a DEVS based visual front-end interface that can reduce the cost of the modeling process to solve these difficulties. The DEVS visual front-end interface provides an environment for visually checking and modifying the modeling structure, and also provides skeleton code based on the modeling structure data. In addition, a program for maintaining consistency between source code and modeling information data is also presented in this study. Keywords: DEVS formalism  Visual front-end interface Model reusability and maintainability  Skeleton code generation Code consistency maintenance  Model Synchronizer

1 Introduction 1.1


Simulation is a simulation of a system to represent a physical or abstract system as a model and to express the relationship between the models. In constructing such a simulation, it is important to define the model and to express the relationship between the models. Therefore, it is necessary to confirm whether the modeling process is designed as the intended structure. But it is not easy to check because the models are © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 60–71, 2018.

A DEVS Visual Front-End Interface for Model Reusability and Maintainability


expressed in a programming language. Since the construction of the simulation is simulated based on a specific formalism, it is necessary to understand the formalism and the expression technique. Also, based on an understanding of formalism, we need to confirm the expression, relationship, and overall structure of the models. One of the most common formal semantics of simulation formalism is DEVS formalism. In the case of DEVS formalism, discrete event simulation specification, models are generated with a combination of atomic models and coupled models and simulate the target system by expressing their relationships. A more detailed introduction to the DEVS formalism is given in Sect. 1.3. The modeling process in DEVS formalism that is done by conceptually and is expressed in a programming language needs to be checked to understand entire structure. In this process, we must verify the relationship between models and structural design. The process of verifying modeling lines in the source code file is time-consuming as the size of the simulation increases. To solve these problems, various methods have been studied, one of which is the development of visualization tools. The simulation visualization tool enhances the understanding of the simulation and improves the accessibility. In addition, existing design models are stored as a base model and used in later modeling process to enhance reusability. For this reason, several visualization tools have been studied as the necessity of a simulation visualization tool has arisen. A existing simulation visualization tool based on the DEVS formalism is CD++ which runs on several different platforms [1]. CD++ is a toolkit for discrete event modeling and simulation based on DEVS and Cell-DEVS formalism. CD++ can describe the specification and relationship to the DEVS model. However, CD++ is only a tool capable of visualization and graphical specification of the model and has no relationship with the actual simulation source code. Also, it does not guarantee the reusability of models designed before. In other words, it only plays a role in expressing the conceptual modeling process visually through the program. In this paper, we focus on the development of a visualization tool that overcomes the limitations of the previous visualization tools. The purpose of the research is to develop a visual front-end interface that can redefine and reuse the model structure based on the structural characteristics of DEVS with a hierarchical relationship. Also, it is to provide a high user experience and verification of source code by generating source code based on modeling information beyond merely providing a visualization function. The expectation of this study is shown in Fig. 1. Simulation modeling, which requires high cost each time in the existing method, can be made easier through the tools developed in this study, and cost can be reduced by reusing the model. Also, the simulation source code is automatically generated based on the DEVS structural information, and the error rate on the modeling code can be significantly reduced. In addition, the goal is to develop a tool that maintains the modeling structure between the generated code and the DEVS structure information. 1.2

Related Works

The research on visualization tools based on DEVS formalism includes “CD++: a toolkit to develop DEVS models” [1] and “Tools for Graphical Specification and Visualization of DEVS Models” [2] by Wainer, an article introducing CD++. The study


J. Yang et al.

Fig. 1. Conventional modeling method and A method using a visualization tool and code generator

is based on the DEVS formalism for CD++, a toolkit for modeling and simulation, which simplifies model design, shortens simulation development time, and provides a safe and cost-effective simulation design. In designing a complex simulation, defining the model and establishing the relationship can reduce the time consuming process through CD ++. It also allows easy modeling by ordinary users who are not familiar with the simulation. Kim’s “DEVS_Suite: A Simulator Supporting Visual Experimentation Design and Behavior Monitoring” shows that the complexity of the simulation model experiment can be reduced by visualizing the simulation environment [3]. This paper shows that it is possible to design simulation experiments by visualizing the data generated through the selected model. Both studies show that visualization of the simulation model helps the user to design and test the simulation, thereby reducing the cost and time of the simulation. Both studies, however, do not show any relationship with the actual simulation code. The purpose of this research is to make the simulation design easier for the user by describing the relationship between the actual simulation source code and the DEVS based simulation visualization. 1.3

DEVS Formalism

The DEVS formalism is an abbreviation of Discrete Event System Specification. It describes the dynamic change of the system in terms of the state where the state changes according to the occurrence of the discrete event. DEVS is a formalism proposed by B. P. Zeigler, which expresses the whole system by creating models by dividing complex system elements into module units and combining them [5]. The DEVS formalism has an atomic model that represents the minimum unit component of a system and a coupled model that consists of a combination of several models. Through the combination of the two models, the system can be represented hierarchically and modularly. The atomic model is the most basic module that constitutes the simulation in the DEVS formalism and is a model that describes the basic unit behavior of the system. The atomic model consists of three sets and four functions, and its mathematical expression is as follows.

A DEVS Visual Front-End Interface for Model Reusability and Maintainability


AM ¼ hX; Y; S; dint ; dext ; k; tai X : Discrete event input set Y : Discrete event output set S : A set of discrete event states dint : Q ! Q : Internal state transition function Q ¼ fðs; eÞjs 2 S; 0  e  taðsÞg : Total state of M; dext : Q  X ! S : External state transition function k : Q ! Y : Output function þ ta : S ! R0;1 : Time advanced function The external state transition function is a function that expresses the state change by receiving input from outside under certain circumstances. The internal state transition function is a function that changes the state of the model itself without external input. When the internal state transition occurs, the output function is called. The coupled model is a model consisting of a combination of atomic models and coupled models, and it includes the linkage between the component models and input ports and output ports. The coupled model is also composed of three sets and four functions, and its mathematical expression is as follows. CM : hX; Y; fdMi g; EIC; EOC; IC; SELECTi X : Discrete event input set Y : Discrete event output set fdMi g : A set of all discrete event component models EIC : External input coupling relationship EOC : External output coupling relationship IC : Internal coupling relationship SELECT : 2fdMi g  £ ! dMi : Selection functions for models that generate events at the same time EIC (External Input Coupling) is the relationship between the external input and the input of the internal model. EOC (External Output Coupling) is the relationship between the output of the internal model and the external output. IC (Internal Coupling) is the relationship between the output of the internal models. In this paper, we present the input and output sets of the model in the atomic model and represent the set of input, output and subcomponent models in the coupled model, and the EIC, EOC, and IC relationships between them.

2 DEVS Visual Front-End Interface This chapter introduces development methodology and rules of DEVS-based simulation visualization tool and a program that maintain consistency between simulation source code and modeling information structure. The DEVS visualization tool and source code generation and consistency maintenance program developed in this study are used as a tool to be used for the simulation


J. Yang et al.

configuration by the user so that several requirements have been set in advance and developed based on this. The requirements for developing the tool are as follows (Table 1). Table 1. Requirements for developing visualizer, skeleton code generator and code consistency maintainer Requirement 1 Requirement 2 Requirement 3 Requirement 4 Requirement 5

Tools that can be visually checked for the DEVS model structure Visualization of sub-models and coupling relations for each level of the hierarchy (other levels are checked in the treeview) Generate skeleton code based on DEVS modeling information When modifying the model structure in the visualizer, the source code is updated according to modified contents When modeling code is modified in source code, the model structure information files are updated according to modified contents

The overall behavior of a program developed by these derived requirements is shown in Fig. 2.

Fig. 2. Overall behavior of the DEVS visual front-end interface

The user first performs DEVS modeling in a predefined manner, which is specified in XML. The XML file reads through the DEVS visualizer, and then an additional modeling process is performed. In this process, you can add designed models to the base model and import pre-designed models. This can be designed to reduce the cost required for the simulation configuration by ensuring model reuse for the most timeconsuming modeling process in the process of constructing the simulation. After the design, the skeleton code generator generates the simulation source code. This process significantly reduces the I/O and relational code error rate of the model that can occur during the process of generating the actual simulation source code. Then, when modifying the model structure in the simulation source code or modifying the modeling information in the DEVS visualizer, the code consistency maintainer synchronizes the source code and the modeling information.

A DEVS Visual Front-End Interface for Model Reusability and Maintainability



DEVS Hierarchy Information Structure

In order to perform visualization based on the modeling information constructed through the DEVS formalism, information about the DEVS models and connections must be stored and managed. This study describes the DEVS model information and connection relation through XML. XML is divided into an ‘OutmostModel’ Tag that describes the actual system combination and a ‘ModelPool’ Tag that describes the basic unit models that make up the simulation. The unit model of the ‘ModelPool’ tag can describe not only the atomic model of the DEVS formalism but also the coupled model. At this time, the coupled model plays a role as unit model and serves as an independent role that cannot be separated. Because the DEVS formalism is a hierarchical structure, one model is composed of several sub-models or not, which is expressed in XML as a ‘Submodel’ tag added to the tag ‘Model’. The structure of XML is as follows (Fig. 3).

Fig. 3. The structure of DEVS_Structure and DEVS_BaseModels

The ‘OutmostModel’ tag serves as a coupling model for expressing the relationship of the simulation model at the uppermost stage, so the type is forced to ‘coupled’. Each model has ‘Type’, ‘Port’, ‘SubModel’, and ‘Couplings’ tag as a sub-element. If an atomic model of type is represented, only the ‘Port’ tag exists. In the ‘Model’ element, there are sub-element ‘Type’ tag and XML attribute ‘type’. The ‘type’ that exists as an attribute represents the role of the atom model or the coupled model. The sub ‘Type’ tag represents the model structure of the model in the ‘ModelPool’. Therefore, each model that exists under the ‘OutmostModel’ tag represents only the combination and connection between the model and the sub model, and the actual structure of the model is specified in the ‘ModelPool’. The process of confirming the information of the model can confirm the connection information between the structure and the model in the ‘OutmostModel’ and confirm the factual information about the model by reading it from the ‘ModelPool’. The reason for expressing the DEVS structure by


J. Yang et al.

‘OutmostModel’ tag and ‘ModelPool’ tag is to have efficiency in structural expression and to increases the reusability. Besides, we use the concept of Base Model for reusability. The model that is frequently reused in the model structure is described in the base model so that the visualization tool can use the models to edit the model structure easily. This study aims at designing a tool that can identify the model structure and even act as an editor to modify the relationship between the models. Therefore, after reading the information on the actual DEVS model data, the structure can be visualized, and the relationship between the models can be modified through user input. The Base Model information is managed differently from the DEVS structure, but the basic XML structure is described in the same way as the DEVS structure. 2.2

DEVS Visualization

The visualization of the DEVS structure is accomplished by reading the two XML files described above-DEVS structure and Base model. Visualizer draws on the panel based on the modeling information read through ‘XML Parsing Module’. The two objects that make up the Panel are divided into instance and connection, and instance has port information internally. Visualization tool was implemented based on C# and DEVS visualization module was developed using Nevron Diagram library. Since the DEVS visualization tool visualizes the hierarchical formalism, it shows only the hierarchy of a certain level, and the hierarchical relationship is provided through the Treeview. One of the provided Treeviews shows the hierarchical relationships of the models and visualizes the connection between submodels corresponding to each node and corresponding ports and input/output ports. In the case of another treeview, you can see the models in the base model, and when the user clicks the node, the corresponding model is added to the level that is visible on the screen. The modeling information can be modified by newly correcting the relationship between the added model and the existing model. The instance that constitutes a DEVS relationship can be deleted or added itself, and the port can be modified, added or deleted. In addition, a connection representing a relation of models can be added and deleted as well, and existing connection relationships can be changed through user input. The modified and deleted information is updated in the XML file, which is used to synchronize the source code and XML into the same modeling structure through a Code consistency maintenance program. 2.3

DEVS Skeleton Code Generation and Code Consistency Maintenance

The developed tool does not merely perform visualization but also generates DEVS source code and maintain consistency by synchronizing generated code with XML which contains DEVS structural information. The entire process of skeleton code generation and code consistency maintenance is shown in Fig. 4. DEVS Model Synchronizer and Code Generator run separately from visualizer. The Skeleton Code Generator is run once in the absence of existing source code, but the DEVS Model Synchronizer continues to be used for situations that require synchronization between the modeling data and the source code. The program that

A DEVS Visual Front-End Interface for Model Reusability and Maintainability


Fig. 4. The process of skeleton code generation and code synchronizer

maintains consistency between the simulation source code and the XML file is based on the python language and uses the open source ‘CodeGen’ library [4]. Skeleton code generation is based on the information in the DEVS structure XML file, and the actual code is written in C++ based on some rules. First, the modeling data is read through the DEVS modeling parser, and then the skeleton code is generated. This process is shown in Fig. 4. Each model name is mapped to a class name and inherits an ‘atomic’ or ‘coupled’ class depending on the type of model. Within the source code, the part of the DEVS structural information called as Modeling Region is zoned by annotation. The beginning of the section is represented by “// [Region] …”, and the end of the section is presented by “// [EndRegion] …”. Each region is named on “…”- ‘Port’ for a port, ‘Coupling’ for a coupling, and ‘Models’ for a sub-model. The initialization part of the port is represented by “ports_init”, and the code for actually registering the port with the simulation engine is expressed as “ports_reg”. The port registration section is divided into “// input ports” and “// output ports” for distinguishing input and output. The DEVS Model Synchronizer reads the modeling information from the Modeling Region in source code through a Parser and stores it in a data structured form, as shown in ② of Fig. 4. DEVS modeling information is described on the simulation code based on the annotation protocol. And based on this, a tool synchronizes DEVS modeling structure between XML and simulation source code. The process of matching the sync between XML and code is divided into two processes. The first is when the modification is done in the visualizer, in which case the XML is up-to-date, and the source code needs to be modified. The first case is shown in ① of Fig. 4. The second is that when the source code is modified, the source code is up-to-date and the XML needs to be modified. The second case is shown in ② of Fig. 4. The process of maintaining code consistency involves reading the XML file and the source code, comparing the differences in the DEVS structure information, and synchronizing the information by inserting the information into the parts that need to be modified. This synchronization and skeleton code generation function ensures the verification of the static form. This reduces the error rate that can occur during the process of representing the modeling process as source code.


J. Yang et al.

3 Case Study The case study confirms the five requirements derived above by using a simple example of how the tools developed in this study are satisfied and implemented. The example is based on a simple GBP (Generate-Buffer-Process) model with a hierarchy of three depths or more. The GBP type DEVS structure to be used in the case study is shown in Fig. 5.

Fig. 5. Sample GBP model MP_GenerateCM ... ... Input_GenCM1 Output_GenCM1 MP_GenAM1 ... ... ...

Fig. 6. XML code based on sample GBP model

A DEVS Visual Front-End Interface for Model Reusability and Maintainability


The following is a part of XML based on Fig. 5. As described above, the DEVS structure file is composed of ‘OutmostModel’ tag and ‘ModelPool’ tag. The result of reading the previous example XML through the DEVS visualization tool developed in this study is as shown in Fig. 7, which satisfies requirement 1 (Fig. 6).

Fig. 7. Screenshot of calling ‘Outmost’ and ‘GenCM’

A user can check the model structure for each level through the Treeview located on the left of the visualization tool, and it satisfies requirement 2. After reading the XML information through a skeleton code generator, the source code is generated according to the DEVS model information, and it satisfies requirement 3, and the part of the generated source code and the file list are as shown in Fig. 8. Since the file generated through the example XML has six atomic models and three combined models, a total of 18 files are generated. The half is a header file, and the half is a CPP file. If you modify the source code or modify the XML containing the modeling information with the visualizer, a user can run the code consistency maintenance program to perform synchronization, which satisfies requirements 4 and 5. Figure 9 shows the changes in the code when the model structure is changed with the visualizer. It is hard to quantitatively compare performance against legacy tools with the proposed tool in this study. Therefore, the qualitative comparison between the CD++ and the proposed tool in this study is used to clarify which parts are possible and impossible (Table 2).


J. Yang et al.

Fig. 8. Generated files and example header file and CPP file

Fig. 9. The changes in the code when the model structure is changed with the visualizer

In the case of CD++, more detailed forms of modeling can be visualized. Also, user can set the internal state for the atomic model and describe simple descriptions of the behavior of the model. Because of these details of modeling visualization, user can execute the simulation in CD++. However, this is only possible for lightweight version of simulation. The proposed tool emphasizes interoperability with simulation code while providing basic visualization function. It is possible to make a more stable and easy simulation configuration through interoperability with simulation code that existing legacy tools can not.

A DEVS Visual Front-End Interface for Model Reusability and Maintainability


Table 2. The qualitative comparison between the CD++ and the proposed tool Feature


Model Visualization Simulation Execution in Model Visualization Simulation Skeleton Source Code Generation Synchronize between Modeling Information and Simulation Source Code

◎ D X X

Proposed tool O X O O

4 Conclusion It is important to describe the design of the model and the relationships between the models in constructing the simulation. The process of making a graphical specification through the visual front-end interface enables the user to understand the concept and to design the structure clearly. The visual front-end interface developed in this study can quickly check the relationship between DEVS models and can easily modify the relationship or internal information. Also, skeleton-code is generated through the information of DEVS models, so that users can conduct modeling more efficiently and safely. Besides, the user can keep the modeling information up to date through the code consistency maintenance program, which is convenient for simulation development. The method of generating the source code through the visual front-end interface is based on a predefined rule, so if the user does not follow the rule, an error will occur. In such case, it is difficult to correct errors. Therefore, future studies will use open source library such as Clang to parse the source code and make a data structure so that user doesn’t need to know the pre-defined annotation rules. Acknowledgements. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF2017R1D1A3B03033437).

References 1. Wainer, G.: CD++: a toolkit to develop DEVS models. Softw.: Pract. Exp. 32(13), 1261–1306 (2002) 2. Wainer, G., Liu, Q.: Tools for graphical specification and visualization of DEVS models. Simulation 85(3), 131–158 (2009) 3. Kim, S., Sarjoughian, H.S., Elamvazhuthi, V.: DEVS-suite: a simulator supporting visual experimentation design and behavior monitoring. In: Proceedings of the 2009 Spring Simulation Multiconference. Society for Computer Simulation International (2009) 4. CodeGen, Ronacher, A. 5. Zeigler, B.P., Oren, T.I.: Theory of modelling and simulation. IEEE Trans. Syst. Man Cybern. 9(1), 69 (1979)

HLA-Based Federation Development Framework Supporting Model Reuse Hang Ji1,3,4(&), Xiang Zhai1,3,4, Xiao Song2, Xiaoliang Liu1,3,4, Yazhou Liang1,3,4, and Zhengxuan Jia1,3,4 1

State Key Laboratory of Intelligent Manufacturing System Technology, Beijing Institute of Electronic System Engineering, Beijing 100854, China [email protected] 2 School of Automation Science and Electrical Engineering, Beihang University, Beijing, China 3 Beijing Complex Product Advanced Manufacturing Engineering Research Center, Beijing Simulation Center, Beijing 100854, China 4 Science and Technology on Space System Simulation Laboratory, Beijing Simulation Center, Beijing 100854, China

Abstract. In this paper we designed a HLA-based federation development framework supporting model reuse. The framework, including model filter, wrapper and scheduler, uses HLA/RTI as distributed underlying communication and management skeleton, and adopts model database and matching algorithm to select satisfying state-machine models for users, adopts FMI to unify model interface and implementation, adopts RTI to associate all the FMUs into one federation. In addition, the framework has distributed, heterogeneous, scalable simulation frame and could provide adequate simulation management. Keywords: HLA

 Functional Mock-up Interface (FMI)  Model reuse

1 Introduction In recent years with the developing information technology, the simulation objects are becoming more sophisticated [1–4]. Simulation applications often include multisystems, multi-tools and multi-models, which brings up increasing demands for simulation framework on its interoperability, expansibility, reusability and consistency of spatiotemporal distribution [5–8]. High Level Architecture (HLA) is a widely used distributed collaborative simulation standard that can provide basic technical support for modeling and simulation of complex systems and handle the heterogeneous, distributed, synergetic simulation scenes [9–11]. It guarantees the stability and synchronism of data distribution process and became the IEEE standard by the year of 2000 with the identifier 1516 [12, 13]. Functional Mock-up Interface (FMI) is another emerging modeling and simulation criterion nowadays. FMI defines an interface to be implemented by an executable called FMU (Functional Mock-up Unit, FMU). FMI makes multi-model-instanceinteraction possible by adopting XML based model description file and dynamic link © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 72–81, 2018.

HLA-Based Federation Development Framework Supporting Model Reuse


libraries. It enables encapsulating dynamic models into simulators and running numerous simulators under single federation. In fact, great benefits could derive from the combined exploitation of HLA & FMI and many organizations have already been devoting into it. A project team from Austrian Institute of Technology expounded the feasibility of integrating HLA & FMI and proposed two algorithms by putting FMU into HLA federation as the master member. The results are suitable for distributed hybrid simulations. However, it needs rewrite the integrated programs according to different demands, the framework is less universal [14, 15]. Alfredo investigated how to combine HLA and FMI from two perspectives HLA for FMI and FMI for HLA, which gave us a great revelation in designing the simulation framework [16]. However, their article focused more on integrated protocol rather than model-driven approaches. In many simulation fields, various types of simulation models are gradually increasing, and there are reuse problems of existing models in new simulation applications. With the scale and complexity of simulation objects and application problems, there are simple simulation models in complex simulation models. With the continuous upgrading of simulation technology standards, there is a reuse problem of the simulation model under the original technical standards under the new technical standards. Due to the lack of necessary information, the reusability of existing models in new applications is difficult to judge [17]. Each individual-model usually adopts a modeling method such as block diagram, state diagram, Petri net, Euler net, etc. according to the characteristics of the domain, Heterogeneity makes integration and reuse of simulation models more difficult, even HLA technology is difficult to solve problems effectively. In this context, the paper designed a HLA-based federation development framework supporting model reuse. The framework uses HLA/RTI as distributed underlying communication and management skeleton, uses reuse config to express user’s demands toward models, uses model filter to choose appropriate models from reusable model base, uses wrapper to encapsulate tool-dependent or standalone models into FMU, uses scheduler to insert FMUs into RTI federates and the communication bus. Section 2 gives the overview of our framework, Sects. 3, 4, and 5 elaborates three core contents separately, model filter, wrapper and scheduler. At last is conclusion.

2 Overview of Framework Design The federation development framework is based on HLA/RTI standard. Run-time Architecture (RTI) is the implementation of HLA, it provides operation management and communication services for simulation process. Designing and programming RTIbased simulation framework can be rather flexible and take full advantage of HLA’s distributed, heterogeneous features. The overview is shown in Fig. 1, system modeling is excluded in the framework. First, the input files of the framework are Federation Execution Data (fed) and reuse config, fed file is an agreement of interactions among federates and reuse config contains the description of model’s state machine diagram. The model filter would select models from reusable model base that are closest to user’s expectation on the


H. Ji et al.

Fig. 1. Overall architecture of the framework

basis of the consistency with fed file and the compatibility with reusable config. Detailed algorithms can be found in Sect. 3. Next, wrapper would transform the target models into FMUs through built-in compilers. FMI provides a standardized solution to convert heterogeneous models into a set of uniform interfaces, which makes it rather convenient for original equipment manufacturers to integrate and test all the systems together with high efficiency as well as reliability. FMI is gaining favor by more and more collaborative simulation model developers and has been supported by dozens of simulation tools and platforms because of its encapsulation, reusability and lightness. Detailed running logic can be found in Sect. 4. Finally, scheduler is responsible for loading, locating and managing all the FMU resources by invoking FMI-API, at the same time taking part in the simulation process through RTI-API. As in the FMI for Model Exchange (ME) modality the solver module is not part of the FMU, it is not practicable to integrate such a kind of FMU in a HLA simulation; as a result, only FMU generated according to the FMI for Co-Simulation (CS) modality can be taken into consideration for the inclusion in a HLA simulation. Detailed structure can be found in Sect. 5. The aforementioned framework has the following advantages. ① Using RTI as efficient information exchange bus, shielding the complexity of network implementation for users, enabling data filtering by publishing/subscribing relationship, decreasing net pressure; ② Having distributed, heterogeneous, scalable simulation frame, providing adequate simulation management by RTI; ③ Initialing reusable model base to store multi-system, multi-tool and multi-disciplinary models, managing models by Model matching degree, using matching relations on interface and state machine to filter models; ④ Unitizing models by FMI standard, providing uniform programming

HLA-Based Federation Development Framework Supporting Model Reuse


interfaces, making the development of members easier; ⑤ Filling the processes of model filtering, model unitizing and federate generating automatically, the only thing using concern is the designing of fed file and reuse config.

3 Design of the Model Filter 3.1

Formalization Method

The reusable simulation model is a special re-use-oriented simulation model that contains a description of its own application context interface information, can be independently used in other simulation applications to support reuse-based modeling, support reusability determination, and Integration with application scenario components in new simulation applications. According to Zeigler’s System Specification Theory, the simulation model can be expressed as an IO system specification. S ¼ \T; X; W; Y; Q; dint ; dext ; k [ T is the time base; x is the input set; x : T ! X is the input series, and x 2 X; XðX; TÞ;Q is the state set; dint is the internal transfer function, and dint : X  Q ! Q, dext is the external transfer function and dext : X  Q ! Q,Y is the output value set; k : Q  X ! Y is the output function. X has closure properties: 8x1\tl ; tr [ ; x2\tl ; tr [ 2 X ^ x1:tr ¼ x2:tl ) x1  x2 2 X Behavioral logic equivalence refers to the ability of a reusable simulation model to faithfully reproduce the behavioral logic of the concepts required for a new simulation application under new application scenarios. The behavioral logic of the simulation model is abstractly described in the simulation model. The determination of the equivalence of behavioral logic requires the existing model of the model library and the expected model in the new application scenario. N EM ¼ SM ! SN _ SM ffi SN

First of all, to meet the logical equivalence of input and output, this can be achieved by interface constraints. EIO ¼ T N T M ^ X N X M ^ Y N Y M Further, we need to ensure the equivalence of the mapping between input and output. RN includes all possible input and output pairs of the system, and RN XN  N ðY ; T N Þ, ðx; qÞ 2 RN ) domðxÞ ¼ domðqÞ, each input-output pair constitutes a unit


H. Ji et al.

of behavior of the system, and RN is a set of all possible behaviors of SN , further satisfying the following conditions to ensure behavioral logic equivalence: Emapping ¼ XN XM ^ RN RM The construction of RM is as follows: For all state q in QM and all input series x\tl ; tr [ in, define state trace STRAJq;w : \tl ; tr [ ! QM and output trace OTRAJq;w : \tl ; tr [ ! Y M in \tl ; tr [ , need to meet the following conditions: 8t 2 \tl ; tr [ ) STRAJq;x ðtÞ ¼ dM ðq; xt [ Þ 8t 2 \tl ; tr [ ) OTRAJq;x ðtÞ ¼ kðSTRAJq;x ðtÞ; xðtÞÞ The last thing to be satisfied is the state equivalence. Estate means that all states of the model and the output of the state are equivalent. Estate ¼ XN XM ^ Ed ^ Ek Ed ¼ 8q 2 QN ^ 8x 2 QN ) dN ðq; xÞ ¼ dM ðq; xÞ Ek ¼ 8q 2 QN ^ 8x 2 QN ) kN ðdN ðq; xÞ; xÞ ¼ kM ðdM ðq; xÞ; xÞ Among the above three types of equivalence, the EIO grade is the lowest and the Estate grade is the highest. 3.2

Definition of RUconfig

RUconfig is a conceptual depiction of the requirements model. The models in each model library also have XML describing their own intrinsic logic. Use RUconfig to match the model’s intrinsic logic description file. State Chart XML (SCXML) is a specification for a finite state machine developed by the W3C organization. This article will format the RUconfig based on the SCXML specification. As the Fig. 2 shoes, RUconfig will contain the following basic elements. (1) State: This element is used to represent a basic state in a finite state machine with optional attributes id and initial. (2) Transition: This element is used to indicate the transition between states. The state transition is driven by events, and the transition condition is judged. The Target property, which represents the target state of this state transition. The Cond property that represents the activation condition for this state transition. (3) Data model element including Input Data and Output Data elements, this element is a container element, can contain one or more Data elements, SCXML through the Data element to complete the definition and use of basic data items. A Data element that declares and composes a Data model element with a mandatory id attribute and optional src and expr attributes. The id attribute of Data

HLA-Based Federation Development Framework Supporting Model Reuse


Fig. 2. Standard format of RUconfig

is used to uniquely identify a data item, and the attribute value is specified by the src or expr attribute. (4) The Assign element, which is used to modify the data value of the Data element. Has a required attribute location, which corresponds to the id attribute of the Data element and is used to identify a data item. Specify a new data value for the Data element with an optional expr. 3.3

Filter Process

RUconfig is a conceptual depiction of the requirements model. The models in each model library also have XML describing their own intrinsic logic. Using RUconfig to match the model’s intrinsic logic description file. Proceed as follows (1) Use the interface description information to filter out the models that match all interfaces. (2) Use RUconfig to match, first judge whether the state equivalence can be satisfied, and output the satisfied model m1. (3) Assume that there is no suitable model in step 2, perform mapping equivalence matching, and output the satisfied model m2. (4) If step 3 does not have a suitable model, then all interface matching models M are directly output.


H. Ji et al.

4 Design of the Wrapper The wrapper obtains models from model filter and provides FMI-based components for federates. As the Fig. 3 shows, wrapper is consist of register center and model compilers.

Fig. 3. Processing structure of wrapper

Both register center and model compiler acquire information from filter and finish their process before simulation starts. Register center takes model related messages including model names, model interfaces and brief descriptions, it records these messages and sends them to interface management in scheduler. Interface management would screen and determine the input/output of each model and establish the mapping table concerning attributes and FMUs. Furthermore, the model description schema in accordance with FMI can be generated automatically in register center; to fulfill this process, model name, model description and variables are needed, specific requirement can be found in FMI standard 2.0. Two kinds of compilers are built inside wrapper: tool dependent and standalone. Tool dependent models are those developed in professional simulation tools and exported in specific extension patterns, generally the simulation tools would offer methods to transmit these models into FMUs, tool dependent compiler just invokes the methods to generate target FMU. For those standalone models or the models developed in FMI-unsupported tools, wrapper provides C-based FMU generator. The generator has built-in template class and including files written in C. It can obtain XML schema of models from register center, wrap business logic into template and generate dynamic link library using Microsoft C compiler. Above elements could form a standard FMU for simulation. Some key segments in FMU template class are listed in Fig. 4, model information includes the class name and model unique id; variable size defines the variables in

HLA-Based Federation Development Framework Supporting Model Reuse


Fig. 4. Key elements in framework’s built-in FMU template class

FMU, supported variable type includes real, integer, bool, string and enumeration; variable define includes all model variables and their value references, every value reference has an index, which is corresponding to the model description schema; variable initial includes the appointing of state and the initialization of value references; variable update is the calculation of value references during the simulation; derivative mapping is the definition of derivative relations among value references.

5 Design of the Scheduler The scheduler uses FMI-API to invoke functions from FMUs and uses RTI-API to interact with communication bus. As Fig. 5 shows, three layers are included in scheduler: RTI Proxy, Interface Management and FMI Adapter.

Fig. 5. Processing structure of scheduler

RTI proxy acquires interface handles from RTI-API, extends its federation ambassador and overrides simulation management methods. Normally, the federation management methods include creating, joining, resigning, deleting, registering, saving and restoring RTI federation; the declaration management methods include publishing/ subscribing objects and interactions; the object management methods include registering, discovering and updating objects and interactions; the ownership management methods include confirming, canceling, acquiring and informing authorization states; the time


H. Ji et al.

management methods include receiving time advance signal and setting regulation/ constrained time policy. RTI proxy is the only way for federates to access services provide by HLA/RTI. Interface management combines business logic from FMUs with methods from RTI proxy, it injects model-related information into simulation management functions and provides input/output attributes for upper and lower layers. The implementation of interface management must refer to the fed file of the federation and conform to the interfaces offered by FMI description schema. In addition, the interface management maintains an “attributeID-modelID” table to establish mapping relationship between properties and FMI adapters. FMI adapters extract attributes from FMUs by using FMI-API and execute data I/O with interface management during the simulation. A standard FMU consists of description schema (XML), API (C) and optional addition dependencies. As the Figs. 6 and 7 indicates, by analyze XML description of FMU, we can obtain model information and API invoking methods. FMI standard offers 18 core interfaces and 6 functional interfaces in format of C to run FMU-based models. By encapsulate above functions FMI adapters fulfill the process of load, set, instantiate, setup, simulate and free FMUs. When simulating FMUs, every time step would trigger the import and export of model attributes.

Fig. 6. Standard model description file in FMI

Fig. 7. Running process of FMI adapter

6 Conclusion In the paper we designed a HLA-based federation development framework supporting model reuse. The framework uses HLA/RTI as distributed underlying communication and management skeleton, which is capable of distributed, heterogeneous and scalable simulation scene. The framework uses reusable model base to store multi-system, multi-tool and multi-disciplinary models and reasonable algorithms to calculate model matching degree. The framework uses model filter, wrapper and scheduler to encapsulate selected models into RTI members, and all of those processes are automatic, which makes it rather convenient to use the federation development framework.

HLA-Based Federation Development Framework Supporting Model Reuse


References 1. Ke, P., Stephen, J.T., Wentong, C., Zengxiang, L.: Multi-user gaming on the grid using a service oriented HLA RTI. In: 13th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, pp. 48–56 (2009) 2. Song, X., Chai, X., Zhang, L.: Modeling framework for product lifecycle information. Simul. Model. Pract. Theory 18(8), 1080–1091 (2010) 3. Chen, Y., Peng, C., Xiao, S.: An efficient approach to collaborative simulation of variable structure systems on multi-core machines. Clust. Comput. 19, 29–46 (2016) 4. Horuk, C., et al.: Automatic and portable cloud deployment for scientific simulations. In: 12th International Conference on High Performance Computing and Simulation, Piscataway, NJ, pp. 374–381 (2014) 5. Song, X., Zhang, L.: A DEVS based modelling and methodology-COSIM. Appl. Math. Inf. Sci. 6(2), 417–423 (2012) 6. Wu, Y., Song, X., Gong, G.: Real-time load balancing scheduling algorithm for periodic simulation models. Simul. Model. Pract. Theory 52(1), 123–134 (2015) 7. Song, X., Hang, Ji, Tang, W.J., et al.: A simulation-language-compiler-based modeling and simulation framework. Int. J. Ind. Eng.-Theory Appl. Pract. 24(2), 134–145 (2017) 8. Song, X., Han, D., Sun, J., Zhang, Z.: A data-driven neural network approach to simulate pedestrian movement. Phys. A-Stat. Mech. Appl. 509(11), 827–844 (2018) 9. Draft Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)-Object Model Template (OMT) Specification (2008) 10. Draft Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)Framework and Rules (2008) 11. Draft Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)Interface Specification (2008) 12. Möller, B.: An overview of the HLA evolved modular FOMs. In: Spring Simulation International Workshop (SIW), Orlando, FL, USA, no. 07S-SIW-108. SIW, Orlando (2007) 13. Möller, B., Löfstrand, B.: Use cases for the HLA evolved modular FOMs. In: European Simulation International Workshop (EUROSIW), Genoa, Italy, no. 07E-SIW-040. SIW, Orlando (2007) 14. Awais, M.U., Cvetkovic, M., Palensky, P.: Hybrid simulation using implicit solver coupling with HLA and FMI. Int. J. Model. Simul. Scientific Comput. (2017) 15. Awais, M.U., Palensky, P., Mueller, W., Widl, E.: Distributed hybrid simulation using the HLA and the functional mock-up interface. In: Conference of the IEEE Industrial Electronics Society, vol. 20, pp. 7564–7569 (2014) 16. Garro, A., Falcone, A.: On the integration of HLA and FMI for supporting interoperability and reusability in distributed simulation. In: Symposium on Theory of Modeling & Simulation: Devs Integrative M&S Symposium, pp. 9–16 (2015) 17. Zeigler, B.P., Gon, K.T., Praehofer, H.: Theory of Modeling and Simulation: Integrating Discrete Event and Continuous Complex Dynamic Systems, 2nd edn. Academic Press, Cambridge (2000)

Soft Computing and Machine Learning

Automatic Performance Simulation for Microservice Based Applications Yao Sun1,2(&), Lun Meng3, Peng Liu1,2, Yan Zhang1, and Haopeng Chan1 1


School of Software Engineering, Jinling Institute of Technology, Nanjing 211169, China [email protected] 2 Nanjing Institute of Big Data, Nanjing 211169, China College of Public Administration, Hohai University, Nanjing 210098, China

Abstract. As microservices can easily scale up and down to adapt to dynamic workloads, various Internet-based applications adopt the microservice architecture to provide online services. Existing works often model applications’ performance according to historical training data, but they using static models cannot adapt to dynamic workloads and complex applications. To address the above issue, this paper proposes an adaptive automatic simulation approach to evaluate applications’ performance. We first model applications’ performance with a queue-based model, which well represents the correlations between workloads and performance metrics. Then, we predict applications’ response time by adjusting the parameters of the application performance model with an adaptive fuzzy Kalman filter. Thus, we can predict the applications’ performance by simulating various dynamic workloads. Finally, we have deployed a typical microservice based application and simulated workloads in the experiment to validate our approach. Experimental results show that our approach on performance simulation is much more accurate and effective than existing ones in predicting response time. Keywords: Microservice  Applications performance Kalman filter  Performance simulation

 Fuzzy logic

1 Introduction Traditional software architectures cannot adapt to the rapid change of users’ requirements in dynamic and complex networks. Microservice architectures aim at developing and operating maintainable and extensible applications. The architecture separates a complex application into independent distributed service components with specific functions and uses lightweight communication mechanisms. However, since microservices have too many components and the dependencies among them are extraordinarily complex, predicting the performance of microservice-based applications is difficult. Therefore, predicting performance by simulating workloads has become the key to guarantee the performance of microservice-based applications. Microservice based applications are often composed with fine-grained isolated service © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 85–95, 2018.


Y. Sun et al.

components that use lightweight communication protocols to cooperate with each other. These autonomous components update independently to flexibly meet the changing requirements. Furthermore, different components of an application often have various resource requirements. Existing works often model an application’s performance with required domain knowledge. Moreover, since they set the static parameters of the application’s performance model according to historical data, these approaches cannot adapt to dynamic workloads. To address the above issue, this paper proposes an adaptive automatic performance simulation approach for microservice based applications. We first model an application’s performance with a queue-based model, which presents the correlations between workloads and performance metrics (e.g., response time, throughput, resource usage), so the model can self-adapt to various microservice based applications. Then, we adjust the parameters of the application’s performance model with a fuzzy Kalman filter [17], so the model can self-adapt to changing workloads. The contributions of this paper are listed as follows: • We model applications’ performance with a Jackson queueing network model, which automatically presents the correlations between workloads and performance metrics without domain knowledge. • We predict workloads with a fuzzy Kalman filter, which can conveniently adjust the parameters of applications’ performance model to adapt to changing workloads. • We have conducted extensive experiments in real scenarios to validate the precision in predicting response time. The remainder of this paper is organized as follows. Section 2 reviews the state of the art related works. Section 3 presents the response time prediction method by combining Jackson queueing network and fuzzy Kalman filter. Section 4 validates the proposed approach with a series of experiments. Section 5 concludes this paper.

2 Related Works Existing works of provisioning resources often use machine learning, fuzzy logic and admission control to train models and calculate parameters. Lama et al. [8] predict the spike of applications’ resource requirements, and provision physical resources in the quality of service (QoS) constraints. Cao et al. [9] estimate the supported workloads according to the allocated resources, and use an admission control method to guarantee QoS by denying overloading requests. Cherkasova et al. [10] model the correlations between workloads and throughput, estimate resource requirements, and accept requests with admission mechanism. Robertsson et al. [11] use a linearity model to predict resource utilization for guaranteeing QoS. Xu et al. [12] consider the performance overhead of VM and trace the changes of applications’ resources in virtualization environment. Karlsson et al. [13] propose a performance isolation approach to analyze applications’ resource requirements, and model applications’ performance to adaptively provision resources. Lama et al. [14] analyze performance interference between virtual machines with machine learning to provision resources. Bodík et al. [15] train the parameters of performance models based on the historical data instances

Automatic Performance Simulation for Microservice Based Applications


to dynamically adjust resources. Thant et al. [16] conduct scientific workflow optimization on make span minimization, virtual machine deployment cost minimization, and virtual machine failure minimization in the cloud infrastructure in a level-wise manner. These approaches train models and get parameters with a static training dataset, so cannot adapt to dynamic fluctuate workloads. To address this issue, we propose a fuzzy Kalman filter-based approach to provision physical resources, which do not need historical monitoring data instances and can rapidly converge.

3 Response Time Prediction Response time is a key QoS (Quality of Service) metric of applications, which is often decided by dynamic workloads and allocated physical resources. we first use a Jackson queueing network to model the correlation between workloads and QoS (i.e., response time). Then, we estimate the model’s parameters with a Kalman filter, and dynamically adjust the filter’s parameters with fuzzy logic to improve prediction accuracy. Finally, we can use the constructed and optimized performance model to predict the response time of an application. 3.1

Jackson Queueing Network-Based Performance Model

To predict response time, we use a performance model to correlating workloads and response time. a Jackson queueing network can well present the characteristics that a typical microservice-based application processes a request flow as follows [1]: • Application components are independent, while the nodes corresponding to components are also independent in a Jackson queueing network model. • Application components communicate with a message bus, while the open-loop Jackson queueing network transfers data in an exponential distribution. • A request event is processed by an application component (node), and then arrives at the next component (node) or leave the network.

Fig. 1. Network queuing model of microservice based applications


Y. Sun et al.

Processing requests includes that a user sends a request to an application, the request is capsuled as an event, the event is processed by many components in sequence, and then a response is sent back to the user. When one component has many instances, the application uses a round robin scheduling strategy to process the request. As shown in Fig. 1, we model an application composed of many components with a Jackson queueing network, where f is a user’s request flow, j is an application component, i is the ith instance of the jth component. Each request flow is processed by n components j1 ; j2 ; . . .; jn , each component j has m instances, and each instance k presented as jk ð1  k  mÞ runs in a container. Various components have different resource requirements (e.g., CPU and I/O intensive), and we define the resource preference (rpj) of component j as the resource type with the highest resource utilization. We can model a Jackson queueing network to predict response time as follows. First, we present the resource utilization of rpj: uj ¼

X i

ðu0j þ sj  cji  Tji Þ;

ð0  uj  1Þ


where u0j presents the resource utilization of rpj with no workload; sj presents the correlation coefficient between workloads and resource utilization; rji presents the concurrency number (i.e., the number of requests in a second) of instance ji, which fits for the Poisson distribution; Tji presents the processing time of instance ji. Then, we calculate the response time of a request flow f: B ¼ dþ

X j

Tj ; 1uj


where Tj presents the average processing time of component j; d presents the transmitting time of a request flow f. We can get u0j and rji by monitoring applications and get sj with experience from historical data but estimating parameters Tji and d is difficult. 3.2

Kalman Filter Based Performance Prediction

To predict response time, we ought to calculate Tji and d in this subsection. The Kalman filter is an optimal linear state estimation approach, which recursively predicts the next value using the previous predicted value and the current monitoring data instance [2]. As the Kalman filter can online rectify the constructed prediction model with lower computation overhead, it well adapt to dynamic workloads. Thus, we use the Kalman filter to predict response Tji and d in the performance model as follows: Xk þ 1 ¼ AXk þ Wk ;


Zk ¼ Hk Xk þ Vk ;


where Zk ¼ ðTj ; dÞT 8j presents an observation matrix with observed components’ processing time; Xk represents a prediction matrix with predicted components’

Automatic Performance Simulation for Microservice Based Applications


processing time; Hk ¼ ðuj ; u0j ; cji ; BÞT 8i represents workloads, resource utilization and response time; Wk is a excitation white noise covariance matrix, which fits for Gaussian distribution Wk  Nð0; WÞ; Vk is a measurement white noise covariance matrix, which fits for Gaussian distribution Vk  Nð0; VÞ [3]. Wk and Vk should online changes with dynamic workloads over time as follows: Wk ¼ TW;


Vk ¼ UV;


where T and U are time-varying adjustment parameters, and then we can get the prediction matrix by adjusting the Kalman filter model as follows: 1. Update X with wk1 ¼ 0: ^k ¼ AX ^k1 X


T P k ¼ APk1 A þ Wk


 1 T  T Kk ¼ P k Hk Hk Pk Hk þ Vk


  ^k ^k ¼ X ^k þ Kk zk  Hk X X


Pk ¼ ðI  Kk Hk ÞP k


2. Update covariance matrix P k :

3. Calculate Kalman gain:

4. Rectify X:

5. Rectify Pk :

The Kalman filter rectifies the predicted value with the last calculated value in each iteration, so it does not require historical data instances. 3.3

Feedback-Based Model Adjustment

Kalman filter-based prediction has drawbacks as follows: 1. We ought to define a precise state transformation matrix that decides the prediction precision. Since the workloads of each application component are irregular nonlinearity, defining a precise matrix transformation matrix is difficult. 2. Collected data instances are assigned with the same weight, so the Kalman filter cannot well rectify the model as time goes on, because of increasing historical data instances.


Y. Sun et al.

3. Typical Kalman filter cannot process the irregular time-varying data instances, so it requires feedback-based methods to online adjust noises and excitation matrixes. To address the above issues, we online adjust the parameters of Kalman filter model to adapt to changing workloads by decreasing the fluctuation of residual error that is white noise. We analyze the mean and variance of residual error, and then adjust noise matrix W and V in formula (3) and (4) with fuzzy logic to rectify the Kalman filter. The residual error represents that the deviation between the predicted value and the monitored one: r ¼ ZðkÞ  ZðkÞ;


We ought to modify the constructed model, if the variance of residual error violates the predefined threshold. The residual error variance P(r) is calculated as follows: PðrÞ ¼ AðHk Pk HkT þ WÞHkT þ V;


We define a fuzzy logic function combining a subordinating degree function and fuzzy rules to adjust T and U in formula (5) and (6) according to the mean and variance of residual error. As fuzzy logic control methods are effective to deal with nonlinearity systems in practice, we use a Sugeno [1] fuzzy logic model (i.e., TS) to adjust the parameters of Kalman filter. The fuzzy logic rules Ri are described as follows: Ri : if x1 is Ai1 and x2 is Ai2 then ui ¼ pi1 x1 þ pi2 x2 þ r i 1  i  n


where Ri presents the ith rule; xj presents the input in the jth set; ui presents the output of ith rule; pij presents the weight of xj; ri is the constant in Ri; n present the number of rules. Assuming that xj activates Ri and the output is ui, we can get the overall result as: U¼

Xm i

ui w i


where m presents the number of rules, and wi is the weight of ui. We calculate the mean and variance of residual errors (formula (12)) in a period with n samples as: 1X ri n


1X T ri ri n


avg ¼ cov ¼

We compare the measured variance of residual errors (formula (17)) with the predicted one (formula (13)). If the measured variance is significant larger than the predicted one, and the measured mean is significant larger than zero, the precision of

Automatic Performance Simulation for Microservice Based Applications


Kalman filter decreases and then we adjust the noise matrix. We use the mean and variance of residual errors as two inputs and the adjustment parameters T and U of noise matrixes W and V as outputs to construct a fuzzy logic model. As shown in Table 1, “zero” presents that T and U do not change, “small” presents that T increases and U decreases, “large” presents that T decreases and U increases, and “medium” presents that T and U both increase. We do a series of experiments in practice and get many data instances. According to the data instances, we get many optimal linearity formula, for example: If the variance is small and mean is zero, we get that: T ¼ Pðr Þ  0:3 þ 0:8;

U ¼ Pðr Þ  0:2 þ 1:9:


If the variance is large and mean is small, we get that: T ¼ Pðr Þ  0:5 þ 0:6;

U ¼ Pðr Þ  0:1 þ 1:4:


According to the calculated rules based on training data instances, we get a series of rules to online adjust the parameters to improve the precision of Kalman filter. Note that our approach is also suitable for other applications and scenarios, which can simulate corresponding domain specific rules. Table 1. Fuzzy logic rules Mean of residual errors Zero Small Large variance of residual error Zero Small Zero Zero Small Small Zero Large Large Large Large Medium

4 Experiment 4.1

Experimental Environment

We set up a cluster composed of eight hosts connected with a gigabit ethernet network; Each host has an Intel Core i7 CPU 3.4 GHz and an 8G memory; Each host running CenOS 7 and Docker 1.7 have several containers for applications. As shown in Fig. 2, the experimental environment has a Master node, four Slave nodes (i.e., Slave 1–4), a database node running MySQL, a workload generator node running JMeter, and a load balancer node running Nginx. We deploy a typical application Web Serving in Cloudsuite1 that is a benchmark suite for applications.



Y. Sun et al.

Slave 1 Slave 2


Load Balancer

Slave 3 Database Slave 4

Master Fig. 2. Experimental environment


Response Time Prediction

(1) Workloads with Trend In this subsection, we simulate increasing workloads and collect response time in period. We implement typical methods to predict response time according to workloads. We evaluate the effect of these existing methods and our approach by comparing measured response time with predicted response time. We introduce these existing methods as shown in Fig. 3.

Response Time (ms)

50 40 30 20 10 0 0

20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 Workloads (Concurrency No.) Real Workloads RL FL FC Our Approach Fig. 3. Response time prediction comparison

Automatic Performance Simulation for Microservice Based Applications


A. Reinforcement Learning (RL) [4] This method uses a reinforcement learning method to train a and adjust the performance model. The experimental results show that RL converges slowly with sudden workloads, e.g., in the period of 150–200th second and 230– 270th second. Furthermore, RL requires a large scale of data instances as a training dataset. RL cannot adapt to dynamic workloads. B. Fuzzy Logic (FL) [5]: This method designs a neural fuzzy controller to predict response time, which self-constructs its structure and adapts its parameters through online learning. A fuzzy controller requires suitable parameters that decide the model’s accuracy. However, these parameters should vary with workloads. However, manually setting these parameters requires knowledge, and these parameters vary with workloads. The experimental results show that the predicted response time significantly deviates the measured response time in the period of 50–100th second and 150–300th second, because parameters do not change with workloads. C. Feedback Control (FC) [6] This method uses control theoretic techniques to model and design the feedback loops, which reduces the complexity of training rules and improves the stability of running systems. The experimental results show that the method cannot well predict response time in the initial period, because the feedback controller requires a long period to converge. D. Our Approach The experimental results show that the errors are mostly in the initialization period, and the error rate is less than 5%, because our approach needs training data to train the performance model. Especially, the error rate is less than 0.5% in the period of 200–250th second, where workloads suddenly increase. The experimental results demonstrate that our approach can well adapt to dynamic workloads and predict response time in high accuracy. (2) Workloads with Periodicity In this subsection, we simulate periodic workloads in a time-of-day pattern. Then, we compare the RAMA approach [7] widely used for periodic workloads and our approach to predict response time. The experimental results in Fig. 4 show that our approach has less prediction accuracy than RAMA in the initialization period, then the accuracy of our approach stably increases, and finally the accuracy of our approach is higher than that of RAMA. The experimental results show that RAMA cannot well predict response time, where the error rate distributes in 20– 80%, and the error rate is more than 80% in the period of workload fluctuation. The error rate of our approach is about 5%, which is much lower than that of RAMA. Errors mostly occur in the initialization period, and we can reduce these errors by increasing the heart rate.

Y. Sun et al.

Response Time (ms)


50 45 40 35 30 25 20 15 10 5 0 0

20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 Workloads (Concurrency Number) Real Workloads Our Approach Fig. 4. Response time prediction compared with RAMA

5 Conclusions Applications often use microservices as their service infrastructures to provide online services. To guarantee the QoS of applications, this paper proposes a feedback-based performance model and simulation approach to deal with dynamic workloads. We model various applications’ performance with a Jackson queueing network model presenting the correlations between workloads and performance metrics, predict applications’ response time by online adjusting the parameters of applications’ performance model with a fuzzy Kalman filter. Finally, we have deployed typical applications and simulated typical workloads to validate the proposed framework. Experimental results show that our approach is much more accurate than existing ones in predicting response time. Acknowledgment. This work was supported by the Ministry of Education of Humanities and Social Science Research (grant 17YJCZH156 and grant 15YJCZH117), the National Social Science Foundation of China (grant 16CXW027), and Fundamental Research Fund for the Central Universities (grant 2014B00514).

References 1. Shanthikumar, J.G., Buzacott, J.A.: Open queueing network models of dynamic job shops. Int. J. Prod. Res. 19(3), 255–266 (1981) 2. Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME-J. Basic Eng. 82, 35–45 (1960) 3. Sinopoli, B., Schenato, L., Franceschetti, M., et al.: Kalman filtering with intermittent observations. IEEE Trans. Autom. Control 49(9), 1453–1464 (2004)

Automatic Performance Simulation for Microservice Based Applications


4. Martinez, J.F., Ipek, E.: Dynamic multicore resource management: a machine learning approach. IEEE Micro 29(5), 8–17 (2009) 5. Lama, P., Zhou, X.: Autonomic provisioning with self-adaptive neural fuzzy control for endto-end delay guarantee. In: IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 151–160 (2010) 6. Lu, C., Lu, Y., Abdelzaher, T.F., et al.: Feedback control architecture and design methodology for service delay guarantees in web servers. IEEE Trans. Parallel Distrib. Syst. 17(9), 1014–1027 (2006) 7. Lama, P., Guo, Y., Zhou, X.: Autonomic performance and power control for co-located Web applications on virtualized servers. In: IEEE/ACM 21st International Symposium on Quality of Service (IWQoS), pp. 1–10 (2013) 8. Lama, P., Zhou, X.: Efficient server provisioning with control for end-to-end response time guarantee on multitier clusters. IEEE Trans. Parallel Distrib. Syst. 23(1), 78–86 (2012) 9. Cao, J., Zhang, W., Tan, W.: Dynamic control of data streaming and processing in a virtualized environment. IEEE Trans. Autom. Sci. Eng. 9(2), 365–376 (2012) 10. Cherkasova, L., Phaal, P.: Session-based admission control: a mechanism for peak load management of commercial web sites. IEEE Trans. Comput. 51(6), 669–685 (2002) 11. Robertsson, A., Wittenmark, B., Kihl, M., et al.: Design and evaluation of load control in web server systems. In: Proceedings of IEEE American Control Conference, vol. 3, pp. 1980–1985 (2004) 12. Xu, C.Z., Rao, J., Bu, X.: URL: a unified reinforcement learning approach for autonomic cloud management. J. Parallel Distrib. Comput. 72(2), 95–105 (2012) 13. Karlsson, M., Karamanolis, C., Zhu, X.: Triage: performance isolation and differentiation for storage systems. In: IEEE International Workshop on Quality of Service, IWQOS, pp. 67–74 (2004) 14. Lama, P., Zhou, X.: Autonomic provisioning with self-adaptive neural fuzzy control for percentile-based delay guarantee. ACM Trans. Auton. Adapt. Syst. 8(2), 9 (2013) 15. Bodík, P., Griffith, R., Sutton, C., et al.: Statistical machine learning makes automatic control practical for internet datacenters. In: Proceedings of Conference on Hot Topics in Cloud Computing, pp. 12–21 (2009) 16. Takagi, T., Sugeno, M.: Fuzzy identification of systems and its applications to modeling and control. IEEE Trans. Syst. Man Cybern. 1, 116–132 (1985) 17. Daum, F.E.: Extended Kalman Filters. In: Baillieul, J., Samad, T. (eds.) Encyclopedia of Systems and Control. Springer, London (2014).

Predictive Simulation of Public Transportation Using Deep Learning Muhammad Shalihin Bin Othman(B) and Gary Tan(B) School of Computing, National University of Singapore, Singapore, Singapore [email protected], [email protected]

Abstract. Traffic congestion has been one of the most common issues faced today as the rising population and growing economy calls for higher demands in efficient transportation. Looking into the public transport system in Singapore, we investigate its efficiency through a simple simulation and introduced predictive travel times to simulate ahead in future so as to identify congestion issues well in advance. Public transport operators can then utilize the reports to apply strategic resolutions in order to mitigate or avoid those issues beforehand. A deep neural network regression model was proposed to predict congestion, which is then used to compute future travel times. Experiments showed that the proposed methods are able to inject a better sense of realism into future simulations. Keywords: Simulation


· Deep learning · Public transport


Singapore has been facing an increase of traffic congestion problems over the years [1]. Several measures have been taken by the Land Transport Authority (LTA) of Singapore such as the Electronic Road Pricing (ERP) [2] to reduce congestion on major roads and actively encouraging the use of public transport so as to reduce the number of vehicles on the roads altogether. Singapore already has one of the most efficient public transportation systems in the world, and the government aims to further enhance both the bus and rail networks to provide commuters with easier access to public transport. By 2030, it is expected that 8 in 10 households will be within 10 min walk from an MRT station. Even so, there is still a significant proportion of commuters (currently at 41.14% in 2018) who opted for buses as the mode of transportation. Currently, the LTA has introduced various policies such as dedicated bus lanes, and mandatory give-way to buses so that priority is given to buses when traveling along more congested road stretches. However, travel delays are still frequently experienced by commuters, especially during the morning rush hours and peak hour travels in the evening. In many occasions, commuters have to wait for a long time for the bus to arrive, and in many instances they are not able to board the bus due to overcrowding. c Springer Nature Singapore Pte Ltd. 2018  L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 96–106, 2018.

Predictive Simulation of Public Transportation Using Deep Learning


This paper aims to mitigate these issues by proposing a deep learning technique to predict the probability of congestion along bus routes before computing travel times based on the likelihood of congestion. These travel times is then injected into a simple simulator in place of the commonly used technique of randomly picking out values based on a distribution. Through such simulation, we can explore bus utilization and allow planning for an effective scheduling more realistically in advance. With the availability of various data sets from the LTA Data Mall [3], several experiments were carried out to finalize the methods to use for predicting congestion. A deep neural network regression is proposed to predict the probability of congestion based on the actual travel times between any given two bus stops, given the day and time. These predicted travel times are then fed to a simulator to simulate several demand possibilities, and produce reports of utilization as well as shortcomings that can be addressed and re-simulated to seek optimal scheduling plans. Figure 1 shows the overview of our proposed methods.

Fig. 1. Proposed architecture overview

In overall, the aims of this paper are to produce the following: – A conceptual contribution of applying deep learning techniques into simulation in order to achieve a greater sense of realism.


M. Shalihin Bin Othman and G. Tan

– Some technical contributions with the algorithms we applied in order to effectively predict congestion probabilities and compute future travel times. – A utility contribution of the simulation model that is equipped with injecting predictive travel times achieved through deep learning. Section 2 will discuss related works that have impacted and led the thought process in this paper. Section 3 will delineate each process that was taken to produce the final work, while Sect. 4 justifies the verifiability and validity of the simulation model, as well as the process and results of the experiments we carried out to evaluate the accuracy performance of the deep neural network regression model. Finally, we conclude along with future work in Sect. 5.


Related Work

Several research works today have been driven by the promising efficacy of machine learning and deep learning algorithms [4,5]. A previous work [6] that was done sought to improve travel duration by predicting congestion and its duration in advance so as to estimate the best route or alternatives for commuters taking private transport. Some challenges faced include the uncertainty of routes taken by private transport causing difficulties in accurate prediction through deep learning. The proposed MLP-LR model was able to achieve a good average accuracy of 63% in predicting congestions in Singapore island-wide, with only 5–10 min of variance for each congestion duration prediction. Modeling and simulation has also seen effective results on improving transportation systems [7,8]. Gupta et al. [9] presented a framework that integrates the optimization of control strategies with the generation of predictive travel time guidance within DynaMIT2.0 [10], a multi-modal multi-data source driven simulation-based short-term traffic prediction system. The novel integration resulted in the consistency between the actual outcomes and the expectation of users when responding to the predictive information and network control. Lu et al. [11], on the other hand, introduced SimMobility - a simulator platform for mobility, which is an agent based, fully econometric, activity-based demand model, integrated with a dynamic traffic assignment model. It also integrates various mobility-sensitive behavioral models within a multi-scale simulation platform that considers land-use, transportation and communication interactions. The paper presented the efficiency of the simulator platform in terms of computational time and prediction accuracy. In view of these works, we made use of the MLP-LR model and a neural network regression model that feeds travel time into a simple simulator to explore the viability of the ML-Simulation concept [12]. In this paper, we explored the concept further with a deep learning approach and an intuitive algorithm to compute the travel times before experimenting the results on a similar simulator.

Predictive Simulation of Public Transportation Using Deep Learning




Our proposed method involves retrieving and cleaning data from LTA’s data mall, then computing the probability of congestion based on the history of actual travel times to train a deep neural network regression model. The model can then be used to predict these probabilities of congestion in future, which we will use to compute the travel times. This allows for a more realistic simulation as compared to using a specific distribution. The simulator would then be capable of analyzing transportation systems well in advance, such that transport operators could mitigate inefficiencies before they happen. The following subsections will discuss each process in greater detail. 3.1


From the LTA Data Mall API, we can retrieve bus services that are operational, bus routes containing all bus stops for each bus service, as well as the bus arrival times for each bus at each bus stop. For this research, the following data were collected: 1. Concatenated Bus Stop Code - to represent the unique journey between 2 bus stops (eg. for every bus service that travels from Point A to D, we record the bus stop code for AB, BC and CD) 2. Day of week (wi ∈ {1, 2, 3, 4, 5, 6, 7}, where 1 represents Monday. . . 7 represents Sunday) 3. Time of day (ti ∈ {0 . . . 2359}, in 24-h time format) 4. Expected Travel Time (in minutes) 5. Actual Travel Time (in minutes) Data were collected for 15 different bus services operating from Monday to Friday over two months, excluding public holidays. We exclude weekends and public holidays so as to focus on the road conditions during regular working days in Singapore. At the end of all data retrieving activities, we have a total of 61,000 data points for training and testing. 3.2

Deep Neural Network Regression

The Deep Neural Network Regression (DNNR) model we use in this paper is unlike a regular linear regression model. The DNNR uses a deep neural network for model training to predict the probability of congestion (y) based on three features; concatenated bus stop code, day and time. The DNNR, consisting of 4 hidden layers with [50,200,100,10] neurons respectively, follows the equation: n  wi xi + b) y = ϕ( i=1


M. Shalihin Bin Othman and G. Tan

A softmax activation function was used between layers in order to “squash” the vectors and produce entries in the range of 0 to 1. The output to the model would then be our probability of congestion, y, where the values between 0 . . . 1, inclusive, represents not congested . . . congested. As the neural network is trained, the loss is computed using the Mean Squared Error (MSE): n

M SE =

1 (yi − yˆi )2 n i=1

Where y is the congestion probability that was predicted through the neural network computations, and yˆ is the actual congestion probability we have for training. The weights are then optimized through back-propagation learning using the Adaptive Gradient Optimizer [13] with 0.001 learning rate and an initial accumulator of 0.01 over 1000 iterations to allow it to converge. 3.3

Training and Prediction

For every stop each bus makes, the bus stop code for the starting stop (id1 ) and the bus code for the next stop (id2 ) is concatenated (cid = id1 id2 ) and recorded. Then, we get the real-time bus arrival times from LTA’s API and deduct the departure time from the previous stop to get the actual travel time between the two stops. To compute the estimated travel time, we use the initial bus schedules instead of the real-time data. We would then have a list of the following data points: 1. 2. 3. 4. 5.

Concatenated Bus Stop Code, cid Day, d Time, t Actual Travel Time, att Expected Travel Time, ett

Algorithm 1 shows how we iterate a training dataset D to compute the probability of congestion and feed the inputs to train the DNNR model. The intuition behind Algorithm 1 is that the maximum travel time between two stops is due to its worst case of congestion, while the minimum travel time is due to no congestion. Hence, Line 4 first checks that the maximum is indeed higher than the minimum, otherwise no congestion ever happens for that stretch of road. The DNNR takes the concatenated unique identifier, the day of week and the time of day as inputs and produces the probability of congestion, between 0 and 1, as output. Once we have a trained DNNR model, we can use Algorithm 2 to predict the probability of congestion from the test dataset T , and compute the predicted travel time, travel time. Again, the intuition behind Algorithm 2 is that the predicted travel time is the sum of the worst travel time multiplied by the probability of congestion p, and the best travel time multiplied by the probability of no congestion (1 − p). This would finally give us a list of predicted travel times that we can inject into the simulator for a realistic simulation into the future.

Predictive Simulation of Public Transportation Using Deep Learning


Algorithm 1. Training Algorithm Require: Functions max(cid) and min(cid), returns maximum and minimum travel time for a specific cid respectively. 1: for each d ∈ D do 2: get att, cid, day, time from d 3: if max(cid) > min(cid) AND att > min(cid) then 4: p ← (att − min(cid))/(max(cid) − min(cid)) 5: train(cid, day, time, p) 6: else 7: train(cid, day, time, 0.0) 8: end if 9: end for

Algorithm 2. Prediction Algorithm Require: Functions max(cid) and min(cid), returns maximum and minimum travel time for a specific cid respectively. 1: for each t ∈ T do 2: get cid, day, time from t 3: p ← predict(cid, day, time) 4: travel time ← p × max(cid) + (1 − p) × min(cid) 5: end for



The purpose of the simulator is to allow effective analysis of transportation systems by simulating into the future well in advance so as to uncover possible bottlenecks or congestion. With such reports, transport operators can quickly address or rerun simulations with their solutions in order to seek optimal scheduling plans in the most cost-efficient way possible. The simple simulator is designed to demonstrate the viability of injecting travel times from deep learning rather than a specific distribution. It will take in the predicted travel time between each stop and simulate the bus traveling from origin at scheduled times. Picking up and alighting passengers at each stop are randomly generated based on a Poisson distribution, until the bus reaches its destination. The visual concept of the simplified simulation model is shown in Fig. 2 and described below. (a) Bus Deployment - buses will be deployed at scheduled times (b) Bus Stations - From origin through n-stops till destination, passengers will board or alight the bus (c) Bus Travel Time - Predicted number of minutes the bus takes to arrive at the next stop (d) Passengers arrival based on a distribution (e) Records Bus Travel Time - for report purposes (f) Dispose of buses and passengers from simulation (g) Passengers who alight or decide to change mode of transport leaves the bus stop


M. Shalihin Bin Othman and G. Tan

Fig. 2. Public transport simulator

(h) Records commuters’ wait time or any other required metrics - for reporting and analysis The model can be adjusted to record more information that may be required for effective analysis. Some considerations may include, but not limited to, the decision-making of passengers to change mode of transport after waiting for more than x minutes, the number of passengers unable to board a bus that arrived due to maximum capacity reached, etc.


Experiments and Results

Since we only developed the simulator to test the viability of injecting deep learning predictions into simulation, it is as good as a regular simulator if the predicted travel times is as close to the distribution of actual travel times as the expected travel times would. Hence, it is sufficient to verify and validate that the simulator is correct, then test the accuracies of the predicted travel times against the expected travel times and the actual travel times. The following subsections will discuss the verifiability and validity of the simulator as well as the experiments done to validate the accuracy of the DNNR model. 4.1

Simulator Verification and Validation

In verifying that the simulator model we have designed is correct, we provide answers and justifications to the following questions: 1. Are events within the model represented correctly? Based on the planned schedules we retrieved from the LTA Data Mall as well as the prediction accuracy we present in Sect. 4.2, the events are correctly represented and as close to real-life situation as possible. 2. Are mathematical formulae and relationships valid? Based on real-life situations, the flow of the buses and decision-making process of passengers as well as their travel behavior are valid.

Predictive Simulation of Public Transportation Using Deep Learning


3. Are statistics and measures formulated and calculated correctly? Based on the data we have from LTA, we validated the actual travel time with various sources such as Google Maps to ensure the distributions we used are accurate. Next, we validate that this simulator is indeed a meaningful and accurate representation of the real system. First, we make the following assumptions: – No bus break-downs. – The bus will stop at every bus stop regardless if there are passengers boarding or not. Then, to prove that the model has high face validity, we justify the following based on real-life scenarios: – – – – –

Each bus service travels non-stop from origin to destination Each bus will require more than 0 min to travel to the next stop. No passengers should be alighting at origin. All passengers need to alight at destination as the bus will terminate. Except for destination, there may be 0 or more passengers boarding the bus, including at origin.

Henceforth, we have verified and validated that the model we have designed is indeed reasonable on its face as we ensured a high degree of realism is built into the model through reasonable assumptions regarding system structure, and reliable data from the Land Transport Authority of Singapore. 4.2

Travel Time Prediction Accuracy

Based on the concatenated unique identif ier, day and time, we first trained the DNNR model with 30,000 data points over 1000 iterations before making predictions on 1,000 unseen data points. We validate the accuracy of the DNNR model by comparing it against the actual, predicted and expected travel time. Figure 3 shows the accuracy performance of the DNNR model in predicting travel times with only 30,000 training data. For visual clarity, only 30 random samples from the 1,000 predictions were plotted for display. As we can see, the predictions made by the DNNR model is much closer to the actual travel times than the expected travel times are. The DNNR model achieved 56% accuracy on the actual travel times while the expected travel times only achieved 31% accuracy. We then continued to train the model with another 30,000 unseen data points before making predictions on the same 1,000 test data points that were not used for training at all. Figure 4 shows the improvement in accuracy performance of the DNNR model against the actual travel times and the expected travel times. Again, we can see that the line between the predicted travel times and the actual travel times is much closer than the expected travel times. At this stage,


M. Shalihin Bin Othman and G. Tan

Fig. 3. Actual vs predicted vs expected travel times

Fig. 4. Actual vs predicted vs expected travel times

the accuracy of the expected travel times is only 38% while the accuracy of the predicted travel times improved to 66%. We can conclude that with further training, the DNNR model is able to make better predictions and close up the gap in overall. We have also showed that our proposed model is able to correctly predict the exact travel times more accurately than the expected travel times estimated by LTA. Therefore, with this model, we can successfully introduce a better sense of realism when simulating into the future, thus producing more useful results for better planning.



In conclusion, we showed a novel integration of deep learning into simulation to improve the level of realism when simulating into the future. This paper has also demonstrated the ability and viability to leverage on latest artificial intelligence technology for simulating complex models more realistically.

Predictive Simulation of Public Transportation Using Deep Learning


With the focus on the public transport system in Singapore and through the experiments we presented in Sect. 4, it is conclusive that our proposed model can effectively predict travel times in order to introduce realistic traffic simulations. The results we have provided further supports that this work can greatly assist transport operators plan their deployment schedule more effectively to meet commuters’ demands as well as ensure them a cost-effective solution. 5.1

Further Work

For future work, we may look further into optimizing the deep learning model by modifying its hyper-parameters or increasing its depth of training with deeper hidden layers and more data points. Furthermore, we can also investigate other factors that may affect the travel time of vehicles such as the weather or any planned road maintenance along a bus route so as to increase the level of realism in our simulations. The assumptions we made for passenger arrivals, bus capacity, etc. can be further improved if collaboration with transport operators are achieved. Since our proposed methods have portrayed the viability of simulating future utilization of public transport systems through deep learning, further work into improving the overall realism and accuracy can be worthwhile.

References 1. Yuan, L.L.: A case study on urban transportation development and management in Singapore. In: Second International Expert Panel Meeting on Urban Infrastructure Development, Bangkok, Thailand, pp. 8–9 (1997) 2. L. T. A. L. Singapore: Electronic road pricing (ERP), March 2017. https://www. on/electronic-road-pricing-erp.html 3. L. T. A. L. Singapore:, March 2017. content/mytransport/home/dataMall.html 4. Fouladgar, M., Parchami, M., Elmasri, R., Ghaderi, A.: Scalable deep traffic flow neural networks for urban traffic congestion prediction. CoRR abs/1703.01006 (2017). 5. Lv, Y., Duan, Y., Kang, W., Li, Z., Wang, F.Y.: Traffic flow prediction with big data: a deep learning approach. IEEE Trans. Intell. Transp. Syst. 16(2), 865–873 (2015) 6. Shalihin Bin Othman, M., Keoh, S.L., Tan, G.: Efficient journey planning and congestion prediction through deep learning. In: International Smart Cities Conference (ISC2) 2017, Wuxi, China, September 2017 7. Deist, T., Patti, A., Wang, Z., Krane, D., Sorenson, T., Craft, D.: Simulation assisted machine learning. ArXiv e-prints, February 2018 8. Bando, M., Hasebe, K., Nakayama, A., Shibata, A., Sugiyama, Y.: Dynamical model of traffic congestion and numerical simulation. Phys. Rev. E 51, 1035–1042 (1995). 9. Gupta, S., et al.: Real-time optimisation of network control strategies in DynaMIT2.0. In: TRB 95th Annual Meeting, Washington, USA, January 2016


M. Shalihin Bin Othman and G. Tan

10. Lu, Y., Seshadri, R., Pereira, F., Antoniou, C., OSullivan, A., Ben-Akiva, M.: Dynamit2.0: architecture design and preliminary results on real-time data fusion for traffic prediction and crisis management. In: 2015 IEEE 18th International Conference on Intelligent Transportation Systems, Gran Canaria, Spain, September 2015 11. Lu, Y., et al.: Simmobility mid-term simulator: a state of the art integrated agent based demand and supply model. In: 94th Annual Meeting of Transportation Research Board (TRB), Washington D.C, USA (2015) 12. Shalihin Bin Othman, M., Tan, G.: Public transport utilization simulation with machine learning. In: Distributed Simulation and Real Time Applications (DS-RT) 2018, Madrid, Spain, October 2018 (2018, to appear) 13. Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

An Ensemble Modeling for Thermal Error of CNC Machine Tools Xuemei Jiang, PanPan Zhu, Ping Lou(&) and Quan Liu

, Xiaomei Zhang,

School of Information Engineering, Wuhan University of Technology, Wuhan 430070, Hubei, China [email protected] Abstract. Thermal error caused by the thermal deformation of computer numerical control (CNC) machine tools is one of the main factors to affect the machining accuracy. With monitoring data of the temperature field, establishing data-driven thermal error model is considered as a more convenient, effective and cost-efficient way to reduce the thermal error. As a matter of fact, it is very difficult to develop a thermal error model with perfect generalization adapting to different working conditions of machining tools. In this paper, a method of an ensemble modeling (EM) based on Convolution Neural Network (CNN) and Back Propagation (BP) Neural Network for modeling thermal error is presented. This ensemble model takes full advantages of two different neural networks, namely CNN having self-extracting feature to solve collinear problem in temperature field and BP can process heat source to thermal error by mapping nonlinear function, then combined into a EM. To demonstrate the effectiveness of the proposed model, an experiment platform was set up based on a heavyduty CNC machine tool. The results show that the proposed model achieves better accuracy and strong robustness in comparison with only with BP network and CNN network respectively. Keywords: Thermal error  Ensemble modeling  CNC machine Convolution neural network  Back propagation network

1 Introduction As the demand of high precision of the machine tool becomes higher and higher, thousands of scholars and engineers dedicate to the research on errors. Among the machine tools errors, thermal error is one of the most significant sources of machine tool to contribute to the deformation or distortion of the machine tools, which makes up 60–70% of total errors [1, 2]. It is caused by the rise of the temperature variation in the machine structure due to internal and external heat sources [3]. Effective thermal error compensation depends on the accurate prediction of the time-variant thermal error during machining. Lots of researchers have applied various modeling methodologies to thermal error model in order to minimize the thermal error. [4, 5] applied the finite element method to get the numerical solution of temperature field of the spindle. [6] proved the spindle temperature field was changing with the effect of internal heat source and environment temperature. Besides analyzing the © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 107–118, 2018.


X. Jiang et al.

thermal characteristics of the spindle, the relationships between the temperatures and thermal errors are studied by building thermal error model. And some scholars took multiple variables, such as the spindle speed, the environment and so on [7]. Multivariable regression model is another kind of method in modeling, which is frequently used for the thermal error [8–10]. Different from the regression model, the spindle thermal error in multiple directions can be modeled with only one neural network as it has multiple outputs [11–17]. However, the model which is put into the widespread application of the thermal error compensation must have high accuracy and robustness. However, a plenty of challenges break out only using traditional prognostics approaches in industry 4.0 environment. In industry 4.0 environment, the emerging industrial big data has ‘5V’ characteristics (volume, velocity, variety, veracity, and value), which challenges traditional prognostics models [18]. The large amount of information from the terminal sensing nodes and the multi-source heterogeneity of big data has brought great difficulties to the feature selection process. A big-data-driven method based deep learning is developed to deal with industrial big data problem [19]. Compared with other mathematics models, deep learning has been widely used in various field due to the advantages of information distribute saving, parallel processing, and self-learning ability. In recent years, different types of artificial neural networks have been developed in thermal error modeling, such as BP network, RBF network, CNN network, and integrated recurrent neural network [20]. Reference [21] used BP network model to compensate most of the thermal deformation, but it is easy to fall into the local optimal solution. The convergence of the peak of the device is very poor, so it is difficult to obtain the global optimal solution. Reference [22] introduced a genetic algorithm (GA) to optimize BP network’s initial weights and thresholds in for that the accuracy and robustness. But there are still problems of premature convergence and slow convergence due to GA. Reference [23] has shown that, when individual artificial neural network has a certain prediction accuracy and diversity, by combining them, it can obtain a better model that has higher accuracy and robust compared to individual neural network. Zhang [24] introduced a novel thermal error, grey neural network (GNN) composed of the grey system theory and neural network, which showed better accuracy and robustness than traditional grey model or the neural network. As mentioned above, thermal error model is a typical nonlinear and multi-variable prediction system. In this paper, a novel model, EM is proposed to predict the thermal error accurately, which combines BP and CNN models. According to the structure of the combined prediction model, the performance of the new models is greatly improved in comparison with BP model and CNN in terms of accuracy and robustness. The rest of the paper is organized as follows: Sect. 2 describes the EM algorithm. Section 3 presents the experiments, including experiment-setup and analysis of results. Finally, in Sect. 4, the conclusion and summary of the paper are outlined.

2 Constructing Thermal Error Model of CNC Machine Tools The thermal error model reflects the complex relationship between the machine tool’s temperature field and thermal error. It is difficult to develop a comprehensive thermal error model with good performance based only on one kind of modeling method. An

An Ensemble Modeling for Thermal Error of CNC Machine Tools


ensemble model is proposed in order to compensate each other and incorporate the advantages of different networks in this research. In Fig. 1, according to the types of the inputs, different models are combined to establish a new model, which is based on BP neural network and CNN. And then, a connected layer combines the outputs of different individual neural network to one result. This method can adjust of network structure and reap beneficial advantages of all models.

Fig. 1. A schematic diagram shows EM based on BP neural network and CNN network. A single full connection layer combines different individual neural network.


Neural Network

An artificial neural network (NN) is a method for mapping inputs to outputs, usually employed in pattern recognition problems, or in function-fitting problems. The concept of deep learning stems from the study of artificial neural networks. A multilayer sensor with multiple hidden layers is a deep learning structure. Deep learning creates more abstract high-level representation attribute categories or features by combining lowlevel features to discover distributed representations of data. Different archetypes of neural networks have own characteristics. Advantages and disadvantages of BP network and CNN network are following in Table 1.

Table 1. Advantages and disadvantages of BP network and CNN network Neural network BP




1. Good nonlinear mapping ability. 2. Good robustness 1. Strong nonlinear mapping ability 2. The Powerful ability of extracting features

1. Get into local extremum easily

1. The train time is long 2. Too much training samples lead to larger network size. 3 Different tasks require separate training



X. Jiang et al.

The Ensemble Model Based on CNN and BP Network

BP Network The topology of BP includes: input layer, hidden layer, and output layer. Figure 2 shows its structure: Xi ði ¼ 1; 2; . . .; mÞ indicates the inputs, and YB indicates the outputs. Wij is the connection weight of the input layer and the hidden layer, and Wjk is the connection weight of between the hidden layer and the output layer. bi and bj is the bias of the hidden layer and the output layer respectively.

Fig. 2. The BP neural network diagram

The state of the hidden layer is Z ¼ fð


x  wij þ bi Þ


f ðÞ is the active function. The commonly used active functions are sigmoid, tanh and pureline. The output value can be expressed as Eq. (2) X YB ¼ Z  wjk þ bj ð2Þ BP network is trained with a gradient descent technique. It tries to improve the performance of the neural network by reducing the total error by changing the weights along its gradient. CNN Network Figure 3 shows the topology of a simple CNN. The network consists of a set of layers, each of which contains one or more planes. It consists of a convolutional layer, a downsampling layer, fully connected layer, and a final output layer. As the input to the CNN model is at least two-dimensional, we will arrange the input into a two-dimensional structure. In the first sub-CNN, the temperature of machine tools enters in the input layer. To other sub-CNNs, the output feature map of the previous layer will be the input

An Ensemble Modeling for Thermal Error of CNC Machine Tools


of the next sub-CNN. The output feature mapping of the last sub-CNN is connected to the fully connected layer and the out layer.

Fig. 3. The CNN neural network diagram

In Fig. 3, the vector of one-dimension raw data has been changed into twodimensional data as input. The general form of the convolution operation is as expressed by Eq. (3) x¼f


x * wij þ b


where * stands for the operator of the two-dimensional discrete convolution, b is the bias vector, wij and x. note the convolution kernel and the input feature map respectively. f ðÞ represents the activation function. After the processing of the convolutional layer, the general form of the down sampling is expressed by Eq. (4) x ¼ f ðb downðxÞ þ bÞ


where, b is the multiplicative bias term, downðxÞ is the pooling function, b is the additive bias vector, and f ðÞ is the activation function. All the neurons in the fully connected layer are connected to all neurons in the feature maps of the upper layer, whose output is expressed by Eq. (5). YC ¼ f ðwx þ bÞ


where, x is the input of the fully connected layer, YC represents the output of the fully connected layer, w and b denote the weight and the additive bias term, and f ðÞ represents the activation function. In the training, after using the gradient descent method, the cost function of Mean squared error function can be minimized by several iterations, to complete network training.


X. Jiang et al.

Structure Design of the Ensemble Model According to the types of the input, a novel method which combines the results of CNN with BP to output the final data as the thermal error compensation value as a newly combined prediction model EM is proposed in this paper so as to achieve higher accuracy than traditional BP network and CNN model. It’s useful to extract the characteristics of the temperature field as INPUT_M by CNN model. The temperature field is a kind of characterization information that causes the thermal error, in order to compensate for the missing information in the information transmission process. In terms of type, the information causing thermal error as INPUT_S, such as environment temperature. power, rotating speed, and feed rate use the BP neural network to compensate for the loss information. The Ensemble Model will make full use of various types of information to make robustness and accuracy improved. The schematic diagram is shown in Fig. 4.

Fig. 4. Ensemble model diagram

Ensemble model can combine the advantages of both models to obtain higher prediction accuracy. It combined the results of BP and CNN to the prediction of thermal error. The output of ensemble model can be defined as Eq. (6). Y~ ¼


W Yf þ i¼1 Bi Bi

X NC i¼1

WCi Yf Ci þ b


f where Y~ is the output of ensemble model for the actual thermal error Y; Yf Bi and YCi are the outputs of BP and CNN network respectively. In this paper, we use the full connect layer structure of a convolutional neural network to construct the final connection layer. It tries to improve the performance of the neural network by reducing the total error by changing the weights along its

An Ensemble Modeling for Thermal Error of CNC Machine Tools


gradient. The ensemble model minimizes the loss function which is Mean square error (MSE) function in Eq. (7): MSE ¼

2 1 XN  Y  Y~ n¼1 N


As shown in Fig. 5, after initializing the entire model, the ensemble model is trained to obtain an optimal prediction model that can be used to predict the spindle thermal error, and the residuals of the prediction model can be reasonably analyzed. Then the performance of the model can be evaluated.

Fig. 5. Flow chart of the modeling method

3 Experiment Validation 3.1

Experiment Set-Up

The proposed methods were verified on an ZK5540A CNC machine tool through temperature and thermal error measuring experiments. ZK5540 CNC gantry mobile multi-head drilling machine is a heavy duty CNC planer drilling machine tool. To collect data from the machine tool, different kinds of sensors are placed on it, including laser displacement sensors (described as DIS sensors for short), Fiber Bragg Grating (FBG) temperature sensors. Temperature sensors are fixed at key positions of machine


X. Jiang et al.

tool’s temperature field. The machine tool, the workshop foundation and the experiment environment are shown in Fig. 6. The number of all temperature measurement points are described in Table 2.

Fig. 6. Experimental setup for temperature measurement

Table 2. Number of temperature measurement points Spindle Beam Column Guide Environment 52 8 32 32 4

The steps of experiment: measuring in downtime, measuring during spindle rotation at different constant speeds (500, 1000, 1500 and 2000 rpm separately for a continuous 5 h), and measurement during spindle rotation. The collecting frequency is 5 s and it means 3500 samples are collected every experiment. Due to the slow change of thermal error, the all sampling data each experiment was down sample to 360 samples in model validation. Lot researches prove the Z axial deforms more than X and Y axial due to the temperature differences in the radial direction [25]. Hence, Z axial thermal error is our main research object. The environment temperature, the machining temperature, the spindle speed and power of CNC were selected as input. Environment temperature, spindle speed, feed rate and power of CNC as the source of the thermal, are put into BP model as INPUT_S. The temperature field of machine tools is feed CNN model as INPUT_M. 3.2

Model Validation and Analysis

Two groups of tests are conducted with different idling speed. Test 1 is conducted at the ambient temperature of 9 °C and in idling speed of 2000 r/min, whereas test 2 is at 14 °C and idling speed of 500 r/min. During the entire process, the learning rate is set to 0.001,

An Ensemble Modeling for Thermal Error of CNC Machine Tools


and the maximum number of training iterations is 2000. The structural topology of the model network is shown in Fig. 3. The prediction of three models results for TEST1 and TEST2 are shown in Figs. 7 and 8. The mean value of residuals is jei jmean . The maximum of residuals is jei jmax . The mean square error is MSE, and the ability of prediction is η, which are: jei jmean ¼

1 jyi  ~yi j N

jei jmax ¼ maxðjyi  ~yi jÞ 1 XN ðy  ~yi Þ2 i¼1 i N PN PN 1 yi j yi j i¼1 jyi  ~ i¼1 jyi  ~ N g ¼ 1  1 PN ¼1 P N i¼1 jyi j i¼1 jyi j N MSE ¼

ð8Þ ð9Þ ð10Þ ð11Þ

Equations (8)–(11) are calculated in the BP model, CNN model and EM of TEST1 and TEST2 to measure the prediction accuracy and prediction robustness of the model.

Fig. 7. The results of different models in TEST1 state 7. (a) is prediction of BP 7. (b) is prediction of CNN 7. (c) is prediction of EM7. (d) is a comparison of different models in TEST1

From the comparison of the experimental results of the three models in Figs. 7(d) and 8 (d), it can be seen that the curves of the predicted values of the BP and CNN models are basically the same as those of the experimental measurements, but them can’t meet requirements for high-precision workpiece. The prediction accuracy of EM in both states is higher than that of the other two models, and the dispersion of the prediction is not much different. It can be seen from Table 3 that the maximum residual of the EM model is about 9 lm while the others is about 12 lm. The mean residual is reduced by


X. Jiang et al.

Fig. 8. The results of different models in TEST2 8. (a) is prediction of BP 8. (b) is prediction of CNN 8. (c) is prediction of EM 8. (d) is a comparison of different models in TEST2

69% and 73%. The η is maintained at about 98% which is higher than the others. It can be seen that the EM model has better prediction accuracy and robustness than single model. It can be used to predict the thermal error model effectively. Table 3. The result of TEST1 and TEST2 in three model Model Test

jei jmax jei jmean MSE


0.012 0.012 0.012 0.013 0.010 0.008


0.0040 0.0038 0.0039 0.0054 0.0004 0.0021

g 6

4:8  10 6:4  106 6:3  106 5:5  106 4:1  106 4:2  106

96.82% 94.73% 96.86% 94.72% 98.76% 97.23%

4 Conclusions In order to accurately predict the CNC thermal error, an ensemble modeling for thermal error based on the CNN model and BP neural network is proposed. We have done numerous thermal error experiments of ZK5540A CNC machine at an idle state under different rotation speeds and ambient temperatures to validate the EM. Compared with the commonly single model, the mean prediction the mean residual of the “EM method” is reduced by 3 lm (71%), and the max residual is improved by 4 lm (33%). The prediction ability of EM is 97.23% at least. It can be seen that the EM network model has better prediction accuracy and robustness than the traditional BP and CNN neural network models. It is effective to use EM network prediction model for thermal error compensation system to reduce the thermal error and improve the machining accuracy.

An Ensemble Modeling for Thermal Error of CNC Machine Tools


The thermal error prediction effect of the “EM method” proposed in this article is only verified when the machine tool is in an idle state. For the actual cutting state, the thermal error prediction still needs further research. Acknowledgement. The authors would like to acknowledge funding support from the National Natural Science Foundation Committee of China under Grant No. 51475347 and the Major Project of Technological Innovation Special Fund of Hubei Province Grant No. 2016AAA016, as well as the contributions from all collaborators within the projects mentioned. We would also like to thank Wuhan University of Technology, People’s Republic of China in supporting this work.

References 1. Ramesh, R., Mannan, M.A., Poo, A.N.: Error compensation in machine tools—a review: Part II: thermal errors. Int. J. Mach. Tools Manuf. 40(9), 1257–1284 (2000) 2. Denkena, B., Schmidt, C., Krüger, M.: Experimental investigation and modeling of thermal and mechanical influences on shape deviations in machining structural parts. Int. J. Mach. Tools Manuf. 50(11), 1015–1021 (2010) 3. Postlethwaite, S.R., Allen, J.P., Ford, D.G.: Machine tool thermal error reduction—an appraisal. Proc. Inst. Mech. Eng. 213(213), 1–9 (1999) 4. Section I.: An adaptive finite element method for stationary incompressible thermal flow based on projection error estimation. Math. Prob. Eng. 2013(2), 1–14 (2013) 5. Kim, J., Zverv, I., Lee, K.: Thermal model of high-speed spindle units. Intell. Inf. Manag. 02 (05), 306–315 (2010) 6. Li, Y., et al.: A review on spindle thermal error compensation in machine tools. Int. J. Mach. Tools Manuf. 95, 20–38 (2015) 7. Li, Y., Zhao, W., Wu, W., et al.: Thermal error modeling of the spindle based on multiple variables for the precision machine tool. Int. J. Adv. Manuf. Technol. 72(9–12), 1415–1427 (2014) 8. Pahk, H.J., Lee, S.W.: Thermal error measurement and real time compensation system for the CNC machine tools incorporating the spindle thermal error and the feed axis thermal error. In: Hayhurst, D.R. (ed.) Proceedings of the 33rd International Conference, pp. 249– 254. Springer, Heidelberg (2000). 9. Ruijun, L., Wenhua, Y., Zhang, H.H., Qifan, Y.: The thermal error optimization models for CNC machine tools. Int. J. Adv. Manuf. Technol. 63(9–12), 1167–1176 (2012) 10. Baltagi, B.H.: Multiple Regression Analysis. In: Baltagi, B.H. (ed.) Econometrics. Springer, Heidelberg (2002). 11. Ren, X., Sun, Y., Zhou, T., Xu, W., Yue, Y.: Real-time thermal error compensation on machine tools using improved BP neural network. In: 2011 International Conference on Electric Information and Control Engineering, Wuhan, pp. 630–632 (2011) 12. Wang, J., Qin, B., Liu, Y., Yang, Y.: Thermal error prediction of numerical control machine based on improved particle swarm optimized back propagation neural network. In: 11th International Conference on Natural Computation (ICNC), Zhangjiajie, pp. 820–824 (2015) 13. Wang, P., Jin, Z.F., Zheng, Y.l.: Artificial neural network-based thermal error model-ling in ball screw. In: IEEE Symposium on Electrical & Electronics Engineering (EEESYM), Kuala Lumpur, pp. 67–70 (2012) 14. Ren, B., Ren, X., Huang, S., Li, G.: The research on thermal error modeling and compensation on machine tools. In: International Conference on Control Engineering and Communication Technology, Liaoning, pp. 444–447 (2012)


X. Jiang et al.

15. Pani, A.K., Mohanta, H.K.: A hybrid soft sensing approach of a cement mill using principal component analysis and artificial neural networks. In: 3rd IEEE International Advance Computing Conference (IACC), Ghaziabad, pp. 713–718 (2013) 16. Zhao, C., Wang, Y.: Optimization of measuring points based on the grey system theory for spindle of CNC machine tool. In: International Conference on Mechatronics and Automation, Changchun, pp. 686–690 (2009) 17. Yao, X.H., Fu, J.Z., Chen, Z.C.: Bayesian networks modeling for thermal error of numerical control machine tools. J. Zhejiang Univ.-Sci. (Appl. Phys. Eng.) 9(11), 1524–1530 (2008) 18. L’Heureux, A., Grolinger, K., Elyamany, H.F., et al.: Machine learning with big data: challenges and approaches. IEEE Access 5(99), 7776–7797 (2017) 19. Yan, J., Meng, Y., Lu, L., Guo, C.: Big-data-driven based intelligent prognostics scheme in industry 4.0 environment. In: Prognostics and System Health Management Conference (PHM-Harbin), Harbin, pp. 1–5 (2017) 20. Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12(10), 993–1001 (2002) 21. Huang, Y., Zhang, J., Li, X., et al.: Thermal error modeling by integrating GA and BP algorithms for the high-speed spindle. Int. J. Adv. Manuf. Technol. 71(9–12), 1669–1675 (2014) 22. SU, Y.-F., Yuan, W.X., Liu, D.P., et al.: A thermal errors compensation model for highspeed motorized spindle based on bp neural network. Modul. Mach. Tool Autom. Manuf. Tech. (2013) 23. Yang, H., Ni, J.: Dynamic neural network modeling for nonlinear, nonstationary machine tool thermally induced error. Int. J. Mach. Tools Manuf. 45(4), 455–465 (2005) 24. Zhang, Y., Yang, J., Jiang, H.: Machine tool thermal error modeling and prediction by grey neural network. Int. J. Adv. Manuf. Technol. 59(9–12), 1065–1072 (2012) 25. Han, J., Wang, H., Cheng, N.: A new thermal error modeling method for CNC machine tools. Int. J. Adv. Manuf. Technol. 62(1–4), 205–212 (2012)

Gait Classification and Identity Authentication Using CNN Wei Yuan1 and Linxuan Zhang2(B) 1

University of Science and Technology Beijing, Beijing, China [email protected] 2 Tsinghua University, Beijing, China [email protected]

Abstract. Mobile-securance is one of the most crucial concerns in the contemporary society. To make further supplementation on the security of mobile phones, this paper proposes a sequential method including periodogram based gait separation, convolutional neural network (CNN) based gait classification and authentication algorithm. The implementation has also been achieved in this paper. The original data are obtained from mobile phone built-in accelerometer. Periodogram based gait separation algorithm calculates the walking periodicity of the mobile phone users and separates individual gait from the time series. Using CNN based classification whose overall classification accuracy can reach over 90% in the test, the separated gaits are subsequently categorized into 6 gait patterns. Furthermore, the CNN based identification authentication convert the certification issue to a bi-section issue, whether the mobile phone holder is the mobile phone user or not. The CNN based authentication method may achieve an accuracy of over 87% when combing with the walking periodicity data of mobile phone users. Albeit the high overall accuracy of CNN based classification and identification authentication, currently the method still have potential deficiency which requires further researches, preparing for public application and popularization. Keywords: Gait classification · Wearable devices Gait labeling · Convolutional neural networks · Time series



Human health and m-securance are two vital concerns of nowadays life. The widely installed inertial sensors in mobile phones are good sources to collect data generated from our daily gait motion. However, it is challenging to accurately extract the motion, classify the gait and identify the device holder. Gait is a cycle movement which indicates many individual details. Walking style, or gait, is known to be inconsistent between individuals and is hard for others to imitate. A good gait classification due to individual disparity is one of the essential basis to identify a person, which is based on accurate method of gait separation. c Springer Nature Singapore Pte Ltd. 2018  L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 119–128, 2018.


W. Yuan and L. Zhang

There are diversities of gait separation or classification methods according to relevant researches. Traditional gait separation uses a stable threshold to separate a time series of acceleration data. Ailisto [1] uses a stable threshold to separate each gait. The method can only apply under strict condition that the device holder is always in a stable walking pattern. A stable threshold can be imprecise due to the inconsistent individual walking frequency, which may overlook many gaits in this way. The stable threshold may also take some random noises that reach the threshold into step counts. Matching the gait patterns with separated time series may alleviate the random accelerometer noises. Zdragkas [2] uses accelerometer generated data from the foot to analyze gait and identify gait event. To reduce the error caused by threshold, he used a variable threshold but still not precise enough. The data generated from foot requires more assisting devices which complicates the daily needs. Using mobile phones to generate daily data is both straightforward and effective. Lin [3] uses decision tree based angular velocity algorithm to classify only 3 gait patterns on the Android platform. This method uses a stable gait property in decision tree, which can be quite different individually. And the 3 kinds of gait are not enough to satisfy daily needs. This paper takes six kinds of gait patterns into accounts. Zhao [4] adopts a wearable device based gait recognition and Convolutional Neural Network (CNN) with angle embedded gait way which is different from the present paper adopted algorithm. There are also many experiments on DTW (Dynamic Time ´ Warping). Swito´ nski [5] builds the model based on kinematic motion data, which has a strict restriction on specific motion. In comparison, algorithm proposed in this paper uses a better data source in a higher frequency. Also, this paper uses periodogram and CNN helping to extract motion in a time series and identify individuals more efficiently. This paper uses movement data to certify mobile phone holder´s identity. The kindred data can be collected from mobile phones since they can store every move when placed near the thigh, that is, put in trouser pockets as shown in Fig. 1.

Fig. 1. An example of mobile phone in the users’ pocket [3]

More specifically, the real-time acceleration of mobile phones can be measured by the accelerometer, a commonly installed sensor in mobile phones. Data

Gait Classification and Identity Authentication Using CNN


are displayed in three axis shown in Fig. 2. Such data collection can assist in precisely recording individual motion, classifying individual movement and helping to demonstrate daily physiological conditions.

Fig. 2. The demonstration of a smart phone accelerometer [7]

In this paper, all data are derived from the GitHub [6], an open-source Internet community. The dataset used in this paper labels the gait data generated from thighs into 6 types: walk-mode, jogging, sitting, standing, upstairs and downstairs. The data are from 34 participants with a data collecting rate of 100 HZ from mobile phone built-in accelerometer modules. This paper is based on the hypothesis that individual gait could be compartmentalized and identified via mobile phone accelerometers under the circumstances that phones are placed in the trouser pockets. It is also assumed that individual gait cycle is generally stable in a long run despite one may have fluctuated walking velocity in a short time.



In this section, the process chart shown in Fig. 3 is proposed to demonstrate the way dealing with the data collected from mobile phones in a sequential way.

Fig. 3. The model of the step classification and authentication

The original data contain three-axis acceleration data presented in time series, and for further use, they are processed to aM in the following methods. And aM is defined as:    (1) aM = aM |aM (t) = a2x (t) + a2y (t) + a2z (t)


W. Yuan and L. Zhang

As mentioned above, personal gait cycle and movement are unique from others. The periodicity of individual walking could range from 0.5 HZ to 1.5 HZ. An improved sliding window is used to separate a time series into individual gaits and get a precise every step pattern image. Using the separated time series, an image-based way is proposed to classify the patterns using CNN. Since the gait length is varied between individuals, for instance, some people may take a step using 1 s while others may need 1.2 s, it is difficult to identify specific gait type. But when transformed into an image way, the periodicity variance can be disregarded to focus on the change of the aM itself and how is every step formed in an aM time series. The classification consequences can count the steps accurately and furthermore use the separated gaits to transforms the authentication into a bi-section problem, whether the device holder is the mobile phone user or not. To identify the users of mobile phones, an image-based way to be evaluated in a CNN classifier model is proposed. 2.1

Gait Starting Position Detection

In a short period time, it is assumed that the device holder is doing the same kind of movement, during which the time series of acceleration data can be accessed. First of all, the time series are separated into individual gait patterns. Sliding window is widely used in gait cycle detection algorithm, which helps to segment the gait approximately [8]. Zhao [4] uses sliding window for gait detection to generate AE-GDIs. Farah [9] uses a fixed length determining the window size to extract features. In this paper, a periodogram [10] based sliding window and median filter [11] way are used to determine the starting position is presented. The window is actually the length of observation time. The sliding window is functioned as a gait separator presented on the time series, which determines the ending of each gait by constantly appearing from the ending of each gait cycle to the beginning of subsequent gait cycle. The sliding window assists in alleviating deviation among possible disparate commencing positions in each gait cycle and the length of the sliding window is determined by the user’s gait cycle. The input of the algorithm is the processed accelerometer data aM . Periodogram is used to roughly determine the time serial periodicity which determines the size of sliding window. The present algorithm uses: Zk =


(ak cos ωk t + bn sin ωk t)



wk = 2πk/n, k = 0, 1, · · · , [n/2],


to transform a time series of n observations into its Fourier representation using the Fourier coefficients to fit the following regression model: 


Zt =


(ak cos ωk t + ak sin ωk t) + et ,


Gait Classification and Identity Authentication Using CNN


Where the wk are the same Fourier frequencies. Despite the potential perfect fitting pattern, the notion from the regression analysis should be used to be applied in the Parseval’s relation:  [(n+1)/2]  2 n  ak + b2k , if n is odd, na20 + n/2 k=1 2 [(n+1)/2]  2 Zt = (5) na20 + n/2 k=1 ak + b2k + na2(n/2) , if n is even, t=1 The quantity I(ωk ) is defined by: ⎧ na20 , k = 0, ⎨ 2 I(ωk ) = n/2(ak + b2k ), k = 1, · · · , [(n + 1)/2], ⎩ na( n/2)2 , k = n/2 when n is even,


Using I(ωk ) can help to find a proper frequency to observe the individuals’ gait cycle. And based on investigation [12], the frequency of the human walk is around 1 HZ, the person No. 026’s periodogram is presented in Fig. 4. Based on the periodogram, the proper frequency of the series of data could be detected. Periodicity T could be calculated, which is also the size of the sliding window.

Fig. 4. Finding the cycle for the person is based on the use of periodogram. The figure shows the gait frequency of No. 026 participate. It is easy to find the proper gait frequency is around 0.0083, which indicates that the periodicity for the participate is 1.20 s.

Based on the figure from the real data, it is assumed that each gait cycle ends with a sequential decreasing numbers of aM and starts from the end of the constant decreasing and going along with a peak. To find the minimum value of a certain range of data, around the end of the sliding window, the minimum aM is considered to be a separation place. However, the simple use of threshold to separate each gait cycle might be inaccurate as it may separate in a wrong place due to the random noises. Median filter, a widely used filter to deal with random noises, may be used to cope with such problems.


W. Yuan and L. Zhang

The input time series will be replaced by median filter with the median of the neighboring entries, and the neighboring range is defined as a window, which will efficiently reduce the random noises. The greater size of the window will cause a fluent processing result but also with a shortcoming to generate greater loss. To balance the strengths and weakness, several experiments are processed. Consequently the window length as 5 data spots is used with correspondingly generated results illustrated in Fig. 5.

Fig. 5. The blue line represents the processed data, and the red line is on behalf of the original. It is obvious that the use of median filter efficiently decreases the effects caused by the random noises. (Color figure online)

In addition, a time series is separated into individual parts for each period cycle, which is shown in Fig. 6. 2.2

CNN Classifier

After separating the time series data, CNN is used for classification. CNN is very similar to ordinary Neural Network, which originates from the work of biologicals’ neural. It consists of neurons that have learn-able weights and biases. Using CNN can make image-based classifier more accurate. In this paper, CNN is used for classifying 6 types of gait in the image based view. Such classification can be utilized to provide a straight-forward way for cataloguing gait, since it transforms the original data into a time-dependent shaped curve. Gait time series with a length around T can be derived and calculated from the previously stated analysis processes. In addition, to achieve the proper image size and high image quality, each gait pattern should be normalized, followed by being cut into size of 290 * 230 pixels (with gait cycle included in the middle of the image). As for increasing the feasibility of the CNN classifier, the image with whitebackground and black line is converted into a black-ground white line image, considering that the white represents 255 and black represents 0 in a gray-scale reading way, which is illustrated in Fig. 7. According to widespread interdependent investigation, generally CNN is built up with a series of neural layers, and every neural layer is made up of many

Gait Classification and Identity Authentication Using CNN


Fig. 6. The black line represents the time series and the blue dot line represents the separation place for each gait. (Color figure online)

Fig. 7. This figure represents a gait cycle in walk-mode.

neural units. Due to such specific conformation, the neural units are crucial to identify the previously processed pictures and inspect the exact formation of those pictures. Every neural layer will have an input and output. When the input of the CNN model is a image, the image can be analyzed in a matrix way. For the relevant example, the Fig. 5 is transformed into a 290 * 230 matrix during which data are detected in a gray-scale reading way. Under such scenario, the image can be read in only one depth. CNN functions as a feature extractor [13], learning the high-level hierarchy of input signals by combining convolution operation, which is the most essential part of CNN. The utilization of convolution ensures that the image data is processed by the neural network in patches rather than every single pixels, which enhances the understanding of the connection and the continuity of the image. Convolutional layer functions as a patch filter which could access the detailed patch information, subsequently go through the image and get an individual information for each patch, after processing the collected information, the neural network might get some information such as edges.


W. Yuan and L. Zhang

Considering about a potential issue during previous steps, convolution layer might lose some information during the stated processing. To tackle this problem, a pooling layer is applied after the convolution processing stages. It is common to insert a pooling layer between successive convolution layers which can assist in filtering the useful information between the layers. And the use of pooling can help alleviate the amount of parameters and reduce the computation in the network, hence to take charge of overfitting. The pooling methodology can also help in retaining more information in the convolutional layer. Using MAX pooling [14] operation, one of the most crucial step in pooling method, could reshape the obtained matrix for further categorization accuracy enhancement. A fully-connected layer, another critical layer in process of image classification, can assist in constructing classifier. Speaking of specific function, the fully-connected layers connect every single neural units in the input layer to corresponding neural units in the output layer. It helps to convert the two dimensional features into a one dimensional vector, which is a highly extracted way to get the feature and help in the classifier. 2.3

Building CNN

Keras [15], a high-level neural networks API, is used to build the CNN model. The sequential neural layers model is presented as follows in Fig. 8.

Fig. 8. This figure represents the basic CNN framework which makes up the classifier.

Primarily, one-hot array is used to identify the 6 gait types, for the purpose of building classifier. One-hot is an array with a length of M to categorize a serial data in M categories. For instance, in the present experiments, there are six different gait patterns and the walk-mode is defined as the first kind of gait whose one-hot array will be [1, 0, 0, 0, 0, 0]. The proposed CNN classifier frame is formed with six sequentially connected neural layers consisting of two convolutional layers (Conv1 layer, Conv2 layer), two pooling layers and two fully connected layers (1024, 6), which is shown in Fig. 8. The ‘ReLU’ contained in convolutional layers and the ‘Softmax’ contained in pooling and fully connected layers are the activation components. Max pooling operation reforms size of feature map spatially after the previously produced pictures being analyzed by the convolutional layer. After the pooling method, images are flattened and converted to gait patterns, which are classified in fully connected neural layer. For the training part, the learning rate is set as 0.0001.17000 gait figures of all six gait types are used in training the CNN model during the training process.

Gait Classification and Identity Authentication Using CNN


During the experiments, the dataset is processed on a PC whose CPU is i7-7820HQ, 2.90 GHz CPU and with 16 GB RAM. Fed with the processed data, the CNN model may get an accuracy of 90% given 1020 gait figures in the test process. 2.4

Identify Authentication

For a selected individual, a bi-section method based on the previously stored periodicity and CNN can be used to authenticate personal identity if the gait patterns classification are processed precisely. Individual identity authentication is based on the walking pattern, the most common personal motion pattern. Since different people may have disparate walking periods, the variance in walking periodicity can be an essential identifier. Adopting the CNN classifier stated above, walk-mode gait can be extracted from a time series data. For a particular person, whether the walk-mode gait is generated or not can be regarded as a method of bi-section. In this way, individual models can be generated to identify different mobile phone users. The model of CNN for the bi-section is presented as follows in Fig. 9.

Fig. 9. The figure represents the model of the bi-section classifier.

As the authentication uses the bi-section method, there would be two CNN classifier outputs. Among the 34 participants in the dataset, 5 participants are randomly chosen and 5 CNN classifiers are built for each of the participants. The walk-mode figures of the participant and figures from other mobile phone users are mixed for the CNN classifier to distinguish. Speaking of the accuracy, using CNN alone can achieve 87% accuracy of identifying individuals on the average of the 5 individual CNN classification tests, while integrating CNN classifier with the individual walking periodicity calculated above may achieve higher overall accuracy.



In this paper, an effective method using to classify the gaits and identify authentication is presented. To obtain more accurately separated gait, the periodogram method is applied. Based on the separated pattern, a novel method for imagebased classification and authentication using CNN is proposed. The eventually generated results from CNN model represent that the algorithm is both efficient and robust.


W. Yuan and L. Zhang

However, the data are generated in 100 HZ, which may be energy consuming and may result in extra heating problems of mobile phones. For future method improvement, adopting a less energy-consuming method to get accelerometer data and a lower frequency detection rate may be essential. The CNN model are currently built for each individual and is used in present to implement the algorithm. In the aspect of practicability, it is assumed that the data should be uploaded to the server in real-time way, where the classification and identification are proceeded. The current critical consideration is applying the algorithm into precise step counts and guarantee personal privacy and safety not only among mobile phones but also on any other individual secure application in the future.

References 1. Ailisto, H.J., Makela, S.M.: Identifying people from gait pattern with accelerometers. In: Biometric Technology for Human Identification II, pp. 7–14 (2005) 2. Zdragkas, G., Avaritsiotis, J.N.: Gait analysis and automatic gait event identification using accelerometers. In: IEEE International Conference on Bioinformatics and Bioengineering, Bibe 2008, 8–10 October 2008, Athens, Greece, pp. 1–6. DBLP (2008) 3. Lin, J., Chan, L., Yan, H.: A decision tree based pedometer and its implementation on the android platform. In: International Conference on Computer Science and Information Technology, pp. 73–83 (2015) 4. Zhao, Y., Zhou, S.: Wearable device-based gait recognition using angle embedded gait dynamic images and a convolutional neural network. Sensors 17(3), 478 (2017) ´ 5. Swito´ nski, A., Michalczuk A., Josi´ nski H., et al.: Dynamic time warping in gait classification of motion capture data. In: World Academy of Science Engineering Technology (2012) 6. GITHUB. 7. The three axis demonstration image. ds/2013/07/iphone.jpeg 8. Subramanian, R., Sarkar, S., Labrador, M., et al.: Orientation invariant gait matching algorithm based on the Kabsch alignment. In: IEEE International Conference on Identity, Security and Behavior Analysis. IEEE (2015) 9. Farah, J.D., Baddour, N., Lemaire, E.D.: Gait phase detection from thigh kinematics using machine learning techniques. In: IEEE International Symposium on Medical Measurements and Applications, pp. 263–268. IEEE (2017) 10. Chuang, A.: Time series analysis: univariate and multivariate methods. Technometrics 33(1), 108–109 (2006) 11. Mehrgardt, S.: Median filter: US, US5138567[P] (1992) 12. Kurz, M.J., Stergiou, N.: An artificial neural network that utilizes hip joint actuations to control bifurcations and chaos in a passive dynamic bipedal walking model. Biol. Cybern. 93(3), 213–221 (2005) 13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012) 14. Murray, N., Perronnin, F.: Generalized max pooling. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2473–2480. IEEE Computer Society (2014) 15. Kearas. Accessed 7 June 2018

Deep Dissimilarity Measure for Trajectory Analysis Reza Arfa, Rubiyah Yusof(&), and Parvaneh Shabanzadeh Centre for Artificial Intelligence and Robotics, Malaysia-Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia [email protected], [email protected]

Abstract. Quantifying dissimilarities between two trajectories is a challenging yet fundamental task in many trajectory analysis systems. Existing methods are computationally expensive to calculate. We proposed a dissimilarity measure estimate for trajectory data by using deep learning methodology. One advantage of the proposed method is that it can get executed on GPU, which can significantly reduce the execution time for processing large number of data. The proposed network is trained using synthetic data. A simulator to generate synthetic trajectories is proposed. We used a publicly available dataset to evaluate the proposed method for the task of trajectory clustering. Our experiments show the performance of our proposed method is comparable with other well-known dissimilarity measures while it is substantially faster to compute. Keywords: Trajectory analysis LSTM

 Dissimilarity measure  Deep learning

1 Introduction With the advancement of surveillance devices and object tracking algorithms, obtaining object trajectory can easily and accurately be obtained. A trajectory typically is a sequence of locations of a moving object which describes the underlying route of the object over time. Despite being simple, trajectory data is a powerful motion descriptor that can be used to infer many object’s activities. Quantifying how (dis)-similar two trajectories are is one of the fundamental research problem which is a foundation of many trajectory analysis applications such as animals movement analysis [1], trajectory clustering [2], scene modelling [3] and trajectory retrieval [4]. Defining a function that measures the similarity between two trajectories is a challenging task mainly because trajectories are sequences with varying sizes. A great number of similarity measures have been proposed for trajectory data such as Dynamic Time Warping (DTW) [5], longest common subsequences (LCSS) [6], edit distance on real sequences (EDR) [7]. Most of these methods try to first find a best alignment between two trajectories. To achieve this goal, dynamic programming is used to match each sample point of one trajectory to one or more sample points of another trajectories. Even though these methods have been shown to perform well in different trajectory analysis tasks, they suffer from being computationally expensive to calculate. © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 129–139, 2018.


R. Arfa et al.

These methods usually are OðL2 Þ complex, where L being the maximum number of observation in trajectories to be compared. Therefore, it has been argued that similaritybased trajectory analysis systems are not scalable to medium and large datasets [8]. In this paper, we propose a novel approach for quantifying the dissimilarity between two trajectories. Our method is based on deep learning methodology, where the network estimates the DTW, LCSS, and MoH distances. One advantage of the proposed method is that it can run on GPU, which makes the calculation much faster than traditional methods. Moreover, since the proposed network shares the weights for different networks, it simultaneously estimates different similarity measures without adding a significant computational time. To verify the effectiveness of our method, we run experiment on a real-world publicly available dataset. The paper rest of the paper is organized as follow. In the next section, related works are presented. In Sect. 3 we discuss the detail of proposed method. The experimental results and discussion is presented in Sect. 4. Finally, in Sect. 5 we provided the conclusion and future research direction.

2 Related Work Quantifying similarity between two trajectories is a fundamental task with dozens application domains. As mentioned earlier, the main difficulty is the potential length differences between trajectories. Many studies address this problem by introducing a length independent similarity measures. Dynamic Time Warping (DTW) [5] is among the early methods that address the problem of time shift between two trajectories by finding an optimal alignment between two trajectories. Dynamic programming is used to find the optimal alignment with minimum distance between matched points. Since DTW tries to match all existing points, this method is sensitive to outliers and noisy observations. Longest common subsequences (LCSS) [6] address this problem by allowing samples to be unmatched. Similar to DTW and LCSS, Piciarelli and Foresti [9] proposed a distance accounted for temporal drift. Unlike LCSS and DTW, their distance measure is capable of comparing incomplete trajectories. This particularly important for online comparison of trajectories, where trajectories are developing and still not fully observed. Many studies proposed distance measure based on edit distance similarity measure. Chen et al. [10] proposed Edit Distance with Real Penalty (ERP) which allows for amplitude scale and global spatial shift. Chen et al. [7] proposed Edit Distance on Real Sequences (EDR). Similar to LCSS, EDR uses predefined threshold for sample matching. Compare to LCSS, EDR is invariant in both scale and global spatial shift. Hausdorff distance is a shaped-based measure that is used to quantify the distance between unequal length signals. The original form of Hausdorff distance, however, is defined for comparing between two unordered sets. Trajectories, in another hand, are sequences of observation where the order between observations is of great important. Atev et al. [11] proposed a modified Hausdorff distance which take the order between observations into the account. Their proposed distance takes the effect of outlier in the objective function as well. Alt [12] proposed a Directed Hausdorff Distance (DHD) which intuitively measures the degree to which a trajectory resemble some part

Deep Dissimilarity Measure for Trajectory Analysis


of other trajectory. Laxhammar and Falkman [13] extended DHD for online trajectory comparison and anomaly detection. All mentioned distance measures use dynamic programming techniques at some point. This effects the overall computational cost, where all these methods have at least OðL2 Þ computational complex. Being computational expensive is particularly a challenging problem when one is dealing with medium to large number of dataset. In most trajectory analysis systems, it is usually required to calculate the pairwise distance several times. For instance, in trajectory clustering or path modelling systems, a (dis-) similarity matrix N  N matrix need to be calculated, where N is number of trajectories in a dataset. Similarly, in anomaly detection or trajectory retrieval systems an observed trajectory need to be compared against all previous trajectories. One solution to this problem is to execute these operations in parallel on a GPU. This solution, however, cannot easily be applied to the similarity measures discussed so far. With regard to performance, each dissimilarity measure has its own advantages where it outperforms the others in a certain scenario. There have been some studies comparing these distances against different datasets [14, 15]. The accuracy of the similarity measures, however, are mostly depend on the dataset and sometimes contradictory to other studies’ findings. To improve the effectiveness of each individual dissimilarity, Buza et al. [16] proposed a hybrid of similarity measures. Combining these methods, however, requires more calculation which increase the execution time. In this study, we proposed used deep learning methodology to estimate pairwise distances. We used Long Short Term Memory (LSTM) to account for potentially different trajectory length. Given two trajectories, the goal is to find a nonlinear estimate for true dissimilarity measures. To train the network we used simulation to generate synthetic trajectories. Pairwise dissimilarity is calculated and given to the network as output.

3 Methodology     Consider two trajectories Tx ¼ rx;0 ; . . .; rx;jxj and Tx ¼ ry;0 ; . . .; ry;jyj where rx;i and ry;j denoting ith and jth observation and j xj and j yj denoting the length of Tx and Ty respectively. The problem of dissimilarity measure is to define  a function    d Tx ; Ty 2 R  0 , where it satisfies d ðTx ; Tx Þ ¼ 0, d Ty ; Tx ¼ d Tx ; Ty . Intuitively speaking, this function will return a small number   when two trajectories are alike. As two trajectories become more dissimilar, d Tx ; Ty produce larger values. Our approach is based on estimating available distance measure method using deep learning representation. In other words, instead of defining a mathematical function,     d Tx ; Ty , neural network learns a non-linear model, d^ Tx ; Ty , that estimate the true dissimilarity measures. Figure 1 presents the overview of the proposed framework for training and testing phases. Two raw trajectories are first preprocessed to ensure that the length of trajectories is less than a predefined size, ‘max . If the length of a trajectory is longer than ‘max , some observations will be uniformly removed until the trajectories length is less than ‘max . In the training phase, the dissimilarity between two trajectories are calculated and


R. Arfa et al.

used as output. Before trajectories are passed into the network, they are normalized by zero-padding technique [17]. This technique extends a trajectory length by simply concatenating 0 to the end of the trajectory. 3.1

Trajectory Simulation

To generate random trajectories, we divided the scene into M  Q cells. When an object enters a grid cell, Cij , that grid affect the flow of the object by manipulating the object’s directional speed. Let the control flow for the kth object in the grid Cij to be Uijk . This vector can be broken down into direction component, vijk , and velocity component, tijk , components. For simplicity, we quantized direction component into four directions: north (N), south (S), west (W), and east (E). For each object entering Cij , the control flow is randomly generated as follow: Uijk  N ðtijk jlt ; rt ; Cij ; kÞ  Pðvijk jV; Cij ; kÞ


where lt and rt are average and standard deviation of velocity. The second term of the right-hand side generates a random direction, vijk . In the case of discrete directions, this term is reduced to categorical distribution and the variable vijk can take one of the quantized direction with specified direction probabilities. In our case where we used four directions, the event probabilities in cell Cij is a four dimensional probability N S E W N S simplex defined by pEij  0; pW ij  0; pij  0; pij  0 where pij þ pij þ pij þ pij ¼ 1. Now consider rk;t ¼ ½ x0k ðtÞ y0k ðtÞ T to be the observed position of the object at that T time. Let Xk;t ¼ ½ xk ðtÞ yk ðtÞ vxk ðtÞ vyk ðtÞ  to be the state vector which represent th the position and the speed of the k moving object at time t. The state at time t depends on the previous state and the flow control of each grid cell. Flow control, Uijk , is randomly generated using Eq. 1 once the object enters the grid cell. This vector will be

Fig. 1. Overview of proposed framework (a) training phase (b) testing phase

Deep Dissimilarity Measure for Trajectory Analysis


kept the same while the kth object resides in that cell. The motion of the object at time t considering it is in Cij can be defined as: Xk;t þ 1 ¼ AXk;t þ BUijk rk;t ¼ CXk;t þ Rt


where Rt  R21 is used to simulate measurement noise to observations. :A is state transition matrix. Similarly, B can be interpreted as a matrix that combines the effect of grid cell’s flow control with the state of objects motion. C is output matrix which is used to extracts the observed position of an object from its state. These are given are as follow: 2

1=a 6 0 A ¼ a6 4 0 0

0 1=a 0 0

Dt 0 1 0

3 2 0 Dt 7 7 ; B ¼ ð 1  aÞ 6 4 05 1

Dt 0 1 0

3 0  1 Dt 7 ;C ¼ 0 5 0 1

0 1

0 0 0 0


where a 2 ð0; 1Þ indicates the effect of flow control on the object movement. Figure 2 illustrates an example of trajectory of an object arriving to grid cell Cij from right grid cell. The initial values for the states in Eq. 2, Xk;0 , is randomly generated from a source grid cells, Cijsource . Source grid cells are predefined cells where objects are first appeared. The initial state is generated as follow Xk;0  N ð:jlX0 ; rX0 ; Cijsource ; kÞ


Algorithm 1 summarizes the generation process used to generate random trajectories.

Fig. 2. An example of simulated trajectory arriving from east to a cell Cij with direction probability distribution shown in the right-hand side of the figure.



R. Arfa et al.

Network Architecture

Our proposed network architecture is based on multilayer Long Short Term Memory (LSTM) networks. LSTM is improvement over Recurrent Neural Networks (RNNs). RNNs are neural networks that hidden units form a directed cyclic connection, allowing it to capture temporal dependencies. These types of networks are shown to be very successful for sequential data such as natural language processing [18] and speech recognition [19]. At each time step t, the state is a function of previous state, ht1 , and current input, St : ht ¼ fh ðht1 ; St Þ


In practice RNN does not perform well in learning long-term dependencies [20]. To overcome this problem various modification of RNN have been proposed. One popular

Deep Dissimilarity Measure for Trajectory Analysis


modification of RNN is LSTM [21] which has been successfully applied to problems in many domains [22–24]. The architecture of proposed LSTM network is illustrated in Fig. 3. The network accepts two normalized inputs Tx and Ty . We set maximum length of the trajectories, ‘max , to 200 and preprocessed as mentioned earlier. The size of hidden layer at each layer is summarized in Table 1.

Fig. 3. Network architecture

Table 1. Number of hidden layer for different part of network Layer name Number of hidden layers Hidden layer 1_y, Hidden layer 1_x 32 Hidden layer 2_y, Hidden layer 2_x 32 Dense layer 1 64 Hidden layer 3 32 Hidden layer 4 32 Dense layer 2 64 Hidden layer 5 64 Dense layer 3 64


R. Arfa et al.

4 Experimental Results and Discussion We trained the network’s weights in two stages. At the first stage we used the synthetic data generation strategy discussed previously. Then we fine-tuned the weights using part of real-world dataset. For the first stage, we simulated total number of 1,000,000 pairs of trajectories using different scene templates. We then calculated DTW, LCSS, and MoH for each pair. All trajectories are then preprocessed with strategy mentioned earlier. We kept 80% of data for training and the remaining 20% for validation purpose. We then train the network using stochastic gradient descent (SGD) method. We start with learning rate at 0.01. Each time the validation error reaches a plateau, we decrease the learning rate by magnitude of 10 and continue the training procedure. We initialize the network’s weights by following [25] and we set the batch size to 512. Tensorflow is used to implanted and train the network. To verify our approach, we evaluate the trained network for the task of trajectory classification on Lankershim dataset. This dataset is part of Next Generation Simulation (NGSIM) program provided by the U.S. Federal Highway Administration (FHWA). The dataset provides trajectories of moving vehicles on Lankershim Boulevard in the Universal City neighborhood of Los Angeles, CA on June 16, 2005. The data are placed into 8:30 am to 8:45 am and 8:45 am to 9:00 am subsets. We only used the trajectories took place near an intersection and removed trajectories outside of this area (Fig. 4). After filtering the trajectories having less than 10 observations, we ended up with total number of 2212 trajectories.

Fig. 4. Lakcherim dataset

We used the first subset of the dataset (8:30 am to 8:45 am) to fine-tune the network’s weight. We divide this subset of dataset into 80% for train and 20% for validation. We used SGD method and set the learning rate to 105 and fine tune the network for 2 epochs. The second subset of the dataset (8:45 am to 9:00 am subsets) is used to evaluate the clustering performance of the trained network. Since this dataset does not provide activity labels for trajectories, we manually labeled the trajectories into 19 activities. Our clustering approach is based on clustering data using similarity matrix [3]. The

Deep Dissimilarity Measure for Trajectory Analysis


similarity matrix stores all pairwise similarities between trajectories, where Aij is the similarity between Ti and Tj . We then used standard clustering algorithm to cluster the trajectories into 19 activities based on their similarities. We used three different clustering algorithms for this purpose, namely, k-means, spectral clustering, and agglomerative clustering. Purity function is used to quantitatively evaluate the performance of our trajectory clustering algorithm [26]. Let N be the number of trajectories in cluster k with label j, P and let Nk ¼ Cj¼1 Nkj be the total number of trajectories fall into cluster k. The purity of a clustering is defined as rpurity ¼

XK Nk q k¼1 N k



Where qk , maxj Nkjk is the purity of cluster k. We compared the proposed similarity measure against DTW, LCSS, and MoH. The result of trajectory clustering is summarized in Table 2. The best performance is achieved by averaging the estimated dissimilarities and agglomerative is used for clustering. Furthermore, clustering accuracy of estimated dissimilarities are comparable with the clustering accuracy of true dissimilarity functions.

Table 2. Clustering accuracy for different similarity measures and clustering technique Similarity measure DTW







Clustering k-means Spectral clustering Agglomerative clustering k-means Spectral clustering Agglomerative clustering k-means Spectral clustering Agglomerative clustering k-means Spectral clustering Agglomerative clustering k-means Spectral clustering Agglomerative clustering k-means Spectral clustering Agglomerative clustering EDTW, ELCSS, EMoH k-means Spectral clustering Agglomerative clustering

Accuracy 0.79 0.81 0.81 0.79 0.82 0.83 0.92 0.97 0.97 0.78 0.81 0.80 0.76 0.80 0.81 0.92 0.95 0.97 0.93 0.97 0.98


R. Arfa et al.

One of the main advantage of the proposed system is being substantially faster to calculate. Table 3 summarizes the result of average execution time required to calculate pairwise dissimilarity two trajectories. These values are obtained by averaging the time required for calculating dissimilarity matrix for 8:45 am to 9:00 am subsets. As the result suggests the proposed method is substantially faster than any individual dissimilarity function, while it estimates all three distances simultaneously.

Table 3. Average execution time for calculating pairwise dissimilarity Similarity measures DTW LCSS MoH EDTW, ELCSS, and EMoH Average time [ms] 2.28 2.27 1.28 0.05

5 Conclusion This paper proposed a deep learning architecture to estimate pairwise dissimilarity of two trajectories. The network accepts two raw trajectories with varying length and return estimate for DTW, LCSS, and MoH dissimilarities simultaneously. A trajectory simulation is proposed to generate synthetic trajectories to be used to train the network. Experimental results have confirmed the effectiveness of the proposed algorithm in the task of trajectory clustering.

References 1. Teimouri, M., Indahl, U., Sickel, H., Tveite, H.: Deriving animal movement behaviors using movement parameters extracted from location data. ISPRS Int. J. Geo-Inf. 7(2), 78 (2018) 2. Atev, S., Miller, G., Papanikolopoulos, N.P.: Clustering of vehicle trajectories. IEEE Trans. Intell. Transp. Syst. 11(3), 647–657 (2010) 3. Morris, B.T., Trivedi, M.M.: Trajectory learning for activity understanding: unsupervised, multilevel, and long-term adaptive approach. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2287–2301 (2011) 4. Weiming, H., Xi, L., Guodong, T., Maybank, S., Zhongfei, Z.: An incremental DPMMbased method for trajectory clustering, modeling, and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(5), 1051–1065 (2013) 5. Keogh, E.J., Pazzani, M.J.: Scaling up dynamic time warping for datamining applications. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 285–289. ACM (2000) 6. Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: 2002 Proceedings of 18th International Conference on Data Engineering, pp. 673–684. IEEE (2002) 7. Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. Presented at the Proceedings of the 2005 ACM SIGMOD International Conference on Management of data, Baltimore, Maryland (2005) 8. Wang, X., Ma, K.T., Ng, G.W., Grimson, W.E.: Trajectory analysis and semantic region modeling using a nonparametric Bayesian model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8 (2008)

Deep Dissimilarity Measure for Trajectory Analysis


9. Piciarelli, C., Foresti, G.L.: On-line trajectory clustering for anomalous events detection. Pattern Recognit. Lett. 27(15), 1835–1842 (2006) 10. Chen, L., Ng, R.: On the marriage of Lp-norms and edit distance. Presented at the Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, vol. 30 (2004) 11. Atev, S., Masoud, O., Papanikolopoulos, N.: Learning traffic patterns at intersections by spectral clustering of motion trajectories. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4851–4856. IEEE (2006) 12. Alt, H.: The computational geometry of comparing shapes. In: Albers, S., Alt, H., Näher, S. (eds.) Efficient Algorithms. LNCS, vol. 5760, pp. 235–248. Springer, Heidelberg (2009). 13. Laxhammar, R., Falkman, G.: Online learning and sequential anomaly detection in trajectories. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1158–1173 (2014) 14. Morris, B., Trivedi, M.: Learning trajectory patterns by clustering: experimental studies and comparative evaluation. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 312–319 (2009) 15. Zhang, Z., Huang, K., Tan, T.: Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes. In: 2006 18th International Conference on Pattern Recognition, ICPR 2006, vol. 3, pp. 1135–1138 (2006) 16. Buza, K., Nanopoulos, A., Schmidt-Thieme, L.: Fusion of similarity measures for time series classification. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011. LNCS (LNAI), vol. 6679, pp. 253–261. Springer, Heidelberg (2011). 17. Weiming, H., Xuejuan, X., Zhouyu, F., Xie, D., Tieniu, T., Maybank, S.: A system for learning statistical motion patterns. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1450– 1464 (2006) 18. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014) 19. Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013) 20. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. Trans. Neural Netw. 5(2), 157–166 (1994) 21. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 22. Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association (2012) 23. Byeon, W., Breuel, T.M., Raue, F., Liwicki, M.: Scene labeling with LSTM recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3547–3555 (2015) 24. Ordóñez, F.J., Roggen, D.: Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016) 25. Zimmermann, H.-G., Tietz, C., Grothmann, R.: Forecasting with recurrent neural networks: 12 tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 687–707. Springer, Heidelberg (2012). 978-3-642-35289-8_37 26. Amigó, E., Gonzalo, J., Artiles, J., Verdejo, F.: A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf. Retr. 12(4), 461–486 (2009)

High Performance Computing and Cloud Computing

Performance Comparison of Eulerian Kinetic Vlasov Code Between Xeon Phi KNL and Xeon Broadwell Takayuki Umeda1(B) 1


and Keiichiro Fukazawa2

Institute for Space-Earth Environmental Research, Nagoya University, Nagoya 464-8601, Japan [email protected] Academic Center for Computing and Media Studies, Kyoto University, Kyoto 606-8501, Japan [email protected]

Abstract. The present study deals with the kinetic Vlasov simulation code as a high-performance application, which solves the first-principle kinetic equations known as the Vlasov equation. A five-dimensional Vlasov code with two spatial dimension and three velocity dimensions is parallelized with the MPI-OpenMP hybrid parallelism. The performance of the parallel Vlasov code is measured on a single compute node with a Xeon Phi Knights Landing (KNL) processor and on a single compute node with two Xeon Broadwell processors. It is shown that the use of Multi-Channel Dynamic Random Access Memory (MCDRAM) as the “cache” mode gives higher performances than the “flat” mode when the size of a computing job is larger than the size of MCDRAM. On the other hand, the use of MCDRAM as the “flat” mode gives higher performances than the “cache” mode for small-size jobs, when the NUMA (Non-Uniform Memory Access) policy is controlled appropriately. It is also shown that there is not a substantial difference in the performance among the cluster modes. The performance of our Vlasov code is best with the “Quadrant” cluster mode and worst with the “SNC-4” cluster mode. Keywords: Performance measurement · Xeon Phi processor Xeon processor · Eulerian-grid-based method · Hybrid parallelism



Many-core scalar processors are one of recent trends of CPUs in highperformance computing, which run at a lower clock frequency ( 68. The computational speed of our code on a single Xeon Phi KNL processor and that on a dual Xeon Broadwell processors are almost the same. This means that the computational efficiency (to the theoretical peak performance) of our Vlasov code on the Xeon Phi KNL is almost one third of that on the Xeon Broadwell, since the theoretical peak performance of a single Xeon Broadwell is over 500 GFLOPS. The result shows that our code runs faster with the “Flat” mode when the size of the job is less than the size of MCDRAM ( subfld__update end type subfld_t type(subfld_t) :: sub

We can call the type-bound procedure update in the instance sub in a straightforward way as call mhd%sub%update(mhd%main)

This is certainly intuitive for simulation programmers, but the elegance of this type-bound procedure call is spoiled by its bulky appearance of the member access operator “%”, which contrasts with the compact period letter “.” in other languages such as C/C++ or JAVA. In eFortran dialect proposed in this paper, we also use “.” as the member access operator, instead of “%”. The preprocessor efpp for eFortran substitutes the period letter “.” that appears in the context of the member access operator into the letter “%”. Care should be taken since the period letter appears in various contexts in the standard Fortran. Among various usages of the period letter in the modern Fortran, a floating point number (e.g., 3.14) is easy to identify (because it is followed by numbers). However, other period letters in the Fortran are not that easy. For example, in our MHD simulations, the period letters frequently appear in unary operators such as

A Dialect of Modern Fortran for Computer Simulations


vorticity = .curl.velocity

or binary operators such as electric = magnetic.cross.velocity

where vorticity, velocity, electric, and magnetic are three-dimensional arrays (vector fields, type(vfield t) defined in the above code fragment) in the MHD code. To distinguish from the member access operator, in eFortran, the user-defined operators, like .curl. and .cross. are supposed to accompany at least one space, both before and after the period letters. In the above example, vorticity = .curl. velocity electric = magnetic .cross. velocity

Logical operators such as .and., .eqv. etc., should also accompany spaces, like c f i l

= = = =

d g j m

.and. e .or. h .eqv. k .neqv. n

As a result of this imposed rule, it is easy to identify a period letter used as a member access operator by making use of the regular expression. The above example, then, is safely written as call mhd.sub.update(mhd.main)

which is much easier to read than the standard syntax using “%”. The preprocessor efpp pass the period letter in file paths (names), character strings, and in comments, without the substitution. 2.2

Block Comment

Comment lines in the standard Fortran are, for example, ! ! ! ! ! !

This routine calculates ... History: ... Author: ...

There is no block comment in the Fortran in contrast to, say, C language in which the block is marked by a pair of “/*” and “*/”. The block comment syntax is convenient when we have to repeatedly turn on and off a part of the source code especially in the early stage of the development. We can write block comments in eFortran. It starts from a sequence of (at least three ) “=” letters: =============


S. Hosoyamada and A. Kageyama

and it ends at the line with exactly the same line, i.e., the same number of “=” letters and the same number of indent spaces. Different numbers of “=” letters or indent spaces indicate a starter line of different comment blocks. In other words, the block comments in eFortran can be nested, in contrast to the C language. For example, efpp converts the following (meaningless) text abc def ghijklmn opq ============== abc def ghijklmn opq ====== abc def ghijklmn opq abc def ghijklmn opq ====== abc def ghijklmn opq ============== abc def ghijklmn opq

into abc def ghijklmn opq ! ======= ! abc def ghijklmn opq !! ====== !! abc def ghijklmn opq !! abc def ghijklmn opq !! ====== ! abc def ghijklmn opq ! ======= abc def ghijklmn opq


Syntax Aliases

In the modern Fortran, some lines can become very long. The following stock phrases seem to be too long at least to the authors: integer(SI), intent(inout) :: m real(DR), intent(in), optional :: p integer(SI), parameter :: N = 100

In eFortran, we write the above lines as inte(SI) :: m real(DR) :: p inte(SI) :: N = 100

This is realized by a simple text-conversion. The following set of pre-defined, shorter character strings (shown in left) are converted into the legitimate, longer strings (right) by efpp.

A Dialect of Modern Fortran for Computer Simulations inte(SI) inte(DI) char(len=


integer(SI) integer(DI) character(len= , intent(in) , intent(out) , intent(inout) , intent(in), optional , intent(out), optional , intent(inout), optional , parameter

In the present version of efpp, the aliases are just text-substitution, with no argument. 2.4


In eFortran, we can use the addition/subtraction/multiplication assignment operators: += -= *=

where, for example, i += 1 is converted into i = i + 1 by efpp. eFortran does not provide division assignment operator “/=”, because it stands for “not equals” in the standard Fortran. 2.5

Pre-defined Macros

The character string __MODULE__ is converted to the module or the main program name that contains the string. The string __ROUTINE__ is converted to the name of the containing subroutine or function. If it is an internal subprogram, it shows the parent routine name, too. See the routine named inter in the sample code in Sect. 3, for example. 2.6

String Aliases

In addition to the pre-defined aliases explained in Sect. 2.3, the preprocessor efpp accepts user-defined aliases. They are specified in a file named efpp alias.list. It is a dictionary data in the following format: "character_string" => "strings_replaced_by_efpp"

A convenient alias frequently used in our simulation programmings is the following pair in the dictionary file "**debug_print**" => "print *, ’debug: ’//"


S. Hosoyamada and A. Kageyama

by which we can insert tidy print-out lines for the debug purpose in such a way that subroutine inter inte(SI) :: i . i = ... . **debug_print** ’i = ’, i . end subroutine inter

and we can easily suppress the debug output by redefining the dictionary, putting an exclamation mark: "**debug_print**" => "!print *, ’debug: ’//"


Imposing of Implicit None

One of the most dangerous errors in the modern Fortran programming is to forget writing a line “implicit none” in the main program or a module file. Following the idea of ELF90 [14], efpp spits an error message when the line “implicit none” is missing. 2.8

“Just Once” Block

In a computer simulation program, it is common to find several chunks of lines that are supposed to be executed only once during the simulation. The tasks assigned to those chunks are memory allocation, setting up the initial condition, and others. These chunks are usually handled by if statement with a flag variable in the standard Fortran code. This kind of “just once” block is so essential in simulation programs that it is worth assigning a specific syntax with a striking appearance. In eFortran, we write the “just once” block in the following form; logical :: just_once = .true. . . ==== ...chunks... ==== . .

The block is sandwiched by n sequence of “=” letters (n ≥ 2) with“” and “”. The efpp preprocessor converts the above lines to

A Dialect of Modern Fortran for Computer Simulations


logical :: for_the_first_time = .true. . . if ( for_the_firtst_time ) then ...chunks... for_the_first_time = .false. ; end if

Note that, in the first line, the initialization of the variable for the first time in the declaration implies “save”, in Fortran. Note also that a semicolon is used in the last line, which is explained in the next. 2.9

Conservation of Line Numbers

Suppose we have an eFortran source code sample.e. By applying the preprocessor efpp, we convert it to a standard Fortran code, say sample.f. It is this file sample.f that is compiled by a Fortran compiler. The compiler sends warnings or errors, if any, with line numbers. We can directly jump to the specified position in the original eFortran code sample.e because the line numbers in sample.e and sample.f are the same. 2.10

Skip Block

Another block syntax introduced in eFortran is the following “skip” block, which frequently appears in simulation programs like the “just-once” block. The lines inside the “skip” block are executed every n calls of the block (n is a positive integer). A code fragment in eFortran is inte(SI) :: ctr = 0 . . ==== ...some work invoked every 10 times... ==== . .

The preprocessor efpp converts it to inte(SI) :: ctr = 0 . . if ( mod(ctr,10)==0 ) then ...some work invoked every 10 times... end if ; ctr = ctr +1 . .

Note again that a semicolon is used to keep the line numbers.



S. Hosoyamada and A. Kageyama

Sample Code Fragment

Here we show a sample code fragment in eFortran. module test_m use constm implicit none . contains subroutine testr(j,val) inte(SI) :: j real(DR) :: val ============================ This is in a block comment. ======= Comments can be nested. ======= ============================ inte(SI) :: ctr = 0 inte(SI) :: NN = 1 logical :: just_once = .true. . j += 1 call mhd.sub.update(mhd.main) . print *, ’Hello from __ROUTINE__ in __MODULE__’ ! ==> Hello from testr in test_m ==== print *, ’This line is executed only once.’ ==== . ==== print *, ’This line is executed every 10 times.’ ==== . contains subroutine inter print *, ’Hello from __ROUTINE__’ ! ==> Hello from testr/inter’ **debug_print** ’i = ’, i end subroutine inter end subroutine testr end module test_m



Fortran has been one of the major programming languages in the HPC. The latest features introduced in the modern Fortran (Fortran 2003 and later) have enhanced its appeal to simulation researchers. The language’s importance in

A Dialect of Modern Fortran for Computer Simulations


HPC will grow further when its frontend for the LLVM compiler infrastructure [17], such as flang [16], becomes popular. To promote the pleasure of coding in the modern Fortran, we propose its dialect, eFortran, in this paper. eFortran has several features to improve the coding experiences of simulation programs, including; period as the member access operator; block comments, shorter syntax, addition/subtraction/multiplication assignment, pre-defined and user-defined macros, automatic check of “implicit none” call, “just once” block, and “skip” block. We have developed preprocessor efpp that converts eFortran code into the standard Fortran. efpp does not change the line numbers of the source code, to help to track messages from the Fortran compiler and directly jump to the source code of eFortran. Some imposed rules, for example, spaces in the user-defined operators described in Sect. 2.1, could be relaxed if we use a general-purpose macro language, such as m4 [13], to implement the preprocessor. However, the learning curve for the tool seems to be too steep for the authors when we observe its too rich features [20]. Instead, we have implemented efpp with Python as a simple text converter. The source code is available at GitHub [8].

References 1. Adams, J.C.: The Fortran 2003 Handbook : The Complete Syntax, Features and Procedures. Springer, Heidelberg (2009). 2. Componey, T.F.: F Home page. 3. Corporation, I.: Intel Fortran compiler. 4. Foster, M.P.: Quantity correctness in Fortran programs. Comput. Sci. Eng. 19(4), 83–87 (2017) 5. Free Software Foundation, I.: GNU Fortran Project Home page. https://gcc.gnu. org/fortran/ 6. Group, N.A.: NAG Fortran compiler. 7. Hassan, A.A., Cardellini, V., Filippone, S.: A framework for unit testing with coarray Fortran. In: 25th High Performance Computing Symposium, HPC 2017, Part of the 2017 Spring Simulation Multi-Conference, SpringSim 2017, vol. 49, no. 3, pp. 47–58 (2017) 8. Kageyama, A.: EFPP. 9. Kageyama, A., Miyagoshi, T., Sato, T.: Formation of current coils in geodynamo simulations. Nature 454(7208), 1106–1109 (2008) 10. Kennison, D.: The IFTPAN preprocessor. Record 3(6), 8–10 (1982) 11. Kernighan, B.W.: RATFOR–a preprocessor for a rational Fortran. Softw. Pract. Exp. 5(4), 395–406 (1975) 12. Kernighan, B.W., Plauger, P.J.: Software Tools. Addison-Wesley Pub. Co, Boston (1976) 13. Kernighan, B., Ritchie, D.: The M4 macro processor. Bell Lab. Tech. Rep. 54, 1–5 (1977) 14. Lahey Computer Systems, I.: ELF90.


S. Hosoyamada and A. Kageyama

15. Miyagoshi, T., Kageyama, A., Sato, T.: Zonal flow formation in the earths core. Nature 463(6), 793–796 (2010) 16. Osmialowski, P.: How the flang frontend works. In: Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC 2017) (2017) 17. Project, L.: LLVM. 18. Spinellis, D.: Notable design patterns for domain-specific languages. J. Syst. Softw. 56(1), 91–99 (2001) 19. Tsuji, T., Watanabe, K., Ikehata, A.: Structured FORTRAN preprocessors generating optimized output. Sotw.-Pract. Exp. 18(5), 427–442 (1988) 20. Turner, K.J.: Exploiting the m4 Macro Language. Technical report CSM-126, pp. 1–17 (1994)

Flow Simulation

Performance Comparison of the Three Numerical Methods to Discretize the Local Inertial Equation for Stable Shallow Water Computation Tomohiro Tanaka1, Hidekazu Yoshioka2(&), Sokly Siev3, Hideto Fujii4, Ly Sarann5, and Chihiro Yoshimura3 1


Kyoto University, Kyotodaigaku-katsura, Nishikyoku, Kyoto, Japan [email protected] 2 Shimane University, Nishikawatsu-cho 1060, Matsue, Japan [email protected] 3 Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan {siev.s.aa,yoshimura.c.aa} 4 Yamagata University, 1-23, Wakaba-machi, Tsuruoka, Yamagata, Japan [email protected] Institute of Technology of Cambodia, P.O. Box 86, Phnom Penh, Cambodia [email protected]

Abstract. The local inertial equation (LIE) is a simple shallow water model for simulating surface water dynamics. Recently, the model has been widely applied to flood simulation worldwide. Keys in numerical implementation of the LIE are the staggered spatio-temporal discretization and the stable treatment of the friction slope terms. The latter is critical for stable and efficient computation. Currently, several discretization methods (semi-implicit, fully-implicit, and exponential methods) for the friction slope terms with comparable computational efficiency are available. However, their performance evaluation has been carried out only independently. We thus compare the performance of the three methods through their application to test and realistic cases. In this paper, firstly, theoretical linear stability analysis results are reviewed, indicating the highest stability of the implicit method. It is also consistent in a certain sense. Application of these methods to a 1-D test case with an advancing wet and dry interface implies that all the methods work well where the fully-implicit method has the least error. Their application to 2-D flood simulation in Tonle Sap Lake and its floodplains in South-East Asia demonstrates that the exponential method gives slightly more oscillatory results than the others. Dependence of the simulated surface water dynamics on the spatial resolution is investigated as well to give a criterion of the resolution for numerical simulation with satisfactory accuracy. Keywords: Local inertial equation  Finite difference scheme Friction slope term  Tonle Sap Lake

© Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 451–465, 2018.


T. Tanaka et al.

1 Introduction Mathematical models based on fluid dynamics are indispensable tools for tracking surface water dynamics, such as flooding and drying of huge lakes and flood expansion in urban areas. The shallow water equations and their simplified counterparts have been effectively utilized in flood simulation with regional and watershed scales [1]. The local inertial equation (LIE) is one of the most widely used mathematical models for modern flood simulation [2]. Mathematically, the LIE is the shallow water equations without the momentum terms in the momentum equations, which can be physically justified for surface water dynamics in which the horizontal scale is much larger than the vertical scale [3]. Mathematical properties of the LIE, especially its possible wave structures, have been analyzed in Martins [4]. Due to its simplicity and easy implementability, numerical computation of the LIE has been carried out with simple but some bit technical discretization methods aiming at high computational efficiency. Keys for successful numerical treatment of the LIE are the mass-conservative spatio-temporally staggered discretization in the spirit of finite difference and the stable discretization of the stiff friction slope terms; the latter is an important factor especially when handling surface water dynamics with wet and dry interfaces. This is because the friction slope terms, due to their functional form, may diverge when the water depth is very small, which is frequently encountered situation in flood simulation. So far, the semi-implicit [2], fully-implicit [5], and exponential methods [6] have been proposed and analyzed individually. Their performance should be compared with each other for establishment of better flood simulator with higher accuracy, stability, and efficiency. This is the motivation of this paper. The objective of this paper is thus set as performance comparison of the three numerical methods (semi-implicit, fully-implicit, and exponential methods) for discretization of the friction slope terms. They are implemented in a common massconservative staggered finite difference scheme. Both benchmark and realistic problems are considered in this paper. The benchmark problem is the 1-D flood propagation problem whose analytical solution is available [7]. As the realistic problem, seasonal flooding and drying of the Tonle Sap Lake in South-East Asia where several measured water depth and discharge data are available. The computational results of the abovementioned methods are compared with the data to see their performance. Dependence of the simulated surface water dynamics on the spatial resolution is also investigated to give a criterion of the resolution for satisfactory flood simulation. The rest of this paper is organized as follows. Section 2 presents the 2-D LIE and its 1-D counterpart. Numerical methods for their discretization are also presented with their mathematical analysis results so far. They are compared in Sects. 3 and 4. Section 3 examines their computational accuracy against the benchmark problem. Section 4 applies them to flooding and drying simulation in and around the Tonle Sap Lake. Section 5 concludes this paper and presents future directions of our research.

Performance Comparison of the Three Numerical Methods


2 Local Inertial Model and Its Discretization In this section, the LIE, the mathematical model used in this paper, is introduced and its discretization methods are presented. 2.1

Mathematical Formulation

The domain of surface water flows is denoted as X and it is identified with a bounded domain in the 2-D x-y space. The time is denoted as t  0. The water depth is denoted as h ¼ hðt; x; yÞ  0 and the line discharges in the x - and y - directions as p ¼ pðt; x; yÞ and q ¼ qðt; x; yÞ, respectively. The horizontal depth-averaged velocities in the x - and y - directions are obtained as u ¼ h1 p and v ¼ h1 q when h [ 0. Otherwise, they are set as 0. The 2-D LIE is a system of hyperbolic conservation laws having source terms to describe horizontal mass and momentum transport phenomena occurring in the domain X. The continuity equation is given as @h @p @q þ þ ¼re @t @x @y


and the momentum equations in the x - and y - directions are given as   @p @ ð h þ z Þ n2 pj pj þ gh þ 10=3 ¼ 0 @t @x h


  @q @ ðh þ zÞ n2 qjqj þ gh þ 10=3 ¼ 0; @t @y h



respectively. Here, z ¼ zðx; yÞ is the bed elevation, r is the source term due to rainfall, e is the sink term due to evaporation, and g is the gravitational acceleration. The conventional Manning’s formula with the roughness coefficient n has been assumed in the friction slope terms: the last terms of (2) and (3). This paper considers the 1-D counterpart in straight channels as well, which is simply derived by omitting the terms in the y-direction: the continuity equation @h @p þ ¼re @t @x


with the momentum Eq. (2). 2.2

Finite Difference Discretization

The 2-D LIE presented in the previous sub-section is discretized with a massconservative finite difference scheme. Firstly, the domain X is discretized into an


T. Tanaka et al.

orthogonal structured mesh having rectangular cells. The time increment for temporal integration is denoted as Dt. The spatial increments in x - and y-directions are denoted as Dx and Dy, respectively. The quantity Q evaluated at the time t ¼ kDt at the location ðx; yÞ ¼ ðiDx; jDyÞ is denoted as Qki;j where i, j, and k are integers numbering the locations and the time steps. Sub- and/or super-scripts for the quantities with halfinteger are defined in the same manner. Here the discretization of the scheme for the 2D LIE is presented since the 1-D discretization is formally derived by simply omitting the y-directional components. Based on the above-mentioned setting, the continuity equation is discretized as k þ 1=2

k þ 1=2

k þ 1=2

k þ 1=2

hki;jþ 1  hki;j pi þ 1=2;j  pi1=2;j qi;j þ 1=2  qi;j1=2 k þ þ ¼ ri;j  eki;j : Dt Dx Dy


The momentum Eqs. (2) and (3) are then discretized as k þ 1=2 pi þ 1=2;j

k1=2 pi þ 1=2;j


2 þ g~hkiþ 1=2;j 4

   hkiþ 1;j þ zi þ 1;j  hki;j þ zi;j Dx

3 þ Si þ 1=2;j;k 5 ¼ 0 ð6Þ

and k þ 1=2


qi;j þ 1=2  qi;j þ 1=2 Dt

2 þ g~hki;j þ 1=2 4

   hki;j þ 1 þ zi;j þ 1  hki;j þ zi;j Dy

3 þ Si;j þ 1=2;k 5 ¼ 0: ð7Þ

The water depths ~hkiþ 1=2;j and ~hki;j þ 1=2 appearing in (6) and (7) are evaluated as n o   k k ~hk ¼ max h þ z ; h þ z i;j i þ 1;j i þ 1;j  max zi;j ; zi þ 1;j i þ 1=2;j i;j


n o   k k ~hk ¼ max h þ z ; h þ z i;j i;j þ 1 i;j þ 1  max zi;j ; zi;j þ 1 ; i;j þ 1=2 i;j



respectively, to represent the water depth in the flowing direction. In (6) and (7), Si þ 1=2;j;k and Si;j þ 1=2;k represent the discretized friction slope terms, which have different functional forms in the different numerical methods as shown in the following sub-section. Appropriate initial and boundary conditions must be prepared for implementation of the finite difference scheme. A methodology to handle wet and dry interfaces is explained here. A cell is said to be dry if its water depth equals 0. Otherwise the cell is said to be wet. At wet and dry interface, the water depth in the dry cell is set to be 0 and the line discharge in the dry cell is manipulated as explained below. Assume that a wet and dry interface is detected at the ði þ 1=2; jÞ th cell where the ði; jÞ th cell is wet and the ði þ 1; jÞ th cell is dry. By (8), we have ~hkiþ 1=2;j ¼ hki;j þ zi;j  zi þ 1;j if the water surface elevation hki;j þ zi;j is

Performance Comparison of the Three Numerical Methods


higher than the elevation zi þ 1;j . If hki;j þ zi;j approaches to zi;j þ 1 , i.e. if hki;j approaches to k þ 1=2

0, then pi þ 1=2;j is set to be 0, so that the water does not flow into the dry cell. In our k þ 1=2 numerical computation, pi þ 1=2;j is set to be 0 if ~hkiþ 1=2;j is smaller than the threshold value 0.01 m for stable computation. The same procedure applies to wet and dry interfaces detected at the other cells. 2.3

Discretization of the Friction Slope Terms

The three methods, which are the fully-implicit method, semi-implicit method, and exponential method, are presented in this sub-section. For the sake of brevity of descriptions, the following quantities are introduced in what follows:  Ai þ 1=2;j;k ¼ g~hkiþ 1=2;j

   hkiþ 1;j þ zi þ 1;j  hki;j þ zi;j Dx

; Bi þ 1=2;j;k ¼ 

gn2 ~ hkiþ 1=2;j

7=3 [ 0; ð10Þ


Ci þ 1=2;j;k ¼ Ai þ 1=2;j;k Dt  pi þ 1=2;j ;  Ai;j þ 1=2;k ¼ g~hki;j þ 1=2

   hki;j þ 1 þ zi;j þ 1  hki;j þ zi;j Dy

; Bi;j þ 1=2;k ¼ 

ð11Þ gn2 ~ hki;j þ 1=2

7=3 [ 0; ð12Þ


Ci;j þ 1=2;k ¼ Ai;j þ 1=2;k Dt  qi;j þ 1=2 :


The first method, the semi-implicit method, is the most widely-used method due to its simplicity. Originally, Bates et al. [2] implemented this method into the first numerical flood simulator based on the 2-D LIE. The method gives Si þ 1=2;j;k and Si;j þ 1=2;k as follows:

Si þ 1=2;j;k

    k þ 1=2  k1=2  k þ 1=2  k1=2  n2 pi þ 1=2;j pi þ 1=2;j  n2 qi;j þ 1=2 qi;j þ 1=2  ¼  10=3 and Si;j þ 1=2;k ¼  10=3 : k ~hk ~ h i þ 1=2;j i;j þ 1=2

The name “semi-implicit” comes from the numerical discretization of the numerk þ 1=2 k þ 1=2 ators. The momentum Eqs. (6) and (7) are exactly solved for pi þ 1=2;j and qi;j þ 1=2 as k1=2

k þ 1=2

pi þ 1=2;j ¼

pi þ 1=2;j þ Ai þ 1=2;j;k Dt    k1=2  1 þ Bi þ 1=2;j;k Dtpi þ 1=2;j 



T. Tanaka et al.

and k1=2

k þ 1=2 qi;j þ 1=2

qi;j þ 1=2 þ Ai;j þ 1=2;k Dt   ¼  k1=2  ; 1 þ Bi;j þ 1=2;k Dtqi;j þ 1=2 


respectively. Notice that the denominators of (14) and (15) are always positive and they are thus well-defined. The second method is the fully-implicit method which is a recently-developed numerical method to enhance the computational stability of the semi-implicit method [5], which gives Si þ 1=2;j;k and Si;j þ 1=2;k as follows:

Si þ 1=2;j;k

    k þ 1=2  k þ 1=2  k þ 1=2  k þ 1=2  n2 pi þ 1=2;j pi þ 1=2;j  n2 qi;j þ 1=2 qi;j þ 1=2  ¼  10=3 and Si;j þ 1=2;k ¼  10=3 : k ~hk ~ h i þ 1=2;j i;j þ 1=2


The name “fully-implicit” comes directly from the fully-implicit treatment of the numerators (16). As in the semi-implicit method, the momentum Eqs. (6) and (7) are k þ 1=2 k þ 1=2 exactly solved for pi þ 1=2;j and qi;j þ 1=2 in the fully-implicit method. They are uniquely found as k þ 1=2 pi þ 1=2;j

ffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

  Ci þ 1=2;j;k   k k   1  1 þ 4Ci þ 1=2;j;k Bi þ 1=2;j;k Dt ¼   2Bi þ 1=2;j;k DtCi þ 1=2;j;k 


k þ 1=2 qi;j þ 1=2

ffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

  Ci;j þ 1=2;k     1  1 þ 4Ci;j þ 1=2;k Bi;j þ 1=2;k Dt ; ¼   2Bi;j þ 1=2;k DtCi;j þ 1=2;k 



respectively. Notice that the quantities in the square roots (17) and (18) are positive. The last method is the exponential method [6], which is based on a different philosophy from the above-presented ones. The exponential method considers the Eq. (2) as a local ordinary differential equation (ODE) with a “linear” decay term as @p þ a þ bp ¼ 0 @t


with a ¼ gh

@ ðh þ zÞ gn2 j pj and b ¼ 7=3 : @x h


Performance Comparison of the Three Numerical Methods


The formal ODE (19) is firstly discretized for tk1=2 \t  tk þ 1=2 as


dp xi þ 1=2 ; yj ; t  k1=2  þ Ai þ 1=2;j;k þ Bi þ 1=2;j;k pi þ 1=2;j p xi þ 1=2 ; yj ; t ¼ 0: ð21Þ dt

Then, (21) is truly an ODE that describes the temporal evolution of p xi þ 1=2 ; yj ; t . The ODE (21) is integrated with respect to the time t for tk1=2 \t  tk þ 1=2 to derive the discretization of the momentum Eq. (2) as 0

k þ 1=2

pi þ 1=2;j


   k1=2  B p A Ai þ 1=2;j;k  Dt i þ 1=2;j;k i þ 1=2;j;k B k1=2 i þ 1=2;j  C   ¼ @pi þ 1=2;j þ   k1=2 Ae  k1=2  Bi þ 1=2;j;k pi þ 1=2;j  Bi þ 1=2;j;k pi þ 1=2;j  ð22Þ

   k1=2  when pi þ 1=2;j  [ 0 and otherwise as k þ 1=2

pi þ 1=2;j ¼ Ai þ 1=2;j;k Dt:


The discretization of the other momentum Eq. (3) is carried out in an essentially similar way as 0

k þ 1=2

qi;j þ 1=2

1    k1=2  B q A Ai;j þ 1=2;k  i;j þ 1=2;k i;j þ 1=2 Dt i;j þ 1=2;k B k1=2  C   ¼ @qi;j þ 1=2 þ   k1=2 Ae  k1=2  Bi;j þ 1=2;k qi;j þ 1=2  Bi;j þ 1=2;k qi;j þ 1=2  ð24Þ

   k1=2  when qi;j þ 1=2  [ 0 and otherwise as k þ 1=2

qi;j þ 1=2 ¼ Ai;j þ 1=2;k Dt:


We can obtain Si þ 1=2;j;k and Si;j þ 1=2;k for the exponential method but are not presented here since they are a bit long and not of importance. 2.4

Similarity and Difference Among the Three Methods

The three numerical methods for discretization of the friction slope terms were presented. A strong similarity among the three numerical methods is the fact that they do not require any iterative algorithms, like the Newton methods, for finding the updated k þ 1=2 k þ 1=2 quantities pi þ 1=2;j and qi;j þ 1=2 . This is a key to achieve computationally efficient numerical simulation of surface water dynamics with the LIE. Since they are essentially explicit and employ the single-stage temporal integration, their computational


T. Tanaka et al.

complexity is theoretically the same. In addition, it has been found that they are more stable than the fully explicit method with     k1=2  k1=2  k1=2  k1=2  n2 pi þ 1=2;j pi þ 1=2;j  n2 qi;j þ 1=2 qi;j þ 1=2  Si þ 1=2;j;k ¼  ð26Þ 10=3 and Si;j þ 1=2;k ¼  10=3 : k ~hk ~ hi;j þ 1=2 i þ 1=2;j On the other hand, they have different stability properties, which may lead to substantially different computational results under certain conditions. Firstly, they have qualitatively different behaviour for coarse temporal resolution. Here, the discretization method is said to be consistent under the coarse limit if it complies with ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   ffi  3=2 v u  u hk k ~hk   t i þ 1;j þ zi þ 1;j  hi;j þ zi;j  i þ 1=2;j  k þ 1=2  lim p ¼ Dt! þ 1 i þ 1=2;j Dx n


and    k þ 1=2  lim qi;j þ 1=2  ¼

Dt! þ 1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   ffi 3=2 v u  k u hk t i;j þ 1 þ zi;j þ 1  hi;j þ zi;j  : Dy n

~hk i;j þ 1=2


Namely, the discretization method is said to be consistent under the coarse limit if it exactly gives locally the uniform flow conditions. Straight-forward calculations show that only the fully-implicit method is consistent under the coarse limit, implying its theoretical advantage over the others. In fact, the semi-implicit and the explicit methods give the result       A  i;j þ 1=2;k  k þ 1=2  ; lim pi þ 1=2;j  ¼   k1=2 Dt! þ 1 Bi;j þ 1=2;k qi;j þ 1=2 


which is totally different from (27). On the computational stability, the three methods have narrower CFL stability conditions than the conventionally-used one [2, 7–9]: Dx Dt  min pffiffiffiffiffiffiffiffi ; i;j ghi;j


which is due to the existence of the friction slope terms. However, it should be noted that this is not the drawbacks of these methods, but because of the incorrectness of the stability condition (30) that the conventional research believed to be correct. Actually, the CFL stability condition is valid for the problems without friction, but which are out of the scope of this paper. On computational accuracy, the three numerical methods have been considered to be at most first-order in both space and time considering the well-known Godunov’s

Performance Comparison of the Three Numerical Methods


theorem [10]. However, this is only a theoretical prediction but not performance in practice. Their accuracy is evaluated in the following sections.

3 Benchmark Computation The three numerical methods are applied to numerical computation of the benchmark problem by Almeida et al. [7], which is a problem to simulate an advancing wet and dry interface in a straight channel having a horizontal bottom. The computational domain is set as the 1-D interval ð0; 5000Þ (m), which is uniformly discretized into 200 cells with Dx ¼ 25 (m). The time increment Dt is set as 1 (s). The analytical solution to this benchmark problem is found in Almeida et al. [7]. Figure 1 plots the computed and analytical water surface profiles at different time steps. The computational results with the three methods are comparable and are difficult to distinguish in the figure. All the methods give diffusive wet and dry interface having faster propagation speed than the theoretical one, which would be due to their implicit natures in the temporal discretization that may add artificial diffusivity. On the other hand, such an effect would be necessary for stable numerical computation. To carry out more detailed comparison of the numerical methods, Table 1 presents the l1 error between the analytical and numerical solutions for each method. Despite that there exist differences of the l1 error among the methods, they are much smaller than the magnitude of the error. In all the cases the largest absolute value of the error is detected at the wave front (see Fig. 1) and the accuracy of numerical methods seems not to be different with each other.

Fig. 1. Computational results for the three numerical methods for analytical solution, semiimplicit method, exponential method, and fully-implicit method at time 1200 s, 3600 s and 6000 s.


T. Tanaka et al. Table 1. l1 error of the water depth for each numerical method. Time (s) Semi-implicit 1200 0.076345 3600 0.108715 6000 0.128035

Exponential 0.07593 0.10826 0.12756

Fully-implicit 0.07655 0.10894 0.12827

4 Application to a Large Shallow Lake 4.1

Study Area

The study area of the model application here is the Tonle Sap Lake in South-East Asia (Fig. 2). The explanation of the lake follows that of Tanaka and Yoshioka [11]. The Tonle Sap Lake is the largest freshwater lake in south-East Asia, which is under a bimodal climate having clear dry and rainy seasons in each year. The lake receives river water from 11 tributary rivers, and its water pours to the Tonle Sap River and then the mainstream of the Mekong River in the dry season (from December to May). On the other hand, in the rainy season, the water flow in Tonle Sap River is reversed due to larger discharge from the Mekong River, causing significant backwater effects, which are indispensable for maintaining ecosystem and fisheries in and around the lake.

Fig. 2. Map of the Tonle Sap Lake. The colored area is the central part of the computational domain. (Color figure online)

Performance Comparison of the Three Numerical Methods


According to Kummu et al. [12], annual backwater inflow from the Mekong River and outflow to the river is 50.3% of annual total inflow and 84.0% of annual total outflow, respectively. This estimation result means that 33.7% of the total inflow into the lake pours to the lower Mekong River. Another outflow source from the lake is the evapotranspiration that accounts for 13.0% of annual total outflow on average. Rainfall supplies 12.4% of water inflowing to the lake. 34.1% of the inflow water is provided by tributary river flows. The sum of the rainfall and evapotranspiration are mostly balanced with the net outflow from the lake and tributary river inflow. 4.2

Computational Conditions

The computational domain is uniformly discretized into the cells each of which is the square having the side length of 500 (m) unless otherwise specified. The bed elevation data for the Tonle Sap River was embedded along its flow direction to efficiently capture the waterway. The temporal increment is set according to the CFL condition with a safety factor of 0.7. Different spatial resolution is examined in a later subsection. The inflow boundary conditions from the tributaries are based on the established hydrological model GBHM [13] and available measured hydraulic data. The source term in the continuity Eq. (1) is based on available hydrological and meteorological data. The outflow boundary condition, which is reversed during rainy seasons, is specified based on the computational result of the 1-D commercial hydraulic software Mike 11, which has already been implemented in hydraulic simulation of the lower Mekong River [14]. Although the wind shear acting on the water surface of the lake may not be negligible, it is not considered in this paper. This is because the available date for estimating the direction and magnitude of the wind shear is very sparse compared with the other meteorological and hydrological data. In addition, we found, based on preliminary computational experiments, that the wind shear does not critically affects the water surface elevation and the outflow discharge at the inlet of the Tonle Sap Lake, both of which are focused on below. It should be noted that this does not mean that the wind is not of importance since local flow conditions would be driven by the wind shear when the flow inflow and outflow discharges are small, such as during dry seasons. 4.3

Computational Results

The computed and observed time series of the water surface elevation at the Kampong Luong station is shown in Fig. 3. The results of all the numerical methods are in good agreement with the observed ones. Among the three methods, the exponential and fullimplicit methods show slightly better agreement than the semi-implicit method in the dry season, during which the water depth is thin and the friction term becomes more dominant than other terms. Figure 4 compares the outlet discharge from the simulation domain for each numerical method. All the methods give results with fluctuation possibly due to the lack of detailed 2-D river bed elevation data. Nevertheless, the overall results are similar among the methods. In summary, overall performances of the methods for simulating large-scale lake flow dynamics are not significantly different among the selection of numerical methods. Theoretical computational efficiency of the


T. Tanaka et al.

schemes is the same since they are essentially explicit. However, in our computation, the implicit and exponential schemes require 1.2 times longer computational time than the semi-implicit one due to many “if” statements.

Fig. 3. Comparison of the computed and observed water surface elevation at the Kampong Luong station.

Fig. 4. Comparison of the computed and reference discharges at the Prek Kdam station. The reference discharge is calculated with the Mike 11 software.

Model performance at different spatial resolution is verified for 1000 (m) resolution, 2000 (m) resolution, and the original resolution of 500 (m). Topographic data at the corresponding spatial resolution were arranged by upscaling one at 500 (m), embedded river bed elevation of the Tonle Sap River. The computed water surface elevations at the Kampong Luong station and outlet discharge are plotted in Figs. 5 and 6, respectively. As shown in Fig. 5, the simulated water surface elevation with the semi-implicit method is not significantly different among the different spatial

Performance Comparison of the Three Numerical Methods


Fig. 5. Comparison of the computed and observed water surface elevation at the Kampong Luong station. The semi-implicit method is used for computation.

Fig. 6. Comparison of the computed and reference discharges at the Prek Kdam station. The semi-implicit method is used for computation.

resolutions, which have been checked to be true for the other methods as well. This indicates that surface water gradient in this large lake is small. On the other hand, outlet discharge is largely different from each other. Even though fluctuation is included in all the cases, the model at 2000 (m) resolution shows extreme values at both the peak of backwater flow from the Mekong River in the rainy season and the drainage flow from the Tonle Sap River in the dry season. This is because, at 2000 (m) resolution, the river bed slope is evaluated as much larger than that at 500 (m) or at 1000 (m), and thus the momentum transport easily increases in both the cases. Although not examined here, this problem can be addressed by downscaling of the spatial resolution along the Tonle Sap River with small increment of computation time. This strategy would be possible owing to much smaller domain of the rivers than that of the whole lake.


T. Tanaka et al.

A further discussion on computational accuracy is provided. Compared with 500 (m) resolution, the number of cells at 2000 (m) resolution is much smaller, which is 1/16 of 500 (m) resolution. Consequently, the allowable maximum time step for 2000 (m) resolution is almost four times larger than that of 500 (m) resolution. Finally, the computation at 500 (m) resolution is 64 times faster than that at 2000 (m) resolution. Therefore, if we focus on capturing the overall behavior of the lake, running the model at 2000 (m) resolution would be the most efficient strategy for practical use.

5 Conclusions Stability and accuracy of the three numerical methods for the discretization of the friction slope terms of the LIE were examined. The theoretical analysis results indicated that they have the same level of complicity. The results also demonstrated that the fully-implicit method is consistent under the coarse limit, while the others do not. The computational results against the benchmark test problem showed that the three numerical methods have comparable computational accuracy, having little difference with each other. Application of the three methods to numerical simulation of flooding and drying of the Tonle Sap Lake demonstrated that all the methods can handle the complex surface water dynamics. It was found that the exponential method gives the most oscillatory results, which may be due to its violation of the consistency under the coarse limit. The semi-implicit method has the least computational cost, while the fullyimplicit method has theoretical advantage of consistency for coarse model resolution. Dependence of the resulting surface water dynamics against computational resolution was finally examined, demonstrating that the spatial resolution of 500 (m) is sufficiently fine and even the resolution of 2000 (m) would be enough for tracking water surface elevation changes. We showed theoretical and practical advantages of the fully-implicit method over the semi-implicit and exponential methods. However, this situation may change if the exponential method is improved. Recently, we found that the exponential method may be improved if the ODE (19) is replaced by the quadratically nonlinear counterpart @p gn2 þ A þ B0 pj pj ¼ 0 with B0 ¼ 7=3 ; @t h


which can also be solved analytically without iteration methods, although the resulting formula would be more complicated. The goal of our research is not simply simulating surface water dynamics, but also tracking the associated transport phenomena, such as the transport and resuspension of sediment particles, nutrients and pathogen, of which have been recognized as critical factors affecting water quality of Tonle Sap Lake and its floodplains [15, 16]. Appropriate transport equations should be coupled with the 2-D LIE for simulating such phenomena. Discretization of the transport equations must be carried out so that the computational efficiency of the 2-D LIE is least affected. The above-mentioned topics are currently undergoing by the authors.

Performance Comparison of the Three Numerical Methods


Acknowledgments. This research was funded by SATREPS project “Establishment of Environmental Conservation Platform of Tonle Sap Lake” and JSPS research grant No. 17K15345.

References 1. Miller, C.T., et al.: Numerical simulation of water resources problems: models, methods, and trends. Adv. Water Resour. 51, 405–437 (2013) 2. Bates, P.D., Horritt, M.S., Fewtrell, T.J.: A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling. J. Hydrol. 387(1–2), 33– 45 (2012) 3. Hunter, N.M., Bates, P.D., Horritt, M.S., Wilson, M.D.: Simple spatially-distributed models for predicting flood inundation: a review. Geomorphology 90(3), 208–225 (2007) 4. Martins, R., Leandro, J., Djordjević, S.: A well balanced Roe scheme for the local inertial equations with an unstructured mesh. Adv. Water Resour. 83, 351–363 (2015) 5. Tanaka, T., Yoshioka, H.: Numerical stability analysis of the local inertial equation with semi- and fully implicit friction term treatments: assessment of the maximum allowable time step. J. Adv. Simul. Sci. Eng. 4(2), 162–175 (2018) 6. Yoshioka, H., Tanaka, T.: On mathematics of the 2-D local inertial model for flood simulation. In: Proceedings of the 2nd International Symposium on Conservation and Management of Tropical Lakes, Siem Reap, Cambodia, 24–26 August 2017, pp. 30–36 (2017) 7. Almeida, G.A., Bates, P.: Applicability of the local inertial approximation of the shallow water equations to flood modeling. Water Resour. Res. 49(8), 4833–4844 (2013) 8. Martins, R., Leandro, J., Chen, A.S., Djordjević, S.: A comparison of three dual drainage models: shallow water vs local inertial vs diffusive wave. J. Hydroinform. 19(3), 331–348 (2017) 9. Yamazaki, D., Almeida, G.A., Bates, P.D.: Improving computational efficiency in global river models by implementing the local inertial flow equation and a vector-based river network map. Water Resour. Res. 49(11), 7221–7235 (2013) 10. Godunov, S.K.: A difference scheme for numerical solution of discontinuous solution of hydrodynamic equations. Math. Sbornik 47, 271–306 (1969) 11. Tanaka, T., Yoshioka, H.: Applicability of the 2-D local inertial equations to long-term hydrodynamic simulation of the Tonle Sap Lake. In: Proceedings of the 2nd International Symposium on Conservation and Management of Tropical Lakes, Siem Reap, Cambodia, 24–26 August 2017, pp. 23–29 (2017) 12. Kummu, M., et al.: Water balance analysis for the Tonle Sap Lake-floodplain system. Hydrol. Process. 28(4), 1722–1733 (2014) 13. Yang, D., Oki, T., Herath, S., Musiake, K.: A hillslope-based hydrological model using catchment area and width functions. Hydrol. Sci. J. 47, 49–65 (2002) 14. Fujii, H., et al.: Hydrological roles of the Cambodian floodplain of the Mekong river. Int. J. River Basin Manag. 1(3), 253–266 (2013) 15. Kummu, M., Penny, D., Sarkkula, J., Koponen, J.: Sediment: curse or blessing for Tonle Sap Lake? AMBIO 37(3), 158–163 (2008) 16. Vanny, L., Jiwen, G., Seingheng, H.: Phnom Penh’s municipal drinking water supply: water quality assessment. Sustain. Water Resour. Manag. 1(1), 27–39 (2015)

Development of the DRowning hUman Model (DRUM) Toward Evaluation of Performance of Lifejackets in Tsunamis Daiki Ajima(B) , Tatsuto Araki, and Takashi Nakamura Tokyo Institute of Technology, 4259 Nagatsuta-cho Midori-ku, Yokohama, Kanagawa 226-8503, Japan [email protected]

Abstract. We developed a new numerical simulation model considering both unsteady water currents and unsteady movement of a human body toward evaluation of lifejackets as a measure against drowning in tsunamis. A combination of a multiphase fluid solution: CIP-CUP method and a method to represent a human body: link segment model enabled to simulate interactions between fluid and human bodies in the developed model. To validate performance of the developed model, we reproduced an experiment by Kurisu et al. (2018) where a manikin was swept down and caught by water currents assuming tsunamis. Consequently, the developed model represented both water currents and movement of a human body well. Keywords: Biofluid mechanics · Drowning · Tsunami Human body dynamics · CIP-CUP scheme Link segmentation model · Multiphase flow analysis


· Lifejacket


To date, a great many people have been killed by tsunamis. In the Great East Japan Earthquake (2011), approximately 91% of more than 15,000 deaths were from drowning in tsunamis [1]. Under such circumstance, effect of lifejackets is attracting attention as a secondary measure after evacuation to a higher location. This is because lifejackets have potential to keep oneself afloat above the water surface in tsunamis. However, ISO-12402 [2] and other current standards of lifejackets require examinations only in still water, even though people are exposed to violent and three-dimensional currents in the case of tsunamis. As one of few precedents evaluating performance of lifejackets in currents like tsunamis, Kurisu et al. [3] experimentally compared movement of a manikin with and without a lifejacket in water currents. However, quantitative evaluation of performance of lifejackets are still remained to be done, although it is necessary to design and establish lifejackets as a valid measure against tsunamis. To achieve such evaluation, numerical analyses are more effective because experimental c Springer Nature Singapore Pte Ltd. 2018  L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 466–476, 2018.

Developed of the DRowning hUman Model (DRUM)


analyses of water currents are not easy in a large scale. Conventional simulation models representing underwater movement of human bodies have developed mainly toward analyses and optimization of specific swimming motion in still water or steady currents [4,5]. By contrast, tsunamis have strong unsteadily currents and then people are forced unsteady movement in them. Therefore, a numerical model which supports direct computation of fluid, representation of unsteady movement of a human body, and their interactions are essential for above-mentioned and evaluation, although it was not realized yet. In the view of this situation, we developed a new numerical simulation model named DRowning hUman Model (DRUM), which represents the interactions of unsteady flow and human-body movement. In this paper, validation of the developed model was evaluated by reproducing an experiment [3].

2 2.1

Numerical Simulation Model Computation of Fluid

Developed model compute fluid and human-body segment respectively, as it considers interactions of each other. Fluid is computed with a multiphase flow solution: Constrained Interpolation Profile - Combined Unified Procedure (CIPCUP) method [6]. To represent multiphase flow, the developed model introduces volume ratio of air φ on every computational mesh (Fig. 1). The interaction from segment to fluid will be discussed in Sect. 2.3. The water surface is traced through computing advection of φ with Dφ = 0. Dt


Then, flow velocity u = (ux , uy , uz ), pressure p, and density of air and water ρair , ρwater are solved with the common fundamental equations among all phases: 1 Du = − ∇p + ν∇2 u − g, Dt ρ


Dp = −C 2 ρ (∇ · u) , and Dt


Dρl = − (∇ · u) (l = air, water) (4) Dt where ν is kinematic viscosity, g is the gravitational acceleration, C is speed of sound, and ρ is density. The values of C and ρ are weighted averages of air, water, and segments of human-body model based on volume ratio. Each of these equations are divided into terms of advection, viscosity, and pressure based on the fractional-step method [7]. Turbulences are represented by the Smagorinskytype LES where the Smagorinsky coefficient is Cs = 0.15.



D. Ajima et al.

Computation of Human-Body Segments

A human body is represented as some rigid-body segments connected at joints. This idea is based on the link segmentation model adopted for many researches especially in the division of sports engineering [4,8]. We got a shape of human body from an anthropometric database [9] (a 23-year-old male, 1.68 m height, 61.4 kg weight, data ID: F020) and represented it with cubic voxels of 5× 10−3 m edges. Then, as shown in Fig. 2, it was divided into 10 segments (ID number: s = 1, 2, · · · , 10) connected at 9 joints (ID number: j = 1, 2, · · · , 9) to compose a human-body model.

Fig. 1. Representation of multiphase fluid with volume ratio of air φ.

Fig. 2. Segments (black letters) and joints (blue letters) of a simulative human body. (Color figure online)

As shown in Fig. 3, when the s-th segment is connected with other segment(s) with the j-th joint(s), its translation and rotation around the center of mass are represented with the Newton and Euler equations:  Fs|j − Ms g and (5) Ms v˙ s = Fs|fluid + j on s

˜ s Is ωs = Ts|fluid + Is ω˙ s + ω

r˜sj Fs|j + Ts|j


j on s

where Eq. (5) is defined in the stationary axes and Eq. (6) is defined in the s-th segment’s principal axes of inertia. Ms and Is are the s-th segment’s mass and principal moment of inertia. vs and ωs are the translation and angular velocity vector of the s-th segment. Fs|fluid and Ts|fluid are the force and torque which the s-th segment receives from surrounding fluid. They are derived as interaction from fluid in the next section. Fs|j and Ts|j are the force and torque which s-th segment receives from the j-th joint. They are derived from requirement of joints sj is the displacement from the s-th segment’s center of mass mentioned below. r to the j-th joint. j on s means the summation of all joints at which the s-th ˜ b respectively segment is connected with other segments. The operators a˙ and a ˜ b = a × b. Then, denote time derivative and vector product: a˙ = da/dt and a

Developed of the DRowning hUman Model (DRUM)

Fig. 3. Force and torque which a segment receives.


Fig. 4. A joint connecting a pair of segments.

vs and ωs of the next time step are derived from temporal integrations of Eqs. (5) and (6) with the Euler method. The values of Fs|j and Ts|j are determined from two conditions required at every joint: one is that a pair of segments always need to be connected with each other, and the other is that joint angles between them need to be within certain movable extents. As the former condition, when the s-th and s -th segments are connected at the j-th joint (Fig. 4), the velocity of j-th joint vj needs to correspond on both segments to keep the connection [8]: ˜ s rsj = vs + ω ˜ s rs j . vj = vs + ω


Then, by taking temporal derivative of Eq. (7) and substituting Eqs. (5) and (6) for it, we get     ˜ s Is ωs + ω˙ s (8) v˙ j = (Ms )−1 Fs|fluid + Fs|j + ts Ts|fluid + ts Is Fs|j − ω     −1 ˜ s Is ωs + ω˙ s Fs |fluid + Fs |j + ts Ts |fluid + ts Is Fs |j − ω = (Ms ) where ts = −˜ rsj AOs (Is )−1 and AOs is a coordinate transformation matrix from the principal axes of the s-th segment to the stationary axes. As the latter condition, first angle of every joint is defined as rotation angle between orthogonal bases fixed on each segment. Then, the following restoration force is applied when a joint angle goes out of given movable extent:   θ¨j = −cθ˙ j + k θj − θj|limit


where θj − θj|limit is the amount of angle by which θj goes out from its movable extent, and the coefficients c and k are fixed to keep a critical dumping as c = 1/Δt and k = c2 /4. In this case, to estimate Fs|j and Ts|j , Eq. (9) needs to be solved with Eq. (9) simultaneously. Incidentally, when θj is within its movable extent, resistance against movement of he joint is negligible [10]. In this case, Ts|j is regarded to be zero and Fs|j is derived from Eq. (9) alone.



D. Ajima et al.

Interactions Between Fluid and Human Body

Interaction from human-body segments to fluid is considered by applying segments’ velocity as boundary condition of flow velocity on every computational meshes. First, volume fraction of human-body segments φs is set for every mesh based on locations of the segments. When a computational mesh contains the s-th segment, its velocity at the center of this mesh is derived as ˜ s rsP us = vs + ω


where r sP is the displacement from the s-th segment’s center of mass to the center of the mesh. Then, flow velocity of the mesh is modified with us as     φ s u∗ + φ s us (11) u∗∗ = 1 − s in mesh

s in mesh

velocity, u∗ is the flow velocity derived in the where u∗∗ is the modified flow computation of advection, and s in mesh means the summation of all segments contained in the mesh. As interaction from fluid to human-body segmentes, force Fs|fluid and torque Ts|fluid from fluid in Eqs. (5) and (6) are estimated for each segment. Those values are derived by integrating pressure ∇p and frictional stress τ respectively as  (∇p + τ ) φs ΔV and (12) Fs|fluid = meshes

Ts|fluid =

(˜ rsP (∇p + τ )) φs ΔV



 where meshes means the summation of all meshes which contain the s-th segment, τ = ρν∇2 u, and ΔV = ΔxΔyΔz is the volume of each mesh.

3 3.1

Model Validation Outline of Referred Experiment

Here, we confirmed validation of the developed model by reproducing an experiment [3], which observed a manikin in water currents assumed tsunamis. Figure 5 shows sketches of the experimental flume: Large Hydro-Geo Flume (LHGF) in the Port and Airport Research Institute, Yokosuka Japan. In every experimental case, an isolated wave height was generated and traveled from a wave generating apparatus settled in the upstream end of the flume. Meanwhile, a manikin provided by the Simulaids® (Water Rescue Manikin, item number 1328) was laid face up on concrete blocks in the downstream area. The manikin had 164.5 cm height, 2.2 kgf (21.6 N) weight in water, and an almost same bodily shape with the human-body model in the developed model. Thus, movement of this manikin was filmed through windows. Besides, temporal variation of water level and flow velocities in x and z-direction were measured with water gauges (WG1, WG2, and WG3 in the Fig. 5) and electromagnetic velocity meters (EV1 and EV2).

Developed of the DRowning hUman Model (DRUM)


A side view of whole flume

A top view around the manikin

A side view around the manikin

Fig. 5. Experimental flume. Locations of water gauges (WG1, WG2, and WG3) and electromagnetic velocity meters (EV1 and EV2) are shown.


Simulative Conditions

The shape of LHGF was reproduced as the computational domain, where noslip boundary condition is given to its floors. A non-uniform orthogonal mesh system was applied to there; the mesh size in mainstream (x) direction (Δx) was uniformly 3 cm in an area where the human body moves, while the outer meshes were getting large with an equal ratio up to 3.7 m. The mesh size in transverse (y) and vertical (z) direction (Δy and Δz) was uniformly 3 cm in the whole domain. As an initial condition, an incident sine wave with 0.44 m amplitude and 48.2 m wavelength was located. The human-body model was located above the blocks with the same posture with the manikin and fixed until the wave reached there (t = 10 s). The density of every segment was determined to totally keep the same weight with the manikin in water and to represent the situation of the experiment where both hands of the manikin were afloat in still water. The movable extents for every joint were based on an anthropometric data [11], while some components of them were narrower than the reference to fit the manikin. Standard values in 20 ℃ temperature were adopted for density and kinematic viscosity of air and water. Temporal increments of every computational step Δt were determined so that the CFL number can keep 0.5 or less and variation of every joint angle can keep 0.1 rad or less per Δt. 3.3

Simulative Results

Figure 6 shows variation of water level observed at WG2 and WG3 ((a) and (b)), and variation of flow velocity in x and z direction observed at EV1 ((c) and (d))


D. Ajima et al.

Fig. 6. (a)(b) water level, (c)(d) x-directional velocity, and (e)(f) z-directional velocity observed at the water gauges and the electromagnetic velocity meters.

and EV2 ((e) and (f)). Here, gray solid lines show the average of experimental cases and dashed lines show the ranges of average ±3σ where σ is the sample standard deviation. Besides, Figs. 7 and 8 show snapshots of the water surface and flows on some cross sections. After the wave passed and detached from the blocks, a vortex was generated behind them (t = 12 s). Then, the wave broke at approximately x = 1.5 m (t = 13 s), and the vortex spread and got weakened (t ≥ 14 s). In Fig. 7, the simulative and experimental shapes of water surface corresponded well, and thus amplitude and period of waves were similar (Figs. 6(a) and (b)). Meanwhile in Fig. 6, water level and velocity are roughly within the range of ±3σ. The difference in them from the average is because localized and complex flows were generated by some reason including breaking of the wave. Figures 7 and 8 also show simulative movement of the human-body model. The dashed squares in Fig. 7(a) represent the approximate scope observed in the experiments (Fig. 7(b)). Immediately after the wave reached the blocks, the whole human body was swept down and began falling from the blocks (t = 11– 12 s). At this time, the whole body spun and fell face down (t = 12 s) because of the flow crept under the body. Then while the upper part of the body was pushed back to near the blocks (t = 13–14 s), the legs were raised to around the top of the blocks (z = −0.3 m) along with an edge of the vortex (t = 14 s). Afterward, the legs were swept away by the detached flow while the upper part and the arms were also raised (t = 17 s in the simulation, t = 14 s in the experiment). Finally,

Developed of the DRowning hUman Model (DRUM)


Fig. 7. Side-view snapshots of (a) simulative and (b) experimental flow and human body. Dotted squares in (a) are the scope of (b).


D. Ajima et al.

Fig. 8. Top-view snapshots of (a) simulative and (b) experimental flow and human body.

Developed of the DRowning hUman Model (DRUM)


Fig. 9. Tracks of the whole body and the head. The experimental result approximately below z = −1.0 m was out of frame.

the upper part of the body were also swept away to approximately x = 2.0 m (t = 19 s in the simulation, t = 15 s in the experiment). In every scene of above-mentioned movement, the developed model reproduced postures and locations of manikin similarly, although the simulative speed of human-body model got slower after it reached the bottom. Next, we compared simulative and experimental tracks of the head in Fig. 9. Here, simulative track of center of mass of the whole body is also shown, however its experimental track was not available. The experimental track approximately below z = −1.0 m was missing because the movement of manikin was out of frame there. Both simulative tracks in Fig. 9 represent rotational movement of the human body along the vortex similarly to Fig. 7. Besides, simulative track of the head corresponded with experimental case precisely, although plots showing the location of every second didn’t after t = 13 s likewise. The difference in speed of the human body is considered because some segments got into area regarded as floors (t = 14– 15 s in Fig. 7(a)) and slowed down by the processes to apply no-slip boundary conditions. Nevertheless, it is considered to be not important at all toward evaluation of lifejackets in tsunamis, because what matters is whether a human body can keep afloat, and thus movement of human bodies in areas around floors are irrelevant to it. To sum up above-mentioned results, the developed model represented the water currents generally within the range of experimental error. As to the movement of the human body, its location and posture were reproduced precisely in every scene. Therefore, the developed model is considered to be adequate to evaluate movement of human bodies and performance of lifejackets in tsunamis.



D. Ajima et al.


We developed a new simulation model to evaluate movement of a human body and performance of lifejackets in tsunamis based on a combination of CIP-CUP method and link segmentation model. By reproducing an experiment [3], it is confirmed that the developed model is able to represent unsteady movement of a human body in unsteady water currents well. In this model, buoyancy of lifejackets can be given to the human body by attaching additional segments or just modifying density of the trunk. Thus, evaluation of lifejackets are realized through comparison of its buoyancy and receiving forces in simulative water currents on actual scale of tsunamis. Toward such evaluation, as our future tasks, it is necessary to reproduce situation of a tsunami and a human body based on appropriate flood simulations and anthropometric data. Furthermore, the developed model is applicable to not only tsunamis but various situations of drowning by giving movement of a human body through the drive of joints. Acknowledgement. This study was partly supported by a Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for Scientific Research (KAKENHI) (B) (16H03147).

References 1. National Police Agency of Japan: The great east Japan earthquake and the police. The Focus, vol. 281 (2012). (in Japanese) 2. International Organization for Standardization: Personal flotation devices? Part 9: Test methods. ISO12402-9:2006 (2006) 3. Kurisu, A., Suga, H., Prochazka, Z., Suzuki, K., Oguri, K., Inoue, T.: Potential technique for improving the survival of victims of tsunamis. PLoS One 13(5), e0197498 (2018) 4. Nakashima, M., Satou, K., Miura, Y.: Development of swimming human simulation model considering rigid body dynamics and unsteady fluid force for whole body. J. Fluid Sci. Technol. 2(1), 56–67 (2007) 5. Mizuno, N., Yamakawa, M.: Numerical simulation of flow around human in underwater dolphin kick swimming. Trans. JSME 83(845), 16–00049 (2017). (in Japanese) 6. Yabe, T., Wang, P.Y.: Unified numerical procedure for compressible and incompressible fluid. J. Phys. Soc. Jpn. 60, 2105–2108 (1991) 7. Kim, J., Moin, P.: Application of a fractional-step method to incompressible navierstokes equations. J. Comput. Phys. 59(2), 308–323 (1985) 8. Fujii, N., Ae, M., Miyashita, K.: Simulation system of human movement based on link segment model and its application to sport movements. Bull. Health Sport Sci.: Univ. Tsukuba 18, 117–126 (1995). (in Japanese) 9. Kouchi, M., Mochimaru, M.: AIST/HQL human dimension database. National Institute of Advanced Industrial Science and Technology H18PRO-503 (2006). (in Japanese) 10. Aoki, K., Yamazaki, N.: The role of joint resistance by passive tissues in human bipedal walking. Biomech. 14, 59–68 (1998). (in Japanese) 11. Nakamura, R., Saito, H., Nagasaki, H.: Fundamental Kinesiology, 6th edn. Ishiyaku Publishers Inc. (2016). (in Japanese)

Illumination Recovery for Realistic Fluid Re-simulation Hongyan Quan1(&), Zilong Song2, Xinquan Zhou3, Shishan Xue1, and Changbo Wang1 1


The School of Computer Science and Software Engineering, East China Normal University, Shanghai, China [email protected] 2 Harbin No. 1 High School, Harbin, China The College of Business, City University of Hong Kong, Hong Kong, China

Abstract. Previous studies in fluid re-simulation have devoted to reducing computational complexity, and little attention has been paid to realistic aspects. This paper presents a linear approach to estimate illumination from video examples for coherent photorealistic re-simulation. Compared with the previous study of light detection, it couples the reconstructed fluid geometry with surface appearance and linearly estimates illumination parameters, which avoids much higher computational cost from tedious optimization. The parameters in BlinnPhong shading model (BSM) are recovered hierarchically. Based on fitting the ambient and diffuse components through the particles with lower intensities, reflectance can be clustered from the observations of high-intensity particles surface. We demonstrate its effectiveness for both steps by extensive quantitative and qualitative evaluation through relighting on the fluid surface from ground truth fluid video, as well as from re-simulation. Photorealistic coherently illuminated visual effects consistent with fluid surface geometry are obtained. Keywords: Illumination Fluid re-simulation

 Blinn-Phong model  Reflectance

1 Introduction Illumination recovery is one of the major challenge and urgent-needed technique for producing naturalistic derivatives with realistic and coherently illuminated appearance in some applications of realistic scene generation, such as fluid re-simulation, virtual reality environment construction, computer games, film making, and military simulation. Illumination recovering known as inverse physics problem, aims at recovering illumination in a scene from the observed appearance. Recovering reflectance and illumination from natural fluid video is an extra hard task. Dynamic surface exhibits varied greatly reflectance behavior because of complex surface orientation and underconstrained condition between shape, reflectance, and illumination provided by nature 2D image, which makes the inverse optic task extremely challengeable and ill-posed ambiguous for reverse processing.

© Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 477–487, 2018.


H. Quan et al.

Despite of many difficulties, the pioneer works have devoted to this field. Land provides retinex theory with prominent influence firstly [1], on the basis, Horn proposes shading and reflectance decomposing approach on prior knowledge of sharp edges and smooth variation [2], furthermore, a variety of strategies about illumination recovery for intrinsic image are proposed successively, including intrinsic characteristics based method [3, 4], local texture based cues [5], ground truth dataset and baseline evaluations based method [6], and local-global sparse representation of reflectance method [7]. Besides, light detection techniques based on BRDFs are varied according to the common models, such as Torrance–Sparrow model [8], and Phong models [9]. Some recent advances have separated the factor images into a product of reflectance and illumination [10–12]. Photometric stereo [13] is the typical one. In the meanwhile, Stephen et al. [14] proposed a Bayesian framework for joint reflectance and illumination. Nayar et al. [15] presented sinusoidal patterns to recover scene illumination. For the existing methods, it is hard to acquire more satisfied results through fully automatic control. Furthermore, some works take the detected color as a clue to achieve separation. However, providing more material properties can increases complexity to the study. To solve this problem, the work of Thiago et al. [16] provides a semi-automatic and non-iterative approach that uses surface normal to calculate the parameters of components. However, it is regret that the specified regions with higher intensity need to provide beforehand urgently. Inspired by the classic framework [16], we provide the BSM estimation, a fully-automatic method without any user interaction, which combines photometric and fluid surface shape to calculate illumination parameters. We couple the recovered fluid geometry with its appearance to linear recover the illumination of nature scene. Compared to the previous study, our recovery algorithm provides a couple of advantages: • A linear solver is provided for illumination recovering from a single fluid frame in video. A key highlight of this work is that we deal with the problem of reflectance and illumination estimating with a linear strategy preventing from complex optimization. • A hierarchical strategy is studied for estimating BSM. Based on linearly fitting to diffuse and ambient, specular reflection can be easily acquired. • Geometry and photometric visual appearance are consistently regarded. During illumination recovery, the surface geometry is coupled into the estimating. The rest of this paper is organized as follows: In Sect. 2, a systemic framework of our approaches is provided. Section 3 describes the details of BSM estimation, and implementations as well as qualitative analysis are provided in the Sect. 4. Finally we draw conclusion in the Sect. 5 with some discussion on the future work.

2 System Framework Figure 1 shows the pipeline of our schemes, which inputs a single fluid image and outputs coherently illuminated results. In this pipeline, it is necessary to construct normalized surface geometry in BSM estimation, and normalized coordinate is

Illumination Recovery for Realistic Fluid Re-simulation


employed to calculate the illumination model. When example frame is input, normalized geometry (see Fig. 1(b)) is calculated from [17]. Steps of Fig. 1(c) to (e) are the main steps of BSM estimation. We calculate the parameters of Blinn-Phong shading model hierarchically. Ambient and diffuse components are linearly fitted from selected particles with lower densities (see Fig. 1(c)), furthermore, specular exponents are calculated (see Fig. 1(e)) after refining to surface geometry. At last, the realistic relighting effect in fluid re-simulation is shown in Fig. 1(f).

Fig. 1. The pipeline of our schemes

3 BSM Estimation The BSM estimation is based on the simplified version of Blinn–Phong shading model [16], which is described as: I p ¼ ka ia þ kd ðLm  NÞim;d þ ks ðRm  VÞa im;s


Where I p is the observed intensity of particle. For RGB color model, the intensity specified in this formula is a vector with three components. In our work, we use the ambient component Ca stands for the product of ambient coefficient ka and ambient ia , diffuse component Cd for the product of diffuse albedo kd and im;d , and specular component Cs for the product of specular reflection albedo ks and im;s . For a certain fluid scene, Ca , Cd and Cs are concrete values, and each of them have three components. Then, Blinn–Phong shading model can be expressed as: I p ¼ Ca þ Cd ðLm  NÞ þ Cs ðRm  VÞa


Where a is reflection exponent, Lm  N denotes scalar product of unit vector Lm to light source and unit normal vector N of surface. Rm  V denotes scalar product of unit vector Rm to reflected direction and unit vector to viewer V. It is necessary to be stressed that the above equation holds for each pixel and each color channel. To a certain scene, Ca , Cd , Cs and a are the four parameters needed to be estimated.



H. Quan et al.

Ambient and Diffuse Estimation

We estimate the ambient and diffuse components from the lower-intensity particles selected with pre-specified threshold T L , and only two items in the Blinn–Phong shading model are considered, written as I p ¼ Ca þ Cd ðLm  NÞ


We use pre-calculated normalized geometry hp [17] to calculate the item ðLm  NÞ. We denote the angle between surface N and vertical direction as h. Then, the angle between Lm and N are approximated as 2h. Further, Lm  N can be estimated from hp : Lm  N ¼ cos 2h ¼ 2 cos2 h  1 ¼ 2h2p  1


Then, the illumination model can be expressed as Ca þ ð2h2p  1ÞCd  I p ¼ 0


Ca and Cd can be fitted linearly using the selected particles. In our experiment, the threshold T L is pre-calculated as: T L ¼ ðI max  I min Þ  2=9 þ I min for more satisfactory results. Here I max and I min are the maximum and minimal intensity from all particles in the fluid scene, calculated in each channel separately. Combining the estimated ambient and diffuse with normalized geometry calculated from [17], particle height is refined. Given that, kR , kG and kB are the factors of three different channels, we substitute kR hp , kG hp and kB hp with hp in (5) respectively. Furthermore, kR , kG and kB are acquired. We average kR , kG and kB of a certain particle to get the height factor k, and then the particle height is refined as khp . Then, more accurate results of ambient and diffuse are achieved. Thus estimation of ambient, diffuse, as well as geometry carries out alternately twice. 3.2

Specular Reflection Exponent Estimation

We select particles with higher intensities and rich specular information to calculate reflection exponent a and reflection component Cs . We take the threshold T h ¼ ðI max  I min Þ  8=9 þ I min to select higher-intensity particles. To any two higherintensity particles with 3-channel observed intensities I 1p and I 2p , from (2) we have Cs ðRm1  V 1 Þa ¼ I 1p  Ca þ Cd ð2h2p1  1Þ


Cs ðRm2  V 2 Þa ¼ I 2p  Ca þ Cd ð2h2p2  1Þ


Illumination Recovery for Realistic Fluid Re-simulation


To reduce the computing complexity, we use ðN  HÞ to approximate the item ðR  VÞ [16]. H is the bisector direction between view and light directions. Further, Rm  V is calculated from particle height hp , and we have Cs hap1 ¼ I 1p  Ca þ Cd ð2h2p1  1Þ


Cs hap2 ¼ I 2p  Ca þ Cd ð2h2p2  1Þ


Then, combining (8) with (9), we derive the reflection exponent a: a¼

lgðf1 =f2 Þ lgðhp1 =hp2 Þ


Where f 1 ¼ I p1  Ca þ Cd ð2h2p1  1Þ


f 2 ¼ I p2  Ca þ Cd ð2h2p2  1Þ


hp1 and hp2 are the heights of the two selected particles. Note that (10) holds for each channel, reflection exponent a is calculated by averaging aR , aG and aB from different channels. Notice that the reflection exponents from every particle pair might be varied, the reflection exponents are clustered from different each channels. To any two reflection exponents ai and aj from different particles, the cluster deviation is defined as   ai  aj \Ta


Where Ta is pre-specified threshold, and we set it 0.1 across all tests. After clustering, the reflection exponent results are automatically divided into different classes. Then, we average the reflection exponents from the class with the largest number. 3.3

Specular Component Calculation

Construct the particle set SH with higher-intensity, then reflection component can be easily acquired from: Cs ¼ ðI p  Ca  Cd ðLm  NÞÞ=ðRm  VÞa


This holds for each channel, and we regard the average as the final result of the separate channels. Based on the previous statements, we describe our BSM estimation procedures in Algorithm 1.



H. Quan et al.

Relighting Based on BSM

Fluid surface relighting uses (15) to register the actual particle height hR to the range of example frame. n m n n hr ¼ ðhR  hnR Þðhm E  hE Þ=ðhR  hR Þ þ hE


n Where hr is the registered height, hm R and hR are the maximum and minimal heights n of actual advection, while hm E and hE are the maximum and minimal heights of example frame. Further, RGB components are calculated as:

Illumination Recovery for Realistic Fluid Re-simulation


Cs ¼ ðI p  Ca  Cd ðLm  NÞÞ=ðRm  VÞa IpR ¼ Ca þ Cd ðLm  NÞ þ CS ðRm  VÞa


Where IpR denotes the relighting intensity, and this formula holds to three channels.

4 Results and Evaluation We demonstrate the effectiveness of both schemes for steps of extensive quantitative and qualitative evaluation. All results are based on real captured data of Dyntex Dataset [18] available to public. Our hardware platform is PC with Intel (R) Pentium (R) 2.67 GHz CPU, 8 GB memory. The relighting results are rendered with OpenGL library. 4.1

Recovered Illuminations Results

In quantitative study, BSM is tested and verified through some challenge fluids, we apply the algorithms to different types of fluid, including tide, calm water and waterfalls. Comparative studies are also provided for further performance and qualitative analysis. Figure 2 demonstrates some of the illuminated results with the recovered parameters. In the left column is the 2D sample frame, the color bars corresponding to the recovered illuminations are listed in the second column of each group, the top-view and 3D-view illuminated effects are also demonstrated in the third and forth columns in each group. From the results, it can be seen that the realistic effect can be achieved from the BSM based inverse optic procedure.

Fig. 2. Our recovered illumination results applied in relighting



H. Quan et al.

Realistic Relighting Results from Recovered Illumination

The illuminations and coherently realistic relighting effects from our proposed methods are shown in Fig. 3. Illumination bars of each group in the second columns are recovered from example from in the left column. The results relighting on reconstructed surfaces are demonstrated in the third (front-view) and forth columns (3Dview), while the last one in each group is from illuminated re-simulation derivatives. It can be clearly seen that coherently illuminated results are achieved.

Fig. 3. The recovered illuminations and coherently realistic relighting effects


Performance of Illumination Estimation

To compare our approach with the state of art [16], average errors of illumination (AEI) Ea , as well as relative errors of illumination (REI) Er are calculated. The statistic results are listed in the third and forth column of Table 1. Ea ¼ ð0:299

N 1  N 1  N 1  X X X    Ri  Rr  þ 0:587 Gi  Gr  þ 0:114 Bi  Br Þ=N i i i i¼0




N is total number of particles,Ri ,Gi and Bi are the observations from 2D example frame, while Rri , Gri and Bri are those from of surface relighting, respectively. 0:299 Er ¼

N1 P i¼0

N1  N1     Ri  Rr  þ 0:587 P Gi  Gr  þ 0:114 P Bi  Br  i i i i¼0


N1 P i¼0

Ri þ 0:587


N1 P i¼0

Gi þ 0:114

N1 P




From the statistic results, we can see that the relighting errors of our strategy are smaller. The work [16] takes a least squares solution to find the log-linearized equations solver for specular reflection, which brings more error to the parameters recovery. Unlike this, linear fitting is employed in our proposed approaches, in which fewer parameters are needed to consider and more satisfactory results can be conveniently obtained. Beside this, it is not necessary to provide any user specified sketches and user interventions in the pre-process.

Illumination Recovery for Realistic Fluid Re-simulation


Table 1. Comparative study of AEI and REI from different methods


Fluid sample AEI Thiago et al. [16] Ea

REI Thiago et al. [16] Er

54ab110 647b710 647b810 647c310 649cf10 649ci10 649cj10 649dc10 649dg10 649dh10 649ea10

0.262964 0.242927 0.808619 0.138688 0.516341 0.235434 0.993911 0.244309 0.157422 0.208546 0.247398

33.036712 29.229742 95.001953 18.077024 79.965372 29.271649 118.633564 33.653357 20.672678 25.569214 34.624263

5.101024 7.675622 9.428635 2.175607 3.949822 4.647984 5.584326 4.415192 5.078963 4.053065 4.234297

0.040602 0.063792 0.080253 0.016691 0.031273 0.037384 0.046785 0.032052 0.038676 0.033057 0.030255

Analysis of Spatial Errors Distribution and Qualitative Analysis

To analyze the spatial distribution of illumination errors, we statistic the maximal and minimal errors from different methods and normalize particle errors with multi-level colors in comparing with the related work [16]. Visualized errors of relighting are shown in Fig. 4. The first column is the fluid image instances, the second column is visualized errors from [16], and the third is from our estimation. The right column is the colors bar of error, the top is corresponding to the maximum and the bottom is corresponding to the minimal.

Fig. 4. Visualized errors from different methods


H. Quan et al.

Figure 4 shows that the errors from different methods are both distributed unevenly in spatial domain. It means that the estimation errors from different works are varied. Errors from calm waters show relatively small, while in the fluid crest and torrent water, it exhibits larger due to larger specular estimation error. This work has several limitations that are worth considering. The provided illumination estimation relies on fluid surface geometry reconstructed from SFS [17]. If fluid surface is suffered from some speckled object, and accurate illumination model in BSM can not be obtained, further coherently illuminated effect is unable to achieve yet. One great potential way is that smooth in pre-process can make up the illumination and texture of the dark speckle, on this basis, more accurate illumination estimation and realistic relighting result can be achieved.

5 Conclusion The main contribution in this work is that we study the illumination recovery strategies based on BSM, realistic relighting from surface geometry can be achieved. Realistic effect with photometrical consistence can be achieved from the recovered illumination parameters. From real capture videos, coherently illuminated realistic fluid effect similar with the original 2D example can realized. Our approach has the following advantages: • A new approach is provided consistently combining photometric and geometric information for recovering illumination parameters in inverted optics problem. • Realistic fluid re-simulation can be conveniently achieved from capturing video with low-level ordinary device. • Hierarchical and linear strategies are presented in recovering illumination, which avoids much computational cost. There are limitations to overcome. In the future, we wish to extend this problem to deep learning strategy to improve it. Acknowledgements. We thank Dyntex Dataset to support rich fluid video for our study, and special thanks to the reviewers for their valuable comments and suggestions. Funding. This study was funded by NSFC Grant No. 61672237, 61532002, National Hightech R&D Program of China (863 Program) under Grant 2015AA016404.

References 1. Land, E.H., Mccann, J.J.: Lightness and retinex theory. J. Opt. Soc. Am. 61(1), 1–11 (1971) 2. Horn, B.K.P.: Determining lightness from an image. Comput. Graph. Image Process. 3(4), 277–299 (1974) 3. Barrowm, H., Tenenbaum, J.: Recovering intrinsic scene characteristics from images. Comput. Vis. Syst. 2, 3–26 (1978) 4. Fischler, M.A.: Recovering intrinsic scene characteristics from images. Southwest Research Inst Report (1981)

Illumination Recovery for Realistic Fluid Re-simulation


5. Shen, L., Tan, P., Lin, S.: Intrinsic image decomposition with non-local texture cues. In: Computer Vision and Pattern Recognition, pp. 1–7. IEEE (2008) 6. Grosse, R., Johnson, M.K., Adelson, E.H., Freeman, W.T.: Ground truth dataset and baseline evaluations for intrinsic image algorithms. In: International Conference on Computer Vision, vol. 30, pp. 2335–2342. IEEE (2010) 7. Shen, L., Yeo, C.: Intrinsic images decomposition using a local and global sparse representation of reflectance. In: Computer Vision and Pattern Recognition, vol. 32, pp. 697– 704. IEEE (2011) 8. Ogino, S., Migita, T., Shakunaga, T.: Simultaneous recovery of reflectance property, shape and light position based on torrance-sparrow model. Technical Report Image Engineering, vol. 107, pp. 557–564 (2008) 9. Nielsen, J.B., Frisvad, J.R., Conradsen, K.: Addressing grazing angle reflections in Phong models. In: SIGGRAPH Asia, vol. 1. ACM (2014) 10. Wu, T.P., Sun, J., Tang, C.K., Shum, H.Y.: Interactive normal reconstruction from a single image. ACM Trans. Graph. (TOG) 27(5), 1–9 (2008) 11. Schoeneman, C., Dorsey, J., Smits, B., Arvo, B., Greenberg, D.: Painting with light. In: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques, pp. 143–146 (1993) 12. Weyrich, T., Lawrence, J., Lensch, H.P.A., et al.: Principles of Appearance Acquisition and Representation. ACM SIGGRAPH, New York (2008) 13. Robert, J.W.: Photometric method for determining surface orientation from multiple images. Opt. Eng. 19(1), 191139 (1980) 14. Stephen, L., Ko, N.: Reflectance and illumination recovery in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 129–141 (2016) 15. Nayar, S.K., Krishnan, G., Raskar, R.: Fast separation of direct and global components of a scene using high frequency illumination. ACM SIGGRAPH 25, 935–944 (2006) 16. Thiago, P., Emilio, V.B., Ives, M., Mario, C.S., Luiz, H.D.F., Luiz, V.: Sketch-based warping of rgbn images. Graph. Model. 73(4), 97–110 (2011) 17. Yu, M.Q., Quan, H.Y.: Fluid surface reconstruction based on specular reflection model. Comput. Animat. Virtual Worlds 24(5), 497–510 (2013) 18. Peteri, R., Fazekas, S., Huiskes, M.J.: Dyntex: a comprehensive database of dynamic textures. Pattern Recognit. Lett. 31(12), 1627–1632 (2010)

Ocean Analysis by Tsunami Simulation of the Nankai Trough Massive Earthquake Yuto Sakae1(B) , Ikuya Morimoto1 , Takuya Ozaki1 , Ryo Kurimoto1 , Liang Li2 , Kyoko Hasegawa2 , Satoshi Nakada3 , and Satoshi Tanaka2 1


Graduate School of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan [email protected] College of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan 3 National Institute for Environmental Studies, Tsukuba, Japan

Abstract. Large-scale tsunamis have a major impact on the natural environment. Therefore, it is important to predict a large-scale tsunami. Recently, large-scale tsunami simulation using a supercomputer has been performed for prediction. By analyzing large-scale tsunami simulation data obtained from the supercomputer, it is expected that damage will be minimized. The purpose of this paper is to propose a visualization method for analysis support of simulation results. There are three kinds of visualization methods proposed. The first method is simultaneous visualization of a majority of feature quantities based on an opacity transfer function obtained by extending the HSV color space. This is aimed at observing the mutual relationship between feature quantities obtained by simulation. The second method is to create volume data in which time-series images are superimposed in the time axis direction (XYT space time) and to visualize the time-series data. This is aimed at a detailed analysis of the full-time behavior and specific feature quantities. The third method is to visualize the fusion of the cross-section plane of a tsunami and the fluid volume around it. This is aimed at the detailed visualization of the observation data in the sea, and by fusing the surrounding data, the detailed time-dependent flow velocity and salinity in the sea can be clearly observed. This paper presents the results of visualizing these three methods by applying them to the flow velocity and salinity data obtained from the simulation of the Nankai Trough massive earthquake and analyze these data. Keywords: Nankai Trough massive earthquake Tsunami simulation · HSVA color space The fusion of the cross-section plane · XYT space time



In the near future, it is predicted that the Nankai Trough massive earthquake causing a large tsunami will occur. Large-scale tsunamis have great influence not c Springer Nature Singapore Pte Ltd. 2018  L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 488–500, 2018.

Ocean Analysis by Tsunami Simulation


only on people but also on marine life. As sea water flows into soil and rivers, it is assumed that the salt damage occurrence and the damage to ecosystems due to changes in salinity in rivers. Therefore, it is necessary to analyze and predict the behavior of a large-scale tsunami. Recently, large-scale numerical fluid simulation has been performed by a supercomputer. The generated data include feature quantities such as flow rate and salinity. The analysis of the tsunami has been conducted by visualizing these features. By using SPBR (Stochastic Point Based Rendering) [1] to support the visual analysis of large-volume data, our laboratory is able to quickly and precisely analyze the internal structure of the target object through transparent visualization It is made possible [1,2]. The purpose of this research is to propose a new visualization method of analysis support considering the difficulty of analysis by conventional visualization. For visualizing a plurality of conventional feature amounts [3], two degrees of freedom (hue and brightness and hue and saturation) were simultaneously used. The visualization in conventional methods clearly demonstrates change in hue. However, the visualization of feature quantities applied to brightness and saturation is difficult. Therefore, achieving an intuitive understanding of the correlation between characteristic regions is difficult. This study was increased the degree of freedom by one more than the conventional method in order to find the relationship between multiple feature quantities. Using three degrees of freedom (hue, saturation and brightness), we can observe the relationship between feature quantities. In the analysis of the tsunami by conventional visualization, an animation is produced. However, the conventional method is unable to provide a visualization of intense sea-based feature quantities and does not focus on changes in a specific time domain. For the 3D visualization of sea for the analyze of the vertical behavior of the water, this study was made to generate a vertical cross-section plane from the sea level to the seabed in a specific location and fuse it with surrounding volume data. This allows us to visualize changes in feature quantities in the sea over time. Furthermore, focusing on the time-development analysis of the sea flow, we stack 2D images of the sea along the time axis and achieve spatiotemporal-space visualization to observe the whole time space in one glance. This is a development achieved by applying the plasma visualization method developed in our laboratory [4,5]. This study visualized the simulation data of the tsunami by the Nankai Trough massive earthquake [6] using the above three methods.


Experimental Data and Visualization Method

The analysis area of the tsunami simulation of the Nankai Trough massive earthquake used in this study is shown in Fig. 1. The analysis area includes Osaka Bay, Satsuma Nada and Kii Channel. The data used consist of an unstructured lattice model adopting a triangular prism model. The number of triangular models is 4,648,536. The data include feature such as flow rate, salinity, and water temperature. This study was made to analysis of 36 frames separated every 10 min from the Nankai Trough big earthquake occurrence. SPBR performs used as a visualization method. SPBR is translucent


Y. Sakae et al.

Fig. 1. Range for the simulation analysis

rendering using issuing particles opaque to the basic shape of the drawing. There is no need for a depth sort with respect to a point group, and it is possible to draw at high speed even for large-scale data. Visualization by stochastic point rendering can be divided into three steps: (1) point generation, (2) point projection, and (3) pixel luminance value determination. The first step of this task creates points with uniform density in the volume space based on the data to be visualized. The second step is to project and store the nearest point from the line of sight at each pixel in the image plane. At this time, the generated points are projected in an arbitrary number of groups. In the third step, luminance values are determined by applying an ensemble average to the image group created in step 2. The determination of the luminance value by the ensemble average is expressed by the following Eq. (1). B=

LR −1 1  B [i] LR i=0


where B is the luminance value of the image to be drawn, Bi is the luminance value of the ith image generated in step 2, and LR is the number of groups.



In this section, we explain the following three methods of sea-flow analysis. (1) Simultaneous visualization of a plurality of feature quantities using a multidimensional transfer function in HSV color space (2) Three-dimensional transparent fused visualization of cross-section plane and fluid volume data (3) Spatiotemporal visualization of tsunami simulation as time-series data.

Ocean Analysis by Tsunami Simulation



Simultaneous Visualization of a Plurality of Feature Quantities Using Multidimensional Transfer Function in HSV Color Space

The transfer function used for visualizing a feature quantity uses a color mapping table to associate the scalar value of volume data with hue and opacity. We create and apply a multidimensional transfer function that adds 1 degree of freedom to the conventional method. The color map with the created multidimensional transfer function applied is shown in Fig. 2.

Fig. 2. Color map of the proposed method (Color figure online)

The color map in Fig. 2 expresses the flow rate and salinity, which are the feature quantities of the experimental data. The vertical axis represents the magnitude of the flow velocity, and the horizontal axis represents the change in the salinity. The slowest value is blue, and the fastest value is purple. When the change amount of salt increases, the saturation decreases, and when this change amount decreases, the brightness decreases. To determine the color, we first choose the hue from the magnitude of the flow velocity and decrease the brightness or the saturation from the salinity change. Using this multidimensional transfer function, we visualize a plurality of feature quantities. 3.2

Three-Dimensional Transparent Fused Visualization of Cross-section Plane and Fluid Volume Data

A cross-section plane is suitable for the detailed analysis of a selected local region, but it is not suitable for providing an overview of a wide area. On the other hand, the volume rendering is not always suitable for the precise analysis of local features, but it is suitable for providing an overview of a wide area. Therefore, we perform a fusion of the cross-section plane and the sea water volume. To realize the fusion of the plane, i.e., the surface and volume, we convert each of them


Y. Sakae et al.

into point datasets. Then, we merge the two datasets to create a unified point dataset and apply SPBR to realize transparent fused visualization. To render the sectional view and its surrounding fluid volume with different shadings, it is necessary to prepare a transfer function that draws both volumes with the same color distribution and different opacity. The transfer function is shown in Fig. 3.

Fig. 3. Transfer function

The color map expresses the magnitude of the feature amount extracted from the simulation data. α is the opacity. The color and opacity are the color and opacity corresponding to the features on the horizontal axis. The opacity is divided between the sectional section and the fluid volume section. A darker area is rendered by increasing the opacity of the section or by reducing the opacity of the surrounding fluid volume. 3.3

Spatiotemporal Visualization of Tsunami Simulation as Time-Series Data

To observe the dynamical behavior of the tsunami for the whole time region at one glance, we stacked 2D images of the sea water along the time axis and created a spatiotemporal 3D volume in the XYT space (Fig. 4).

Fig. 4. Creating volume data by the time-series image group

The resolution of the created image data is 1024 × 1024. One pixel form these image groups is handled as one voxel of the volume data, and the volume data are created by overlapping along the time axis. The data generated by the simulation are output at 10 min/frame for 10 min to 360 min, and the total number of frames is 36. To make the overlay thicker, we linearly interpolate

Ocean Analysis by Tsunami Simulation


Eq. (2) to pixel values of the same coordinates of the n-th and n+1-th images to create an interpolated image. C = Cn (1 − θ) + Cn+1 θ

(0 ≤ θ ≤ 1)


By inserting the created interpolated image between the n-th image and the n+1-th image, thickness data are given to the volume data along the time axis. In this research, by applying interpolation in 8 steps in the time direction, volume data of 1024 × 1024 × 281 are created.


Experimental Results

Figures 5 and 6 show the fusion visualization results for the feature quantities.

Fig. 5. Fusion visualization by conventional method

Figure 5 shows the result by the conventional method, (a) expresses the flow velocity as hue and salinity as brightness, and (b) expresses the flow velocity as hue and salinity as saturation. Figure 6 shows the results of the proposed method, using hue, brightness, and saturation. These results allow us to visualize the flow rate and salinity 190 min after the occurrence of the tsunami caused by the Nankai Trough massive earthquake. In Fig. 5(a), if the magnitude of the brightness is normalized to 0 to 1, the brightness becomes 0.5 when the salinity change amount is 0. Since the brightness criterion becomes 0.5, the overall brightness becomes low, and it is difficult to visualize the salinity variation. Similarly, in Fig. 5(b), the standard for saturation after normalization becomes 0.5, and the saturation of the whole volume is low. As a result, it is difficult to visualize salinity fluctuations. In Fig. 6, when the change amount of salinity after normalized is 0, brightness and saturation can be set to 1. As a result, the range of color that can be expressed widens, and it becomes easy to visualize places where the salinity changes markedly. In the visualization of the sea surface, we visualized the change in the flow velocity and salinity of 6 hours after the occurrence of the Nankai Trough earthquake. This paper presents only the results of after 150 min from the occurrence of the tsunami by the Nankai Trough massive earthquake are shown (Fig. 7).


Y. Sakae et al.

Fig. 6. Fusion visualization by our method

Fig. 7. Visualization area of cross-section plane

The analysis ranges are around Naruto Strait (Fig. 7(a)), between Wakayama prefecture and Awaji Island (Fig. 7(b)), near Akashi Strait (Fig. 7(c)), and Osaka bay (Fig. 7(d)). This is because Fig. 7(a) to (c) is expected to show characteristic amounts that change in a complicated manner, and Fig. 7(d) is expected to influence the salt changes based on the influence of the nearby river. Figure 8 shows the visualization results for salinity. The dark area near the center of the image is the cross-section plane, and the surrounding pale area is the sea volume. From Fig. 8, we can see the salinity change distribution from the sea surface to the bottom of the sea. In the waters of Fig. 8(a), there was no significant salinity change. However, in the waters of Fig. 8(b) to (d), the salinity increased from the sea surface layer to the bottom of the sea. Furthermore, we can see that some parts of the sea surface have low

Ocean Analysis by Tsunami Simulation


salinity. It is thought that this is because fresh water is mixed by rivers near the three sectional views. Next, the visualization result of the flow velocity is shown in Fig. 9.

Fig. 8. Visualization results of salinity changes in the sea

The dark area near the center of the image is the cross-section plane, and the surrounding pale area is the sea volume. From Fig. 9, we can see the flowvelocity distribution from the sea surface to the bottom of the sea. We can see the change in flow velocity from the sea surface layer to the bottom of the ocean in the waters of Fig. 9(a) to (d). In particular, it can be confirmed in Fig. 9(b) that the change is severe from the sea surface to the seabed. The creation of volume data in XYT space was performed based on the data on the sea surface layer and sea floor around Awaji Island six hours after the Nankai Trough Earthquake occurred. Figure 10 shows the volume data created based on the flow velocity data. Comparing the volume data of the sea surface and the volume data of the sea floor, we can see that the flow velocity is higher than the sea surface velocity, because the seabed is close to the source of the earthquake. Next, to visually distinguish the fluctuations in the flow velocity, volume data whereby the opacity of the portion with a low flow velocity was lowered and the portion with the high flow velocity was emphasized ware prepared. The created volume data are shown in Fig. 11. It is remarkable that the flow velocity of the seafloor is higher than


Y. Sakae et al.

Fig. 9. Visualization result of the flow velocity in the sea

that of the sea surface layer. Furthermore, by observing the flow velocity of the sea floor along the y axis, the time-dependent flow velocity can be visualized (Fig. 12). From Fig. 12, it can be confirmed that a plurality of high flow velocities are generated, and we can confirm the first and second tsunami wave, respectively. Figure 13 shows volume data created based on the salinity data. Compared with sea surface layer, the sea floor shows that the higher salinity range increases significantly. The volume data emphasizing the high salinity are shown in Fig. 14. In addition, to understand salinity fluctuations at the seafloor where fluctuations of the flow velocity are high, changes in the salinity of the seabed are drawn along the y axis (Fig. 15). Over time, we can see that the salinity is decreasing in the area on the left side. However, in the sea area on the right side, minimal salt fluctuations were observed.

Ocean Analysis by Tsunami Simulation

Fig. 10. XYT volume data of flow velocity

Fig. 11. XYT volume data of the flow velocity using opacity changes



Y. Sakae et al.

Fig. 12. Flow velocity data viewed from the XT plane

Fig. 13. XYT volume data of salinity

Ocean Analysis by Tsunami Simulation

Fig. 14. XYT volume data of salinity using opacity changes

Fig. 15. Salinity data viewed from the XT plane




Y. Sakae et al.


Other studies were insufficient to recognize the relationship between feature quantities and the visualization of the ocean and the presentation of detailed temporal changes. This paper was proposed a visualization based on the fusion of feature quantities, the visualization of the sea flow and the creation of volume data in XYT space as an ocean analysis method. These methods were able to better understand the relationship between flow velocity and salinity using three degrees of freedom. Furthermore, we were able to visualize the tsunami data by three-dimensional fusion with the fluid volume around its cross-section plane such that it became possible to clearly visualize the detailed time changes in flow velocity and salinity in the sea. Finally, using the XYT space, which is 2 + 1-dimensional space, it was possible to visualize temporal changes with still imagery focusing on changes in the tsunami simulation over the whole time period or characteristic changes occurring in a specific area.

References 1. Tanaka, S., et al.: Particle-based transparent rendering of implicit surfaces and its application to fused visualization. In: Proceedings of the EuroVis 2012 (2012) 2. Zhao, K., Sakamoto, N., Koyamada, K., Tanaka, S., Murotani, K., Koshizuka, S.: Interactive visualization of large-scale 3D scattered data from a tsunami simulation. Int. J. Ind. Eng. Theory Appl. Pract. 24(2), 207–219 (2017) 3. Matsuoka, D., Araki, F., Kida, S., Sasaki, H., Taguchi, B.: Visualization for ocean general circulation model via multi-dimensional transfer function and multivariate analysis. In: Proceedings of JSST 2012 (International Conference on Simulation Technology), pp. 90–94 (2012) 4. Kawamoto, N., et al.: Static visualization of dynamical plasma collision. In: Proceedings of the 11th Asian Symposium on Visualization (ASV 2011), Niigata, Japan, 5–9 June 2011 5. Uenoyama, Y., et al.: Comparative visualization of plasma-plume collisions focusing on coincidence of multiple datasets. In: 34th International Annual Conference on Simulation Technology, JSST 2015, Toyama, Japan, 12–14 October 2015 6. Nakada, S., Hirose, N., Senjyu, T., Fukudome, K., Tsuji, T., Okei, N.: Operational ocean prediction experiments for smart coastal fishing. Prog. Oceanogr. 121, 125– 140 (2013)

Improving Traffic Flow at a Highway Tollgate with ARENA: Focusing on the Seoul Tollgate Seung-Min Noh, Ho-Seok Kang, and Seong-Yong Jang(&) Seoul National University of Science and Technology, Seoul, Korea {tenderly,syjang}

Abstract. In this study, the effects of traffic flow improvements were compared in terms of real traffic data by using the level of congestion, throughput, and average duration time criteria. A current situation and a situation with smart tolling were considered in order to ensure the reliability of the results. The simulation consisted of six scenarios, the error rates were checked with 30 replications, and the error rates were noted to be lower than 0.5%. The scenarios were categorized by vehicle speeds that could affect subsequent vehicles. The smart tolling model was better than the current model overall. In particular, the results of comparing lifelike scenario 1 and scenario 6 with smart tolling are key. The results showed that the level of congestion and average duration time could be decreased to approximately 50%. With regard to the results, there are significant impacts on improving traffic flow while maintaining five lanes, unlike in the current model. Keywords: Smart tolling

 Simulation  Traffic flow

1 Introduction According to a report released by the Ministry of Land, Infrastructure and Transport in January 2018, the number of vehicles registered in Korea was 22.53 million as of December 2017, which denotes an increase of 3.3% from 21.8 million in 2016. This means there is 1 vehicle per 2.3 people [1]. As traffic grows, highway traffic is also growing. Traffic is the main cause of highway congestion. Tollgates can also be a major cause because vehicles need to stop at these gates unless they have a Hi-pass. In addition, there are various causes such as excessive lane changes and traffic accidents. The Korea Highway Corporation installed Hi-pass lanes on the highways in 2007, starting with a pilot project in 2000 to ease the congestion of highways and tollgates. Since then, the number of Hi-pass users has continued to increase, and 80% of daily highway traffic used a Hi-pass as of early July 2017 [2]. Despite this high utilization rate, users complained because the Hi-pass vehicles had to drive at speeds of 30 km/h or less [3]. The Hi-pass lanes and regular lanes are still operated together at tollgates in Korea. This is causing conflicts and excessive lane changes owing to differences in speed between different lanes, which can lead to serious safety problems. This also contributes to traffic congestion on the roads. In Korea, it is difficult to install Hi-pass for all tollgates. Thus, a realistic alternative is needed to prevent heavy traffic congestion caused by conflict between Hi-pass lanes and regular lanes. Lately, © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 501–510, 2018.


S.-M. Noh et al.

developments in information and communication technologies have led to changes in intelligent transportation systems and discussions on smart tolling systems to remedy the shortcomings of Hi-pass. According to a report released by the Ministry of Land, Infrastructure and Transport in January 2017, the smart tolling system was introduced as an unmanned automatic payment system using existing Hi-pass or video recognition technology. The smart tolling system overview is illustrated in Fig. 1 [4].

Fig. 1. Smart tolling system (Source: The Ministry of Land, Infrastructure and Transport, January 2017)

In this study, the subjects were limited to vehicles entering Seoul from the large tollgate, which was frequently congested. The effects of assuming a smart tolling system along with maintaining existing driving speeds in multilanes will be compared.

2 Theory Background Recently, a study on the operation of tollgates on highways was conducted by examining traffic congestion and safety owing to relative speeds and conflicts between Hi-pass lanes and regular lanes. Research on smart tolling is also underway. Variables such as Hi-pass traffic, the lengths of fields and roads for entering and exiting, and adjustments of Hi-pass lanes affect the traffic dispersion rate. When placing a Hi-Pass lane, the position is analyzed as a key variable. Hence, selecting a location is the most important factor for balanced distribution in a Hi-pass lane [5]. A simulation model was developed to explore the best integrated operation methods when combining the existing TCS (Toll Collection System) and ETC (Electronic Toll Collection) systems. In addition, the optimum number of Hi-pass lanes was calculated to reduce vehicle waiting time. According to the results of a cost benefit analysis, for operators, expanding the number of Hi-pass lanes will increase benefits. For users, an appropriate number of lanes is required owing to increasing costs or latency. If the number of lanes is underestimated or overestimated, then costs will outweigh the benefits [6]. To improve

Improving Traffic Flow at a Highway Tollgate with ARENA


safety for central Hi-pass lanes, the deceleration time of vehicles owing to relative speed and conflicts were analyzed. The proposed methods of operation were the connection between the central Hi-pass and main lanes, prevention of conflicts, secure driving lanes, and stepwise entry [7]. A case study using a micro traffic simulation provided a method for determining the position of Hi-pass lanes considering mobility, stability, and convenience of operation. The result was more efficient when the Hi-pass lanes were located on the left side and were closer to each other [8]. The Hi-pass system, which automatically paid during nonstop driving, was expected to ease traffic delays or congestion near tollgates. However, if many vehicles are using the tollgate, it is difficult to avoid delays and congestion, and there is a risk of accidents owing to the conflicts of Hi-pass and regular-lane vehicles. The study introduced a smart tolling system that has multilanes in nonstop traffic [3]. If smart tolling is installed at the locations of tollgates, the effects on the initial construction expenses and reduced costs are significant. As a result, the smart tolling will help reduce transit time and travel costs [9].

3 Simulation 3.1

Traffic Flow Model

Fundamentals. This study examines the effects of applying a smart tolling system to current tollgates to improve traffic flow. The Korea Highway Corporation’s actual traffic data (Monday, 3/7/2016) were used for analysis, and a model of the entry section of the Seoul tollgate on the Gyeong-bu highway was developed using ARENA 14.0. According to the data, the section was divided into 20 booths in 5 lanes, as shown in Table 1.

Table 1. Seoul tollgate entry section Number of lanes Number of booths Type of lane 1 1–4 Hi-pass 2 5–9 Regular 3 10–14 Regular 4 15–17 Hi-pass 5 18–20 Regular

The arrival distribution time was estimated by the traffic volume between 8 a.m. and 10 a.m. (the peak time during rush hour) using an input analyzer included in ARENA 14.0. The results are listed in Table 2. Some booths were unable to estimate the distribution time as they were operated differently according to the tollgate situation.


S.-M. Noh et al. Table 2. Arrival distribution time for each booth Number of booths Arrival distribution time 1 0.5 + LOGN(7.3, 8.34) 2 0.5 + LOGN(2.71, 2.39) 3 0.5 + GAMM(2.75, 1.64) 4 0.5 + LOGN(4.79, 5.33) 5 6 7 5 + LOGN(17.5, 17) 8 9 + LOGN(16.5, 21.8) 9 4.5 + LOGN(19.5, 14.4) 10 11 4 + GAMM(10.2, 2.13) 12 7.5 + LOGN(16.6, 17.3) 13 5 + LOGN(30.4, 41.8) 14 15 0.5 + LOGN(6.72, 7.99) 16 0.5 + LOGN(8.51, 11.5) 17 0.5 + GAMM(7.13, 1.25) 18 8 + 167 * BETA(0.807, 4.95) 19 20 7 + LOGN(46.6, 110)

Entry roads were divided as shown Fig. 2. The number of zones and the lengths of zones are summarized in Table 3.

Table 3. Number of zones and length of zone by section Section Tollgate 1 2 3 4 5 6

Number of zones Length of zone Length of section (m) 6 3 18 20 4 80 20 4 80 13 4 52 13 4 52 8 4 32 13 4 52

The number of zones refers to the number of vehicles that can occupy the zone. Simulation Assumptions. Several assumptions are described for the implementation and examination of simulation models at a possible level.

Improving Traffic Flow at a Highway Tollgate with ARENA


Fig. 2. Split-entry road section

First, each vehicle moves in the direction of the lowest occupancy rate of any route that can be moved forward. Second, the basic speed of all vehicles is set as the average speed of passing through the tollgate, which can only be identified by the data. However, if the passing interval time of two vehicles was 0 s, it was converted to 1 s. Third, there is no classification between buses and vehicles, and so on. Fourth, when moving diagonally to change lanes, the speed of the vehicle is reduced slightly to the 90% level. Fifth, when passing through a tollgate, the speed of vehicles in regular lanes is adjusted to the 20% level, assuming the speed at which it stops and starts for payment. Sixth, the length of the zone was set to 4 m, and the length of the vehicle was set to 3 m. Seventh, when the vehicle starts to move to the next zone, the current zone is released. Eight, the actual total length was 368 m, but in the model, the length was set to 366 m to divide by zone. Nine, as basic research, the acceleration and deceleration of each vehicle were not considered. 3.2

Simulation Modeling

Model 1 (Hi-pass and Regular Lanes). To simulate traffic flows such as the current situation, the model was implemented using networks and zones. The model was implemented as a section between tollgates with 20 lanes merging into 5 lanes, as shown in Fig. 3. The speeds of the Hi-pass vehicles were estimated by real data using the input analyzer, and the results are shown in Fig. 4. The time to pass through the tollgate section was separated by lane type. For vehicles in Hi-pass lanes, the transit time was calculated by dividing the distance of the tollgate section (18 m) by the speed (m/s) of each vehicle. For vehicles in regular lanes,


S.-M. Noh et al.

Fig. 3. Hi-pass and regular lane model

Fig. 4. Distribution of speeds for Hi-pass vehicles

the payment time was calculated as the sum of the average response time (1.25 s) and average service time (11.65 s). This was done by referring to the Korea Highway Corporation’s service report [10] and adjusting the speed of passing through the tollgate to 20%. Table 4 lists the adjusted speeds and transit (payment) times by lane type. Table 4. Adjusted speed and transit (payment) time by lane type Type of lane Adjustment of tollgate section speed (%) Transit (payment) Time (s) 18 m Hi-pass 100 m ðvehicle speed1000 3600 s Þ Regular 20 12.9

Model 2 (Smart Tolling). The basic model was implemented to simulate the traffic flow for smart tolling as defined by the Ministry of Land, Infrastructure and Transport using networks and zones as in model 1. Model 1 merges into 5 lanes from 20 lanes, but model 2 maintains 5 lanes without a tollgate. Model 2 was implemented for the same section of model 1 in order to compare the two, as shown in Fig. 5.

Improving Traffic Flow at a Highway Tollgate with ARENA


Fig. 5. Smart tolling model

The speed of the vehicle is same as in model 1, and there is no transit time caused by the lack of a tollgate. 3.3

Scenario Design

The scenarios were divided by vehicle speed. Because the current speed uses distribution values, some vehicles drive very slowly at a low probability. This may result in a time loss for subsequent vehicles. Table 5 lists the scenarios.

Table 5. Scenarios Scenario Model Vehicle speed (km/h) 1 Model 1 NORM(66.1, 13.3) 2 60 3 100 4 Model 2 NORM(66.1, 13.3) 5 60 6 100

4 Simulation Results and Analysis 4.1

Evaluation Criteria

In this study, the levels of improved traffic flow were compared by applying a basic model of smart tolling at the Seoul tollgate during peak hours. To compare the results of the two models, three criteria of evaluation were set. First, the average number of vehicles in the entire section represented the level of congestion. If the number is high, then it means the traffic is not flowing smoothly. Second, the throughput value representing the level of traffic handling is calculated by the number of vehicles passing through the last section divided by number of vehicles entering the tollgate.


S.-M. Noh et al.

Third, the average duration time was set as a secondary measure. If the time is long, then it means the traffic flow is poor. 4.2

Validation of the Number of Replications

When comparing simulation results, reliability of the experimental results must be ensured. A sufficient replication number of experiments is needed. If the results have a large error, then the experiment cannot be reliable. Experimental results with 30 replications for the error rate of the evaluation criteria using the half width of a 95% confidence level interval provided by ARENA are listed in Table 6. Table 6. Error rates of evaluation criteria by scenario with 30 replications Scenario 1 2 3 4 5 6

Level of congestion (%) Throughput (%) Duration (%) 0.43 0.02 0.15 0.33 0.02 0.14 0.34 0.01 0.11 0.34 0.02 0.09 0.27 0.02 0 0.3 0.01 0

It was confirmed that there were no problems using these values because all error rates were lower than 0.5%. 4.3

Experiment Results

Table 7 lists the values of the experiment criteria by scenario. Table 7. Values of evaluation criteria by scenario (Unit: Number of vehicles, %, s) Scenario 1 2 3 4 5 6

Level of congestion Throughput 39.9787 99.62 41.8726 99.59 26.7366 99.74 32.7607 99.68 33.9423 99.68 20.2515 99.80

Duration 27.3022 28.5477 18.2479 22.3330 23.1064 13.7774

Comparing the Levels of Congestion. The results show that if the vehicle speed is low, then the number of vehicles in the entire section is higher. In the most congested situation of scenario 2, an average of 41.8726 vehicles remained in the experimental period, while 20.2515 vehicles remained in scenario 6.

Improving Traffic Flow at a Highway Tollgate with ARENA


Comparing the Throughput. According to the results, the throughputs of scenarios 3 and 6 were the highest by model. The transit times were short because each vehicle drove at a speed of 100 km/h. The throughputs were 99.74% and 99.80%, respectively. The throughput of scenarios 2 and 5 were the lowest by model. The speed was 60 km/h, and the throughputs were 99.59% and 99.68%, respectively. As a result, the current number of vehicles can be handled even if the number of lanes is shrunk to five by applying model 2. Comparing the Duration Time. For the average duration time, scenarios 2 and 5 by model were the longest at 28.5477 s and 23.1064 s, respectively. Scenarios 3 and 6 by model had the shortest times at 18.2479 s and 13.7774 s, respectively. Key Results. Most tollgates have Hi-pass and regular lanes together. In reality, this is similar to the situation in scenario 1. The smart tolling model will be similar to the situation in scenario 6 using a 100 km/h highway speed limit. These are the key results when scenario 6 is applied to the situation in scenario 1. First, the level of congestion will drop from 39.9787 vehicles to 20.2515 vehicles for a percentage of about 50.7%. Second, the throughput will increase slightly, from 99.62% to 99.8%. Third, the average duration time will drop from 27.3022 s to 13.7774 s for a percentage of about 50.5%. Consequently, the level of congestion and the average duration time are expected to have significant positive effects.

5 Conclusion In this study, the effects of applying the basic model of a smart tolling system that is currently under active discussion were compared. To compare the results, a simulation was conducted on vehicles entering a Seoul tollgate that is large and often congested. The results of the experiments with the smart tolling model showed that the level of congestion and average duration time significantly decreased by about 50%. The effects of improvements in the traffic flow were seen to be sufficient even though the number of lanes remained at five. Of course, the duration decreases in part by the given payment time. Extensions of the lane seem unnecessary because the throughputs are similar between models. The limitations of this study as follows: The models were implemented at a fundamental level but tried to reflect reality through assumptions. Settings such as acceleration or deceleration were not implemented. In addition, the data need to be updated. Based on this model, a new model will be implemented for additional settings such as extensions of sections, vehicle separation, acceleration, and deceleration, and so on. By changing to smart tolling, this research will be expanded to include an economic analysis of unused land. Acknowledgement. This study was supported by the Research Program funded by the Seoul National University of Science and Technology.


S.-M. Noh et al.

References 1. Ministry of Land, Infrastructure and Transport Homepage. NEWS/m_71/dtl.jsp?lcmspage=1&id=95080239. Accessed 23 Jan 2018 2. Korea Highway Corporation Homepage. Accessed 7 Mar 2018 3. Lee, U.J., Kim, S.T., Kim, C.K., Park, J.H., Park, G.H.: The smart tolling which next generation payment system with multiple lane and non-stop. J. Korean Soc. Road Eng. 16 (1), 46–50 (2014) 4. Ministry of Land, Infrastructure and Transport Homepage. NEWS/m_71/dtl.jsp?id=95078761. Accessed 24 Feb 2018 5. Lee, J.S., Lee, K.Y., Lee, C.K., Yun, I.S., Yu, J.W.: Estimation of Hi-pass traffic dispersion rates to determine the optimal location of Hi-pass lanes at a toll plaza. J. Korea Inst. Intell. Transp. Syst. 12(4), 22–32 (2013) 6. Shin, H.S.: Operation decision model development and optimal operation of highway tollgate. Ph.D. Dissertation. Gachon University (2005) 7. Yoo, B.S., Lee, S.B., Park, W.Y., Do, H.G.: Safety improvement of centrally installed Hipass lane of express highway. J. Korean Soc. Civ. Eng. 30(1), 1–10 (2010) 8. Yun, I.S., Han, E., Lee, C.K., Rho, J.H., Lee, S.J., Kim, S.B.: Mobility and safety evaluation methodology for the locations of Hi-PASS lanes using a microscopic traffic simulation tool. J. Korea Inst. Intell. Transp. Syst. 12(1), 98–108 (2013) 9. Yoon, H.S.: Analysis of construction cost savings of new expressway with smart tolling application, Master’s Thesis, Ajou University (2016) 10. Korea Highway Corporation, Master Plan for Hi-Pass and Hi-Pass Lane Selection Using a Simulation Tool (2005)

Visualization and Computer Vision to Support Simulation

Pixel Convolutional Networks for Skeleton-Based Human Action Recognition Zhichao Chang(&), Jiangyun Wang, and Liang Han School of Automation Science and Electrical Engineering, Beihang University, Beijing, China [email protected]

Abstract. Human action recognition is an important field in computer vision. Skeleton-based models of human obtain more attention in related researches because of strong robustness to external interference factors. In traditional researches the form of the feature is usually so hand-crafted that effective feature is difficult to extract from skeletons. In this paper a unique method is proposed for human action recognition called Pixel Convolutional Networks, which use a natural and intuitive way to extract skeleton feature from two dimensions, space and time. It achieves good performance compared with mainstream methods in the past few years in the large dataset NTU-RGB+D. Keywords: Human action recognition  Skeleton-based models Skeleton pixel pictures  Pixel convolutional networks

1 Introduction Human action recognition has always been a hot topic in the computer field in recent years. In the early stage, useful information is obtained from pictures or videos through the method of image processing [1–4]. Since the advent of depth sensors, deeper skeleton information can be directly extracted through Kinect or other similar devices. Because of its strong robustness to the changes in illumination, scale and rotation, skeleton based action recognition has attracted more and more attention recently [5]. Meanwhile, with the development of deep learning technology, neural networks which can learn feature automatically have been widely applied in action recognition. Although some of the models based on depth learning have shown excellent performance, there are still many aspects for them to promotion when dealing with the datasets with more diverse and complex skeleton action. The purpose of this paper is to propose an effective method for skeleton sequences modeling and apply them to human action recognition tasks. Through studying the papers of recent years, it is found that the dynamic skeleton data of the human body can be obtained by the depth cameras, which usually include information of two dimensions, space and time. Time means the connections between different moments in a time sequence. Early methods only model with the time information, which ignore the feature extraction of the inherent relationship between different joints at the same moment, that is information on the space. Some of the recent approaches have focused on the spatial relationships of dynamic skeletal models, but © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 513–523, 2018.


Z. Chang et al.

most of them have manually defined the way of extracting spatial features. This kind of methods have a distinct independence to hardly obtain all the information of the skeleton, which lead to a poor performance on specific issues. Feature designed by manual usually requires more domain knowledge and careful parameter tuning where the effect of different datasets may be very different. In order to improve the performance of the depth learning in the action recognition tasks, a more reasonable automatic feature extraction model is needed to exert the advantage of neural networks. Recently a new neural network model named graph convolutional network has been developed, which is a generalization of the classical convolution neural network and has been well applied in many applications [6–8]. However, due to the limitation of internal mechanism, this method still has much more room for improvement. In this paper we aim to design a general convolutional network called Pixel Convolutional Networks (PCN) for solving skeleton-based human action recognition. The main contributions are as follows: a new method of feature extraction is proposed and the validity of this method is verified on a standard dataset.

2 Related Work 2.1

Skeleton Based Action Recognition

The neural networks to solve related problem can be summarized as the structure shown in Fig. 1. The structure can be generalized to three main modules: spatial feature extraction module, time feature extraction module and classification module. If the action aimed to be recognized are sent into the network as input data, the action categories recognized by the network can be obtained. The recognition accuracy of different networks differs greatly. The reason is that the author will add their own thinking to each module to form a unique design and there will also be different structural arrangements for the whole neural network. The space feature extraction module is a very important factor to measure the quality of a network among all the factors.

Fig. 1. The usual structure of the neural network in skeleton based action recognition.

Pixel Convolutional Networks for Skeleton-Based Human Action Recognition



The Inherent Spatial Feature

As the human skeleton has itself spatial feature, the specific relationship is shown in Fig. 2. The human body is a natural symmetrical left and right structure, and the upper and lower limbs have obvious feature. There are 25 representative joints in human skeleton and Kinect can accurately identify their trajectories. To improve the recognition accuracy, some spatial structure information will be introduced into the action recognition method including the connection of the adjacent key points or the body parts or the hand elbow shoulder connection. In order to model these spatial information, existing methods often use RNN and other sequential models to traverse the linked key points. This requires the model designer to define an ergodic rule or manually define some body parts.

Fig. 2. The representative joints in human skeleton.


Graph Convolution Network

The neural network on graph is a hot area in machine learning research and has been successfully applied to network analysis and text categorization. Before introducing the concept of Graph Convolution Network (GCN), we first simply review the convolution operations on images. On the image, convolution operations use some fixed size convolution kernel to scan the input image. A pixel matrix of the same size as the weight matrix is extracted near the center position pixels of each scan and the feature vectors on these pixels are spliced in space. The inner product is made with the parameter vectors of the convolution kernel to obtain the convolution output of the position. Here, the “neighborhood pixels” can be defined as a neighborhood on the pixel grid. When the convolution operation is extended on the image structure to any structure, we can also define the neighborhood of any node and a series of weight matrices. This is the basic idea of GCN. However, unlike the image the number of nodes in the neighborhood of each node is not fixed if the adjacency matrix is used to define the neighborhood on a common graph structure. This makes it difficult for us to determine the parameter dimension of the convolution kernel that needs to be used and how to arrange the weight matrix and the neighborhood of the node to perform the inner product operation. In the related GCN


Z. Chang et al.

article, the author proposed that the inner product operation be transformed into such an operation: using the same vector and the feature vectors on the points in all neighborhood to calculate the inner product and to find the result for the mean [9]. This makes the parameters of the convolution kernel can be determined as a fixed length vector and there is no need to consider the order of nodes in the neighborhood. This design allows GCN to be used on arbitrary connection diagrams, and has achieved good performance in some tasks, such as network analysis and semi supervised learning. Although GCN is abstracted and expanded by all kinds of methods, it does not improve the inherent core drawbacks of the general modeling of spatial structure information for skeleton, where too many rules are defined and then the corresponding convolution operation is just carried out.

3 Pixel Convolutional Network 3.1

Pipeline Overview

The acquisition of skeleton data is mainly through the following two ways, directly through the Kinect and other depth sensors or to get the estimated attitude in the video through the algorithm. Usually the data is a sequence of frames, each frame will have a set of joint coordinates. We construct a pixel picture with the joints as pixel nodes. The input to the network is the joint coordinate vectors on the pixel nodes. Multiple layers of pixel convolution operations will be applied on the input data and generating higherlevel feature maps on the picture. It will then be classified by the standard SoftMax classifier to the corresponding action category. The whole model is trained in an endto-end manner with backpropagation. 3.2

Convolutional Neural Network

Convolution is originally a concept in mathematics. We call ðf  gÞðnÞ a convolution. Its continuous definition is: Z ð f  gÞ ð nÞ ¼



f ðsÞgðn  sÞds


Its discrete definition is: ð f  gÞ ð nÞ ¼

1 X

f ðsÞgðn  sÞ



Convolution operation is a key step in convolution neural networks. The traditional two-dimensional convolution is defined as follows: C ðs; tÞ ¼

Mr1 X X Mc1 m¼0 n¼0

Aðm; nÞ  Bðs  m; t  nÞ


Pixel Convolutional Networks for Skeleton-Based Human Action Recognition


There are several key concepts in convolution operation: the number of images, the size of input and output images, the number of output channels, the size and number of filters, strides and padding. We assume that the height of filter is filter_height, the width of filter is filter_width, the size of input images is (in_height, in_width). And then we calculate the size of output images (out_height, out_width):


out height ¼ ðin height þ 2padding  filter heightÞ=stride þ 1


out width ¼ ðin width þ 2padding  filter widthÞ=stride þ 1


Pixel Convolutional

The most primitive purpose of the neural network is to simulate the cognitive process of the human brain. We believe that the links in the human skeleton must not be too complicated by the restriction of physiological factors. In addition, graph convolution network models only consider the links between the joints which are nearest, which causes the strong connections between the middle and further distance are ignored. The concept of pixel convolution is proposed in this paper. In general, it means the spatial structure information of human skeleton is constructed to the most natural cognitive model, two-dimensional picture. Any redundant information is not defined from the original information in this method, and the skeleton structure information of the human body is expressed in the most natural way. The X, Y and Z coordinates of the joint points corresponding to each number are filled in three two-dimensional pictures, as shown in Fig. 3. The joints in the skeleton become pixels in the twodimensional picture, so the method is named as the pixel convolution. After the above processing, we can use traditional image convolution method to extract the human skeleton model.

Fig. 3. The pixel image in pixel convolution.

The advantage of pixel convolution to extract the spatial relationship between joints in a single frame is more intuitive and reasonable than the graph convolution. It saves a large number of the process of designing the feature matrix and reflects the structural


Z. Chang et al.

relationship between the joints with all kinds of distance, which restores the original information of the skeleton joint and improves the learning effect of the neural network.

4 Experiment In this section we evaluate the performance of PCN in skeleton based action recognition experiments. We experiment on a large-scale action recognition dataset NTURGB+D which is the largest in-house captured action recognition dataset. 4.1


NTU-RGB+D is currently the largest dataset with 3D joints annotations for human action recognition task [10]. This dataset is captured by three Microsoft Kinect V2 cameras and contains 56800 action sequences in 60 action classes. These sequences are all performed by 40 volunteers aged between 10 and 35. Each sequence contains one or two skeletons and each skeleton is represented by 25 joints, which gives 3D joint locations (X, Y, Z) in the camera coordinate system, detected by the Kinect depth sensors. This dataset is very challenging because it contains a large number of human interaction with objects and changes in the view point, and the frame number of different skeleton sequences may vary greatly. This dataset has two evaluation benchmarks. In the cross-subject (X-Sub) benchmark, half of the volunteers are used as the training set, and the rest is the test set. In the cross-view (X-View) benchmark, data from camera 2 and 3 are used for training, while the data from camera 1 for testing. X-View benchmark is chosen in this paper. The original data of the NTU RGB+D Dataset is stored in Skeleton files, of which 37646 files are used as training sets and 18932 files are used as test sets. 4.2

Network Structure

This paper builds PCN based on the method of the pixel graph proposed previously, whose network structure is shown in Fig. 4. The function of the PCN0 module is to rewrite the 25 joints into the 8 * 7 matrix form of Fig. 3, so the traditional twodimensional convolution can be used to extract the feature of joint data and the activation function is Relu. The function of the TCN0 module is to convolute time series. The pseudo twodimensional form is used which means a dimension of the convolution kernel is 1. This operation only extracts the characteristics of the time and the activation function is also Relu. The Dropout is added to prevent the training process from overfitting. After that nine similar structures were circulated and the number of channels was increased each time, so more hidden layer features can be extracted. Finally, four pool operations are used to reduce the hidden layer feature dimension, and a data structure similar to the tag is obtained to calculate the loss value and update the parameters by backpropagation.

Pixel Convolutional Networks for Skeleton-Based Human Action Recognition


Fig. 4. The structure of pixel convolution neural network


Result and Comparison

In the experiment 60 epochs are designed to train the network and the batch size is 8. The initial learning rate is set to 0.1. After 10 epochs, the learning rate automatically drops to 0.1. After 30 epochs, it drops to 0.01. The learning rate is gradually reduced with the increase of epochs so that the parameters of the networks can be trained more favorable. The optimizer uses the most basic SGD which can continue to be improved in the later work. From Fig. 5 we can see that as the number of training epoch increases, mean training loss gradually decreases. This means that with the progress of training, the network is becoming more and more excellent for the data of training set and the minimum value is reached after 60 epochs. After each 5 training epochs, the data of the test set is used to test the current network performance. The performance of the network on the test set has some ups and downs but it also reaches the minimum after 60 epochs, which can be seen clearly in Fig. 6. Top 1 and top 5 of accuracy is usually used to evaluate the quality of the network. From Figs. 7 and 8 we can see that after 60 epochs, top 5 of accuracy is close to 98% and top 1 of accuracy has exceeded 84%.


Z. Chang et al.

Fig. 5. Mean training loss.

Fig. 6. Mean test loss.

After the test, the accuracy rate of Top1 is 84.8% under the current parameters and the accuracy of top5 is 98.4%. Table 1 and Fig. 9 is the Top1 accuracy of the excellent literature of nearly three years based on the skeleton data [10–14]. From Table 1 we can see that the current accuracy of PCN has reached the highest level in 2017. It is believed that in the next work, better recognition results can be achieved by improving the training batch and optimizing the parameters.

Pixel Convolutional Networks for Skeleton-Based Human Action Recognition

Fig. 7. Top 5 of accuracy

Fig. 8. Top 1 of accuracy Table 1. Recognition accuracy of other papers. Algorithm Lie group [11] H-RNN [12] Deep-LSTM [10] PA-LSTM [10] ST-LSTM+TS [13] Temporal Conv [14] PCN

X-View 52.8% 64.0% 67.3% 70.3% 77.7% 83.1% 84.8%



Z. Chang et al.

Fig. 9. Variation tendency of the recognition accuracy based on different algorithms

5 Conclusion In this paper, we propose a novel model for skeleton-based action recognition called pixel convolutional networks (PCN). The model constructs a set of spatial temporal pixel convolutions on the skeleton sequences. On the challenging large-scale datasets, the proposed pixel convolutional network outperforms the previous skeleton-based models. In addition, pixel convolutional networks can capture motion information in dynamic skeleton sequences which is complementary to RGB modality. The combination of skeleton-based model and frame-based model will further improve the performance in action recognition. The flexibility of PCN also opens up many possible directions for future works.

References 1. Sigal, L., Black, M.J.: HumanEva: synchronized video and motion capture dataset for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1–2), 4–27 (2006) 2. Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3D skeletons as points in a lie group. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–595. IEEE Computer Society (2014) 3. Mahasseni, B., Todorovic, S.: Regularizing long short term memory with 3D humanskeleton sequences for action recognition. In: Computer Vision and Pattern Recognition, pp. 3054–3062. IEEE (2016) 4. Zhang, K., Zuo, W., Gu, S., et al.: Learning deep CNN denoiser prior for image restoration, pp. 2808–2817 (2017) 5. Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimed. 19(2), 4–10 (2012) 6. Bruna, J., Zaremba, W., Szlam, A., Lecun, Y.: Spectral networks and locally connected networks on graphs. In: ICLR (2014) 7. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: NIPS (2016)

Pixel Convolutional Networks for Skeleton-Based Human Action Recognition


8. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR 2017 (2017) 9. Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs, pp. 2014–2023 (2016) 10. Shahroudy, A., Liu, J., Ng, T.T., et al.: NTU RGB+D: a large scale dataset for 3D human activity analysis, pp. 1010–1019 (2016) 11. Veeriah, V., Zhuang, N., Qi, G.J.: Differential recurrent neural networks for action recognition. In: IEEE International Conference on Computer Vision, pp. 4041–4049. IEEE (2015) 12. Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: CVPR, pp. 1110–1118 (2015) 13. Liu, J., Shahroudy, A., Xu, D., et al.: Spatio-temporal LSTM with trust gates for 3D human action recognition, pp. 816–833 (2016) 14. Kim, T.S., Reiter Liu, J., Shahroudy, A., Xu, D., et al.: Spatio-temporal LSTM with trust gates for 3D human action recognition, pp. 816–833 (2016)

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds Based on Curvature-Dependent Poisson Disk Sampling Yukihiro Noda1(&), Shu Yanai1, Liang Li2, Kyoko Hasegawa2, Atsushi Okamoto3, Hiroshi Yamaguchi4, and Satoshi Tanaka2 1

Graduate School of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan [email protected] 2 College of Information Science and Engineering, Ritsumeikan University, Kyoto, Japan 3 History Research Institute, Otemae University, Nishinomiya, Japan 4 Nara National Research Institute for Cultural Properties, Nara, Japan

Abstract. In recent years, with the development of 3D laser-measurement technology, digital archiving is being carried out as one of the efforts to leave cultural assets to posterity around the world. The laser-scanned point clouds are large-scale and precisely record complex 3D structures of the cultural assets. Accordingly, such point clouds are used in research field of visualization to support analysis and use of the assets. As representative examples of visualization, there are feature-highlighting and transparent visualization. The quality of visualization highly depends on distributional uniformity, that is, uniformity of inter-point distances. However, laser-scanned point clouds are usually distributional bias data, which makes it impossible to visualize with high quality. In previous studies, making point distances uniform by Poisson disk sampling, the quality of transparent visualization can be improved. This study proposes curvature-dependent Poisson disk sampling. The proposed method adjusts the order and radius of the sampling disk according to the curvature calculated by principal component analysis. By applying the proposed method to laserscanned point clouds, the edges of cultural assets can be emphasized and the visibility of shape further improves in transparent visualization. Thereby, we realize feature-highlighting transparent visualization with high visibility of three-dimensional structure and edge-shape. Keywords: Laser-scanned point clouds  Feature-highlighting Transparent visualization  Poisson disk sampling  Curvature-dependent Principal component analysis

© Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 524–538, 2018.

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


1 Introduction Recently, efforts have been made to leave cultural assets to posterity around the world [1, 2]. One of the efforts is digital archiving, which aims to measure, record, and preserve cultural assets related to history and culture art by digital information technology. Furthermore, by building a database of cultural assets and publishing the digital data, it contributes to utilization in various fields and inheritances of culture [3]. Ritsumeikan University has proceeded with “mixed reality type digital museum” by cooperation with other universities [4–6]. In the project, the database of cultural assets in Kyoto is built by an advanced and diverse information technology. In addition to the digital archives related to Kyoto, digital archiving of various cultural assets throughout Japan has carried out. For example, Zuiganji Cave in Matsushima city, Miyagi prefecture, and Wada Kofun in Ritto city, Shiga prefecture. Large-scale three-dimensional (3D) measurement technology related to the digital archive is used to create 3D digital data. With rapid development of 3D measurement technology in recent years, photometry, measurement by laser irradiation, and a method of measuring from the sky using Unmanned Aerial Vehicle (UAV) have also been developed. Historical buildings and ruins can be damaged by weathering and disasters, and it is inevitable to deteriorate over time. Therefore, 3D measurement technology is very important for preservation and research of cultural assets against them. In our research, we use the point clouds data obtained by 3D measurement of cultural assets for visualization research [7, 8]. The laser-scanned point clouds are big data as large as about 10 million or more points. In addition, the laser-scanned point clouds also have 3D coordinate and color information, and cultural assets can be precisely recorded and visualized. Our visualization researches are mainly focused on transparent visualization and feature-highlighting [9]. Transparent visualization can visualize inner and outer structures of point clouds simultaneously, so it contributes to understanding of 3D structure of the cultural assets. We realize transparent visualization by using stochastic point-based rendering (SPBR) [10], which enables high-speed and precise rendering of large-scale point clouds. Opacity of the rendering depends greatly on the number of points and the point density of a point cloud. However, in the 3D measurement, the point density is biased depending on the weather at the time of measurement and the site conditions. Furthermore, measurement is performed from many directions, so that point density unevenness occurs when merging data. As a result, the high quality of the point cloud is impaired, and there is a possibility that the effect of transparent visualization cannot be fully demonstrated. Requirements of a high-quality point cloud are summarized into three: (1) point density is sufficiently high, (2) point density is uniform, and (3) uniform distance between adjacent points. In the case of a laser-scanned point cloud, the point density is sufficiently high, but there are cases in which (2) and (3) are missing because the point density becomes to be biased in the situation at the time of measurement and merging data. In conventional studies, one way to satisfy (2) and (3) in requirements of a high-quality point cloud is Poisson disk sampling (PDS) [11–13]. PDS can reduce points of a point cloud so that the point density and inter-point distances are unified. High-quality point clouds can be obtained by


Y. Noda et al.

eliminating the bias of laser-scanned point clouds without moving original points and adding new points. Thereby, the visibility of internal structure in transparent visualization can be improved. However, because the overall color of laser-scanned point clouds become pale by seeing through, the edge-shape of cultural assets may become difficult to be visually understood. Therefore, in this study, we propose a feature-highlighting transparent visualization method by applying curvature-dependent Poisson disk sampling to laser-scanned point clouds. The curvature is calculated by principal component analysis (PCA) [14–17], and the edge-shape of cultural assets is the high-curvature portion. Specifically, to emphasize high-curvature portion, the proposed method keeps more high-curvature points in reduction by Poisson disk sampling. In general, conventional Poisson disk sampling removes points randomly and the radius of the Poisson disk is a constant. The proposed method adaptively changes the order and the radius size depending on whether the point is a high-curvature point or a low-curvature point. Furthermore, by combining curvature-dependent Poisson disk sampling and SPBR, the point density and opacity can be high for high-curvature portion, and low for low-curvature portion. As a result, the color of edge-shape is dark and clearly drawn. Thus, we expect that the structure and the shape of cultural assets can be visualized accurately and easily.

2 Transparent Visualization As an advantage of transparent visualization, it is possible to visualize the appearance and internal structure of the object simultaneously. We can visualize the perspective by generating polygons from a point cloud. However, because it is a large-scale polygon mesh, the amount of calculation increases greatly. In this study, in order to do transparent visualization of laser-scanned point clouds, we use stochastic point-based rendering (SPBR). SPBR is a fast and precise transparent visualization method that does not require sorting for laser-scanned point clouds. The opacity is controlled by the number of points and point density on curved surface of a point cloud. However, since the point density of laser-scanned point clouds can be non-uniform depending on the measurement environment, it has a bad influence on transparent visualization. Poisson disk sampling (PDS) is used as a method to uniform the point density and the interpoint distances of a point cloud. The PDS for a point cloud is the 3D extension of original PDS that has been used for point drawing in the field of image processing. The conventional study improved the quality of SPBR by using the high-quality point clouds obtained by applying PDS to laser-scanned point clouds. 2.1

Poisson Disk Sampling (PDS)

In the original PDS applied to pixels of a two-dimensional (2D) image, the Poisson disk is a circle. On the other hand, in the extended PDS for a 3D point cloud, the Poisson disk becomes a sphere. Constituent points after applying Poisson disk sampling are no closer to each other than a specified minimum distance, i.e., the given radius r of the Poisson disk. The algorithm of Poisson disk sampling for a point cloud with n points is as follows:

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


Procedure and Application of Poisson Disk Sampling 1. Assign number k (= 1, …, n) to each constituent point of an input 3D point cloud data in random order. 2. Set radius r of 3D sphere in Poisson disk sampling. 3. Store the 1st point, that is, the point with k = 1. 4. Increment k by 1. If k = n, terminate the algorithm, else go to the next step. 5. If other points already exist within radius r around the k-th point, remove the k-th point, else accept it. Then go back to step 4. The right image of Fig. 1 shows the result of applying PDS to a simple point cloud. The left image of Fig. 1 shows a randomly distributed points whose inter-point distances are non-uniform. Comparing two images in Fig. 1, we can clearly observe the distributional uniformity in the right image. From these results, the point cloud generated by applying PDS can satisfy the requirements, which are uniform point density and uniform inter-point distances. Therefore, with appropriately setting radius r of the Poisson disk according to a laser-scanned point cloud, the high-quality point cloud can be obtained after applying PDS.

Fig. 1. Both figures are point clouds with the same number of points. The left image is the point cloud generated by uniform random number, and the right image is the point cloud generated by applying PDS to a point cloud.


Stochastic Point-Based Rendering (SPBR)

SPBR is performed through three processes: (1) generating points, (2) projecting points to pixels of an image, and (3) determining pixel luminance value. Details of each process are as follows: Procedure and Application of Stochastic Point-Based Rendering 1. Randomly divide a laser-scanned point cloud into multiple point subsets. The number of point subsets is defined as repeat level LR. Also, it is assumed that each point subset is statistically independent with the uniform point density. 2. Generate an intermediate image by projecting each point subset. Perform hidden point removal processing in pixel units, that is, project the nearest point to each


Y. Noda et al.

pixel and the color of the point is taken as the pixel luminance value. However, the pixel luminance values with being projected no point is the background color. 3. Average the intermediate images created in step 2. By averaging the pixel luminance value of the images obtained by projecting multiple (= LR) point subsets, a transparent image can be finally generated. The opacity a is calculated with LR: repeat level LR, n: the number of generated points, S: the local area of a point cloud on the planar image, and s: the cross-sectional area of a point (Eq. 1). In SPBR, because of rendering with setting the opacity a by a user, the number of generated points n needs to be determined by the opacity a. Therefore, by transforming Eqs. 1 to 2, n can be calculated with the existing values, a, s, S, and LR.  s LnR a¼1 1 S n¼

lnð1  aÞ   LR ln 1  Ss

ð1Þ ð2Þ

As mentioned above, laser-scanned point clouds are large-scaled and the point density is non-uniform. Thereby, the density of points generated in step 1 of SPBR is also non-uniform, and the set opacity cannot be properly reflected on a transparent image. To solve this problem, in step 1 of SPBR, the high-quality point obtained by applying PDS to a laser-scanned point cloud is divided randomly into multiple point subsets. This procedure realizes improving the image quality of transparent visualization. In this study, the target data for transparent visualization are two laser-scanned point clouds (Fig. 2). The left image shows Wada Kofun in Ritto city, Shiga prefecture. This target is one of the nine tombs made in the middle of the 6th century to the 7th century. There is a restored stone room inside. The number of points is 7,153,551. Whereas, the right image shows Zuiganji Cave in Matsushima city, Miyagi prefecture. The cave is said to been dug by the people of Tendai, and there are small rooms and Jizo inside. The target is a part of Zuiganji Cave including small inner rooms with historical value, and the number of points is 366,201,197. Transparent visualization results of each target data are shown in Figs. 3 and 4. SPBR of Wada Kofun in Fig. 3 is performed with opacity a ¼ 0:5 and repeat level LR = 100. In the left image of the original point cloud, because of the point density irregularity, we can see that the opacity is also non-uniform. That is, the set opacity is not be properly reflected, and as a result, an unintended transparent image is generated. Whereas, in the right image of the point cloud after applying PDS, we can observe uniformity of the opacity distribution. As a result, the visibility of the inner structure of the stone room improves, and high-quality rendering is realized. Next, both images of Zuiganji Cave in Fig. 4 are set repeat revel LR = 300. Even increasing the repeat level, it is possible to make the transparent image high-quality. However, same as Wada Kofun, visibility of small inner rooms is low in the left image of the original point

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


cloud. As shown in the right image, by applying PDS to the point cloud, it becomes easy to observe the state of small inner rooms such as the arrangement of Jizo.

Fig. 2. Visualization results of the original laser-scanned point clouds. The left image is the raw point cloud of Wada Kofun, and the right image is the raw point cloud of Zuiganji Cave.

Fig. 3. Both figures are transparent visualization results of Wada Kofun by using SPBR. With opacity a ¼ 0:5 and repeat level LR = 100, we can see that the opacity of the right image after applying PDS is regular compared with the left image.

Fig. 4. Both figures are transparent visualization results of Zuiganji Cave by using SPBR. With the revel LR = 300, in the right image after applying PDS, we can see that the deviations of the point density and the opacity are eliminated.


Y. Noda et al.

3 Feature-Highlighting Transparent Visualization From transparent results of Figs. 3 and 4, by combining PDS with SPBR, the visibility of the inner structure became higher, and transparent effect was also improved. However, by performing transparent visualization, there are two problems: (1) more information is obtained at once by visualizing inner and outer structures simultaneously and (2) the overall color of the cultural asset becomes pale. The large amount of information and the non-vivid color can make it difficult to distinguish between inside and outside of the cultural assets. To solve these problems, we improve the above-mentioned transparent visualization method to a feature-highlighting transparent visualization method. The proposed method can solve problems of the transparent visualization by emphasizing the feature area and enhancing the visibility of shape. In this research, the feature area refers to a high-curvature portion such as the corner of an object mainly. As a method of extracting the feature area, we use principal component analysis (PCA) which is relatively resistant to noise. The proposed method changes the order and the radius of PDS depending on the curvature calculated with PCA in order to leave more points in the high-curvature portion. Thereby, SPBR can realize increasing the opacity in the high-curvature portion higher than the low-curvature portion. 3.1

Principal Component Analysis (PCA)

In the calculation of the feature value using PCA, the covariance matrix S is first calculated with P(i) and the position coordinates of the point inside the sphere of radius r centered on P(i). The eigenvalues (k1 > k2 > k3) and the eigenvector (v0, v1, v2) are found from the covariance matrix. The eigenvector represents the direction of the principal component axis, and the eigenvalues represent the variance in that direction. Also, since S is a real symmetric matrix, it is guaranteed that its eigenvalues are always positive and eigenvectors belonging to different eigenvalues are orthogonal to each other. Figure 5 shows examples of a high-curvature and a low-curvature point.

Fig. 5. Distribution of points in the local area. surface of a point cloud.

pffiffiffiffiffi k3 indicates the bulging degree of the curved

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


The contribution rate of k3 is defined as the pseudo curvature. In particular, the pseudo curvature represents the change of curvature Ck calculated by Eq. 3. In this research, to visualize clearly the division between the high-curvature portion and the low-curvature portion, the threshold of Ck is determined by user input. In a point cloud, the points larger than the threshold are defined as high-curvature points, and the points lower than the threshold are defined as low-curvature points. Ck ¼


k3 k1 þ k2 þ k3


Curvature-Dependent Poisson Disk Sampling

As mentioned in the preceding section, conventional Poisson disk sampling removes points in random order, and the radius of the Poisson disk is a constant. However, the probability of point removal is the same no matter whether a point is high-curvature or low-curvature. In feature-highlighting to enhance the visibility of shape, points of the high-curvature portion are important. Therefore, to realize feature-highlighting transparent visualization, we propose curvature-dependent PDS, which is the method to keep more points of high-curvature portion and uniform the point density in each portion. The proposed method utilizes the fact that the point removed in PDS is dependent on the order and radius of the sampling disk. Thus, the proposed method consists of curvature-dependent order PDS and curvature-dependent radius PDS. Details of each PDS are described below. Curvature-Dependent Order PDS. In the field of 2D image processing, the point drawing methods by PDS with grayscale ordering of pixels and radius-order scan of pixels are proposed [18, 19]. In these researches, it was found that edge preservation and clarity of lines in the point drawing image are improved. In this research, we propose the uniform down sampling method of a 3D point cloud by high-curvature area order PDS. High-curvature area order PDS is different from conventional random order PDS in step 1 of the procedure described in Sect. 2.1. Step 1 of the proposed method is as follows: 1. Divide an input point cloud into high-curvature area and low-curvature area by using the set threshold value of Ck. Then, assign number k (= 1, …, m) randomly to each point in high-curvature area. After that, assign number k (= m + 1, …, n) randomly to each point in low-curvature area. Verification of the Retention of High-Curvature Points. We verify the retention of highcurvature points by high-curvature area order PDS. In the experiment, we first apply random order PDS and high-curvature area order PDS to a laser-scanned point cloud, namely Bunny which has 205,719 points. The radiuses of the Poisson disk are the same. And then, in terms of the number of high-curvature points and the percentage of high-curvature points to total points, we compare the point clouds after applying each PDS. The visualization results of Bunny are shown in Fig. 6. To visualize clearly the high-curvature portion which is the set of points larger than the threshold of Ck ¼ 0:01, the points are colored with red.


Y. Noda et al.

Fig. 6. Bunny after applying each PDS. The left image shows the result of random order PDS, and the right image shows the result of high-curvature area order PDS. (Color figure online)

Fig. 7. Enlarged view of Fig. 6. (Color figure online)

Table 1. The number of points and the percentage of high-curvature points to total points in Bunny after applying each PDS. PDS method Total points High-curvature points Percentage of high-curvature points to total points

Random order 20,877 2,445 11.71%

High-curvature area order 20,957 3,103 14.81%

From Figs. 6 and 7, we can observe that the result of high-curvature area order PDS retains more points of the high-curvature portion in the parts of nose, eyes, and ears than random order PDS. In addition, it can be seen from Table 1 that high-curvature points and its percentage are larger in high-curvature area order PDS, even though all points in both results are almost the same. Therefore, it is possible to improve edge preservation and clarity of lines in a 3D point cloud by using curvature-dependent order PDS.

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


Distance Verification Between Adjacent Points. We also verify whether the distance between adjacent points of the point clouds after applying each PDS is kept uniform. In the experiment, polygon meshes are created for the original Bunny and the results of each PDS, and the aspect ratios of polygons are compared [20]. The aspect ratio is a value obtained by dividing the longest side by the shortest side of the triangular polygon. The closer the aspect ratio is to 1, the closer the triangle polygon is to an equilateral triangle. It is considered that the inter-point distances are uniform. This verification uses the same point clouds as verification of the retention of high-curvature points.

Table 2. The aspect ratio of triangle polygon created with adjacent three points in Bunny. Bunny Number of polygons Aspect ratio Average Deviation Max Min

Original Random order High-curvature area order 322,206 2.098 1.187 9.820 1.001

41,756 1.457 0.341 2.594 1.001

41,808 1.457 0.338 2.594 1.001

In Table 2, when looking at the average aspect ratio, both PDS are closer to 1as compared with the original. Also, the original has a wider aspect ratio and is dispersed, so we can reconfirm that the original is a non-uniform point cloud. On the other hand, the standard deviations of both PDS are small. Thus, it is considered that inter-point distances of the point clouds applied each PDS are uniform. Therefore, we can see that the uniformity of inter-point distances by high-curvature area order PDS is equal to random order PDS. The verification results show that only by changing the order of PDS according to a feature value, whether a point should be removed or retained can be controlled. Finally, Curvature-dependent order PDS is proved to be effective in feature-highlighting transparent visualization. Curvature-Dependent Radius PDS. The opacity a of transparent visualization using SPBR is in proportional to the number of generated points n. Also, the reduction rate of points greatly varies depending on the radius of the Poisson disk. By utilizing these, we propose curvature-dependent radius PDS, which is a method to control the opacity of SPBR by adaptively changing the radius of the Poisson disk. To determine the radius, it is necessary to calculate n satisfying a set by user input. Since the radius can be determined from n at the time of repeat level LR = 1, the required n is calculated by Eq. 4 which assigns LR = 1 to Eq. 2. At this time, if PDS is applied to a sufficiently dense point cloud, it is considered that the distance between adjacent points is equal to the radius of the Poisson disk. In addition, when assuming that n points are evenly arranged in virtual area S, the radius r of the Poisson disk can be calculated with Eq. 5. Figure 8 illustrates the idea described above. However, because the points of a point cloud are not actually evenly arranged, the number of points after applying PDS will never be exactly the same as n calculated with Eq. 4. Therefore, as shown in Eq. 5, the


Y. Noda et al.

radius r needs correction term Cr in order to move the number of points after applying PDS closer to n. lnð1  aÞ   ln 1  Ss rffiffiffi S pr ¼ Cr n



Fig. 8. Concept of points of a point cloud are evenly arranged in virtual area. The radius of the PDS can be determined by calculating the generation points.

In Eqs. 4 and 5, by setting the opacity a of high-curvature points higher, the radius r of high-curvature points becomes smaller. As a result, high-curvature area can become dense and its color can also be made dark and clear. Figure 9 illustrates the concept of the idea described above.

Fig. 9. Concept of applying curvature-dependent radius PDS to a point cloud. In high-curvature area, the radius of PDS is small and the point density is high. On the other hand, in low-curvature area, the radius of PDS is large and the point density is low.

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


4 Experimental Results We apply curvature-dependent order PDS to the laser-scanned point clouds, and perform transparent visualization using SPBR. This experiment also uses the laser-scanned point clouds of Wada Kofun and Zuiganji Cave as the target data. In addition, to keep as many high-curvature points as possible, we apply the proposed method, which is a combination of high-curvature area order PDS and curvature-dependent radius PDS. The results obtained by the proposed method are regarded as feature-highlighting transparent visualization and compared with transparent visualization results obtained by the conventional method. Figure 10 shows the visualization results of Wada Kofun. The left image is the transparent visualization of the conventional method with a ¼ 0.2 and LR = 100. On the other hand, the right image is the feature-highlighting transparent visualization of the proposed method with LR = 100. The threshold of curvature is set to Ck = 0.05, the opacity in the high-curvature area is a ¼ 0.8 and the opacity in the low-curvature is a ¼ 0.2. Compared to the left image, the right image emphasizes the shape of the stone room inside Wada Kofun. In addition, we can observe the state of the wall made by heaping up stones in the right image. Figure 11 shows the visualization results of Zuiganji Cave. The upper image in Fig. 11 is the transparent visualization of the conventional method with a ¼ 0.2 and LR = 100. The lower image in Fig. 11 is the feature-highlighting transparent visualization of the proposed method with LR = 100. The threshold of curvature is set to Ck = 0.03, the opacity in the high-curvature area is a ¼ 0.8 and the opacity in the lowcurvature is a ¼ 0.2. In the upper image, we can observe the state of small inner rooms such as the arrangement of Jizo. However, the shape of Jizo and the state of wall in the room are not clear. In contrast, they are emphasized in the lower image, and the small door buried in the wall and the shape of Jizo are visualized clearly.

Fig. 10. Comparison of visualizing Wada Kofun by using the conventional method (left) and the proposed method (right). Both images are transparently visualized by SPBR.


Y. Noda et al.

Fig. 11. Comparison of visualizing Zuiganji Cave by using the conventional method (top) and the proposed method (bottom). These images are the enlarged view of a small room inside Zuiganji Cave.

From these experimental results, it can be seen that feature-highlighting transparent visualization can improve the visibility of the feature area which becomes difficult to see by transparent visualization.

5 Conclusion In this research, we aimed at further enhancing the shape visibility of laser-scanned point clouds in transparent visualization. To achieve this goal, we proposed a featurehighlighting transparent visualization method, which is the combination of curvaturedependent PDS and SPBR. The proposed method can control the generated points and opacity of a point cloud by adaptively changing the order and radius of PDS for each curvature area. In the experiment, we applied the proposed method to the laser-scanned

Feature-Highlighting Transparent Visualization of Laser-Scanned Point Clouds


point clouds of actual cultural assets and showed the visualization results. From the experimental results, it can be seen that the edge-shape of cultural assets can be emphasized successfully while retaining the conventional transparent effect. Therefore, it becomes easier to comprehend the 3D structure and shape of cultural assets by feature-highlighting transparent visualization.

References 1. Zorich, D.M.: A Survey of Digital Cultural Heritage Initiatives and Their Sustainability Concerns. Council on Library and information Resources, Washington (2003) 2. Parry, R.: Digital heritage and the rise of theory in museum computing. Mus. Manag. Curatorship 20(4), 333–348 (2005) 3. Hachimura, K., Tanaka, H., Tanaka, S.: New development of digital archive. Hachimura, K., Tanaka, H. (eds.). Nakanishiya Publishing, Kyoto (2012) 4. Yano, K., et al.: Kyoto art entertainment on a virtual time and space. In: The 1st Digital Content Symposium (2005) 5. Uemura, M., Matsuda, K., Yamamoto, M., Hasegawa, K., Nakata, S., Tanaka, S.: Visualizing inner 3D structure of Gion-Festival Funeboko based on the opaque illuminant particle model. IEICE Tech. Rep. MVE 110(382), 311–316 (2011) 6. Yano, K., Nakaya, T., Kawasumi, T., Tanaka, S.: Historical GIS of Kyoto, Ministry of Education Global COE Program “Digital Humanities Center for Japanese Arts and Cultures”. Ritsumeikan University, Nakanishiya Publishing, Kyoto (2011) 7. Li, W., Shigeta, K., Hasegawa, K., Li, L., Yano, K., Tanaka, S.: Collision Visualization of a Laser-Scanned Point Cloud of Streets and a Festival Float Model used for the Revival of a Traditional Procession Route, the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XLII-2/W7, pp. 255–261. ISPRS Geospatial Week 2017, Wuhan, China (2017) 8. Tanaka, S., et al.: Particle-based transparent rendering of implicit surfaces and its application to fused visualization. In: EuroVis 2012, pp. 25–29 (short paper), Vienna (Austria) (2012) 9. Okamoto, N., Hasegawa, K., Li, L., Okamoto, A., Tanaka, S.: Highlighting feature regions combined with see-through visualization of laser-scanned cultural heritage. In: 2017 International Conference on Culture and Computing, Kyoto (2017) culture.and.computing.2017.18 10. Tanaka, S., et al.: See-through imaging of laser-scanned 3D cultural heritage objects based on stochastic rendering of large-scale point clouds. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. III-5, 73–80 (2016) 11. Ebeida, M.S., Patney, A., Mitchell, S.A.: Efficient maximal poisson-disk sampling, the definitive version appears in ACM transactions on graphics, vol. 30, no. 4 (2011) 12. Gamito, M.N., Mddock, S.C.: Accurate multi-dimensional poisson-disk sampling. ACM Trans. Graph. (TOG) 29(1), Article no. 8 (2009) 13. Yanai, S., Umegaki, R., Hasegawa, K., Li, L., Yamaguchi, H., Tanaka, S.: Improving transparent visualization of large-scale laser-scanned point clouds by using poisson disk sampling. In: 2017 International Conference on Culture and Computing, Kyoto (2017) 14. Yang, L., Lai, Y.-K., Hu, S.-M., Pottmann, H.: Robust principal curvatures on multiple scale. In: Eurographics Symposium on Geometry Processing (2006) 15. Pauly, M., Gross M., Kobbelt, L.P.: Efficient simplification of point-sampled surfaces. In: Proceedings of the IEEE Visualization (2002)


Y. Noda et al.

16. Hotta, T., Iwakiri, M.: A characterizing 3D point cloud based on relative gradient method and its efficiency evaluation. Inst. Image Electron. Eng. Jpn. 43(4), 550–558 (2014) 17. Hotta, T., Iwakiri, M.: Cluster analysis of 3D point cloud based on principal component analysis of normal vector distribution, Technical report of video information media conference, vol. 38, no. 42 (2014) 18. Wang, T., Urahama, K.: Stippling by poisson disk sampling with grayscale ordering of pixels. IEICE Trans. Inf. Syst. (Jpn. Ed.) 96(1), 262–265 (2013) 19. Junsheng, T., Urahama, K.: Binary stippling images by poisson disk sampling with radiusorder scan of pixels. IEICE Tech. Report 113(318), 43–47 (2013) 20. Bernardini, F., Mittleman, J., Rushmeier, H., Silva, C., Taubin, G.: The ball-pivoting algorithm for surface reconstruction. IEEE Trans. Visual Comput. Graph. 5, 349–359 (1999)

Image-Based 3D Shape Generation Used for 3D Printing Zemin Li1,2, Lin Zhang1,2(&), Yaqiang Sun1,2, Lei Ren1,2, and Yuanjun Laili1,2 1

School of Automation Science and Electrical Engineering, Beihang University, 100191 Beijing, China [email protected], [email protected] 2 Engineering Research Center of Complex Product Advanced Manufacturing Systems, Ministry of Education, 100191 Beijing, China

Abstract. 3D shape design is one of the most vital procedures in addictive manufacturing, especially in the environment of cloud manufacturing. And the consumption of time and energy to design a 3D shape is huge. Our objective is developing a 3D shape generative technique applied in the process of 3D printing model shape design. As generative adversarial networks (GAN) in field of deep learning has the potential to generate 3D shape models based the latent vector sampled from prior latent spaces. We use Conditional GAN as the solution to map image information to 3D printing shapes that satisfy the printable requirements. We evaluate the capability of our model to generate authentic 3D printing shapes across the several classes. Basically, the model could be an alternative as an assistant 3D printing shape designer. Keywords: 3D printing Cloud manufacturing

 Generative adversarial networks

1 Introduction 3D printing technology possesses the ability to satisfy the personalized manufacturing demands of individuals and small teams, has the characteristic of low cost and fast speed [1]. 3D shape design is one of key procedures in the process of 3D printing [2]. The design of 3D shapes consumes a large amount of time and energy, requiring professional design skills, which leads to a high design threshold. The design process can be simplified obviously if 3D shape could be reconstructed from images, reducing the difficulty of design as well [3]. At present, Structure from Motion (SFM) and Multi View Stereo (MVS) are mainly two methods of image based 3D reconstruction, and they are complementary, as they do not deal with the same assumptions [4, 5]. Several distributed software implements this function, such as insight3D [6], 3DFLOW, 123D Catch [7], etc. However, these procedures are compared complicated, with strict environment conditions required. As the correlative researches on 3D model in the field of Deep Learning have been more and more popular along with abundant 3D shape dataset constructed, it is an alternative method to generate 3D shapes with the help of generative models in deep neural network. © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 539–551, 2018.


Z. Li et al.

Recently, many inspiring attempts to learning the voxel based deep representation of 3D shape have gained improved result in the application of classification, retrieval, generation for 3D shape [8–10]. Particularly at the task of 3D shape generation, voxel based representation of 3D shape considerably reduce the computing burden [11]. What we need to do is mapping the relationship from image to 3D model. Generative Adversarial Networks (GAN) [12], as one of the most popular generative models, consists of independent generator and discriminator. And the generator, with sampled vector from latent space fed, learns the distribution of data and generates new object. While the discriminator takes charge of closing the distance between generated data distribution to real data distribution. Image based 3D shape generation takes the image information as latent vector, that becomes a constraint on generative net. Conditional GAN, a variant of native GAN, is suitable for this kind of tasks. As the original GAN, the discriminator is assigned to simply distinguish the fake and real input, and its loss function is based on Jensen-Shannon divergence while gradient vanishing problem occurs easily. The recent Wasserstein GAN [13] handle this problem by leveraging continuous Wasserstein distance that represents the distribution difference between real data and generated data. The WGAN try to drive the Wasserstein distance close to promote generated data as similar as possible to real data, meanwhile, the weight clipping is applied to enforce a Lipschitz constraint on the discriminator. And the extended work WGAN-GP [14] further improves the training by adding a gradient penalty item with respect to its input. In our case, we apply the WGAN-GP as the skeleton of our proposed 3D shape generation method. We present to generate printable 3D model based images from single view of the object in the real world. And the issue in this study is how to establish the relationship in the neural network between 2D images and 3D models, that do not belong in the same dimension. Besides, converting generated 3D shapes to 3D objects that satisfy the requirement of 3D printing is an another challenge. The proposed method, with Conditional Generative Adversarial Network adopted and images fed into as conditional vectors, possesses the ability to generate state-of-the-art 3D shapes. In the following chapters, we will introduce the details of our network, procedures of our experiment and results of our evaluations.

2 Related Work 2.1

Image-Based 3D Shape Modeling

Image-based 3D modeling owns the superiority in data acquiring, as persons could collect images with portative devices such as mobiles, digital cameras, etc. However, the algorithms to implement 3D modeling from images are subject to some restrictions, such that: we have to collect images of objects that have a relatively small baseline. And sometimes, users are limited by environment conditions, and model the object from a just handful of views or just one view, which presents issues for these algorithms [15]. The limitation proceeds from a number of key technical assumptions. One typical assumption is that features can be matched across views as hypothesized by the majority of methods based on SFM and SLAM [16–19]. So users need to collect

Image-Based 3D Shape Generation Used for 3D Printing


sufficient images from as many views as possible and keep in a relative baseline as far as possible in the meantime. If the viewpoints are separated by a large baseline, it is extremely problematic to establish feature correspondences due to local appearance changes or self-occlusions [20]. Moreover, lack of textures on objects and specular reflections also make the feature matching problem very difficult [21]. Recently, [15, 22–26] leverage deep neural nets to learn 3D shape from multi images. As the 3D modeling, the attempts to generate or reconstruct 3D object from images could be divided into two styles. In the first style, users can reconstruct 3D shape directly from images from multi views or just one view of the object, typically implemented in 3D-R2N2 [15], which is a novel recurrent neural network. 3D-R2N2 performs well with the images rendered with plain backgrounds. In the other style, researchers establish the relationship between 3D models and 2D images via a latent space. [8] presents a TL-network that map the images with convolutions network into a latent vector and then utilize the 3D deconvolution nets to generate 3D shapes. [27] leverages generative models such as Generative Adversarial Net and Variational Autoencoders [28] to generate 3D shapes from images with latent vector joined as a link between images and 3D models. However, 3DGAN is proficient at producing high quality objects from single classes. [29] proposed 3D-VAE-IWGAN to supply a gap and owns the ability to generate multiple classes of objects. Even so, it is arbitrary to trust that a latent space could establish a solid relationship between images and 3D shapes. BTW, most of the above methods with neural networks generates 3D shape represented in 3D voxel grid. On the extend way, we considered to introduce Conditional GAN to 3D modeling from images as the images could be treated as a so powerful restrained condition that generate 3D shapes presented in input images. 2.2

Learning with an Adversarial Nets

GAN was proposed by Goodfellow in 2014 [8], which incorporate a learned adversarial discriminator into the procedure of generative modeling. GAN owns the potential to mimic any distribution. As latent vectors sampled from prior distributions are fed into the trained generator, the traditional GAN could generate objects similar to samples from training dataset. Since that, a variety of GANs have been proposed successively, for example, DCGAN for adopting GAN with convolutional networks for image synthesis [30], WGAN with the Wasserstein distance as a superior critic in discriminator [13], Conditional GAN with a constraint to guide the generation [31], Cycle GAN for image style converting [32], SRGAN for super resolution images creating [33], etc. And the conditional GAN is widely applied in image style converting [34], images synthesis from text [35] and images painting [36], as CGAN transforms the traditional GAN as non-supervision models to a supervised model to generate the object under instructions. GAN as a powerful generative model, is also introduced into the research of 3D models. [4] showed that the learned latent vector by 3D-GAN can generate highquality 3D objects and improve object recognition accuracy as a shape descriptor. [29] make an improved 3D-GAN and 3D-IWGAN proposed to be trained on distributions involving multiple distinct object classes. [37] proposed to reconstruct 3D models from single depth view via CGAN. [38] presented 3D-ED-GAN to inpaint 3D models semantic plausibility and contextual details.


Z. Li et al.

3 Models and Method 3.1


Our method aimed at generating 3D shape used for 3D printing from several images via generative adversarial network. We leveraged the learning ability of neural network to generate 3D shapes represented in 3D voxel grid containing simple occupancy information, the input is a single image whereas the output shape is 323 occupancy grids in our network. And then we obtained the printable file of 3D shape. The whole pipeline of our method is showed in Fig. 1. Image Input

Voxel Model Generation

Voxel Denoising

Mesh Model Generation

Export Printable File

Fig. 1. Pipeline of our method

The learned generator outputs 3D voxel shape. However, the object format to access 3D printing is polygon mesh. It is necessary to convert voxel model to soft mesh model, resulted with stereolithographic format file. 3.2

Network Architecture

We use the following notation. The image encoder is donated as E, generator is denoted as G, the discriminator is denoted as D.

Fig. 2. Training a conditional GAN to map images to voxel models

We passed the single image into a pre-trained model as a feature extractor to obtain a 2048-dimensional latent vector z. We fed z into our volumetric convolutional networks which serves as generator in Conditional WGAN-GP. Obviously, the image feature serves as conditional input without noise input in our case. The generator tends to learn a map from images to 3D shapes with the supervision of 3D shapes.

Image-Based 3D Shape Generation Used for 3D Printing


The generated model y0 is further fed into a conditional discriminator. In particular, a single image paired with its corresponding 3D voxel model y, which is regarded as “real” example, while the image paired with its corresponding output 3D voxel shape from generator is regarded as “fake” example. We aimed at enhancing the discriminator to learn the matching relationship between image space and 3D shape space in the sense. The generator and discriminator are learned in an adversarial way that generator tend to generate example as similar as possible to real example to confuse the discriminator. On the other hand, discriminator as a critic is learned to distinguish the input pair is real or fake, with respective scores output. And the difference score between real example and fake example represents the distance between real data distribution and generated data distribution. The discriminator output signal is used to update both the generator and discriminator asynchronously. The training process is illustrated in Fig. 2. Image Encoder The pre-trained deep learning models like VGG-Net [39], Res-Net [40], Google-Net [41], etc. from ImageNet [42] are called as image feature extractor. These pre-trained models with convolutional architecture are usually used in classification task where images must be classified into one of 1000 different categories. Each of the whole pretrained model can be divided two part: feature extracting layers and classification layers followed. The information we need is the output of feature encoder layers. In our case, we chose ResNet-50 to map a single image to feature vector. On the one hand, this model’s total parameter number is smaller than VGG-19’s, on the other hands, the output feature map is 1  1  2048, which can be easily reshape to a feature vector passed to the following generator and discriminator. Generator We design an 3D convolutional neural network to generate 3D objects inspired by 3DGAN. Distinct from traditional Conditional GANs, we omit the noise vector input because we found that the generator simply learned to ignore the noise. And we desire the generator to produce deterministic outputs. The image feature vector is computed via a Fully-Connected layer in the Generator, output with an embedding vector in same size. Therefore, the generator could make an adaptive adjustment of the external signal via the FC layer parameters updating. Then we reshape the embedding vector into 2  2  2  256 as the source feature map of 3D deconvolutional network. As illustrated in Fig. 3, the network includes five fully convolutional layers with kernel size 4  4  4 and strides 2. The generator maps a 2048-dimensional vector which extracted from a single image via a pre-trained encoder, to a 323 cube, representing a 3D shape in voxel space.

Fig. 3. Generator architecture


Z. Li et al.

Discriminator Discriminator is arranged to classify whether the estimated 3D shapes are plausible or not by scoring the 3D voxel models. As stated above, the input signal of conditional discriminator includes not only the 3D shape only but also image feature vector. The shape of input 3D voxel cube is 323 whereas the image feature is represented as a vector. The difficulty is how to combine these two signals in distinct dimensionality in discriminator. We reference the method from an application called generative adversarial text to image synthesis [35]. We process the input 3D voxel data with three 3D convolutional layers of kernel size 4  4  4 and strides 2, and no spatial batch normalization added between but each followed with leaky ReLU. We then reduce the dimensionality of the image feature vector through a fully connected layer to a 200demensional embedding vector. When the spatial dimension of 3D convolutional layers’ output is 4  4  4, we replicate the embedding vector spatially resulting a matrix in the size of 4  4  4  200 and perform a depth concatenation. We then process the combined data via a 3D convolutional layer with the same configuration as above. Finally, we get a final score from discriminator. The detailed architecture is illustrated in Fig. 4.

Fig. 4. Discriminator architecture



We aimed to train a conditional WGAN-GP to map a picture to 3D shape, therefore, besides the objective function Lgan for conditional GAN, an object reconstruction loss Lrecon is introduced. Reconstruction Loss For the generator, we use a modifications of binary cross-entropy loss function instead of the original version inspired by [43]. The original binary cross-entropy weights false positives and false negatives equally. Nevertheless, most of the voxel space tends to be empty and the network can drop into a local optimum point by outputting most negatives. In this set, we inflict a higher penalty on false positive than false negative results by

Image-Based 3D Shape Generation Used for 3D Printing


assigning a hyper parameter s which weights the relative importance of false positives against false negatives as shown in following Eq. 1: Lrecon ¼ sy logðy0 Þ  ð1  sÞð1  yÞ logð1  y0 Þ


Where y is the target value in f0; 1g and y0 is the output value in ð0; 1Þ for each voxel from the generator. GAN Loss In our conditional WGAN-GP settings, we modified the loss function slightly and detailed definitions can be found in WGAN-GP: Lggan ¼ E½Dðy0 jzÞ


Ldgan ¼ E½Dðy0 jzÞ  E½DðyjzÞ þ kE½ð r^y Dð^yjxÞ 2 1Þ2 


Where ^y ¼ ex þ ð1  eÞy0 , e  U½0; 1, lambda controls the trade-off between the original WGAN loss and the gradient penalty. As stated in original paper, the advised value of lambda was 10. z represents the input image feature vector, while y0 is the generated value in ð0; 1Þ for each voxel from generator, and y is the target value in f0; 1g. There are two loss functions, Lrecon and Lggan to synergistically optimize the generator in our case. Minimizing Lrecon tends to learn the overall 3D shapes, whist minimizing Lggan tends to decrease the high-frequency noise of predict output. In order to jointly optimize the generator and guarantee the two loss in a scalar, we assign weight g1 to Lggan and g2 to Lrecon . Overall, the loss function for the generator is showed as follows: Lg ¼ g1 Lggan þ g2 Lrecon


During training, we set g1 to 0.5, and g2 to 100. 3.4


Our architecture is trained end-to-end in a supervised way. The image encoder provides feature vector for both the generator and discriminator. Nevertheless, we decide not to fine-tune image encoder in our training process as the parameters of generator and discriminator are updated asynchronously. To be specific, if we update image encoder parameters paired with generator and discriminator alternately in a feed forward manner, the image encoder will drop into an instable state and influence the learning process of the generator and discriminator. The image feature extracting ability of pretrained ResNet-50 is convincing and we adapt the feature vector by means of fullyconnected layer in the generator and discriminator respectively. The discriminator should be more powerful than the generator because the incentive for generator to update comes from the discriminator, meanwhile there is no need to worry about the gradient vanishing problem. Therefore, we update discriminator for 5 times and update generator for one time. We set learning rate of both G and D to


Z. Li et al.

0.001, and use a batch size of 128, we use ADAM optimizer, with a = 0.0001, b1 = 0.5, b2 = 0.9. We summarize the training procedure in Algorithm 1:

λ =10, ncritic =5, α =0.0001, β1 =0.5, β 2 =0.9, τ =0.85, η1 =0.5, η2 =100. Require: The gradient penalty coefficient λ , the number of discriminator iterations per generator iteration ncritic , the batch size m , Adam hyper-parameters α , β1 , β 2 , the reconstruction loss hyper-parameter τ , the loss for generator update hyperAlgorithm1 3D-ICGAN, our proposed algorithm. We use default values of

parameters η1 , η 2 .

ω0 , initial discriminator parameters. θ 0 , initial generator parameters. 1: while θ has not converged do

Require: 2: 3: 4:

ncritic do for i =1,…, m do

for t =1, …,

Extract image feature vector

z , a random number ε : U [0,1] .

y′ ← Gθ ( z )

5: 6:

yˆ ← ε y + (1 − ε ) y′


L(Di ) ← Dω ( y′) − Dω ( y ) + λ ( ∇ yˆ D( yˆ | z ) − 1) 2

8: 9:


end for ω ← Adam(∇ω

1 m (i ) ∑ LD ,ω ,α , β1 , β 2 ) m i =1


end for


Extract image feature vector


y′ ← Gθ ( z )


i) ← −τ y log( y′) − (1 − τ )(1 − y ) log(1 − y′) L(recon


L(Gi ) ← − Dω (Gθ ( z ))


θ ← Adam(∇ω

{z }

(i ) m i =1

1 m i) ),θ ,α , β1 , β 2 ) ∑ (η1L(Gi ) + η2 L(recon m i =1

16: end while

Data Generation For our training task, we need to obtain a large amount training data. We collect 3D shapes from Shapenet database [44] which covers 55 common object categories with about 51,300 unique 3D models. Besides, we render all the models into 15 viewers, at elevation of 30° and 15 azimuth angles from 0 to 360°, in increments of 24°. We rendered the images on a plain ground. We will render the images on colorful background which randomly selected from open room dataset such as SUN database [45] in the further job. We converted 3D shape into 323 voxel grid using binvox [46].

Image-Based 3D Shape Generation Used for 3D Printing


4 Experiment and Evaluation In this section, we evaluate our 3D Conditional WGAN-GP and introduce the procedure to access a printable file of 3D shape from voxel grid output of deep generation model. 4.1

3D Shape Generation Evaluation

We jointly trained our model over 5 classes of 3D shapes which consists of bed, chair, sofa, desk and table, each of them contains 1000 different models in addition to bed. As the scale of generator loss scored by the discriminator of generated voxel data is easily affected by the invariant constant offset of the discriminator, we only track the discriminator loss and plot the curve in Fig. 5.

Fig. 5. Loss curves: negative critic loss (left) and reconstruction loss (right)

As illustrated in Fig. 5, the negative critic loss of our model converges toward a minimum as the networks trains which explains that the generator minimizes the Wasserstein distance between real data distribution and generated data distribution and generates the most plausible 3D shape. Besides, we recorded reconstruction loss of test set which has no intersection with the training set. And the test loss was illustrated in Fig. 5, which shows that the average reconstruction loss converges towards a minimum. 4.2

3D Printing Evaluation

The 3D shapes generated from 3D Conditional GAN are represented in voxel grid, nevertheless the printable 3D shapes are polygon mesh format. We aimed to obtain physical 3D shapes rather than a group of visual 3D voxels not describing their coordinates. It is necessary to convert the representation of voxel to polygon mesh which represented by the coordinates of their vertices. Several uncontrolled outliers away from the main voxel grid structure generated from our deep learning model appear at times, and these redundant voxels have a negative impact on the efficiency of 3D printing as needless but ubiquitous support structures for outliers in addictive manufacturing technology are produced in the procedure of slicing. We


Z. Li et al.

presented a method called connected-domain-detection which employ a connectivity of 26 in our 3D binary grid and we implemented this method in matlab. In this way, we eliminated the outlier voxels treated as noises of our 3D voxel model (seen as Fig. 5). Input Image

Voxel model

Denoised model

Polygon mesh

Fig. 6. Input images and 3D printing objects

Image-Based 3D Shape Generation Used for 3D Printing


The generated 3D voxel models are denoised via extracting the largest connected component of generated object. However, it is so insufficient to generate printable 3D object that we need convert voxel data to polygon mesh. As we know, the file format used in 3D printing is called STL (an abbreviation of “stereolithographic”) which describe only the surface geometry of a three-dimensional object. In this case, we apply Lewiner marching cubes algorithm to find surfaces in 3D volumetric data. So we are able to export STL file provided for 3D printing. We print our generated object and as illustrated as Fig. 6, our model possesses the ability to generate 3D printable object in case of image offering.

5 Conclusion In this work, we propose a novel method to generate 3D printing models from images via Generative Adversarial Network. By leveraging the generative models, our method predicts the 3D models represented in voxel grid, then we apply some manner to get a 3D printable file and we print the generated 3D models. We evaluate our method in the field of 3D shape generation and 3D printing. The results show that our method owns the ability to generate 3D printable models even only images provided. And in this way, the procedure of 3D shape design could be simplified. In the further job, we will train our deep generation network across multi-class objects and promote the stability of our method. Acknowledgement. This work is partially supported by National Nature Science Foundation of China (No. 61374199).

References 1. Mai, J., Zhang, L., Tao, F., Ren, L.: Customized production based on distributed 3D printing services in cloud manufacturing. Int. J. Adv. Manuf. Technol. 84, 71–83 (2016) 2. Rengier, F., et al.: 3D printing based on imaging data: review of medical applications. Int. J. Comput. Assist. Radiol. Surg. 5, 335–341 (2010) 3. Remondino, F., El-Hakim, S.: Image-based 3D modelling: a review. Photogram. Rec. 21, 269–291 (2006) 4. Westoby, M., Brasington, J., Glasser, N., Hambrey, M., Reynolds, J.: ‘Structure-fromMotion’photogrammetry: a low-cost, effective tool for geoscience applications. Geomorphology 179, 300–314 (2012) 5. Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: null, pp. 519–528. IEEE (2006) 6. Mach, L.: Insight3D. Open Source image based 3D modelling software. Recuperado el (2012) 7. Chandler, J., Fryer, J.: Autodesk 123D catch: how accurate is it. Geomat World 2, 28–30 (2013) 8. Girdhar, R., Fouhey, D.F., Rodriguez, M., Gupta, A.: Learning a predictable and generative vector representation for objects. In: European Conference on Computer Vision, pp. 484– 499. Springer (2016)


Z. Li et al.

9. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015) 10. Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNS for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5648–5656 (2016) 11. Kar, A., Tulsiani, S., Carreira, J., Malik, J.: Category-specific object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1966–1974 (2015) 12. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014) 13. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017) 14. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. In: Advances in Neural Information Processing Systems, pp. 5767–5777 (2017) 15. Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: European Conference on Computer Vision, pp. 628–644. Springer (2016) 16. Fitzgibbon, A., Zisserman, A.: Automatic 3D model acquisition and generation of new images from video sequences. In: Signal Processing Conference (EUSIPCO 1998), 9th European, pp. 1–8. IEEE (1998) 17. Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Anal. Mach. Intell. 27, 418–433 (2005) 18. Häming, K., Peters, G.: The structure-from-motion reconstruction pipeline–a survey with focus on short image sequences. Kybernetika 46, 926–937 (2010) 19. Fuentes-Pacheco, J., Ruiz-Ascencio, J., Rendón-Mancha, J.M.: Visual simultaneous localization and mapping: a survey. Artif. Intell. Rev. 43, 55–81 (2015) 20. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004) 21. Saponaro, P., Sorensen, S., Rhein, S., Mahoney, A.R., Kambhamettu, C.: Reconstruction of textureless regions using structure from motion and image-based interpolation. In: International Conference on Image Processing (ICIP), pp. 1847–1851. IEEE (2014) 22. Gadelha, M., Maji, S., Wang, R.: 3D shape induction from 2D views of multiple objects. In: International Conference on 3D Vision (3DV), pp. 402–411. IEEE (2017) 23. Rezende, D.J., Eslami, S.A., Mohamed, S., Battaglia, P., Jaderberg, M., Heess, N.: Unsupervised learning of 3D structure from images. In: Advances in Neural Information Processing Systems, pp. 4996–5004 (2012) 24. Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency, p. 3. In: CVPR (2017) 25. Tatarchenko, M., Dosovitskiy, A., Brox, T.: Multi-view 3D models from single images with a convolutional network. In: European Conference on Computer Vision, pp. 322–337. Springer (2016) 26. Lun, Z., Gadelha, M., Kalogerakis, E., Maji, S., Wang, R.: 3D shape reconstruction from sketches via multi-view convolutional networks. In: International Conference on 3D Vision (3DV), 2017, pp. 67–77. IEEE (2017) 27. Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016) 28. Doersch, C.: Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016)

Image-Based 3D Shape Generation Used for 3D Printing


29. Smith, E., Meger, D.: Improved adversarial systems for 3D object generation and reconstruction. arXiv preprint arXiv:1707.09557 (2017) 30. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015) 31. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411. 1784 (2014) 32. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycleconsistent adversarial networks. arXiv preprint (2017) 33. Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, p. 4 (Year) 34. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. arXiv preprint (2017) 35. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396 (2016) 36. Agrawal, M., Sawhney, K.: Exploring convolutional neural networks for automatic image colorization. Technical report 37. Yang, B., Wen, H., Wang, S., Clark, R., Markham, A., Trigoni, N.: 3D object reconstruction from a single depth view with adversarial learning. arXiv preprint arXiv:1708.07969 (2017) 38. Wang, W., Huang, Q., You, S., Yang, C., Neumann, U.: Shape inpainting using 3D generative adversarial network and recurrent convolutional networks. arXiv preprint arXiv: 1711.06375 (2017) 39. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 40. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770– 778 (2016) 41. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) 42. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015) 43. Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2016) 44. Chang, A.X., et al.: Shapenet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015) 45. Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: IEEE conference on Computer vision and pattern recognition (CVPR), 2010, pp. 3485–3492. IEEE (2010) 46. Min, P.: Binvox, a 3D mesh voxelizer (2004).*min/binvox

A Memory Efficient Parallel Particle-Based Volume Rendering for Large-Scale Distributed Unstructured Volume Datasets in HPC Environments Yoshiaki Yamaoka1(&), Kengo Hayashi1, Naohisa Sakamoto1, and Jorji Nonaka2 1


Kobe University, Kobe, Japan [email protected] RIKEN Center for Computational Science, Kobe, Japan

Abstract. In recent years, the size and complexity of the datasets generated by the large-scale numerical simulations using modern HPC (High Performance Computing) systems have continuously increasing. These generated datasets can possess different formats, types, and attributes. In this work, we have focused on the large-scale distributed unstructured volume datasets, which are still applied on numerical simulations in a variety of scientific and engineering fields. Although volume rendering is one of the most popular techniques for analyzing and exploring a given volume data, in the case of unstructured volume data, the time-consuming visibility sorting becomes problematic as the data size increases. Focusing on an effective volume rendering of large-scale distributed unstructured volume datasets generated in HPC environments, we opted for using the well-known PBVR (Particle-based Volume Rendering) method. Although PBVR does not require any visibility sorting during the rendering process, the CPU-based approach has a notorious image quality and memory consumption tradeoff. This is because that the entire set of the intermediate rendering primitives (particles) was required to be stored a priori to the rendering processing. In order to minimize the high pressure on the memory consumption, we propose a fully parallel PBVR approach, which eliminates the necessity for storing these intermediate rendering primitives, as required by the existing approaches. In the proposed method, each set of the rendering primitives is directly converted to a partial image by the processes, and then they are gathered and merged by the utilized parallel image composition library (234Compositor). We evaluated the memory cost and processing time by using a real CFD simulation result, and we could verify the effectiveness of our proposed method compared to the already existing parallel PBVR method. Keywords: Particle-based Volume Rendering 234 image composition  In-situ visualization

 Unstructured volume dataset

© Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 552–562, 2018.

A Memory Efficient Parallel Particle-Based Volume Rendering


1 Introduction and Motivations In recent years, advances in parallel computing technologies have continuously increased the size and complexity of the numerical simulations in modern HPC operational environments. As a result, the generated simulation results have also proportionately increased in its size and complexity, and most of them are stored as largely distributed files, and can represent a variety of data formats, types, and attributes. In this work, we have focused our attention on the large-scale distributed unstructured volume datasets with complex geometries, such as those utilizing a mixture of different geometrical elements. It is worth noting that this kind of datasets is still commonly used in a variety of scientific and engineering fields. Direct volume rendering [1] is one of the most popular scientific visualization techniques for analyzing and exploring the entire volume data by appropriately setting the opacity attributes, via user-defined transfer function, in order to highlight or hide parts of the data that a user is paying attention. Although this rendering technique can efficiently run on structured datasets, the direct volume rendering of unstructured volume rendering is still a challenging task especially for large datasets. Direct volume rendering requires the visibility sorting along the direction of the viewing ray, and it is straightforward in the case of structured volume data such as voxel data because of the easiness to find the neighboring grid data. However, in the case of the unstructured volume datasets, the computation of the cell adjacency information for each of the viewing ray becomes time-consuming as the number of the cell elements or the size of the rendering image increases. In addition, the problem becomes more acute in the case of complex geometry data with heterogeneous cell elements such as the one utilized in the evaluations. We used an unstructured volume data composed of tetrahedral, hexahedral, and prismatic cells, utilized by a CFD (Computational Fluid Dynamics) simulation. It is worth noting that this kind of heterogeneous geometric cells is used by CSM (Computational Structure Mechanics) simulations. Almost a decade ago, in 2007, PBVR (Particle-based Volume Rendering) [2] has emerged as an alternative approach for the traditional volume rendering by revisiting the Sabella’s particle emission model [3]. The PBVR method represents a given volume data as a set of opaque and self-illuminated particles as the intermediate geometric primitives. The particle generation takes into consideration the user-specified transfer function, to determine the particle density distribution inside the volume data. The most important point of the PBVR method is that, independent of the cell type, the input volume data is converted to the same geometric primitive (particle). These generated particles are assumed as being totally opaque, and since there is no transparency information anymore the visibility-sorting process during the rendering phase becomes unnecessary. These generated particles are then projected onto the image plane to generate the corresponding volume rendering image. Since these particles are generated in a stochastic fashion, an ensemble averaging processing of the images rendered by using different sets of particles proportionately increases the image quality as the number of ensembles increases.


Y. Yamaoka et al.

Several improvements and extensions have been proposed to the PBVR so far including some parallelization approaches. Among them, Kawamura et al. [4] proposed a parallelization of the particle generation stage along with a mechanism for gathering and transferring the generated particle data to be used in a remote visualization system. Although this approach can handle large-scale datasets generated by the HPC systems, the particle generation stage needs to be re-executed when there is an update in the transfer function due to its dependency. Adaptive particle size adjustment technique [5] has been proposed in order to eliminate this transfer function dependency, and enable the interactive transfer function modification, without the necessity for any particle regeneration, during an interactive visual exploration. There is also a time-varying LOD rendering technique [6] proposed for the PBVR in order to enable the handling of large-scale time-varying datasets, which includes unstructured volume data. However, all these existing PBVR-based large-scale volume rendering approaches are not fully parallelized, that is, the particle rendering process remains non-parallelized, and the generated particles are gathered onto a single rendering process which generates the final rendering image. It is worth noting that in all these aforementioned parallel PBVR approaches, there is a possibility that the number of generated particles becomes large depending on the utilized transfer function as well as the required number of ensembles, which defines the resulting image quality. In this single rendering process approach, it can put a high pressure on the memory consumption for storing the entire set of generated particles before initiating the rendering process. In this paper, we propose a memory efficient parallel PBVR, which also parallelizes the rendering stage by generating the sub-images at each of the rendering process to avoid the particle data transferring. These lightweight sub-images are then gathered and processed by a flexible parallel image composition library named 234Compositor [7]. We evaluated the effectiveness of this proposed approach using a large-scale distributed unstructured volume dataset composed of prismatic cells, and generated from a thermal fluid simulation in an HPC environment [8].

2 Method PBVR is an object-space stochastic rendering approach [9], and the particle density within a given volume data is firstly estimated by taking into consideration the user defined transfer function. Independently to the data formats of the original volumetric data sets, the generated particles data sets will have a unique and common format to be processed by the subsequent rendering process in order to generate the final image. In this section, we will present an overview of the proposed parallel PBVR method utilizing the KVS (Kyoto Visualization System) framework [10]. 2.1


The PBVR workflow can be separated onto the following three main processes: Particle generation process; Particle projection process; and Ensemble averaging process. In the already existing parallel PBVR approaches, only the particle generation process has been executed in parallel, that is, the particle generation is independently executed for

A Memory Efficient Parallel Particle-Based Volume Rendering


each of the sub-volumes. In our proposed parallel PBVR approach, the particle projection process is also executed in parallel as shown in the Fig. 1. In this figure, each of the ranks represents the MPI (Message Passing Interface) process, and the portion marked in a red rectangle represents the repetition loop required to improve the image quality, and is controlled by the pre-defined number of ensembles. Considering that the large-scale numerical simulations on modern HPC systems can generate much larger number of distributed files compared to the number of processes allocated for the rendering, we implemented a data loading mechanism to absorb the mismatch in the number of files and the allocated number of data loading processes. Each of the data loading processes is capable of reading a set of pre-defined number of files containing the sub-volume data, that is, the users can freely select a number of processes which is a multiple of the number of the distributed physical files. For instance, if the number of distributed files is 16, it will become possible to use 16, 8, 4, or 2 MPI processes for reading the data in parallel. In the next sub-sections, the particle generation, projection, and the sub-image gathering and merging processes will be explained in more detail.

Fig. 1. Overview of the proposed parallel PBVR method. (Color figure online)


Particle Generation

After the data loading, each of the MPI processes will generate the particle datasets for the loaded sub-volumes. This particle generation will be governed by the following Eq. 1, where the particle density q represents the number of particles within a unit volume. q¼

logð1  aÞ pr 2 Dt



Y. Yamaoka et al.

In this equation, the variable a represents the opacity value given by the user specified transfer function, and r represents the radius of the particles, and the Dt represents the length of the integration intervals utilized in the traditional volume rendering brightness equation [11]. Since the particle generation process occurs in a cell-by-cell fashion, an intra-parallelization by utilizing multi-thread processing can be applied, and we utilized the OpenMP multi-threading in our implementation. The number of generated particles for each of the sub-volumes can be calculated by multiplying the volume of the grids by the particle density q. It is worth noting that, in the particle generation process, the number of particles can also be controlled by applying the particle size adjustment technique [6]. The KVS framework outputs the particles possessing the following information: • Coordinate Data (3  4 Bytes Floating Point Data) • Color Data (4  1 Byte Data) • Normal Vector Data (3  4 Bytes Floating Point Data) 2.3

Particle Projection

In the existing parallel PBVR techniques [4–6], the entire set of generated particles are transferred a priori to the rendering process, the master MPI node, and as shown in the previous sub-section, each particle consumes 28 bytes to be stored. To avoid a high pressure on the memory consumption when the number of particles becomes large, we also execute the rendering in parallel to avoid the particle data gathering onto the master MPI node. In the rendering process, each of the particles is then projected onto the image plane to generate the sub-image corresponding to the sub-volume read by the MPI process. During this projection phase, the depth information from the coordinate data information is used to select the closest particles thus eliminating the visibility sorting required by the traditional volume rendering. In addition, since each of the particles has its own coordinate information, they can be processed in any ordering. Each of the rendered images is composed of color and depth information, which are then used by the subsequent parallel image composition process explained in the next sub-section (Fig. 2).

Fig. 2. Example of rendered sub-images for each of the sub-volumes.

A Memory Efficient Parallel Particle-Based Volume Rendering



Sub-image Gathering and Merging

The order-independency explained in the previous sub-section gives us the flexibility for executing the sub-image gathering in an asynchronous manner. However, in this initial implementation, we utilized the parallel image composition functionality, provided by the 234Compositor library [7], which requires the presence of the entire set of sub-images to that will be processed. The sub-images are gathered and merged by taking into consideration the depth information for each of the pixel data. Actually, a simple depth comparison is executed in a per-pixel basis and the closest color information will remain in the merged image. This merged image represents an ensemble image, and by repeating this process for each set of the generated particles, multiple ensemble images will be obtained. In the traditional PBVR approach, the final image is then generated by averaging these images in the master MPI node as shown in the Fig. 3. However, to reduce the memory cost for storing each ensemble image, we propose the use of a progressive approach for the ensemble averaging process. As shown in the Eq. 2, by considering that Pi;j ¼ Ri;j ; Gi;j ; Bi;j being a pixel value at   position (i, j) on an ensemble image, the Pki;j ¼ Rki;j ; Gki;j ; Bki;j will be the pixel value obtainedwith the number of  ensemble averaging processes of k, and the following kþ1 kþ1 kþ1 kþ1 pixel value can then be progressively calculated as Pi;j ¼ Ri;j ; Gi;j ; Bi;j follows: 8 kþ1 > < Ri;j Gki;jþ 1 > : Bk þ 1 i;j

¼ ¼ ¼

k k k þ 1 Ri;j k k k þ 1 Gi;j k k k þ 1 Bi;j

þ þ þ

1 k þ 1 Ri;j 1 k þ 1 Gi;j 1 k þ 1 Bi;j


3 Experimental Results In order to verify the memory cost and the processing time of the proposed parallel PBVR method, compared to the previously proposed parallel PBVR approach [5], we utilized a distributed unstructured volume dataset, obtained from a numerical simulation result of the Magnus force acting on a rotating sphere placed in a uniform flow [8]. This CFD simulation calculates the rotation effect of a heated sphere around a uniform flow, the vertical axis of the surrounding flow field, and the lift. This unstructured volume data, separated into 256 distributed files, is composed of 18,899,767 prism elements with 15,321,546 nodes. The rendering performance was measured on a largememory HPC system possessing 32 12-core Intel Xeon E7-8857 v2 3.0 GHz CPUs, and 8 NVIDIA Quadro K6000 GPUs, and a total of 16 TB of main memory. Figure 4 shows visualization results obtained by using our proposed method with different number of ensembles.


Y. Yamaoka et al.

Fig. 3. Sub-image gathering and merging, and the final ensemble averaging process.

Fig. 4. PBVR rendering results by using different numbers of ensembles: 1 (left), 10 (middle) and 100 (right).

A Memory Efficient Parallel Particle-Based Volume Rendering



Memory Cost

In the experiments, we focused our analysis of the memory cost on the most loaded process. We allocated 32 MPI processes for the evaluations, and set the number of ensembles to 100 in order to generate high quality rendering results, and utilized a traditional benchmarking image size of 512  512. The theoretical memory cost of the proposed method and the target PBVR method [5] can be estimated from the number of particles and the resolution of the rendering image. As explained in the Sub-sect. 2.2, each particle is represented by a total of 28 bytes including the coordinate position of 4 bytes  3 components = 12 bytes; a normal vector of 4 bytes  3 components = 12 bytes; and a scalar value of 4 bytes representing the color and transparency information. Therefore, the memory cost P, which is required to store the particles, can be represented as P = 28  N, where N is the number of particles. In addition, the data size of a pixel is represented by a total of 7 bytes including a color value of 1 byte  3 components = 3 bytes and a depth value of 4 bytes. As a result, the memory cost of the image data G can be calculated as G = 7  W  H, where W and H are the width and height of the image data, respectively. Table 1 shows the measurement results for the memory consumption in the target parallel PBVR method [5] utilized for the comparison. As shown in this table, although the memory consumption proportionally increases as the number of ensembles increases, we can verify that it does not depend on the number of utilized MPI processes. This is because that in this parallel PBVR approach, the entire set of particles is a priori gathered onto the master MPI node and stored in the main memory before starting the projection-based rendering process. Therefore, the required memory cost is correspondent to the entire set of the generated particles, thus independent from the utilized number of MPI processes. On the other hand, Table 2 shows the measurement results for the memory consumption required by our proposed parallel PBVR method. As shown in this table, we can verify that the memory cost proportionately increases as the number of ensembles increases, and inversely decreases as the number of MPI processes increases. This is because that, in the proposed method, the particles are independently generated and rendered for each of the sub-volumes in the MPI node without any data transfer. In the case of the well load-balanced scenario for the particle generation and rendering, the memory consumption is almost equivalent to the memory required to store the partial particles in the computational nodes. In this case, the number of the partial particles is almost the same as the number of particles, which can be calculated by dividing the total number of generated particles by the number of MPI processes. In this experiment, the number of generated particles in each of the computational nodes was well loadbalanced. From the aforementioned reasons, the memory cost for the previous method proportionally increases as the number of ensembles increases, and the cost for our proposed method does not depend on this assumption. Therefore, in our method, it is possible to improve the image quality by increasing the number of ensembles without increasing the memory cost during the rendering process. Moreover, the memory cost required for each node can also be reduced by increasing the number of processes.


Y. Yamaoka et al.

Table 1. Memory cost of the target parallel PBVR method [5] utilized as the benchmark. # of MPI procs. # of ensembles 1 10 4 26.9 MB 253 MB 8 26.9 MB 253 MB 16 26.9 MB 253 MB 32 26.9 MB 253 MB

100 2.45 2.45 2.45 2.45


Table 2. Memory cost of the proposed parallel PBVR method. # of MPI procs. # of ensembles 1 10 100 4 13.1 MB 13.1 MB 13.1 MB 8 7.3 MB 7.3 MB 7.3 MB 16 5.4 MB 5.4 MB 5.4 MB 32 4.5 MB 4.5 MB 4.5 MB


Processing Time

We measured the processing time of our proposed method when using different numbers of ensembles, numbers of MPI processes, and image resolutions. Figure 5 shows a graph with differential processing times comparing the target parallel PBVR method and the proposed parallel PBVR method taking into consideration the data transfer and the particle projection processes. The differential time, shown in the graph, is calculated by subtracting the processing time required by the proposed method from that required by the previous method. From this figure, we can verify that all of the differential times were positive, that is, the processing speed of the proposed method was slower compared with the previous method. However, the particle projection time was comparable even using different numbers of ensembles and MPI processes. In the case of the number of ensembles of 1, the data transfer time of the sub-images in the image composition process was 0.16 to 0.33 s slower than the particle data transfer time in the previous method. In addition, it is worth noting that the differential times get longer as the number of ensembles increases. In the previous parallel PBVR method, the number of transmissions of the generated particles is only one although the data size of the transferred particles increase as the number of ensembles increases. In contrast, our proposed method required multiple data transmission for the partial images although the small data size of the transmitted images compared to the previous method. In fact, there is a possibility that the number of transmissions is affecting the total rendering time, although the memory cost of the proposed method can be reduced significantly compared to the previous method.

A Memory Efficient Parallel Particle-Based Volume Rendering


Fig. 5. Differential time between the proposed parallel PBVR and the existing parallel PBVR method [5] taking into consideration the data transfer and particle projection processes.

4 Conclusions In this work, we proposed a memory efficient parallel PBVR method using parallel image composition functionality from the 234Compositor library. This parallel PBVR method was focused on handling large-scale distributed unstructured volume datasets where it can show its maximum potential by avoiding the time-consuming visibility sorting when using the traditional volume rendering approach. In the experiments, we confirmed that the memory cost required for the proposed method decrease as the number of ensembles and MPI processes increase, that is, the proposed method has higher memory efficiency compared to the existing parallel PBVR methods, where the entire set of the generated particle data is gathered to the master MPI node before starting the rendering process. However, since there is a trade-off between the image quality and the ensemble number, if the ensemble number becomes large, it is possible to have a higher performance penalty due to the increase in the number of necessary rendering and image composition. As a future work, we are planning to work on this issue, and in addition, to integrate into the simulation codes for enabling the in-situ PBVR directly on the HPC systems. Acknowledgements. Our research was carried out under the auspices of the Ministry of Education, Culture, Sports, Science and Technology’s entrusted work “Climate Change Adaptation Technology Society Implementation Program (SI - CAT)” and Grant-in -Aid for Scientific Research (JP 17K00169). The data was provided by Mr. Masaya Muto of Kyoto University. This work is also partially supported by the “Joint Usage/Research Center for Interdisciplinary Largescale Information Infrastructures” in Japan (Project ID:jh180060-NAH).


Y. Yamaoka et al.

References 1. Levoy, M.: Display of surfaces from volume data. IEEE Comput. Graphics Appl. 8(3), 29– 37 (1988) 2. Sakamoto, N., Nonaka, J., Koyamada, K., Tanaka, S.: Particle-based volume rendering. In: Asia-Pacific Symposium on Visualization (APVIS 2007), pp. 129–132 (2007) 3. Sabella, P.: A rendering algorithm for visualizing 3D scalar fields. ACM SIG-GRAPH Comput. Graph. 22(4), 51–58 (1998) 4. Kawamura, T., Idomura, Y., Miyamura, H., Takemiya, H., Sakamoto, N., Koyamada, K.: Remote visualization system based on particle based volume rendering. In: SPIE/IS&T Electronic Imaging, International Society for Optics and Photonics, p. 93970S (2015) 5. Hayashi, K., Shimizu, T., Sakamoto, N., Nonaka, J.: Parallel particle based volume rendering using adaptive particle size adjustment technique. In: ACM SIGGRAPH Asia Symposium on Visualization (SA17), pp. 11:1–11:8 (2017) 6. Zao, K., Sakamoto, N., Koyamada, K.: Using interactive particle-based rendering to visualize a large-scale time-varying unstructured volume with mixed cell types. In: IEEE Pacific Visualization 2017 (VisNotes), pp. 185–189 (2017) 7. Nonaka, J., Ono, K., Fujita, M.: 234Compositor: a flexible parallel image compositing framework for massively parallel visualization environments. Futur. Gener. Comput. Syst. (2017). 8. Muto, M., Watanabe, H., Kurose, R., Tsubokura, M.: The effect of surface heating on the drag of a sphere at the critical reynolds number. In: 4th International Conference on Jets, Wakes and Separated Flows (2013) 9. Sakamoto, N., Koyamada, K.: Stochastic approach for integrated rendering of volumes and semi-transparent surfaces. In: SC Companion: High Performance Computing, Networking Storage and Analysis (UltraVis2012), pp. 176–185 (2012) 10. Sakamoto, N., Koyamada, K.: KVS: a simple and effective framework for scientific visualization. J. Adv. Simul. Sci. Eng. 2(1), 76–95 (2015) 11. Johnson, C., Hansen, C.: Visualization Handbook. Academic Press, Inc. (2014)

A Transfer Entropy Based Visual Analytics System for Identifying Causality of Critical Hardware Failures Case Study: CPU Failures in the K Computer Kazuki Koiso1(&), Naohisa Sakamoto1, Jorji Nonaka2, and Fumiyoshi Shoji2 1

Graduate School of System Informatics, Kobe University, Kobe, Japan [email protected] 2 Center for Computational Science, RIKEN, Kobe, Japan

Abstract. Large-scale scientific computing facilities usually operate expensive HPC (High Performance Computing) systems, which have their computational and storage resources shared with the authorized users. On such shared resource systems, a continuous and stable operation is fundamental for providing the necessary hardware resources for the different user needs, including large-scale numerical simulations, which are the main targets of such large-scale facilities. For instance, the K computer installed at the R-CCS (RIKEN Center for Computational Science), in Kobe, Japan, enables the users to continuously run large jobs with tens of thousands of nodes (a maximum of 36,864 computational nodes) for up to 24 h, and a huge job by using the entire K computer system (82,944 computational nodes) for up to 8 h. Critical hardware failures can directly impact the affected job, and may also indirectly impact the scheduled subsequent jobs. To monitor the health condition of the K computer and its supporting facility, a large number of sensors has been providing a vast amount of measured data. Since it is almost impossible to analyze the entire data in realtime, these information has been stored as log data files for post-hoc analysis. In this work, we propose a visual analytics system which uses these big log data files to identify the possible causes of the critical hardware failures. We focused on the transfer entropy technique for quantifying the “causality” between the possible cause and the critical hardware failure. As a case study, we focused on the critical CPU failures, which required subsequent substitution, and utilized the log files corresponding to the measured temperatures of the cooling system such as air and water. We evaluated the usability of our proposed system, by conducting practical evaluations via a group of experts who directly works on the K computer system operation. The positive and negative feedbacks obtained from this evaluation will be considered for the future enhancements. Keywords: Data exploration  Failure analysis High performance computing system  Log data  Causality

© Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 563–574, 2018.

 Transfer entropy


K. Koiso et al.

1 Introduction and Motivations Large-scale scientific computing facilities, such as the R-CCS (RIKEN Center for Computational Science), in Kobe, Japan, has operated expensive HPC (High Performance Computing) systems, for providing necessary computational and storage resources to the authorized users. On such large-scale shared resource systems, a continuous and stable operation is fundamental for enabling the smooth running of large-scale numerical simulations, which are the main targets of such large-scale HPC facilities. For instance, the K computer, installed at the R-CCS, enables the users to continuously run large jobs with tens of thousands of computational nodes (a maximum of 36,864 nodes) for up to 24 h, and huge jobs by using the entire K computer system (82,944 nodes) for up to 8 h. In addition, usually a large number of small and medium size jobs are usually co-existing at the same time, thus any critical hardware failures will impacts the normal operation can have great consequences on the users. Considering the large number of hardware components, on such complex system, it is almost impossible to avoid hardware failures. However, it will be highly useful to have a tool for analyzing possible causes of these failures for decision making process and preventive strategies. Several studies for monitoring and analyzing the health condition of the hardware system for the continuous and stable operation, as well as for improving the reliability of the system, have been reported so far. Shoji et al. [1] investigated the relationships between the numbers of critical hardware failures on the K computer during specific events. For instance, the monthly failure rate of CPUs has coincided with the heavy computational usage such as the full node LINPACK measurements and Gordon Bell challenges. In addition, the monthly failure rate of DIMM (Dual in-line Memory Module) has decreased right after the modification on the air conditioning operation, which reduced the outlet air cooling temperature from 21 to 18 °C. It is worth noting that to monitor the health condition of the K computer and its supporting facility, a large number of sensors has been providing a vast amount of measured data. Since it is almost impossible to analyze the entire data in real-time, these information has been stored as log data files for post-hoc analysis. In this work, we propose a visual analytics system which uses these big log data files to identify the possible causes of the critical hardware failures. On other HPC systems, Schulz et al. [2] proposed a visual analytics system, which can analyze and identify the correlation patterns from the performance datasets, such as the CPU utilization rate and the waiting time obtained from a large-scale HPC system. They used multi-dimensional data visualization techniques, such as the scatter plot matrix and parallel coordinate plot techniques. El-Sayed et al. [3] conducted a correlation analysis using field data, such as the HPC power quality, temperature, and cosmic radiation, provided from the LANL (Los Alamos National Laboratory) facility to investigate the influence of various factors affecting the reliability of the HPC system. From the obtained investigation results, they showed that the failure rate of CPU has a positive correlation with neutron flux caused by the cosmic rays. In addition, Schroeder et al. [4] analyzed the failure data of more than twenty types of systems collected by the LANL facility to grasp the failure characteristics of the HPC system in

A Transfer Entropy Based Visual Analytics System


order to develop a highly reliable HPC system. From their analysis results, the differential time between the failures can be modeled by using the Weibull distribution where the hazard rate decreased, and showed that the failure occurrence rate depended on the system scale, and it did not depend on the hardware types. Gupta et al. [5] visualized the failure data collected from the Titan supercomputer with a heat map to analyze the spatial characteristics of the HPC system failure distribution. From these results, they confirmed that the distribution of the system failures has spatial localities, and the probability of failure occurring near the place where the failure occurred last time was higher than the other places. However, although it is possible to analyze the correlations between the datasets obtained from the systems, in those aforementioned studies, they have not quantitatively mentioned any causal relationships between a possible cause and its effect. In order to correctly analyze the failure causes on such HPC systems, it is necessary to properly evaluate not only the correlation, but also the causality between the failures and the log event information obtained from these systems. Moreover, for the operational side, it is important to have a visual causality system, which can correctly interpret and validate the evaluation results from the causality testing. Efficient visualization techniques allied with intuitive and interactive operation for exploring the log event information will greatly assist the interactive visual exploration, thus in this paper, we propose a visual analytics system, which can interactively explore the failure causes on the HPC system based on the information transfer by using the HPC environmental log data and the system failure data. The proposed system will be detailed explained in the next section.

2 Proposed Visual Analytics System Our proposed visual analytics system is composed of the following four sub-systems as shown in the Fig. 1: (a) Graph plot view subsystem, which visualize the environmental log dataset and the failure data represented as time-varying dataset; (b) Spatiotemporal plot view, which visualize the distribution of the failures; (c) Heat-map plot view subsystem, which visualize the frequency distribution of the failures; and (d) Causality graph plot view subsystem, which visualize the causalities between the failures and the HPC environmental log datasets. 2.1

Graph Plot View Subsystem

In order to analyze the causal relationship between the HPC environmental log data information and the number of failures, we selected the CPU temperature from the HPC environmental log data, and the number of the CPU failures. For the visual analysis, it is important to correctly understand the characteristics of both data by providing the macro and micro view for enabling the overview and also for comparing the transitions. In order to overview the number of failures and the transition of the HPC environmental log data, we plotted them with a line graph. As shown in Fig. 2(a), the number of failures was plotted as a line graph with the number of failures on the vertical axis and the period on the horizontal axis. Likewise, as shown in Fig. 2(b), the HPC


K. Koiso et al.

Fig. 1. Overview of our proposed system which consists of four main subsystems: (a) Graph plot view; (b) Spatiotemporal plot view; (c) Heat-map plot view; (d) Causal graph plot view.

environmental log data information is plotted as a line graph with the value of the environmental log data on the vertical axis and the period on the horizontal axis. Moreover, in order to compare these two graphs, with the same granularity and visual clarity, the number of failures and the HPC environmental log data were both plotted on a monthly basis. Since the temporal distribution of the CPU failures was scarce, even considering month as the time period, we utilized the total accumulated number of failures for the plotting. Therefore, the number of failures is expressed not in the rack level but in the entire system level. It is worth mention that the K computer system has 4 CPUs per system board, and 24 system boards per rack, and a total of 864 racks distributed in 24  36 manner. However, there are 9 lines of storage racks, thus the total rack distribution becomes 24  45. In order to comply with the use of system level information (82,944 CPUs) for the monthly number of failures, the CPU temperature from the HPC environmental log data was expressed as the average temperature of the already averaged daily temperature in the rack basis. By performing the drag operation as shown in the Fig. 2(left), it is possible to interactively select an ROI (Region of Interest). The user selected area is then reflected on the pairing data as shown in the Fig. 2(right). Since the selected ROI is highlighted on both graphs, it becomes possible to intuitively visualize and compare the number of files and the averaged CPU temperature in the selected period. 2.2

Spatiotemporal Plot View

In order to correctly analyze the critical hardware failure causes, it is important to know when and where the failure occurred. Therefore, in order to present the spatiotemporal distribution of the hardware failures, the plotting of the location where the fault

A Transfer Entropy Based Visual Analytics System


Fig. 2. Intuitive data range selection by using the brushing procedure. The selected ROI on the graph plot view of the number of failures (left) is reflected on the CPU temperature (right).

occurred, in addition to the occurrence date and time is highly valuable. We utilized a three dimensional plotting technique to facilitate the interactive visual exploration of the spatiotemporal distribution of the hardware failures. In the case of our targeted CPU failures, the Fig. 3 shows the 3D spatiotemporal distribution of the CPU failures. The X axis corresponds to the ID of the rack from A to X, and the rack ID with Y axis from 1 to 45, and the XY space represents the spatial distribution of the rack. The Z axis corresponds the period of time. For instance, in this figure, the failure plotted with a red dot occurred on the rack S23 on April 1, 2016. Moreover, since this figure is a 3D plot, it is conceivable that the plotted points may overlap each other and the point placed on behind cannot be seen. In this system, it is possible to freely change the viewpoint by intuitive drag operation on the graph, and it can be expected to effectively discover temporal and spatial bias of the failure occurrence rack. 2.3

Heat Map Plot View

In addition to the spatiotemporal distribution of the failures, it will be of great value to identify where the failures is actually concentrated. When analyzing the spatial distribution of the failures, by merely looking at the space plane with the fault rack spacetime plot described in the Subsect. 2.2, it becomes impossible to extract the accurate spatial distribution. This is because when a failure overlap another one at the same place, these plots will completely overlap and the point on behind will be hidden. In order to solve this visual cluttering problem, and to enable an accurate visual analysis of the spatial distribution of the failures, we utilized the heat map plotting technique.


K. Koiso et al.

Fig. 3. Spatiotemporal distribution of the CPU failures over a specific period of time. (Color figure online)

Figure 4 shows the heat map plot view corresponding to the number of failures in the rack basis. In this figure, the horizontal axis corresponds to the ID of the rack from A to X, and the ID corresponds to the ID of the rack whose vertical axis is 1 to 45, respectively, and each cell corresponds to one rack. It is important to mention that the axis number 3, 8, 13, 18, 23, 28, 33, 38, and 43 corresponds to the storage racks and there are no computational nodes on them. The color of the cells represents the number of failures of the corresponding rack. The total number of failures is taken in all the periods handled for each rack, the maximum value is red, and the minimum value is colored with blue color. As the color of the cell approaches to red, it indicates that the number of the failures increases. With this information in main, it becomes possible to intuitively identify the failure concentration in the rack basis, and pay attention to the occurrence of a spatial bias in the failure distribution. 2.4

Causality Graph Plot View

In the proposed visual analytics system, we used the Transfer Entropy (TE), which was introduced by Schreiber [6], in 2000, in order to calculate the causality between the HPC environmental log dataset and the failure data. The TE can calculate the causality by quantifying a flow of information between two stochastic variables as the information entropy. Let X be an environmental log data, and Y the failure data, then the causality will describe the influence of X on Y, and this can be calculated by using the TE as follows:


  ðlÞ ðk Þ  P Yt þ 1 jYt ; Xt X  ðlÞ ðk Þ   ¼ P Yt þ 1 ; Yt ; Xt log ðlÞ P Yt þ 1 jYt


A Transfer Entropy Based Visual Analytics System


Fig. 4. Frequency distribution of the failures using heat-map plot view. (Color figure online)

ðk Þ

where Yt denotes the embedding point with a time stride t in the k-dimensional state space and is given by ðY ðtÞ; Y ðt  1Þ; . . .; Y ðt  k þ 1ÞÞ. Yt þ 1 represents the value in Y at the time point of t þ 1. The causality TY!X , which represents the influence of Y on X, and can be solved as follows:


  ðk Þ ðlÞ  P Xt þ 1 jXt ; Yt X  ðk Þ ðlÞ   ¼ P Xt þ 1 ; Xt ; Yt log ðk Þ P Xt þ 1 jXt


For the interest of environmental log data and the failure data, the mutual causalities between these variables can be calculated by using the aforementioned Eqs. 1 and 2. In our proposed system, the causality calculation results for each environmental log data (measured temperatures of Water, CPU, Air In, and Air Out) are represented as a graph structure in the causality graph plot view (Fig. 5). In the failure analysis, since a plurality of environmental log data are targeted, the environmental log data is represented as Xi ði ¼ 1; 2; . . .; nÞ. Here, n represents the number of environmental log data. Y is represented by a red circle and Xi by a green circle, respectively, in our current system, in order to match the color number highlighted by the failure number plot and the environmental log data plot described in the Subsect. 2.1. In order to visually clarify the causal relationships, the causal relation from Y to Xi is represented by a red arc, and the causal relation from Xi to Y is represented by a green arc. The thickness of the arc is proportional to the magnitude of the causal index, indicating that there is a strong causal relationship as the arc becomes thicker. The thickness of the arc is calculated, and drawn by using the following procedure.


K. Koiso et al.

Step 1. Let X be X1 ; X2 ; . . .; Xn and determine the maximum and minimum values from the 2  n calculated transfer entropy values. Step 2. The maximum transfer entropy value ðvalueÞ to be represented by the arc is normalized by Eq. 3 with the maximum value obtained in Step 1 as max and the minimum value as min, and the result N is obtained. Step 3. Calculate the thickness of the arc based on N obtained in Step 2 and draw it. N¼

value  min max  min


Since the purpose of this case study is to analyze the causal relation of the number of failures regarding to the HPC environment data, Xi , then they are placed around Y.

Fig. 5. An example of the causality graph plot view focusing on the CPU temperature. TCPU!Fails and TFails!CPU represent the influences of the CPU on the Fails, and vice-versa. (Color figure online)


System Implementation

In this section, we explain how the proposed system described in the previous subsections works in conjunction. Figure 6 shows a snapshot of our proposed system with the visualization results of the dataset from April 14, 2014 to April 24, 2017. Figure 7 shows the visualization of the intake air temperature by choosing the intake air temperature from the number of faults plots in 2.1 from September 1, 2014 to February 28, 2015, the causal relation visualization graph of 2.4. Figure 8 shows the data flow of this system. When this system is started, CPU failure data and environmental log data of one month granularity are read and each of the four visualization methods described in the previous sub-sections are displayed in a

A Transfer Entropy Based Visual Analytics System


Fig. 6. Overview of the GUI of the proposed visual analytics system

Fig. 7. Visualization result of a causal analysis by selecting a certain period of time.

single screen. When the user selects a time period of interest in the failure number plot, the graph of that period is highlighted. Then, the selected period is handed over to the other four screens, and environmental log data of one day granularity within the selected period is read. In the failure rack spatiotemporal plot, the faults within the selected period are plotted with emphasis, and in the upper left of the screen, the total number of faults, the total number of faults within the selected period, all the periods handled, the selected period is displayed. These can be switched on/off by keyboard operation on the fault rack space-time plot. In the failure rack heat map, the number of failures occurring within the selection period is counted for each rack and mapped. In the causal relationship visualization graph, use the read environmental log data for each day, and calculate the causal index between the number of failures and the environmental log data during the selection period. When the user selects the item of the environmental log data, the arc connecting the circle representing the selected item and


K. Koiso et al.

the circle representing the failure frequency is highlighted. On the upper left of the screen are displayed the transfer entropy value from the number of failures to the selected environmental log data, and the transfer entropy value from the selected environmental log data to the failure number. These can be switched between display and non-display in the same way as the fault rack space time plot. Also, the type of the selected environmental log data is passed to the environmental log data plot, and in the environmental log data plot, the time transition of the selected environmental log data is plotted on a monthly basis. Also, the graph of the period passed from the failure number plot is highlighted. For implementation of this system, we used C++ language, Qt 5.5 for user interface and KVS (Kyoto Visualization System) [7] for visualization library.

Fig. 8. User operations and data flows of our proposed system.

3 Results and Discussions In the experiments, we used an HPC environmental log data with data measured between the period of April 2014 and April 2017. Among several log data information, we selected four HPC environmental log data: CPU temperature; Water temperature; Intake air temperature; and Exhaust air temperature. We also utilized a list of the critical CPU failures, which required the system board substitution, occurred in the period of April 2014 to April 2017. The granularity of the transfer entropy method was calculated on a daily basis. We selected the period of February 1, 2015 to April 30, 2015 when the number of failures became large compared to the other periods as shown

A Transfer Entropy Based Visual Analytics System


in the Fig. 9. For the comparison, we show in the Fig. 10, the period of time where the occurrence of the CPU failures was relatively small, that is, from March 1, 2016 to May 31, 2016. Considering the Fig. 9, since the arc from the CPU to the Fails is thicker than other arcs, it is suggested that the CPU temperature has mainly contributed to the CPU failures in that period where a relatively large number of failures occurred. It is worth noting that in this period, the Gordon Bell Challenge was held and the CPUs have been heavily used. Therefore, we can expect a high variation on the CPU temperature in this period compared to the normal operation period of the K computer. In the case of Fig. 10, since the arc from the Water to the Fails is thicker than other arcs, it is suggested that the water temperature has mainly contributed to the CPU failures in this period where a relatively small number of failures has occurred. It is worth noting that the water is used for cooling the CPUs as well as the Air.

Fig. 9. Causal exploration result in the period of February to April 2015.

Fig. 10. Causal exploration result in the period of March to May 2016.

4 Conclusions In order to analyze the hardware failure causes on HPC systems, we developed a visual analytics system which uses measured log data from an HPC environment, and the failure occurrence data. The users can select a period of interest from the time transition of the number of failures, and the spatiotemporal distribution of the failures within selected period will be plotted. In addition, the causal index between the number of failures and the HPC environmental log data is calculated, by using the transfer entropy method, and the results are visualized in the form of circles and arcs. Furthermore, from


K. Koiso et al.

the visualization results of the causal relationships, the users can select the required HPC environmental information, to visualize the time transition. The proposed visual analytics system made possible to comprehensively and interactively analyze the spatiotemporal distribution of hardware failures, and the causal relationships between the measured HPC environmental log data and the number of failures. Although it is possible to visualize these causal relationships, as a future works, we will work on the further analysis to verify and validate the obtained causality results. Acknowledgements. Some of the results were obtained by using the K computer operational environment at the RIKEN CCS (Center for Computational Science) in Kobe, Japan.

References 1. Shoji, F., et al.: Long term failure analysis of 10 petascale supercomputer. In: HPC in Asia Session at ISC (2015) 2. Schulz, C., Rodrigues, N., Damarla, K., Henicke, A., Weiskopf, D.: Visual Exploration of mainframe workloads. In: SIGGRAPH Asia 2017 Symposium on Visualization, pp. 4:1–4:7 (2017) 3. El-Sayed, N., Schroeder, B.: Reading between the lines of failure logs: understanding how HPC systems fail. In: 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 1–12 (2013) 4. Schroeder, B., Gibson, G.: A large-scale study of failures in high-performance computing systems. IEEE Trans. Dependable Secur. Comput. 7(4), 337–350 (2010) 5. Gupta, S., Tiwari, D., Jantzi, C., Rogers, J., Maxwell, D.: Understanding and exploiting spatial properties of system failures on extreme-scale HPC systems. In: 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 37–44 (2015) 6. Schreiber, T.: Measuring information transfer. Phys. Rev. Lett. 85(2), 461 (2000) 7. Sakamoto, N., Koyamada, K.: KVS: a simple and effective framework for scientific visualization. J. Adv. Simul. Sci. Eng. 2(1), 76–95 (2015)

Modeling the Spread of Epidemic Diseases on ElasticStack-Based Simulation Output Analysis Environment Kangsun Lee(&) and Sungwoo Hwangbo(&) Myongji University, 34 Geobukgol-ro, Seodaemun-gu, Seoul 120-728, Korea [email protected]

Abstract. In this paper, we propose a simulation output analysis environment using Elastic Stack technology in order to reduce the complexity of the simulation analysis process. The proposed simulation output analysis environment automatically transfers simulation outputs to a centralized analysis server from a set of simulation execution resources, physically separated over a network, manages the collected simulation outputs in a fashion that further analysis tasks can be easily performed, and provides a connection to analysis and visualization services of Kibana in Elastic Stack. The proposed analysis environment provides scalability where a set of computation resources can be added on demand. We demonstrate how the proposed simulation output analysis environment can perform the simulation output analysis effectively with an example of spreading epidemic diseases, such as influenza and flu. Keywords: Simulation output analysis

 Elastic Stack  Epidemic simulation

1 Introduction As we have experienced serious epidemic diseases, such as the flu, measles, Ebola virus, foot-and-mouth disease, and avian influenza (AI), understanding the spreading mechanism of the epidemics beyond specific regions and making preventive policies among countries become more and more important [1, 2]. Simulation is a technology that predicts the future and analyzes the results obtained by simulating a given system or phenomenon according to various situations and supports decision making. Simulation technology is believed as a viable solution to understand and control the problems that cannot be experimented in the real world, such as epidemic spreading phenomenon [3–5]. In order to simulate the epidemic spreading phenomenon realistically, accurate data on the population model and population migration are required, together with accurate simulation logic for the epidemic proliferation process. In recent years, a large amount of data on various fields of society and country has been released to public, and utilized to replace the existing theoretical models in order to improve the accuracy of This work was supported by 2018 Research Fund of Myongji University. © Springer Nature Singapore Pte Ltd. 2018 L. Li et al. (Eds.): AsiaSim 2018, CCIS 946, pp. 575–583, 2018.


K. Lee and S. Hwangbo

simulation results. In addition, data provision services are being enriched over the web in the form of open APIs, expanding our horizon to add reality in simulation-based decision making and future prediction, such as population migration and spread of infectious epidemic diseases. However, in order to incorporate the public data provision services in simulations, we need an efficient environment that is capable of connecting distributed data provision, distributed simulation execution, and distributed analysis work in a seamless manner. In this paper, we propose a distributed simulation and analysis environment where epidemic simulation and analysis can be done over the physically distributed computing resources. The proposed environment comprises of three computing nodes – input management node, simulation node and analysis node. The input management node is responsible to acquire public datasets and interact with public data provision services. The simulation node is responsible to perform a series of simulations to know the spreading phenomenon of epidemic disease. The analysis node collects simulation results from simulation nodes dispersed over the network, analyzes them collectively and visualizes the simulation results over the web. We employ Elastic Stack technology to implement the simulation and analysis suite for epidemic simulations. We demonstrate how the proposed simulation and analysis environment can effectively simulate the spreading phenomenon of influenza disease. This paper is organized as follows. In Sect. 2, we present related studies on epidemic disease simulations and enabling technologies for distributed simulations and analysis. Chapters 3–4 illustrate our simulations and summarize experimental results. We conclude in Sect. 5 with summary and future works to achieve.

2 Background 2.1

Related Research

Simulations have utilized to understand the dynamics of infectious disease and evaluate the mitigation strategies. FluTE is an individual-based stochastic influenza epidemic simulation model capable of simulating the spread of influenza across major metropolitan areas or the continental United States [1]. FluTE creates synthetic populations based on typical American communities using the US-wide family size distribution from the 200 Census. The synthetic population model is organized hierarchically, starting from the household cluster, neighborhoods, and the community, and utilized as the way in which the influenza spreads during day and night. EpidemicSim is an epidemic simulation system which includes realistic mobility information to simulate disease transmission [4]. EpidemicSim utilizes the ubiquity of mobile devices and online social networks, and has created an opportunity for real-life simulations of disease transmissions. The resulting information is employed to create disease social networking maps used to determine the importance of each individual in the network and the connections between the ground zero source of the disease and the total infected population at the end of the simulation. These two epidemic simulations are good examples to illustrate how we can benefit by utilizing public data sets and public data provision services.

Modeling the Spread of Epidemic Diseases


As simulations are becoming popular to analyze large-scale systems with massive inputs and outputs, more engineers are experiencing difficulties in interacting with and understanding the simulation datasets. The needs of collecting, storing and managing large simulation datasets have become imminent for various simulation applications, such as epidemic simulations with large population across wide area. ARLS (After action Reviewer for Largescale Simulation data) is a Hadoop-based output analysis tool for large-scale simulation datasets [6]. ARLS clusters distributed storages using Hadoop and analyzes the large-scale datasets using MapReduce in order to improve data processing time significantly comparing to the traditional output analysis tools. In order to accommodate the aforementioned requirements, there is a need for a technique which can effectively manage input services, simulation services and analysis services in a distributed fashion and configures available resources on demand, according to computing complexity in the phase of simulation and analysis. Elastic Stack [7] is our solution to resolve the aforementioned requirements. The following section outlines the important features of Elastic Stack. 2.2

Enabling Technology: Elastic Stack

Elastic Stack is a group of open source products, i.e., Elasticsearch, Logstash, Beats, Kibana, and ECE, designed to help users take data from any type of source and in any format and search, analyze, and visualize that data in real time [7]. Followings are brief explanations on each of products in Elastic Stack (Fig. 1).

Fig. 1. Elastic Stack products

Logstash is a dynamic data collection pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to a stash that is designated by a user. • Elasticsearch is a distributed, JSON-based search and analytics engine designed for horizontal scalability, maximum reliability, and easy management.


K. Lee and S. Hwangbo

• Beats is a platform for lightweight shippers that send data from edge machines to Logstash and Elasticsearch. • Kibana gives shape to user data and is the extensible user interface for configuring and managing all aspects of the Elastic Stack. Elastic Stack products have been successfully applied to various use cases, such as application and enterprise search, business analytics, operational log analytics, etc., in various industries, including manufacturing, education, health care, etc [8]. For example, USAA remediates security incidents by analyzing 3–4 billion security events a day, running Python scripts, building custom applications to mine the data, and utilizing Watcher, the Elasticsearch alerting and notification extension, to perform threat intelligence and security analytics [9]. Our simulation and analysis environment requires three components each of which takes a responsibility (1) to acquire public datasets, (2) to collect simulation outputs and (3) analyze and visualize the simulation results over the web. Elastic Stack products are well suited for our purpose, since Beasts, Logstash, Elasticsearch and Kibana can provide the necessary functionality for each of the distinct phases of simulation and analysis. In the following section, we present the architecture of our simulation and analysis environment and explain how Elastic Stack technology has been employed.

3 An Elastic Stack-Based Simulation and Analysis Environment Figure 2 shows our environment for conducting epidemic simulations and visualizing the results.

Fig. 2. Elastic Stack-based simulation and analysis environment

Modeling the Spread of Epidemic Diseases


Three components have been developed with Elastic Stack technology. • Simulation Output Transmitter (SOT) sits on a user site and transmits simulation outputs to the remote Simulation Controller. SOT has been implemented by using Beats and Logstash and acts as a dynamic data collection pipeline which users can plug into their simulators to transmit their data to the remote Simulation Controller. • Simulation Controller (SC) stores simulation outputs gathered from multi-sources with the help of SOT. SC has been implemented by using Elasticsearch which is a distributed, RESTful search and analytics engine capable of performing and combining many types of searches, including sort by the relevance, text tokenization and stemming and connection to conventional databases. • Simulation Output Visualizer (SOV) interacts with SC and visualizes simulation outputs. SOV has been implemented by using Kibana which searches, views, and interacts with data stored in Elasticsearch indices. With the help of Kibana, SOV can perform advanced data analysis and visualize the results in a variety of charts tables, and maps. Since the three components can be executed independently in different machines dispersed over the network, it is possible to add computing resources on demand, as we need more computing resources to perform simulation and analysis.

4 An Epidemic Spreading Simulation: Influenza Example FluTE [1] is our reference model to implement an epidemic influenza. Section 4.1 presents the major components of the influenza simulation model, including population model, space model and disease model. Section 4.2 illustrates spreading process of our influenza simulation. 4.1

Influenza Model

Our simulation model includes population model, space model and disease model, as illustrated in Fig. 3. • Population model: Each individual of the population model is modeled with {household id, age, residence place, commuter district, health status}, using Seoul area size distribution from Korea-wide family size distribution in 2005 Census [10]. Household id means the unique identifier where the personnel belongs. Age is divided by 4 groups - preschool group, schooling group, active working group, and retirement group. The residence place is defined as the house inhabited by the individual, and assigned by one of the 25 boroughs in Seoul. The commuter district is created by randomly assigning a borough. • Space model: The spatial model allocates commuting places (424 districts in Seoul) to individuals, according to their age. For example, if a person is a preschool child, a home or nursery school is designated as his or her district. In the case of students, school districts are assigned for their commuting districts. In the case of active working population, their working district was assigned to the commuting area. In


K. Lee and S. Hwangbo TaxiData





+Initialization() +Simulation() +SentToLogStash() +Synchronization()

+CheckInfected() * +SetStatus() 1 1



* PersonAgent 1

-ID -FamilyID -placeID -age -health

1 *

+Initialization() +CheckInfected() +SetStatus() +Synchronization()

Place -ID -boroughID


-date -time -boroughID -rate +getDate() +getTime() +getBoroughID() +getRate() +SendData()

+Initialization() +SetInfectionValues() +GetInfectedPeople() +GetPeopleList() * 1 Home Borough


Workplace +Type

+SendData() +Initialization()

Fig. 3. Class diagram for influenza simulation model

the case of the retirement population, commuting places were allocated to the districts where nursing homes or retirement community centers are located. • Disease model: We considered a simple SEIR [2] simulation of the infectious disease spread in the population model, in which no births, deaths or introduction of new individuals occurred. Individuals are each assigned to one of the following disease states: Susceptible(S), Exposed (E), Infectious (I) or Recovered (R). Susceptible\individuals may contract the disease with a given rate when in contact with an infectious individual, and enter the exposed disease state when they become infected but are not yet infectious themselves. The exposed individuals become infectious. 4.2

Spreading Process

The spreading process of the epidemic influenza is simulated by dividing into the daytime infection period and the nighttime infection period. At night, some of the family members of an infected person are infected. During the daytime, we examine the commuter place of the infected person and randomly select people in the same commuter district to infect the disease with a random transmission rate. In order to simulate the mobility of the infected person more realistically, we utilize the real-time taxi calls around the commuter district in Seoul metropolitan area [11]. If the number of taxi calls is large, we conclude that the mobility of the commuter area is large, therefore, the infection transmission rate is increased. The infection period of the infected persons is arbitrarily assigned as a random number between 3 and 7 days. After the infecting period, the state of the infected person is changed to the recovery state.

Modeling the Spread of Epidemic Diseases


5 Simulation Results We have implemented the epidemic influenza model in Fig. 3 with JADE (Java Agentbased DEvelopment Framework) [12]. JADE is an open source platform for peer-topeer agent-based applications. As shown in Table 1, Total of 4 computers are configured to create the proposed simulation and analysis environment - three computers are configured with Logstash to perform simulations and send outputs to the analyzer server. An analyzer server has equipped with Elasticsearch engine to collect simulation outputs from the three simulation instances dispersed over the network, and also calls Kibana services to visualize simulation results. Approximately 170,000 PersonAgents are created according to the population model presented in Sect. 4.1. The spreading process of the infectious influenza has been simulated separately, during daytime and night time, and services to real-time taxi calls have been made in order to consider mobility of the infected person for a given period of time. Table 1. System specification.

CPU RAM OS Added Services

Simulation instance #1 i3-4160T 3.10 GHz 4 GB Ubuntu 14.04.5 JADE, Logstash

Simulation instance #2 i3-4160T 3.10 GHz 4 GB Ubuntu 14.04.5 JADE, Logstash

Simulation instance # i3-4160T 3.10 GHz 4 GB Windows 7 JADE, Logstash

Analysis server I5-3470T 3.20 GHz 4 GB Windows 10 Elasticsearch, Kibana

Figure 4 visualizes the spreading phenomenon of the influenza on the map of metropolitan Seoul. We traced the number of infected patients of each borough with the simulation time step of 1 day, and colored the borough according to the intensity of influenza patients. Figure 4 exhibits the four representative maps each of which illustrates S,E,I,R phase of the infectious influenza over 1 month, respectively. After GangNam borough had small number of susceptible influenza patients (Fig. 4-①), the influenza started to spread in GangNam borough as shown in Fig. 4-②. Figure 4③shows that the influenza has exposed the neighboring boroughs. Figure 4-④ shows that most of the influenza patients have started to recover. Figure 5 shows the number of infected people per day during 26 simulated days. In Fig. 5, we can also see the SEIR phase of the infectious influenza - the outbreak occurs after 13 days and the transmission rate goes down until the infected people are recovered and no more new patients occur.


K. Lee and S. Hwangbo

Fig. 4. Spreading of influenza: map visualization (Color figure online)

Fig. 5. Transmission of influenza over time

6 Conclusion In this paper, we construct a simulation and analysis environment based on Elastic Stack technology which enables simulations, data management, and analysis to be performed on different computing resources distributed over the network. We illustrate the effectiveness of the environment through an infectious influenza simulation. The influenza simulation utilizes the Elastic Stack-based simulation and analysis environment and successfully simulated the spreading phenomenon of the infectious influenza. One of the advantages of our simulation and analysis environment is that we can scale up the environment easily by adding more Logstash plugins and ElasticSearch engines as we simulate larger areas with more population. We would like to utilize Elasticsearch-Hadoop solution in order to incorporate big data into simulation framework. By connecting the massive data storage and deep processing power of Hadoop with the real-time search and analytics of Elasticsearch, large-scale simulations with massive amount of unstructured input and output datasets can be performed effectively.

Modeling the Spread of Epidemic Diseases


References 1. Chao, D.L., Halloran, M.E., Obenchain, V.J., Longini Jr., I.M.: FluTE, a publicly available stochastic influenza epidemic simulation model. PLoS Comput. Biol. 6(1), e1000656 (2010) 2. Stehle, J., Voirin, N., Barrat, A., et. al.: Simulation of an SEIR infectious disease model on the dynamic contact network of conference attendees. BMC Med. 9(87), pp. 1–15 (2011) 3. Taylor, S.J., Khan, A., Tolk, K.L., Morse, A., Yilmaz, L., Zander, J.: Grand challenges on the theory of modeling and simulation. In: Proceedings of the Symposium on Theory of Modeling & Simulation-DEVS Integrative M&S (2013) 4. Kipman, S., Ilhan Akbas, M., Turgut, D.: EpidemicSim: epidemic simulation system with realistic mobility. In: 37th Annual IEEE Conference on Local Computer Networks Workshops, IEEE (2012) 5. Huang, C.-Y.: An agent-based epidemic simulation of social behaviors affecting hiv transmission among taiwanese homosexuals. Comput. Math. Meth. Med. Article ID 867264 (2015). 6. Lee, K., Jung, K., Park, J., Kwon, D.: A MapReduce-based output analysis tool for largescale simulations, Adv. Eng. Softw. 95(C), pp. 28–37 (2016) 7. Elastic Stack’s official web page. 8. Elastic Stack Applications. 9. [email protected]: Modern threat intelligence & security and analytics. elasticon/tour/2018/amsterdam/elastic-kpn-modern-threat-intelligence-security-analytics 10. Korea Population Census (2005). 11. Taxi Calls in Seoul metropolitan area, June (2017). 12. JAVA Agent DEvelopment Framework.

Author Index

Jang, Seong-Yong 501 Jeong, San 60 Ji, Hang 72 Jia, Zhengxuan 72, 168 Jiang, Xuemei 107, 241

Abe, Kuniyoshi 401 Ahmad, Nizam 255 Ajima, Daiki 466 Arai, Takashi 414 Araki, Tatsuto 466 Arfa, Reza 129 Bai, Tian 186 Becerra Fernández, Mauricio Bo, Liu 32

Kageyama, Akira 439 Kamata, Hiroyuki 414 Kang, Ho-Seok 501 Kanno, Taro 306 Kase, Yuta 414 Kim, Haejoong 232 Koiso, Kazuki 563 Koyamada, Koji 286 Kurimoto, Ryo 488


Chai, Xudong 168 Chan, Haopeng 85 Chang, Zhichao 513 Chen, Kai 296 Choi, Changbeom 60 Cordasco, Gennaro 151 Cosenz, Federico 220 D’Auria, Matteo 151 Dai, Rong 168 De Magistris, Giovanni 45 Dyner Rezonzew, Isaac 220 Fang, Ke 3, 18 Fujihara, Masayuki 271, 425 Fujii, Hideto 451 Fukazawa, Keiichiro 143 Furuta, Kazuo 306 González La Rotta, Elsa Cristina Guo, Liqin 168 Hamagami, Kunihiko 271 Han, Liang 513 Hasegawa, Kyoko 488, 524 Hayashi, Kengo 552 Hosoyamada, Shin’Ya 439 Huo, Xin 349 Hwangbo, Sungwoo 575 Ishida, Sachiko 372 Ito, Kenichi 207


Laili, Yuanjun 539 Lee, Kangsun 575 Lei, Pu-wen 286 Li, Bohu 168 Li, Da 382 Li, Liang 488, 524 Li, Ruifang 382 Li, Wei 316 Li, Zemin 539 Li, Zhengying 241 Liang, Yazhou 72 Lim, Dae-Eun 232 Lin, Shenglin 316 Lin, Tingyu 168, 186 Liu, Jing 296 Liu, Peng 85 Liu, Quan 107 Liu, Weizhen 349 Liu, Xiaoliang 72 Lou, Ping 107, 241, 382 Ma, Ping 18, 316 Matsuura, Ryo 372 Meng, Lun 85 Miyake, Yohei 255 Morimoto, Ikuya 488 Moriyama, Takao 45 Munawar, Asim 45


Author Index

Nakada, Satoshi 488 Nakamura, Takashi 466 Niu, Shuai 316 Noda, Yukihiro 524 Noh, Seung-Min 501 Nonaka, Jorji 552, 563 Ogawa, Sakiko 306 Okamoto, Atsushi 524 Ozaki, Takuya 488 Pham, Duc Truong 333 Pham, Tu-Hoa 45 Qing, Duzheng 168 Quan, Hongyan 477 Ren, Lei


Sakae, Yuto 488 Sakamoto, Naohisa 552, 563 Sarann, Ly 451 Scarano, Vittorio 151 Seok, Moongi 60 Shabanzadeh, Parvaneh 129 Shalihin Bin Othman, Muhammad 96 Shao, Fengjing 361 Shi, Guoqiang 168 Shoji, Fumiyoshi 563 Siev, Sokly 451 Song, Xiao 72, 168, 296 Song, Zilong 477 Songyan, Wang 32 Spagnuolo, Carmine 151 Sui, Yi 361 Sun, Jinghan 296 Sun, Rencheng 361 Sun, Yao 85 Sun, Yaqiang 539 Tachibana, Ryuki 45 Tan, Gary 96 Tanaka, Satoshi 488, 524 Tanaka, Tomohiro 451 Tao, Chao 32 Tatsubori, Michiaki 45

Umeda, Takayuki 143 Unami, Koichi 425 Usui, Hideyuki 255 Wang, Changbo 477 Wang, Fei 186 Wang, Jiangyun 513 Wang, Yuzhu 349 Wei, Qin 382 Xiao, Yingying 168, 186 XiaoBing, Shang 32 Xie, Hongnan 296 Xing, Chi 168 Xu, Wenjun 333 Xue, Shishan 477 Yaegashi, Yuta 271, 425 Yamaguchi, Hiroshi 524 Yamaoka, Yoshiaki 552 Yan, Ke 333 Yanai, Shu 524 Yang, Jiyong 60 Yang, Ming 3, 18, 316 Yao, Bitao 333 Yoshimura, Chihiro 451 Yoshioka, Hidekazu 271, 425, 451 Yoshioka, Yumi 271 Yu, Xiang 361 Yuan, Wei 119 Yusof, Rubiyah 129 Zhai, Xiang 72 Zhang, Chi 286 Zhang, Lin 168, 186, 539 Zhang, Linxuan 119 Zhang, Tianze 349 Zhang, Xiaomei 107, 241 Zhang, Yan 85 Zhang, Yingxi 168 Zhou, Xinquan 477 Zhou, Yuchen 3, 18 Zhou, Zude 333 Zhu, Cui 241 Zhu, PanPan 107 Zou, Xiangyu 382

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2020 AZPDF.TIPS - All rights reserved.