Information and Communication Technology for Intelligent Systems

The book gathers papers addressing state-of-the-art research in all areas of Information and Communication Technologies and their applications in intelligent computing, cloud storage, data mining and software analysis. It presents the outcomes of the third International Conference on Information and Communication Technology for Intelligent Systems, which was held on April 6–7, 2018, in Ahmedabad, India. Divided into two volumes, the book discusses the fundamentals of various data analytics and algorithms, making it a valuable resource for researchers’ future studies.


107 downloads 6K Views 29MB Size

Recommend Stories

Empty story

Idea Transcript


Smart Innovation, Systems and Technologies 106

Suresh Chandra Satapathy · Amit Joshi    Editors

Information and Communication Technology for Intelligent Systems Proceedings of ICTIS 2018, Volume 1

Smart Innovation, Systems and Technologies Volume 106

Series editors Robert James Howlett, Bournemouth University and KES International, Shoreham-by-sea, UK e-mail: [email protected] Lakhmi C. Jain, Faculty of Engineering and Information Technology, Centre for Artificial Intelligence, University of Technology, Sydney, NSW, Australia; Faculty of Science, Technology and Mathematics, University of Canberra, Canberra, ACT, Australia; KES International, Shoreham-by-Sea, UK e-mail: [email protected]; [email protected]

The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles.

More information about this series at http://www.springer.com/series/8767

Suresh Chandra Satapathy ⋅ Amit Joshi Editors

Information and Communication Technology for Intelligent Systems Proceedings of ICTIS 2018, Volume 1

123

Editors Suresh Chandra Satapathy School of Computer Engineering Kalinga Institute of Industrial Technology Bhubaneswar, India

Amit Joshi Sabar Institute of Technology Gujarat Technological University Ahmedabad, Gujarat, India

ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-13-1741-5 ISBN 978-981-13-1742-2 (eBook) https://doi.org/10.1007/978-981-13-1742-2 Library of Congress Control Number: 2018949057 © Springer Nature Singapore Pte Ltd. 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

This SIST volume contains the papers presented at the ICTIS 2018: Third International Conference on Information and Communication Technology for Intelligent Systems. The conference was held during April 6–7, 2018, in Ahmedabad, India, and organized by Global Knowledge Research Foundation, Raksha Shakti University, and Computer Engineering Division Board—the Institution of Engineers (India)—supported by Gujarat Innovation Society and Gujarat Council of Science and Technology. It will target state-of-the-art as well as emerging topics pertaining to ICT and effective strategies for its implementation in engineering and intelligent applications. The objective of this international conference is to provide opportunities for the researchers, academicians, industry persons, and students to interact and exchange ideas, experience, and expertise in the current trend and strategies for information and communication technologies. Besides this, participants will also be enlightened about the vast avenues and current and emerging technological developments in the field of ICT in this era and its applications will be thoroughly explored and discussed. The conference is anticipated to attract a large number of high-quality submissions and stimulate the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, students from all around the world and provide a forum to researchers; propose new technologies, share their experiences, and discuss future solutions for design infrastructure for ICT; provide a common platform for academic pioneering researchers, scientists, engineers, and students to share their views and achievements; enrich technocrats and academicians by presenting their innovative and constructive ideas; and focus on innovative issues at the international level by bringing together the experts from different countries. Research submissions in various advanced technology areas were received, and after a rigorous peer review process with the help of the program committee members and external reviewers, 72 papers were accepted with an acceptance rate of 0.23. The conference featured many distinguished personalities like Narottam Sahoo, Advisor and Member Secretary, GUJCOST, DST, Government of Gujarat; Prof. Milan Tuba, Vice-Rector, Singidunum University, Serbia; Shri Aninda Bose, Senior Publishing Editor, Springer Nature; Dr. Nilanjan Dey, Techno India College of Engineering, Kolkata, v

vi

Preface

India; Dr. Shyam Akashe, Professor, ITM University, Gwalior; Dr. Parikshit Mahalle, Professor, Sinhgad Group of Institutions, Pune; Mr. Bharat Patel, Chairman, CEDB, the Institution of Engineers (India); Dr. Priyanka Sharma, Raksha Shakti University, Ahmedabad. Separate invited talks were organized in industrial and academic tracks in both days. We are indebted to all organizing partners for their immense support to make this conference possible on such a grand scale. A total of 14 sessions were organized as a part of ICTIS 2018 including 11 technical, 1 plenary, 1 keynote, and 1 inaugural sessions. A total of 63 papers were presented in six technical sessions with high discussion insights. The total number of accepted submissions was 72 with a focal point on ICT and intelligent systems. Our sincere thanks to all sponsors, press, print and electronic media for their excellent coverage of this conference. Bhubaneswar, India Ahmedabad, India April 2018

Suresh Chandra Satapathy Amit Joshi

Contents

Novel Peak-to-Average Power Reduction Technique in Combination with Adaptive Digital Pre-distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Kiran, K. L. Sudha and Vinila Nagaraj

1

Semantic Segmentation Using Deep Learning for Brain Tumor MRI via Fully Convolution Neural Networks . . . . . . . . . . . . . . . . . . . . . Sanjay Kumar, Ashish Negi and J. N. Singh

11

An Efficient Cryptographic Mechanism to Defend Collaborative Attack Against DSR Protocol in Mobile Ad hoc Networks . . . . . . . . . . . E. Suresh Babu, S. Naganjaneyulu, P. V. Srivasa Rao and G. K. V. Narasimha Reddy

21

Materialized Queries with Incremental Updates . . . . . . . . . . . . . . . . . . . Sonali Chakraborty and Jyotika Doshi

31

Big Data as Catalyst for Urban Service Delivery . . . . . . . . . . . . . . . . . . Praful Gharpure

41

Traffic Signal Automation Through IoT by Sensing and Detecting Traffic Intensity Through IR Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . Sameer Parekh, Nilam Dhami, Sandip Patel and Jaimin Undavia Performance Evaluation of Various Data Mining Algorithms on Road Traffic Accident Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sadiq Hussain, L. J. Muhammad, F. S. Ishaq, Atomsa Yakubu and I. A. Mohammed

53

67

Casper: Modification of Bitcoin Using Proof of Stake . . . . . . . . . . . . . . Nakul Sheth, Priteshkumar Prajapati, Ayesha Shaikh and Parth Shah

79

Provable Data Possession Using Identity-Based Encryption . . . . . . . . . . Smit Kadvani, Aditya Patel, Mansi Tilala, Priteshkumar Prajapati and Parth Shah

87

vii

viii

Contents

Classification of Blood Cancer and Form Associated Gene Networks Using Gene Expression Profiles . . . . . . . . . . . . . . . . . . . . . . . Tejal Upadhyay and Samir Patel

95

Stock Market Decision-Making Model Based on Spline Approximation Using Minimax Criterion . . . . . . . . . . . . . . . . . . . . . . . . 107 I. Yu. Vygodchikova, V. N. Gusyatnikov and G. Yu. Chernyshova Developing a Multi-modal Transport System by Linkage of Local Public Transport with Commuter Trains Using Software as a Service (SaaS) Architecture . . . . . . . . . . . . . . . . . . . . . . . 115 Godson Michael D’silva, Lukose Roy, Anoop Kunjumon and Azharuddin Khan Accelerate the Execution of Graph Processing Using GPU . . . . . . . . . . . 125 Shweta Nitin Aher and Sandip M. Walunj Ischemic Heart Disease Deduction Using Doppler Effect Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Ananthi Sheshasaayee and V. Meenakshi Transmission Expansion Planning for 133 Bus Tamil Nadu Test System Using Artificial Immune System Algorithm . . . . . . . . . . . . 143 S. Prakash and Joseph Henry Survey and Evolution Study Focusing Comparative Analysis and Future Research Direction in the Field of Recommendation System Specific to Collaborative Filtering Approach . . . . . . . . . . . . . . . 155 Axita Patel, Amit Thakkar, Nirav Bhatt and Purvi Prajapati Flower Pollination Optimization and RoI for Node Deployment in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Kapil Keswani and Anand Bhaskar Exploring Causes of Crane Accidents from Incident Reports Using Decision Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Krantiraditya Dhalmahapatra, Kritika Singh, Yash Jain and J. Maiti A Novel Controlled Rectifier to Achieve Maximum Modulation Using AC-AC Matrix Converter with Improved Modulation . . . . . . . . . 185 K. Bhaskar and Parvathi Vijayan Service Quality Parameters for Social Media-Based Government-to-Citizen Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Sukhwinder Singh, Anuj Kumar Gupta and Lovneesh Chanana Slot-Loaded Multiband Miniaturized Rectangular Microstrip Antenna for Mobile Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Sajeed S. Mulla and Shraddha S. Deshpande

Contents

ix

Prediction of a Movie’s Success Using Data Mining Techniques . . . . . . 219 Shikha Mundra, Arjun Dhingra, Avnip Kapur and Dhwanika Joshi Brain Tumor Segmentation with Skull Stripping and Modified Fuzzy C-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Aniket Bilenia, Daksh Sharma, Himanshu Raj, Rahul Raman and Mahua Bhattacharya Extended Security Model over Data Communication in Online Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 P. Mareswara Rao and K. Rajashekara Rao Emotional Strategy in the Classroom Based on the Application of New Technologies: An Initial Contribution . . . . . . . . . . . . . . . . . . . . 251 Hector F. A. Gomez, Susana A. T. Arias, T. Edwin Fabricio Lozada, C. Carlos Eduardo Martínez, Freddy Robalino, David Castillo and P. Luz M. Aguirre Using Clustering for Package Cohesion Measurement in Aspect-Oriented Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Puneet Jai Kaur and Sakshi Kaushal Fungal Disease Detection in Maize Leaves Using Haar Wavelet Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Anupama S. Deshapande, Shantala G. Giraddi, K. G. Karibasappa and Shrinivas D. Desai Features Extraction and Dataset Preparation for Grading of Ethiopian Coffee Beans Using Image Analysis Techniques . . . . . . . . . 287 Karpaga Selvi Subramanian, S. Vairachilai and Tsadkan Gebremichael An Overview of Internet of Things: Architecture, Protocols and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Pramod Aswale, Aditi Shukla, Pritam Bharati, Shubham Bharambe and Shekhar Palve Assay: Hybrid Approach for Sentiment Analysis . . . . . . . . . . . . . . . . . . 309 D. V. Nagarjuna Devi, Thatiparti Venkata Rajini Kanth, Kakollu Mounika and Nambhatla Sowjanya Swathi Animal/Object Identification Using Deep Learning on Raspberry Pi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Param Popat, Prasham Sheth and Swati Jain Efficient Energy Harvesting Using Thermoelectric Module . . . . . . . . . . 329 Pallavi Korde and Vijaya Kamble Refining Healthcare Monitoring System Using Wireless Sensor Networks Based on Key Design Parameters . . . . . . . . . . . . . . . . . . . . . . 341 Uttara Gogate and Jagdish Bakal

x

Contents

OLabs of Digital India, Its Adaptation for Schools in Côte d’Ivoire, West Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Hal Ahassanne Demba, Prema Nedungadi and Raghu Raman Energy Harvesting Based on Magnetic Induction . . . . . . . . . . . . . . . . . . 363 A. A. Gaikwad and S. B. Kulkarni Design of Asset Tracking System Using Speech Recognition . . . . . . . . . 371 Ankita Pendse, Arun Parakh and H. K. Verma Cybercrime: To Detect Suspected User’s Chat Using Text Mining . . . . 381 Khan Sameera and Pinki Vishwakarma Techniques to Extract Topical Experts in Twitter: A Survey . . . . . . . . . 391 Kuljeet Kaur and Divya Bansal Comparison of BOD5 Removal in Water Hyacinth and Duckweed by Genetic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Ramkumar Mahalakshmi, Chandrasekaran Sivapragasam and Sankararajan Vanitha Mathematical Modeling of Gradually Varied Flow with Genetic Programming: A Lab-Scale Application . . . . . . . . . . . . . . . . . . . . . . . . . 409 Chandrasekaran Sivapragasam, Poomalai Saravanan, Kaliappan Ganeshmoorthy, Atchutha Muhil, Sundharamoorthy Dilip and Sundarasrinivasan Saivishnu A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 Pragma Kar, Soumya Banerjee, Kartick Chandra Mondal, Gautam Mahapatra and Samiran Chattopadhyay A Novel Method for Image Encryption . . . . . . . . . . . . . . . . . . . . . . . . . 427 Ravi Saharan and Sadanand Yadav PPCS-MMDML: Integrated Privacy-Based Approach for Big Data Heterogeneous Image Set Classification . . . . . . . . . . . . . . . 435 D. Franklin Vinod and V. Vasudevan Polynomial Time Subgraph Isomorphism Algorithm for Large and Different Kinds of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445 Rachna Somkunwar and Vinod M. Vaze Study of Different Document Representation Models for Finding Phrase-Based Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Preeti Kathiria and Harshal Arolkar Predicting Consumer’s Complaint Behavior in Telecom Service: An Empirical Study of India, Sri Lanka, and Bangladesh . . . . . . . . . . . 465 Amandeep Singh and P. Vigneswara Ilavarasan

Contents

xi

Computational Intelligence in Embedded System Design: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Jonti Talukdar, Bhavana Mehta and Sachin Gajjar Knowledge-Based Approach for Word Sense Disambiguation Using Genetic Algorithm for Gujarati . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Zankhana B. Vaishnav and Priti S. Sajja Compressive Sensing Approach to Satellite Hyperspectral Image Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 K. S. Gunasheela and H. S. Prasantha Development of Low-Cost Embedded Vision System with a Case Study on 1D Barcode Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Vaishali Mishra, Harsh K. Kapadia, Tanish H. Zaveri and Bhanu Prasad Pinnamaneni Path Planning of Mobile Robot Using PSO Algorithm . . . . . . . . . . . . . . 515 S. Pattanayak, S. Agarwal, B. B. Choudhury and S. C. Sahoo An Application of Maximum Probabilistic-Based Rough Set on ID3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523 Utpal Pal and Sharmistha Bhattacharya (Halder) A Novel Algorithm for Video Super-Resolution . . . . . . . . . . . . . . . . . . . 533 Rohita Jagdale and Sanjeevani Shah Counting the Number of People in Crowd as a Part of Automatic Crowd Monitoring: A Combined Approach . . . . . . . . . . . . . . . . . . . . . . 545 Yashna Bharti, Ravi Saharan and Ashutosh Saxena Improving Image Quality for Detection of Illegally Parked Vehicle in No Parking Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Rikita Nagar and Hiteishi Diwanji Analysis of Image Inconsistency Based on Discrete Cosine Transform (DCT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563 Vivek Mahale, Mouad M. H. Ali, Pravin L. Yannawar and Ashok Gaikwad Implementation of Word Sense Disambiguation on Hadoop Using Map-Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 Anuja Nair, Kaushik Kyada and Neel Zadafiya Low-Power ACSU Design for Trellis Coded Modulation (TCM) Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581 N. N. Thune and S. L. Haridas Enhancing Security of Android-Based Smart Devices: Preventive Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589 Nisha Shah and Nilesh Modi

xii

Contents

Survey of Techniques Used for Tolerance of Flooding Attacks in DTN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 Maitri Shah and Pimal Khanpara An In-Depth Survey of Techniques Employed in Construction of Emotional Lexicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609 Pallavi V. Kulkarni, Meghana B. Nagori and Vivek P. Kshirsagar DWT-Based Blind Video Watermarking Using Image Scrambling Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621 C. N. Sujatha and P. Sathyanarayana Fractals: A Novel Method in the Miniaturization of a Patch Antenna with Bandwidth Improvement . . . . . . . . . . . . . . . . . . . . . . . . . 629 Geeta Kalkhambkar, Rajashri Khanai and Pradeep Chindhi Adaptive Live Task Migration in Cloud Environment for Significant Disaster Prevention and Cost Reduction . . . . . . . . . . . . . . . . . . . . . . . . 639 Namra Bhadreshkumar Shah, Tirth Chetankumar Thakkar, Shrey Manish Raval and Harshal Trivedi Lip Tracking Using Deformable Models and Geometric Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 Sumita Nainan and Vaishali Kulkarni Highly Secure DWT Steganography Scheme for Encrypted Data Hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 Vijay Kumar Sharma, Pratistha Mathur and Devesh Kumar Srivastava A Novel Approach to the ROI Extraction in Palmprint Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675 Swati R. Zambre and Abhilasha Mishra A Novel Video Genre Classification Algorithm by Keyframe Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685 Jina Varghese and K. N. Ramachandran Nair Reduction of Hardware Complexity of Digital Circuits by Threshold Logic Gates Using RTDs . . . . . . . . . . . . . . . . . . . . . . . . . 697 Muhammad Khalid, Shubhankar Majumdar and Mohammad Jawaid Siddiqui Analysis of High-Power Bidirectional Multilevel Converters for High-Speed WAP-7D Locomotives . . . . . . . . . . . . . . . . . . . . . . . . . . 711 J. Suganthi Vinodhini and R. Samuel Rajesh Babu

Contents

xiii

A Study on Different Types of Base Isolation System over Fixed Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725 M. Tamim Tanwer, Tanveer Ahmed Kazi and Mayank Desai Suppression of Speckle Noise in Ultrasound Images Using Bilateral Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 Ananya Gupta, Vikrant Bhateja, Avantika Srivastava and Aditi Gupta Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743

About the Editors

Suresh Chandra Satapathy is currently working as Professor, School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, India. He obtained his Ph.D. in computer science and engineering from Jawaharlal Nehru Technological University (JNTU), Hyderabad, India, and M.Tech. in CSE from NIT Rourkela, Odisha, India. He has 27 years of teaching experience. His research interests are data mining, machine intelligence, and swarm intelligence. He has acted as program chair of many international conferences and edited six volumes of proceedings from Springer LNCS and AISC series. He is currently guiding eight scholars for Ph.D. He is also a senior member of IEEE. Amit Joshi is a young entrepreneur and researcher who holds an M.Tech. in computer science and engineering and is currently pursuing research in the areas of cloud computing and cryptography. He has 6 years of academic and industrial experience at the prestigious organizations in Udaipur and Ahmedabad. Currently, he is working as Assistant Professor in the Department of Information Technology, Sabar Institute of Technology, Gujarat, India. He is an active member of ACM, CSI, AMIE, IACSIT Singapore, IDES, ACEEE, NPA, and many other professional societies. He also holds the post of Honorary Secretary of the CSI’s Udaipur Chapter and Secretary of the ACM’s Udaipur Chapter. He has presented and published more than 30 papers in national and international journals/conferences of IEEE and ACM. He has edited three books on advances in open-source mobile technologies, ICT for integrated rural development, and ICT for competitive strategies. He has also organized more than 15 national and international conferences and workshops, including the international conference ICTCS 2014 at Udaipur through ACM’s ICPS. In recognition of his contributions, he received the Appreciation Award from the Institution of Engineers, Kolkata, India, in 2014 and an award from the SIG-WNs Computer Society of India in 2012.

xv

Novel Peak-to-Average Power Reduction Technique in Combination with Adaptive Digital Pre-distortion V. Kiran, K. L. Sudha and Vinila Nagaraj

Abstract Orthogonal Frequency-Division Multiplexing (OFDM) is considered to be a most viable option to convey information at high data rates in wireless communication. OFDM is multi-carrier communication technique which uses orthogonal subcarriers and has high bandwidth spectral efficiency and robustness to high-frequency selective fading channels. In spite of several advantages, OFDM has the major drawback of Peak-to-Average Power Ratio (PAPR), which may lead to power inefficiency in RF transmission section, high in-band, and out-of-band radiation, inter-carrier interference, and degradation in Bit Error Rate (BER) performance. Hence, it is extremely preferred to have minimum PAPR. This paper proposes a novel technique of reducing peak-to-average power, improving the bit error rate performance, and saving bandwidth by the combination of novel PAPR reduction technique and adaptive DPD. The proposed method gives a better PAPR reduction as well as better BER performance compared to other methods. Keywords OFDM



PAPR



DPD



BER



CCDF

1 Introduction Orthogonal Frequency-Division Multiplexing (OFDM) is a viable option to convey information at high bit rate in mobile and wireless communication. OFDM has many advantages like robust transmission under frequency selective fading channel, resilience to interference (ISI), narrowband effect, efficient usage of available bandwidth, and simpler channel equalization. OFDM signal can be generated using IFFT signal processing which is equivalent to the summation of many multicarrier signals [1]. Summation of multicarrier signals leads to fluctuation in envelope. V. Kiran (✉) ⋅ V. Nagaraj Department ECE, RVCE, Bangalore 59, India e-mail: [email protected] K. L. Sudha Department ECE, DSCE, Bangalore 78, India © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_1

1

2

V. Kiran et al.

Fluctuation in envelope is further distorted through nonlinear power amplifiers due to large amplitude swings and leads the power amplifier to operate in saturation region [2]. Hence, the nonlinear amplifiers cannot operate with higher efficiency level. Many techniques have been proposed for peak-to-average power reduction in OFDM signals (Fig. 1). The present 4G exploits single-carrier FDMA where additional DFT is required before IFFT. This leads to frequency offset and drifts. Every proposed technique is usually calculated through Complementary Cumulative Distribution Function (CCDF) which is a probabilistic approach. Clipping technique shows the lowest PAPR but worst BER performance. Here, it has been proposed that a nonlinear scaling method with which signal is scaled down near threshold level of linear amplification of a region. Scaled-down signal can be scaled up reversely because scaled-down information is shared between both transmission and receiver, but only for the signal with high PAPR. OFDM performance in LTE cannot be achieved by PAPR reduction technique alone. Digital Pre-Distortion (DPD) in conjunction with PAPR reduction technique can improve the power efficiency, BER performance and cost-effectiveness.

Fig. 1 Classification PAPR reduction techniques [11]

Novel Peak-to-Average Power Reduction Technique in Combination …

3

2 OFDM Signal OFDM is multi-carrier modulation as defined in Eq. 1. OFDM baseband signal with n-point IFFT signal processing is equivalent to summation of n multiple components of sinusoids. x ð nÞ = =

1 N −1 2π ∑ X ðkÞej N kn N k=0

  1 N −1 2π 2π ∑ X ðkÞ½ðcos knÞ + j sin kn  N k=0 N N

ð1Þ

where n = 0, 1, 2 . . . N − 1 and xðkÞ input data symbol in complex number. OFDM signal shows variations in signal envelope which causes a distortion through nonlinear amplifier. The probability of occurrence of high peak signal depends on the combination of input data sequence. X = {X [0], X [1], …, X [N − 1]}, which can be analyzed with statistical approach. Therefore, the PAPR can be defined as the ratio of maximum power to average power of the signal. maxjx½nj2 o PAPR = n E jx½nj2

ð2Þ

Because of the high PAPR, signal should be backed off for being amplified with nonlinear power amplifiers, which gives rise to low SNR. Therefore, the signal with high PAPR will have low SNR. CCDF is a performance evaluation metric for PAPR. It gives the probability of the OFDM signal exceeding the threshold value. The operation of the nonlinear power amplifier can be classified as linear and nonlinear region. The nonlinear region shows the performance of power amplifier, exhibiting the nonlinear characteristics of AM/PM and AM/AM with distortions as AM-AM and AM-PM are widely accepted as figure of merit for the nonlinear system. The time domain equation of base band OFDM signal can be expressed as   t − nT − T2 xOFDM, b ðt Þ = ∑ x½nT Π T n=0   N −1 t − nT − T ̸ 2 = ∑ ½xRe ½nT  + jxIm ½nT Π T n=0 N −1

where Π

t T

 =

1, 0,

− T ̸2 ≤ t ≤ T ̸2 otherwise

ð3Þ

4

V. Kiran et al.

Let Vth be the threshold voltage envelope which is corresponding to input power near the boundary between linear and nonlinear region of nonlinear power amplifier. In nonlinear scaling method, peak envelope which is high above Vth is more scaled down than peak envelope near above Vth. Then, those envelopes are distributed near under Vth. Nonlinear scaling shows better PAPR reduction and with better improvement in BER and SNR. For high-frequency signals, PAPR reduction method is not sufficient for improvement of power efficiency. A combination of novel PAPR reduction method and DPD will improve power efficiency, cost-effectiveness, and maximize the linearization effectiveness of the power amplifier.

3 Digital Pre-Distortion To minimize the power amplifier’s in-band and out-of-band distortion, many linearization techniques have been proposed. Techniques include negative feedback, feed-forward method, pre-distortion, and post-distortion. Currently, Digital predistortion has become most efficient method for linearization of power amplifier due to stability, easy implementation, and cost-effectiveness. On the other hand, it also adapts to changes in nonlinear characteristics of power amplifier. DPD will generate inverse coefficients to cancel AM/AM and AM/PM distortions introduced by the nonlinear power amplifier. The distortions and predistortions are complementary to achieve linear distortion. To find the inverse coefficient for power amplifier in adaptive algorithm, LMS and RLS algorithms are used [4].

4 Adaptive Algorithm The block diagram in Fig. 2 shows an adaptive algorithm which will update pre-distortion coefficients by computing the distance between input and output. Many iterations are performed till difference is approximately equal to zero [5].

Fig. 2 Adaptive DPD block diagram

Novel Peak-to-Average Power Reduction Technique in Combination …

4.1

5

Novel PAPR Reduction Technique + ADPD

To achieve power efficiency and improved BER, the algorithm which is a combination of novel PAPR reduction technique with a digital pre-distortion is employed as shown in Fig. 3. The entire system consists of novel PAPR reduction technique and ADPD. This novel technique reduces the PAPR to predefined range and transmits the scaled signal using frequency modulation along with location of the scaled position without using extra bandwidth. The output of novel PAPR reduction is fed into DPD for using a digital signal processing. Pre-distorter will generate inverse coefficients for their input stimulus, i.e., acts as an inverse PA nonlinearity equalizer. To satisfy the linearity, the following procedures are followed for simulation and analysis: (a) (b) (c) (d) (e)

Pre-distortion parameters are initialized. Getting g(n) which is a reduced PAPR. Getting the D(n) into PA and output y(n). Pre-distortion coefficient are calculated using LMS algorithm. Iteration is continued till error becomes zero.

Above plot shows the importance of modeling nonlinearities using adaptive indirect learning architecture which are DPD and LMS algorithms (Fig. 4). From the Fig. 5, it is observed that conjunction of nonlinear scaling method to reduce the PAPR and adaptive Digital pre-distortion makes PAPR reduced to zero (Figs. 6, 7, and 8). Uniform Constellation of a perfect signal is perfectly symmetric about the origin. When the constellation is not “square” it shows I-Q imbalance, i.e., when the Q-axis height does not equal the I-axis width. Quadrature error is seen in any “tilt” to the constellation. Constellation performance comparison using novel PAPR reduction technique with DPD method is shown in Fig. 9. It can be seen that both the phase rotation and amplitude diffusion are improved considerably with novel PAPR reduction technique with DPD (Figs. 10 and 11).

Fig. 3 Novel PAPR reduction technique + ADPD

6

V. Kiran et al.

Fig. 4 PSD performance

Fig. 5 Complementary cumulative distributive function

Two significant distortion effects in PAs that causes spectral regrowth in transmitted signal and bit error in received signal. For the nonlinear systems, AM-AM widely accepted as figure of merit. In pre-distortion, nonlinear power amplifier are stimulated by baseband samples and AM-AM and AM-PM effects are estimated.

Novel Peak-to-Average Power Reduction Technique in Combination …

7

Fig. 6 BER and SNR

Fig. 7 Original signal constellation diagram

Estimated distortions are eliminated from power amplifier by pre-distorting the stimulus input with their inverse coefficients, i.e., pre-distorter acts as an inverse PA nonlinearity equalizer.

8

Fig. 8 Distorted signal constellation diagram

Fig. 9 Constellation diagram

V. Kiran et al.

Novel Peak-to-Average Power Reduction Technique in Combination …

Fig. 10 AM-AM performance with novel PAPR reduction technique + DPD

Fig. 11 AM-PM performance with novel PAPR reduction technique + DPD

9

10

V. Kiran et al.

5 Conclusion In wireless communication systems, transmitting signals with increased power efficiency and negating signals with nonlinear distortions are the two major challenges, to which APPR and DPD are the most favorable solutions, respectively. The proposed novel PAPR reduction method in conjunction with adaptive DPD not only compensates the nonlinear distortions arising from power amplifier but also achieves increase in power efficiency due to improved BER performance. Simulation results show that the proposed method can obtain desirable results by 25 dB and 4 dB improvement on the linearity performance and power efficiency performance, respectively.

References 1. Gregorio, F.H.: Analysis and compensation of nonlinear power amplifier effects in multi-antenna OFDM systems. Thesis Dissertation (2007) 2. Jones, A.E., Wilkinson, T.A., Barton, S.: Block coding scheme for reduction of peak to mean envelope power ratio of multicarrier transmission schemes. Electr. Lett. 30(25), 2098–2099 (1994) 3. Ding, L., Zhou, G.T., Morgan, D.R., et al.: A robust digital baseband predistorter constructed using memory polynomials. IEEE Trans. Commun. 52(1), 159–164 (2004) 4. Proakis, J.G., Manolakis, D.G.: Digital Signal Processing, Principles, Algorithm, and Applications, 4th edn. ISBN: 978-7-121-04042-9 (2007) 5. Ai, B., Zhong, Z.D., Zhu, G., et al.: A novel scheme for power amplifier pre-distortion based on indirect leaning architecture. Wirel. Personal Commun. 46(4), 523–530, 639 (2008) 6. Braithwaite, R.N.: Adaptive digital pre-distortion of nonlinear power amplifiers using reduced order memory correction. In: IEEE MTT-S International Microwave Symposium, Atlanta, GA, 15–20 June 2008, session WMA-8 7. Braithwaite, R.N.: Crest factor reduction (CFR) of wideband wireless multi-access signals. In: IEEE MTT-S International Microwave Symposium, Boston, MA, 7–12 June 2009 8. Braithwaite, R.N.: Reducing estimator biases due to equalization errosion adaptive digitalpre-distortion systems for RF power amplifiers. In: IEEE MTT-S International Microwave Symposium Digital, pp. 1–3. Montreal, QC, Canada, 17–22 June 2012 9. Braithwaite, R.N.: Measurement and correction of residual nonlinearities in a digitally predistorted power amplifier. In: Proceedings 75th ARFTG Microwave Measurement Conference, pp. 14–17. Anaheim, CA, 28 May 2010 10. Braithwaite, R.N.: Digital predistortion of a power amplifier for signals comprising widely spaced carriers. In: Proceedings 78th ARFTG Microwave Measurement Conference Tempe, AZ, pp. 1–4, 1–2 Dec 2011 11. Kiran, V., Sudha, K.L., Vinila, N.: Comparison and novel approach of peak to average power reduction technique in OFDM. In: 2nd International Conference on Networks Information and Communications (ICNIC-2015), at SVCE Bangalore-157 (2015) 12. Kiran, V.: ACPR reduction for better power efficiency using adaptive DPD. In: 6th IEEE International Conference on Communication and Signal Processing-ICCSP 16 Organised by Adhiparasakthi Engineering College Melmaruvathur, Tamilnadu, India—603319 (2016) 13. Kiran, V., Jose, S.: Adjacent partitioning PTS with Turbo coding for PAPR reduction in OFDM. IEEE International Conference on Advanced Computing & Communication Systems (ICACCS 2017), Jan 2017

Semantic Segmentation Using Deep Learning for Brain Tumor MRI via Fully Convolution Neural Networks Sanjay Kumar, Ashish Negi and J. N. Singh

Abstract In this paper, premature head lump recognition along with analysis is dangerous to clinic. Therefore, segmentation of paying attention to growth neighborhood desires near subsists precise, efficient, and robust. Convolution system is authoritative illustration model with the purpose of capitulate skin tone. Researchers explain to intricacy complex with taught continuous pixels and top condition and image in semantic. According to research contribution approaching, the make completely convolution system with the intention obtain participation of random dimension and manufacture correspondingly sized production with resourceful supposition and knowledge. We describe and element the breathing liberty and entirely convolution system clarify describe function toward special impenetrable estimate everyday jobs in addition rough copy family member and preceding reproduction. We are acclimatizing fashionable arrangement network which is keen on fully convolution networks with relocating their knowledgeable representation by modification to the segmentation assignment. We describe a bounce structural chart to facilitate collect semantic requirement starting with a profound uncouth deposit through exterior in sequence following low, well coating toward construct precise in addition and thorough segmentation. This is the FCN attain circumstance of the segmentation and 36% similar development toward 66.6% indicate lying 2015 NYUD with pass through a filter present although deduction take a smaller amount single fifth and succeeding on behalf of the characteristic picture. According to researches, they designed a three-dimensional fully convolution neural network for brain tumor segmentation. During training, researchers optimized our network alongside beating purpose based on gamble achieve results S. Kumar Uttarakhand Technical University Dehradun, Sudhowala, India e-mail: [email protected] A. Negi G.B. Pant Engineering College, Pauri Garhwal, Uttarakhand, India e-mail: [email protected] J. N. Singh (✉) Galgotias University, Greater Noida, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_2

11

12

S. Kumar et al.

and researchers also used to assess the superiority of prediction twisted in this representation. In order to accommodate the massive memory requirements of three-dimensional convolutions, we cropped the images we fed into our network, and we used a UNET architecture that allowed us to achieve good results even with a relatively narrow and shallow neural network. Finally, we used post-processing in order to smooth out the segmentations produced by our model.





Keywords Image segmentation Classification MRI (magnetic resonance imaging) scan Fully convolution neural networks



1 Introduction Brain growth is unrestrained enlargement of hard accumulation shaped through undesired cell establish inside dissimilar part the mind. It can be divided into malignant tumor and benign tumor. Malignant tumors contain primary tumors and metastatic tumors. Glooms are the majority of recurrent brain cancer in elders, which establishes commencement mind cell plus and penetrates the neighboring clean. Patients with low-grade gliomas can expect life extension of several years while patient with high grade can expect at most 2 years. Meanwhile, the number of patients diagnosed as brain cancer is growing fast year by year, with estimation of 23,000 new cases only in the United States in 2015. The system be pouring move forward inside gratitude [1]. Former approaches include second-hand convents intended pro segmentation every pixel is label through the group of it is enclose principle before section, excluding through shortcoming with the intention of job [2]. In this research identical toward the back culture onward deduction pixel intelligent prophecy. FCN (fully convolutional networks) can professionally find out to create impenetrable prediction for per-pixel everyday jobs comparable toward semantic. Researcher’s exhibit to FCN (fully convolutional networks) taught continuous happening segmentation goes further than current absent supplementary equipment. To our acquaintance, this is the primary employment toward instruct FCN (fully convolutional networks) back-2 (1) For pixel-wise prophecy (2) Commencing supervise preparation. Completely version and accessible system predict impassable output beginning arbitrary size input. Equal knowledge with conclusion can execute complete picture instance by impenetrable feed frontward working out and backpropagation. In-network up sampling layers enable pixel shrewd guess with knowledge inside net through sample pool. This technique competent together asymptotically with utterly plus preclude the necessitate pro the complete extra mechanism. This move toward do not construct employ and dispensation complication counting wonderful

Semantic Segmentation Using Deep Learning for Brain Tumor MRI …

13

Fig. 1 Single axial sliver of MR picture of high score glioma tolerant [4]

pix otherwise after-hoc modification with haphazard field of limited classification. This representation transfers current achievement in organization to opaque forecast through categorization net because entirely complication plus modification resting on following their educated representation [3] (Fig. 1). Gabor filters, Histogram of Oriented Gradients (HoG), or Wavelets shows bad performance, especially when boundaries between tumors and healthy tissues are fuzzy. As a result, designing task-adapted and robust feature.

1.1

Brain Tumors

Through a pervasiveness of a smaller amount than 1 h in the western inhabitants, intelligence tumor is not extremely ordinary, nevertheless they are in the center the nearly every deadly cancer. Now this time fresh learning predictable the UK frequency speed planned intended for major tumor of the mind otherwise anxious organization inside the course of exist approximately 26 for each 200,000 adults through around single third the tumor creature hateful plus the relax moreover benevolent and average spiteful. The utterance growth be of latin source plus earnings bulge [5]. These days it regularly linked by a neoplasm because through unrestrained compartment propagation. Head cancer preserve exist classier according toward source before amount of ferociousness. Main brain tumors happen inside the head whilst metastatic head cancer regularly originates the beginning additional part of corpse [6].

2 Related Work In mechanism education fully convolution neural network is a group of bottomless, provide for frontward false neural networks that have productively be functional to analyze visual descriptions. FCN (fully convolutional networks) use a difference of multilayer awareness intended to necessitate negligible preprocessing [7]. Increase convent output two maps of discovery score future for the four corners and postal tackle block [4]. Downhill casement discovery Segment semantic segmentation through Pinero along with picture reinstatement through Eigen [4].

14

S. Kumar et al.

FCN (fully convolutional networks) preparation is uncommon except hand-me-down proficiently in Thompson. To study a back-to-back fraction detector plus spatial representation used for pretense opinion though perform exposit resting on otherwise scrutinize this technique.

2.1

Common Approaches in FCNN (Fourier Convolutional Neural Networks)

• Little model restrict ability plus amenable field scrap shrewd preparation. • Placement dispensation through super pixel bulge chance meadow regularization filter otherwise restricted categorization. • Contribution variable with production interlace for impenetrable. • Many type of level pyramid dispensation. • Saturate tanen non linearity’s. • Ensembles while the technique do lacking equipment. Nevertheless researcher does find out piece intelligent preparation plus move and stitch impenetrable production beginning the viewpoint of the FCN. Researchers too converse in system awake example of the completely linked forecast through Eigen Particular container. Similarly become accustomed bottomless classification mesh to segmentation additional than perform consequently inside irritated suggestion classifier model [8]. These approaches modify CNN (convolutional neural network) scheme through example bound box plus/otherwise area proposal intended discovery semantic in addition toward example segmentation. Neither technique is educated. They reach high-tech segmentation fallout lying in that order so researchers straight contrast our separate FCN toward their semantic segmentation consequences inside Section. Researchers combine skin texture corner to corner layer to portray a linear local to-global symbol with the purpose of melody continuous. In fashionable job Hariharan et al. Moreover use manifold layer in their mixture replica for semantic segmentation [8].

2.2

Deep Learning in Brain Tumor MRI Medical Imaging

The foremost prominent learning to relate deep neural network to Brain Tumor image meting was absent [3], which used a FCN structural design to carry out pixel-wise categorization of electron microscopy neuron imagery into covering and no covering pixel. Unpaid to the premature achievement of [7] and others, attention to apply FCN architectures to brain tumor MRI images have burgeoned in present time [9]. Brain Tumor MRI image analysis and segmentation troubles present several unique challenges. First, uncomplaining information in healthiness check

Semantic Segmentation Using Deep Learning for Brain Tumor MRI …

15

plight tends to be exceptionally varied [9], where the equivalent pathology can in attendance in exceptionally poles apart ways crossways patients. Further complicate the confront of health check image segmentation is the comparatively small size of the information set obtainable, and the obtainable data life form imperfect or not in conformity.

2.3

Image Segmentation in FCN (Fully Convolutional Neural Networks)

Present continue living two major approach to semantic segmentation pixel-wise segmentation, somewhere small plot of land of an medical image is used to categorize the middle pixel, and fully convolution architectures as original planned by [5], where the set of relations contribution is the full medical image and crop is a semantic segmentation amount [1] have explore the concluding using VGGinspired [6] architectures and shown fully convolution network to have accuracy comparable to pixel wise approaches with a considerably lower computational cost. Numerous FCN-based method contain be future intended for intelligence segmentation beginning multimodal MRI, Ccounting those based on segmenting individual MRI slices [5], volumetric segmentation [2], and FCNNs joint by means of other arithmetical method [2]. Practically every single one present architectures for intelligence Brain tumor segmentation use a pixel-wise U-net approach as in [6, 10], which include been talented but motionless show incomplete achievement in addition, at the same time as [6] have functional fully convolution network to other medical troubles, no learning thus far has used a fully convolution approach for the precise difficulty of brain tumor segmentation.

3 Experiments 3.1

Dataset General Idea

The Brats Dec 2017 dataset contains T1, T1 contrast enhanced, T2, and FLAIR (Fluid-attenuated inversion recovery) imagery for a total of 500 patients (156 glioblastoma and 112 lower grade) with the intention of encompass been physically segmented into tumor core, enhancing tumor, and background regions [5]. Model imagery is shown in Fig. 2. The Brats dataset contain segmentations for necrotic core tumor, ornamental core growth, no enhancing core tumor, and edema regions. The full BraTS confront is to obtain the maximum possible segmentation achieve for all four regions, but here we focus completely on segmenting growth region from the environment.

16

S. Kumar et al.

(a)T

(b) T1 contrast enhanced

(c) T3

(d) flair

Fig. 2 BraTS dataset images [5]

4 Results Concentration is fairly all the same across the cortex it is essential for the representation to be taught textural skin with which to stand for and section the image. This of preparation put and moderately large figure parameter in the replica, so the replica seem to be conquer with bias towards backdrop pixels previous to it is able to study useful textural facial appearance (Figs. 3 and 4). Whereas, generally FCN model do not execute well, the cube score thrashing performance is considerably improved than the cross entropy loss while it is skilled on the similar model and dataset. This is in procession with the consequences in [9], which show a important improvement in incorrectness for brain tumor segmentation what time the arrangement was sophisticated to the dice achieve as an alternative of additional more conservative loss function. Consequently, these results authenticate the conclusion of [9] and hold up the make use of a dice score loss for health check image segmentation (Table 1, Fig. 5).

FIFCN-CE-Loss

(Dice=0.26)

FIFC

Dice

Loss

(Dice=0.33)

Fig. 3 The FCN model did not execute extreme healthy. The cube achieved defeat mock-up performance improved other than the score is unfortunate compared to the patch-wise architectures. This is the majority possible unpaid to the FCN model biasing heavily towards the surroundings division [8]

Semantic Segmentation Using Deep Learning for Brain Tumor MRI …

17

Fig. 4 a Original MRI image. b Ground tumor segmenta. c Segmentation BCN. d Segmentation via FCN model. Correct vowels, unknown vowels (red), misidentified vowels

Table 1 Ablation study and contrast with preceding BraTS (Biroul Român de Audit Al Tirajelor) face studies Analysis

Technique

Represent dice score (%)

Urban [11] Reza [11] Goetz [11] Approach BCN BCN BCN BCN FCN FCN FCN

Deep CNNs Random forest Randomized trees Report Loss value +Dropout +Batch normalization Batch normalization – +Dropout +Batch normalization

89 94 86 Common dice score (%) 81.5 83.5 86.6 83.4 87.8 880.2 86.4

Loss Value

1.5 1.5

1

1

0.5

0.5 0

0

50

100

Epoch Number

(a) FCN Loss curves

0 0

20

40

60

80

(b) FCN Dice score curves

Fig. 5 Loss and estimated set dice score curve for the ablation learning

Transport knowledge is a shows potential move toward for augment and initializing CNNs for intelligence MRI segmentation while by means of minute datasets, other than attendance remnants a good deal labor to be complete. Specially, in attendance are frequent challenges to conquer in retraining and preparation for imagery of incompatible resolution.

18

S. Kumar et al.

5 Conclusion In this research paper, we are developed two novel architectures for brain tumor segmentation and evaluate their accurateness on pinnacle BraTS confront 2017 dataset, along with as well explore the function of transport knowledge beginning the BraTS architecture to the Rembrandt dataset. Numerous of the consequences pre-scented at this time purposely, the make use patch-wise approach to glioma segmentation are extremely talented. Prospect employment is supposed to focal point on mounting a more multifaceted FCN (Fully convolutional neural networks) architecture, and applies dice trouncing to together the brat dataset and transport knowledge. Generally, we are positive that by means of additional occupation, it will power be promising to use CNN to proficiently and inefficiently subdivision brain tumors, and by this means transport lots of the application such as surgical homework and mental representation quicker keen on arrive at. Acknowledgements Researchers examine the BCN (Byse Code Normal) somewhat outperforms the FCN. This is the majority probable owing to the BCN (Byse Code Normal) relying less on the precise image skin tone than do the FCN. In the FCN, convolution layer squeeze the picture as elevated skin tone, then the volitional layer reconstruct the segmented picture on or after this covering tone [12]. While this is a potentially tremendously powerful structural plan, the unspoken supposition is that distant on top of the earth level skin for comparable imagery determination as well comparable [9]. Though, the dissimilarity in excellence and declaration of the Rembrandt imagery compare to the brat images income this supposition possibly will not hold extremely powerfully, and consequently see a go to sleep in segmentation excellence sandwiched sand wiched among the BraTS and Rembrandt datasets [13]. In conclusion, the consequences demonstrate that the segmentation superiority is very conflicting transversely the legalization set [8]. We scrutinize together exceptionally elevated and exceptionally short segmentation quality. The FCN model in meticulous have sample inside each container crossways the histogram. This show that at the same time as transport knowledge may be shows impending for application to segmentation, it is not automatically dependable and is motionless extremely reliant on picture excellence and declaration [8].

References 1. Shen, L., Anderson, T.: Multimodal Brain MRI Tumor Segmentation via Convolution Neural Networks 2. Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C., Jodoin, P.-M., Larochelle, H.: Brain tumor segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017) 3. Sharma, K., Kaur, A., Gujral, S.: Brain tumor detection based on machine learning algorithms. Int. J. Comput. Appl. 103(1) (2014) 4. Xiao, Z., Huang, R., Ding, Y., Lan, T., Dong, R.F., Qin, Z., Zhang, X., Wang, W.: A deep learning-based segmentation method for brain tumor in MR images. In: 2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), pp. 1–6. IEEE (2016)

Semantic Segmentation Using Deep Learning for Brain Tumor MRI …

19

5. Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J.: A Review on Deep Learning Techniques Applied to Semantic Segmentation. arXiv:1704.06857 (2017) 6. Ciresan, D., Giusti, A., Gambardella, L., Schmidhuber, J.: Deep neural networks segment neuronal membranes in electron microscopy images. Nips, pp. 1–9 (2012) 7. Usman, K., Rajpoot, K.: Brain tumor classification from multi-modality MRI using wavelets and machine learning. Pattern Anal. Appl. 1–11 (2017) 8. Kayalibay, B., Jensen, G., van der Smagt, P.: CNN-based Segmentation of Medical Imaging Data (2017) 9. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009) 10. Bauer, S., Nolte, L.-P., Reyes, M.: Fully automatic segmentation of brain tumor images using support vector machine classification in combination with hierarchical conditional random field regularization. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 354–361. Springer, Berlin, Heidelberg (2011) 11. Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imag. 34(10), 1993–2024 (2015) 12. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015) (to appear) 13. Bahadure, N.B., Ray, A.K., Thethi, H.P.: Image analysis for MRI based brain tumor detection and feature extraction using biologically inspired BWT and SVM. Int. J. Biomed. Imag. (2017) 14. Bauer, S., Wiest, R., Nolte, L.-P., Reyes, M.: A survey of MRI-based medical image analysis for brain tumor studies. Phys. Med. Biol. 58(13), R97 (2013) 15. Dvorak, P., Menze, B.H.: Local structure prediction with convolutional neural networks for multimodal brain tumor segmentation. In: MCV@ MICCAI, pp. 59–71 (2015) 16. Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, 11–18 Dec 2015, pp. 1520–1528 (2016) 17. Reese, T.G., Heid, O., Weisskoff, R.M., Wedeen, V.J.: Reduction of eddy-current-induced distortion in diffusion MRI using a twice-refocused spin echo. Magn. Reason. Med. 49(1), 177–182 (2003) 18. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015) 19. Menze, B.H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, K., Burren, Y., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imag. 34(10), 1993–2024 (2015) 20. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional Networks for Biomedical Image Segmentation, pp. 1–8 (2015) 21. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014) 22. Rathi, V.P., Palani, S.: Brain tumor MRI image classification with feature selection and extraction using linear discriminant analysis. arXiv:1208.2128 (2012) 23. Kumar, S., Srivastava, S.: Image encryption using s-des based on Arnold cat map using logistic map. Int. J. Bus. Eng. Res. 8 (2014) 24. Scarpace, L., Flanders, A.E., R. Jain, T. Mikkelsen, Andrews, D.W.: Data from rembrandt (2015) 25. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: to-wards concurrent object uncovering with district proposal network. Adv. Neural Seq. Process. Sys-t. (NIPS) (2015)

An Efficient Cryptographic Mechanism to Defend Collaborative Attack Against DSR Protocol in Mobile Ad hoc Networks E. Suresh Babu, S. Naganjaneyulu, P. V. Srivasa Rao and G. K. V. Narasimha Reddy Abstract This paper presents a novel mechanism to defend and detect the collaborative attack against popular DSR using Elliptic Curve Digital Signature Algorithm (ECDSA). This proposed security mechanism is suitable for sophisticated wireless ad hoc network that provides efficient computation, transmission and very powerful against collaborative attack. Already, several secure routing protocols were proposed to defend the attacks. However, most of the security mechanisms were used to detect or defend the single or uncoordinated attacks. Keywords DSR



MANET



Collaborative attack



ECDSA

1 Introduction Due to the proliferation of wireless devices, there is a need of next-generation wireless communication systems for rapid deployment of autonomous mobile nodes. These mobile nodes should allow the user to access the information without any geographic boundaries, even when there is no infrastructure exists. Fortunately, E. Suresh Babu (✉) Department of Computer Science and Engineering, National Institute of Technology Warangal (NITW), Warangal, Telangana, India e-mail: [email protected] S. Naganjaneyulu Department of Computer Science and Engineering, LBRCE, Mylavaram, Andhra Pradesh, India e-mail: [email protected] P. V. Srivasa Rao Department of Computer Science and Engineering, VITB, Bhimavaram, Andhra Pradesh, India e-mail: [email protected] G. K. V. Narasimha Reddy Department of Computer Science and Engineering, SRIT, Anantapur, Andhra Pradesh, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_3

21

22

E. Suresh Babu et al.

the solution for these types of service can be realized with mobile ad hoc networks (MANETs). The MANET network dynamically forms with various mobile nodes for effective interaction among themselves. However, the limited characteristics of such decentralized networks pose several challenging issues such as routing, security, dynamic network topology that varies very frequently, no fixed infrastructure, etc. In the recent past, routing issue is one of the active research areas and numerous routing protocols [1] have been designed for these networks. One of the routing protocols is Dynamic Source routing (DSR) and Ad hoc on-demand distance vector (AODV) routing protocols, which fall under on-demand routing protocols. Nevertheless, each protocol had its own advantages and disadvantages, but these protocols permit all the mobile nodes to access the information that is passed through the network, which is more very vulnerable to attacks that can damage the performance of the network. Hence, one of the challenging issues of these networks is to design the robust security solution that can protect the network from various attacks. Already, several secure routing protocols were proposed to defend the attacks. But, most of the security mechanisms were used to detect or defend the single or uncoordinated attacks. This paper presents a novel mechanism to defend and detect the collaborative attack against DSR using Elliptic Curve Digital Signature Algorithm (ECDSA). Specifically, we had chosen the DSR protocol that performs poorly against collaborative attacks. Initially, we modelled the collaborative attack, which is one of the overwhelming attacks compared to single or uncoordinated attack. This attack works in coordination with one or more distinct malicious nodes by synchronizing the activities to decrease the performance of a network. This proposed security mechanism provides less computation overhead, transmission and very powerful against collaborative attack. Finally, we had conducted a series of simulation using NS-2 simulator that exhibits the solution to the application and its behaviour in a MANET environment. The rest of the paper is organized as follows. Section 2 presents the related work of the paper and gives the brief overview of DSR routing protocol. In Sect. 3, we modelled collaborative attack against DSR protocol. Section 4 gives secure DSR protocol using ECDSA mechanism and finally, the results and simulation work are discussed in Sect. 5.

2 Related Work This section presents the related work proposed by various researchers related to routing, routing attacks and countermeasures against these routing attacks. Royer and Toh [1] examined routing protocols and evaluated the protocols based on a set of parameters such as time complexity, critical nodes, routing metric and multi-cast capability. Many possible applications and challenges in mobile ad hoc networks have been identified. Bhalaji and Shanmugam [2] have categorized all the nodes in the network into three types based on their behaviour and studied the association between them for route selection. Even though the information sent over the

An Efficient Cryptographic Mechanism to Defend Collaborative …

23

network is encrypted using various cryptographic mechanisms, there is no guaranteed security for the information sent. In order to avoid this, Sivakumar and Ramkumar [3] proposed an efficient secure route discovery protocol for DSR. Using this protocol inconsistency in the RREQ can be identified during the route discovery process. In [4, 5], Suresh Babu et al. designed a scheme using hybrid DNA-based cryptography (HDC) against AODV routing protocol to have a secure communication among the nodes in the network against heterogeneous attacks. An intrusion detection scheme was proposed by Chen-hong song et al. to counteract DOS and black hole attacks against AODV. Konstantinou et al. [6] studied about Hilbert and Weber polynomials to generate secure elliptic curves. Using Weber polynomials, the curves cannot be drawn directly. So, roots obtained using Weber polynomials are converted into corresponding Hilbert polynomial roots and an elliptic curve is generated using these roots.

3 Modelling Collaborative Attacks Against DSR Protocol This section presents the collaborative attack against DSR routing protocol. First, Dynamic Source Routing (DSR) protocol [4, 7], which is one of the popular on-demand routing protocols that maintains the route cache for storing of routes between source and destination. This routing protocol is usually constructed in two phases, one is route discovery and the other is route maintenance. Whenever the sender node needs to send the data to the destination—first, the source node checks for the valid route in its route cache. If an unexpired route is present in its route cache, then it can send the data. Otherwise, it invokes the route discovery process to discover the routes between source and destination. The collaborative attack [8] is one of the overwhelming attacks compared to single or uncoordinated attack. This attack [8] works in coordination with one or more distinct malicious nodes by synchronizing the activities to decrease the performance of a network. In this section, we modelled collaborative attack with the combination. The black hole attack is one of the severe that drops the data packets receives from the intermediate node. To execute this attack, the malicious (black hole) node advertising itself having a valid shortest route to a destination node, although the route is bogus, with the intention that node consumes the intercepted packets to use its place on the route as the first step in a man-in-the-middle attack, which performs a denial-of-service attack. Next, is the rushing attack which is one of the fastest establishing attacks within the route. To effect this attack, the comprised malicious node sends the route request (RREQ) to the destination node much faster than other request (RREQ) route coming from other neighbours nodes. Moreover, this rushing node bypass the route delay and may delay compared to original RREQ policies of the routing protocol. In this paper, we modelled to combine the above the two attacks, which coordinates each other to synchronize their activities to decrease the performance of a network. The following Fig. 1 depicts the collaborative attacks against DSR protocol.

24

E. Suresh Babu et al.

Fig. 1 Collaborative attack model

For instance, the source (S) node, the destination (D) node and the intermediate nodes R (Rushing) node which enables the rushing attack [5] and another intermediate node B (black hole) node that enables the black hole intrusion, respectively. Whenever the sender node wants to send the data to the destination—first, the source node checks for the valid route in its route cache. If an unexpired route is present in its route cache, then it can send the data. Otherwise, it invokes the route discovery process to discover the routes between source and destination. The Route request packet (RREQ) is broadcasted to the neighbour nodes (R) in the network. When an intermediate node (R) receives RREQ packet from the source/neighbour node, it adds its own address and rushes the packet to its neighbour node (B). Whenever the RREQ packet received by black hole (B) node, it claims that has the shortest path to the destination which is the false information given to the source node. Moreover, the black hole node also forwards the RREQ packet to the destination. The destination node replies with RREP packet back to the source node along with the route B and R neighbour nodes.

4 Secure Routing Algorithms Using ECDSA Mechanism This section presents a proposed solution to defend and detect the collaborative attacks against DSR protocol using Elliptic Curve Digital Signature Algorithm (ECDSA). Particularly, the proposed security mechanism combines the authentication, secure routing for the avoidance of collaborative attack using ECDSA and the intrusion detection system for detection of collaborative attack, into the on-demand DSR routing protocol that performs poorly against collaborative attacks.

An Efficient Cryptographic Mechanism to Defend Collaborative …

4.1

25

Elliptic Curve Digital Signature Algorithm (ECDSA)

The subsection presents the precise requirements of secure routing algorithms and key management service for achieving the encryption, authentication and digital signature mechanism. In order to achieve the data confidentiality, integrity and authentication security service, the Elliptic Curve Digital Signature Algorithm (ECDSA) is used, which is a variant of the Digital Signature Algorithm (DSA) that is applied to ECC. This ECDSA provides digital signature for given information, which is transmitted between source and destination. Generally, the ECDSA will be implemented in three phases—first, Key Pair Generation (KPG) phase, next, Signature Generation (SG) phase and finally, Signature Verification (SV) phase. The first two phases were carried by the source node and the final phase is carried by the destination node.

4.2

Secure Route Discovery Process of DSR Protocol Using ECDSA

This subsection discusses justification and choice of secure routing algorithms that establish a reliable and secure route between each pair of nodes. In particular, we will discuss the secure route discovery process of DSR protocol by integrating security association between the source and destination nodes. Suppose, whenever the source node wants to send the data to the destination node. First, it initiates the secure route discovery process by broadcasting the Security Associated Route Request (SARREQ) packet to its neighbour nodes for authentication. When an intermediate node receives the SARREQ packet from the authenticated neighbour node (procedure explained below in Sect. 4.2.3), it adds its own address and forwards the packet to its authenticated neighbour nodes. This process will be continued until the SARREQ packet reaches to the destination node. Once the destination node receives the SARREQ packet, it responds with the Security Associated Route Reply packet (SARREP) back to the source node. The secure route discovery is carried out with the following assumptions—first, registering all the nodes of the network using certified trusted authority. Next, authenticate the registered nodes through ECC mutual authentication algorithm and finally, initiating a route discovery mechanism after generating a signature using ECDSA. The following section discusses the secure route discovery procedure.

4.2.1

Registering with Trusted Authority

Every node in the network must register by choosing a random point for generating the private key, say, PR ϵ Z*p and calculate the public key, say, PUTA by multiplying the private key with point generator as shown in the Eq. (1)

26

E. Suresh Babu et al.

PUTA = PRTA * G

ð1Þ

Suppose a source node (A) want to register in the network, then it will send the identity say, ðIDA Þ to the Trusted Authority ðTAÞ. The trusted authority will calculate the identity of the source node by multiplying with the private key of the trusted authority, PRTA . The trusted authority will generate ID′A and is transmitted back to the source node as shown in the Eq. (2). ID0A = IDA * PRTA

ð2Þ

Further, the source node (A) checks the identity ID′A by multiplying with the point generator. Moreover, he also calculates with its own identity IDA by multiplying with the public key of trusted authority. If both the points are equal, then it replies with the positive acknowledgement as shown in Eq. (3). ID0A * G = IDA * PUTA

ð3Þ

Similarly, all the intermediate nodes along with destination node must register in the network will follow the same procedure, which is discussed above.

4.2.2

Node Authentication Using EC Cryptography

Next, there is a need of authentication mechanism for all the registered nodes for securing a network. To authenticate the nodes, every registered node must generate the public and private keys using ECC algorithm as discussed in Sect. 4.2.1. For instance, to authenticate one hop source node and destination node. First, the source node calculates the sum of identity of trusted authority (X) and identity of source (Y), which gives the secure identity information ðB′ Þ. Next, the source node sends  security associated information Asum B′ , N, T to the destination node B. Here ðB′ Þ is a Secure Identity Information, N such that N ∈ Zp* is a Random Number and T is the time stamp. where X = ID′A * PRA * PUB Y = IDA * N * PUTA

ð4Þ



B =X+Y Whenever the destination node receives the security associated information Asum B′ , N, T , then it verifies the timestamp. If the time stamp is valid, then it further checks for the validity of security identity information B′ by calculating D′ . Moreover, the destination node also calculates the identity of A, say, ID′A by multiplying the random number along with generator point, say, F. If both the

An Efficient Cryptographic Mechanism to Defend Collaborative …

27

points D′ and F are equivalent, then it will authenticate the node A otherwise rejects. The complete procedure in shown in Eq. (5) B = ID′A * PRB * PUA D′ = B′ − B F = ID′A

ð5Þ

*N *G

Similarly, all the registered nodes in the network must follow the same procedure for mutual authentication to achieve authenticated network.

4.2.3

Secure Route Discovery in DSR Using ECDSA

The next aspect after generating the key management service is a secure routing algorithm of DSR protocol using ECDSA mechanism. This mechanism integrates security association between the source and destination nodes. To achieve the secure route discovery process of DSR protocol, the following assumptions are made. • Initially, all the mobile nodes in the network need to register themselves as discussed in Sect. 4.2.1. • All the communicating nodes must generate their public and private keys through EC Cryptography before route discovery process as discussed in Sect. 4.2.2.

RREQ = SkDkhi to k kE ðPRs fH ðRREQÞgÞkfNi to k g

where i = 1 to k

ð6Þ

The node which initiates the route discovery process must sign the RREQ packet using ECDSA mechanism and broadcasts that packet into the network as presented in Eq. (6). • The intermediate nodes will relay the signed RREQ packet to the destination. • On receiving the RREQ packet, the destination node verifies the authenticity and integrity of the RREQ packet through ECDSA signature verification algorithm and replies RREP packet back to the source as shown in the Eq. (7).

RREP = SDhk to i kDkðPUs fH ðRREPÞgÞfNk to i g

ð7Þ

In the above Eqs. (6) and (7) depicts S as a source, D is destination, h is hop count, PRs is the private key of source, H is the hash function that needs to be computed, N is the intermediate nodes. K is the destination node and the symbol (∥) is the concatenation of messages.

28

E. Suresh Babu et al.

5 Simulation Results and Performance Analysis This section presents the performance evaluation and simulation result [9] to analyse the practicability of our theoretical work. We had integrated the collaborative attack, security and intrusion detection mechanism into DSR protocol. This proposed novel method was implemented in network simulator (NS-2) software, which runs in Ubuntu-13.04. These simulation results were used for comparing the performance of proposed routing protocol with various metrics like throughput, end-to-end delay, overhead and packet delivery ratio in the presence of collaborative attack and security mechanism. Figure 2 shows the throughput of DSR protocol in the presence of collaborative attack for 10, 20, 30, 40 and 50 nodes with varying pause time. Specifically, the figure shows the average percentage of collaborative attack against DSR routing protocol that effects the throughput as number of collaborative attacks increases. It is observed that when 1-collaborative node (one black hole node, two wormhole nodes and one rushing node) is present, 20% of the throughput decreases compared to the non-attack DSR protocol. Removed it is also observed that when more collaborative nodes are present in the DSR protocol that degrades the network performance. Figure 2 also depicts the packet delivery ratio of DSR protocol in the presence of collaborative attack. It is observed that when 1-collaborative node (one black hole node, two wormhole nodes and one rushing node) is present, 5% of the packet delivery ratio decreases compared to the non-attack DSR protocol. Removed it is also observed that when more collaborative nodes are present in the DSR protocol, it degrades the network performance. Figure 3 shows the average overhead of DSR routing protocol for 10, 20, 30, 40 and 50 nodes with varying pause time. It is observed that the Non-Attack DSR protocol contains minimal overhead as the nodes will carry only original RREQ and RREP control information between source and destination. Figure 4 shows the average end-to-end delay of DSR routing protocol for 10, 20, 30, 40 and 50 nodes with varying pause time. It is observed that the average end-to-end delay under the collaborative node against DSR protocol is less minimal compared to the normal DSR protocol.

Fig. 2 Throughput, PDF versus average % of attack

An Efficient Cryptographic Mechanism to Defend Collaborative …

29

Fig. 3 Routing overhead

Fig. 4 End-to-end delay

6 Conclusion This paper presents a novel mechanism to defend and detect the collaborative attack against popular DSR using Elliptic Curve Digital Signature Algorithm (ECDSA). Particularly, the proposed security mechanism combines the authentication, secure routing for avoidance of collaborative attack using ECDSA and the intrusion detection system for detection of collaborative attack, into the on-demand DSR routing protocol that performs poorly against collaborative attacks. Moreover, this proposed security mechanism is suitable for sophisticated wireless ad hoc network that provides efficient computation, transmission and very powerful against collaborative attack.

References 1. Royer, E.M., Toh, C.K.: A review of current routing protocols for ad hoc mobile wireless networks. Pers. Commun. IEEE (1999). https://doi.org/10.1109/98.760423 2. Bhalaji, N., Shanmugam, A.: Association between nodes to combat blackhole attack in DSR based Manet, pp. 0–4 (2009)

30

E. Suresh Babu et al.

3. Sivakumar, K.A., Ramkumar, M.: An effcient secure route discovery protocol for DSR. Commun. Soc. 458–463 (2007) 4. Suresh Babu, E., Nagaraju, C., Krishna Prasad, M.H.M.: Light-weighted DNA based hybrid cryptographic mechanism against chosen cipher text attacks. Int. J. Inf. Process. Index. arXiv, Indian Citation Index-2015. ISSN-0973–821 5. Ashok, K.S., Suresh Babu, E., Nagaraju, C., Peda Gopi, A.: An empirical critique of on-demand routing protocols against rushing attack in MANET. Int. J. Electr. Comput. Eng. 5 (5) (2015) 6. Konstantinou, E., Stamatiou, Y.C., Zaroliagis, C.: Efficient generation of secure elliptic curves. Int. J. Inf. Secur. 6(1), 47–63 (2007). https://doi.org/10.1007/s10207-006-0009-3 7. Johnson, D.B., Maltz, D.A., Broch, J.: DSR: the dynamic source routing protocol for multi-hop wireless ad hoc networks. Adhoc Netw. 139–172 (2006) 8. Suresh Babu, E., Nagaraju, C., Prasad, M.H.M.: A secure routing protocol against heterogeneous attacks in wireless ad hoc networks. In: Proceedings of the Sixth International Conference on Computer and Communication Technology 2015, pp. 339–344. ACM (2015) 9. Lou, W.: A simulation study of security performance using multipath routing in ad hoc networks, pp. 2142–2146 (2003)

Materialized Queries with Incremental Updates Sonali Chakraborty

and Jyotika Doshi

Abstract Enterprise decision-making entails results extracted through Online Analytical Processing (OLAP) queries. The performance of result retrieval from data warehouse is a critical factor. Frequent OLAP queries have to access warehouse data repeatedly for generating the same results. To avoid executing the same OLAP query and access data warehouse, our approach suggests that queries are materialized and stored in a separate database named MQDB along with its results and other metadata information. When query is fired next time, results are fetched from MQDB, in case of no incremental updates. If incremental updates are required, then only incremental records from data warehouse are analyzed for retrieving updated results. Final results will be based on the results in MQDB and incremental result retrieved from data warehouse. Traversing through incremental records in data warehouse results in faster query result retrieval. This paper evaluates query execution time of materialized queries involving nonincremental as well as incremental updates using data warehouse. Keywords Data warehouse



Materialized query



Incremental updates

1 Introduction Data from OLTP (Online Transaction Processing) system is transferred to data warehouse through the process of ETL (Extraction, Transformation, and Loading). Enterprise data warehouse is refreshed periodically for generating updated results. Refreshing rate of data warehouse depends on the size of the enterprise and the number of transactions occurring per unit time. S. Chakraborty (✉) Gujarat University, Ahmedabad, Gujarat, India e-mail: [email protected] J. Doshi GLS University, Ahmedabad, Gujarat, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_4

31

32

S. Chakraborty and J. Doshi

Results of OLAP queries fired for analysis and decision-making are extracted from the warehouse data. The main problem is that when the same query is fired next time, traversing through all records in data warehouse is quite time-consuming due to its large size. The authors suggest an approach of using MQDB (Materialized Query Database) for storing executed queries along with other factors such as result, time stamp, frequency, and threshold. An OLAP query when fired is first searched in MQDB for its equivalent query. If found, and no incremental updates are required, then the results are retrieved from MQDB. This eliminates data access from warehouse every time the same query is fired and hence gives better performance. If incremental updates are required, then we suggest considering only incremental records from data warehouse for generating updated results.

2 Literature Review Literature [1–4] laid emphasis on the role of data warehouse in making strategy decisions. The quality of data warehouse is crucial for organizations. Hence, metrics should be defined based on organization’s needs related to external quality attributes. Methods, models, techniques, and tools should be used for designing and maintaining high-quality data warehouse [1]. Author Bara et al. [2] emphasizes on improving quality of business intelligence systems, using OLAP and data warehouse or using reports based on SQL queries. The business intelligence project may get compromised as system may take a long time to run or the data may not be available for longer period. They presented a classification of types of data warehouse based on functional point, namely central data warehouse, data mart, virtual data warehouse, or views [3]. The primary concern for ad hoc queries of large data warehouse is that query performance degrades as they have to traverse through large volumes of data after making multiple joins. Probability of that same query fired by multiple users is high. The authors introduced a strategy to minimize response time by storing queries and their corresponding results in the cache memory. Vanichayobon [4] focuses on the impact of index structures on the performance of queries including ad hoc queries. He identifies the factors that need to be considered in order to select a proper indexing technique for data warehouse applications. Issues in the management of read-only data over longer time domain being critical for data warehouse is discussed by Srinivasa [5]. A good schema is required for integration of data from diverse sources along with removal of errors during reconciliation. O’Neil and Quass [6] present a review of indexing technology which is used to speed up queries in data warehouse environments. They introduced two approaches, namely; Bit-Sliced indexing and Projection indexing.

Materialized Queries with Incremental Updates

33

3 Proposed Approach When an OLAP query is fired, first its equivalent query is searched from MQDB. If no equivalent query is found, it is considered as a new query. Results are generated using warehouse data. The executed query along with its results and other metadata is stored in MQDB. Next time when the same or an equivalent query is fired, results are fetched from MQDB, if no incremental updates are required. Otherwise, updated results are generated using only incremental records from data warehouse and are appended with past results stored in MQDB. Suggested approach includes the following phases: 1. Initialization phase for generating identifiers of existing tables, fields, and functions. (During application load), 2. Storing the executed query in MQDB, if it is fired first time. 3. Checking for an equivalent query in MQDB when an OLAP query is fired. If equivalent query is not found in MQDB, it performs phase 2 considering it as a new query. If an equivalent query is found in MQDB, then it proceeds as follows: (a) Time stamp of the equivalent query is compared against last data warehouse refresh date. (b) If this time stamp value is greater than data warehouse refresh date, then no incremental updates are required. Results are fetched from MQDB. Metadata for the query is updated. Otherwise; (c) Perform incremental updates using incremental records from data warehouse. (d) Fetch past results from MQDB. Append incremental result with the past result. Update result file and other information stored in MQDB for that query (Fig. 1). To understand the proposed approach, we illustrate the working of an NGO. Data is collected from http://censusindia.gov.in/. Organization’s data warehouse stores data about the various education levels of people (both males and females) belonging to different age groups all over India is described in Table 1. Education level is classified as illiterate, literate, below primary, primary, middle, secondary, higher secondary, nontechnical diploma, technical diploma, and graduate.

3.1

Initialization Phase [7]

This phase is executed during application load time. Identifiers are assigned to tables, fields of each table, and functions applied on them. Identifiers for tables and their corresponding fields vary with applications/domain. For the discussed example, consider the following tables and their attributes:

34

S. Chakraborty and J. Doshi

Fig. 1 Flowchart depicting the phases of the proposed approach

Table 1 Tables and their attributes in the considered example [7] Table name

Attributes

dw_states dw_town dw_age dw_zones

st_Id, st_name, st_entry_date tn_Id, st_Id, tn_name, tn_entry_date age_Id, age_range, a_entry_date (record_id, st_Id, tn_Id, age_Id, m_illiterate, f_ illiterate, m_literate, f_literate, m_ belprimary,f_ belprimary, m_primary, f_primary, m_middle, f_middle, m_secondary, f_secondary, m_hsecondary, f_hsecondary, m_diploma, f_diploma, m_graduate, f_graduate, m_unclassified, f_unclassified, z_entry_date

(i) Generating table identifiers Identifiers for tables discussed in Table 1 are as (Table name, Table Identifier): (dw_states, 01), (dw_town, 02), (dw_age, 03), and (dw_zones, 04).

Materialized Queries with Incremental Updates

35

(ii) Generating field identifiers of each table Field identifiers can be generated as (Field name: Field Identifier), Here, fields of table dw_town are assigned identifiers as illustration. Identifiers (starting from “01”) are assigned to fields of other tables in a similar manner. (tn_Id: 01), (st_Id: 02), (tn_name: 03), and (entry_date: 04). (iii) Generating aggregate function identifiers Function identifiers are generated as (Function name, Function Identifier): (Sum, 01), (Avg, 02), (Min, 03), (Max, 04), (Count, 05), (StdDev, 06), (Var, 07), (group by, 08), and (order by, 09).

3.2

Storing Materialized Queries When Query Is Fired First Time

Consider the following OLAP queries fired on the discussed example: To find the number of female students pursuing technical diploma course under various age groups SQL Query for the above problem can be written as: Query 1 (without join) SELECT f_diploma FROM dw_zones GROUP BY age_Id Consider another OLAP query requiring join operation. To find the number of female students pursuing technical diploma from different towns of India Query 2 (requires join) SELECT count (dw_zones.f_diploma), dw_age.age_range, dw_town.tn_name, dw_states.st_name FROM dw_zones, dw_town, dw_states, dw_age WHERE dw_state.st_Id= dw_zones.st_Id AND dw_states.st_Id= dw_tn.st_Id AND dw_town.tn_Id = dw_zones.tn_Id AND dw_zones.age_Id= dw_age.age_Id GROUP BY dw_age.age_range, dw_town.tn_name, dw_states.st_name ORDER BY dw_states.st_name desc;

Assuming Query 1 and Query 2 are fired first time, identifiers are generated based on (i), (ii), (iii) from Sect. 3.1. They are saved in “Stored_query” table in MQDB as

36

S. Chakraborty and J. Doshi

Table 2 “Stored_query” table in MQDB (query id “q1” and “q2” corresponds to query 1 and query 2, respectively) Sq_id

Query_id

Table_id

Field_id

Function_id

sq1 sq2 sq3 sq4 sq5 sq6 sq7

q1 q1 q2 q2 q2 q2 q2

04 04 04 03 02 01 01

20 04 20 02 02 02 02

08 05 08 08 08 10

Table 3 “Materialized_query” table in MQDB query_id

query_date

query_frequency

query_threshold (frequency of execution per given period)

Size (No. of records)

Path of result table

q1 q2

4/4/2017 12/22/2017

5 10

15 20

14 6286

q1_result q2_result

shown in Table 2. Metadata information of the queries is stored in “Materialized_query” table as depicted in Table 3.

3.3

Processing of Query When Fired Next Time

When an OLAP query is fired, it is first searched in MQDB for an equivalent query. Note that, each query is equivalent to itself. Consider the following example for finding equivalent query: Eq_query2 (equivalent query of Query 2) SELECT dw_states.st_name , count (dw_zones.f_diploma), dw_age.age_range, dw_town.tn_name, FROM dw_zones, dw_town, dw_states, dw_age WHERE dw_state.st_Id= dw_zones.st_Id AND dw_states.st_Id = dw_town.st_Id AND dw_town.tn_Id = dw_zones.tn_Id AND dw_zones.age_Id= dw_age.age_Id GROUP BY dw_age.age_range, dw_town.tn_name, dw_states.st_name ORDER BY dw_states.st_name desc;

Identifiers are generated for tables, fields, and functions as discussed in (i), (ii), (iii) from Sect. 3.1. Identifiers of Eq_query2 are searched from “Stored_query” table as

Materialized Queries with Incremental Updates

37

shown in Table 2, and it is found to be equivalent to Query 2. Hence, Eq_query2 is equivalent to Query 2. Consider some more OLAP queries along with their equivalent query. Query 3: Find average number of males and females pursuing secondary and higher secondary education under different age groups SELECT avg (dw_zones.f_secondary), avg (dw_zones.m_secondary), avg (dw_zones.f_hsecondary), avg (dw_zones.m_hsecondary), dw_age.age_range FROM dw_zones, dw_age WHERE dw_zones.age_Id= dw_age.age_Id GROUP BY dw_age.age_range;

Eq_query3 (Equivalent query of Query 3) SELECT avg (dw_zones.f_hsecondary), avg (dw_zones.m_hsecondary), dw_age.age_range, avg (dw_zones.f_secondary), avg (dw_zones.m_secondary) FROM dw_zones, dw_age WHERE dw_zones.age_range= dw_age.age_range GROUP BY dw_age.age_range;

Query 4: Find the town name having maximum number of graduate males and females SELECT max (dw_zones.m_graduate), max (dw_zones.f_graduate) , dw_town.tn_name FROM dw_zones, dw_town WHERE dw_zones.tn_Id= dw_town.tn_Id

Eq_query4 (Equivalent query of Query 4) SELECT dw_town.tn_name , max (dw_zones.f_graduate), max (dw_zones.m_graduate) FROM dw_zones, dw_town WHERE dw_zones.tn_Id= dw_town.tn_Id

3.3.1

Checking Time stamp of Equivalent Query

“Materialized_query” table shown in Table 3 shows time stamp value of Query 2 is “12/22/17”. Suppose, last warehouse refresh date is “01/01/2018”. Hence, the query result requires an incremental update.

38

3.3.2

S. Chakraborty and J. Doshi

Performing Incremental Updates

Incremental data from warehouse is considered for generating incremental results. Incremental data refers to the records loaded in warehouse post query time stamp date.

3.3.3

Retrieving Results from MQDB and Appending with Incremental Results

Once an equivalent query is found and no incremental updates are required, then results are fetched from “Materialized_query” table as given in Table 3. In case, incremental updates are required, then it generates results from warehouse incremental data. New results generated are appended with past results stored in MQDB. The metadata information of the query is updated.

4 Experimental Results The processing time of the queries discussed in our example is evaluated. A data warehouse is populated with more than 10,000 records collected from http:// censusindia.gov.in. Tools used for this testing are Python programming and MySQL database. The program is executed on a system with Windows 7, Intel (R) Core (TM) 2 Duo CPU E8400 @ 3 GHz, 3000 MHz and 2 GB RAM. Method A: Result generation from data warehouse Total time for retrieving query result = Time taken for query execution + extraction of results from data warehouse. Method B: Fetching results from MQDB without incremental updates Total time for retrieving query result = Time taken for finding equivalent query in the database + Checking timestamp value with last data warehouse update + Fetching results from MQDB. Method C: Fetching past results from MQDB and performing incremental updates from data warehouse. Total time for retrieving query result = Searching for equivalent query in MQDB + Checking timestamp with last data warehouse refresh + Fetching result from MQDB + performing incremental update from data warehouse. Time taken for generating results using Methods A, B, and C for the queries is tabulated in Table 4. Reduction in time in percentage (%) between two methods is calculated using formula and is tabulated in Table 5. 100 * (execution time of Method A − execution time of Method B/execution time of Method A) (Fig. 2).

Materialized Queries with Incremental Updates

39

Table 4 Query execution time using methods A, B, and C Queries

Query Query Query Query Query Query Query

1 2 2 3 3 4 4

Equivalent query

Query 1 Query 2 Eq_query 2 Query 3 Eq_query 3 Query 4 Eq_query 4

Method A

Method B

Method C

Time taken (s)

Number of records

Time taken (s)

Number of records

Time taken (s)

Number of incremental records from data warehouse

0.7994 1.1775 1.1775 0.8351 0.8351 0.7991 0.7991

14 6286 6286 14 14 1 1

0.0840 0.0919 0.0919 0.0840 0.0840 0.0654 0.0654

14 602 602 14 14 1 1

0.6315 0.8392 0.8392 0.7884 0.7884 0.7518 0.7518

14 602 602 14 14 1 1

Table 5 (%) Reduction in time between two methods Equivalent query

Method A and method B (%)

Method A and method C (%)

Query 1 Query 2 Query 2 Query 3 Query 3 Query 4 Query 4

Query 1 Query 2 Eq_query 2 Query 3 Eq_query 3 Query 4 Eq_query 4

89.49 92.20 92.20 89.94 89.94 91.82 91.82

21.00 28.73 28.73 5.59 5.59 5.92 5.92

Time (Seconds)

Queries

1.3 1.2 1.1 1 0.9 0.7994 0.8 0.6315 0.7 0.6 0.5 0.4 0.3 0.084 0.2 0.1 0 Query 1

1.1775 0.8392 0.8351

0.7884 0.7991

0.7518 Method A Method B

0.0919

0.084

0.0654

Query 2

Query 3

Query 4

Method C

Fig. 2 Graph depicting query execution time using methods A, B, and C

5 Conclusion It is observed that materializing and storing queries has significant contribution in reducing processing time. Experimental results show that more than 88% time is saved when results are fetched from MQDB having no incremental updates as

40

S. Chakraborty and J. Doshi

compared to generating results from data warehouse. In case of incremental updates using data warehouse, result retrieval is faster by at least 5%. The execution time of Method B depends upon number of results records in MQDB. Limited number of records in MQDB will take less execution time.

References 1. Serranoa, M., Trujillo, J., Calero, C., Piattini, M.: Metrics for data warehouse conceptual models understandability. Inf. Softw. Technol. 49(8), 851–870 (2007) 2. Bara, A., Lungu, I., Velicanu, M., Diaconita, V., Botha, I.: Improving query performance in virtual data warehouses. WSEAS Trans. Inf. Sci. Appl. 5(5) (2008). ISSN: 1790-0832 3. Sultan, F., Aziz, A.: Ideal strategy to improve data warehouse performance. Int. J. Comput. Sci. Eng. 02(02), 409–415 (2010) 4. Vanichayobon, S.: Indexing techniques for data warehouses’ queries. http://www.cs.ou.edu/ ∼database/documents/vg99.pdf. Accessed 15 Sept 2016 5. Srinivasa, S.: Query processing issues in data warehouses. http://citeseerx.ist.psu.edu/viewdoc/ download?doi=10.1.1.42.9042&rep=rep1&type=pdf 6. O’Neil, P., Quass, D.: Improved query performance with variant indexes. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, pp. 38–49 7. Chakraborty, S., Doshi, J.: Performance evaluation of materialized query. Int. J. Emerg. Technol. Adv. Eng. 8(1), 243–249 (2018)

Big Data as Catalyst for Urban Service Delivery Praful Gharpure

Abstract Various service providing departments globally are opening up the data that gets collected in their respective service areas as a public resource. It is getting established that this Open Data can be used to identify socioeconomic trends and improve public services leading to economic growth. McKinsey in 2013, published a report titled “Open data—Unlocking innovation and performance with liquid information”, which identified more than $3 trillion in economic value globally that could be generated each year through enhanced use of Open data. (Open data: unlocking innovation and performance with liquid information, 2013 [1]). The sets of such information on one side are an avenue for revenue for the service provider departments and on the other side, while this happening interdepartmental data exchange also has a potential to bring a transformation in service delivery. This paper explores the extents to which this can be leveraged under Digital India Program. Keywords Federation



Identity



Open data



City information

1 Introduction In the recent past, service provider departments globally are, in principle accepting to share their data in open formats, i.e., datasets thus released shall be free to use, reuse, and redistribute. Once done citizens, researchers, and domain experts shall be able to access this data through for better understanding and meaningful interpretations. At the same time, the information exchange amongst service provider departments can bring in the much-needed agility in service delivery mechanism giving satisfaction to end user, i.e., citizen. In India, lack of credible data on public services area has been a cause of concern for academicians, administrators, and investors. Further, the duplication of efforts and service overlaps has resulted in lack P. Gharpure (✉) Infrastructure & Planning Division, Tata Consultancy Services, Nagpur, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_5

41

42

P. Gharpure

of end user satisfaction. The paper intends to explore the potential of big data to bridge some of the gaps in service delivery with reference to best practices reference and way forward for implementation.

2 Distinguishing Big Data from Open Data Big data is a widely used definition for high volume information which is either put in a standard format or available in form of free text or in varying formats. Often such information has potential to be processed as data for meaningful interpretation. Open data pertains to information that is freely available for use and republish post analysis as needed, without restrictions from copyright, patents, or other mechanisms of control. The classification of information in this category calls for a clarity in the intended use as all the information may not be made available for free use in open forums. Open data term used for information which is publically available for individuals, academicians, scholars, and organizations for use in their respective initiatives to make informed decisions, and solve complex issues. There are two dimensions to such publically available data when it comes to its usage, i.e., data that can be used as it is or data through publically available but is licensed for its reuse. Figure 1 [1] provides some examples of classification of data into proprietary/open/big data categories based on the sources and information overlaps. The graphic in Fig. 1 distinguishes open data from big data. Big data, thus, constitutes key information from within departments and though cannot be made available in open forum but can be used within departmental service provision initiatives. This voluminous information, i.e., big data has potential to bring in the much needed agility in the overall process.

Fig. 1 Data sources and classification [1]

Big Data as Catalyst for Urban Service Delivery

43

3 Urban Service Delivery Data Linkages The urban service delivery essentially consists of extending the end user requested service through a set of activities falling under the following four stages. • Service Initiation: Here, the end user initiates the process with applying for service/reporting an issue. • Service Validation: In this stage, the provider department carries out validation checks and processes the request for attending the issue. • Service Implementation: The requested service is put to work once validation is over and user pays the requisite fees if needed or funded from department budgets. • Service Operation: Once the implementation is complete, the service is extended to end user or in case of breakdown the service is restored. Since processes are getting automated, the IT-enabled service delivery also generates the repository of all the services/issues reported. It starts with user information, on which area citizen logged issue along with the information regarding action taken. During the course of service issue resolution if any infrastructure element is altered it is also logged in asset management database. A close look at key urban services reveals the fact that the information validation is a key step in the overall process and the information exchange is of upmost importance. It is the data elements that come into play for such validation checks. In current times, these elements exists in individual silos and have a potential to transform the service delivery dramatically in the event they form part of a common database accessible to all provider departments. This shall also lead to citizen getting a facility to record their information only once, thereby reducing administrative burden on the system. Figure 2 gives a snapshot of the same. There are several initiatives ongoing in the country under the umbrella of Digital India Program which directly or indirectly calls for the information sharing amongst various departments extending the services and linking the sequence of these. One such example is the initiative to streamline the procedure for building permissions, which has sequence of activities and procedures to be followed. These procedures cut across various departments for validation checks often more than once as per the stage of the project. The same is discussed under Sect. 4 below. The authorities in urban areas mainly the municipal corporations and other state government departments are instrumental in extending the urban services; collectively needs to leverage the information for the acceleration in the process of service delivery. In official transactions like statutory approvals, the information gathered may fall in either of the two data types—big data or open data. Categorization of such information should not be a barrier for information exchange as the internal

44

P. Gharpure

Fig. 2 Datasets linkage to service delivery

validation is intended. For the service delivery, the departments call for various identity parameters which a service recipient provides. System based auto validations eliminate such requirements. There are various duplication instances where user submits validation documents at multiple departments and even the departments too have the same information or can be taken from other provider department for validation purposes. Even though IT implementation is progressing in different departments each new identity created rarely integrates with existing identities—leading to additional costs and complexity. The data linkages for service delivery are heavily dependent on the identity of requester. The information sets forming validation parameters can be linked through federated identity management through formation of circle of trust [2] amongst the provider departments. Such linkage shall help to process service requests as the information shall be available instantly. The confidentiality of data is also maintained as the information remains within the same group of providers. Figure 3 represents the concept of circle of trust. The various sets of information residing in the silos thus get linked through such a mechanism.

Big Data as Catalyst for Urban Service Delivery

45

Fig. 3 Creating circle of trusts [2]

4 Urban Service Delivery Data Linkages The processes for various services extended cut across multiple departments and even touch back upon the department to whom the request is initially submitted for certain validation checks before service is delivered. Figure 4 depicts such instances in process of building permission [3]. Thus, there are touch points amongst various service providing department, wherein the service/user-related information is gathered which if not leveraged would result in lots of rework for end user as well as concerned departments. The data exchange shall minimize the rework loops generated in these processes. At present, services are extended with multiple provider departments like State Electricity boards for power, Water supply and Sewerage Board for Infrastructure, municipalities for service provision, and Revenue Department for Land; overall, we see all converging to land and its user data attributes. A common documentation of such information once created shall be comparable to a CMDB (Configuration Management Data Base) in IT world. The benefits of CMDB have been seen in IT Service Management domain (ITSM) over last three decades. The best practices of ITSM need to be leveraged as we do IT implementation in service delivery [4]. The services depend on the end user identity getting mapped to multiple service delivery channels. Commonly used identity instruments link customer of a service

46

P. Gharpure

Fig. 4 Information exchange channels in building projects [3]

to many other provider departments which maintain their data and can be easily reused for meaningful validations. To give an example, if a user logs to a site say Regional Transport Office (RTO) to request for a change in address and needs a document to validate/update. The site shall have a link to State electricity board website on which the user may or may not have an account. User can reach out to the website using link on RTO/city portal page; with first login, the user credentials shall be validated or he will be required to create login if not done already and opt for federation. On completing this, his identity on two accounts shall be linked facilitating the required validation from address field to update the RTO account seamlessly. The same concept can be extended for various information types which can be attributed to a single end user. Figure 5 illustrates the concept of information exchange outlined here [5]. The information exchange through such touch points to service delivery channel arising out of other department extending citizen interfacing data has the potential to bring in the service transformation. Table 1 identifies some of several existing information sources and the use of provider department information to other department leading to a likely service transformation. The information linkages are derived from the author’s own research work as urban planner [6].

Big Data as Catalyst for Urban Service Delivery

47

Fig. 5 Data exchange amongst service provision process [5]

A gap in terms of citizen expectation of urban service delivery has been voiced in several forums, however, a structured approach leveraging available data facilitates better service provision with citizen participation. Such an approach is based on key initial measures to be taken up to evolve a framework. There have been notable success stories in the recent past where departments from different states have carried our initiatives and have demonstrated the benefits out of information exchange and its interpretations across services. However, in order to leverage these best practices, there is a need to formulate a four-point approach as described below which leads to an emerging framework. (i) (ii) (iii) (iv)

Infrastructure Asset Information system. End user information with multi-identity instrument linkages. Common interface link for citizen and administration with a service catalog. Based on the above three, the processes adaptable by all providers need to be evolved.

The emerging framework, thus, shall be based on a key aspect of replication of best practices with a uniform set of procedures to give scalability to the urban service delivery program to widen the service catalog with introduction of new services. City Information Repository (CIR) through amalgamation of departmental information provides a skeleton for continual updating with information parameters for inputs required for any new services with the real-time data updates. As mentioned earlier, the current ongoing initiatives do synergize with the four-point framework concept but few of best practices is implemented in the

48

P. Gharpure

Table 1 Information linkage and transformation opportunities [6] Information source

Information

User department

Potential transformation

Municipality

Building permission Building permission

State water supply and sewerage board Inspector general of stamps and registry (IGR) Municipality

Planning for future augmentation of services Verification of documents for registration realtime

Municipality

IGR IGR IGR IGR/ Municipality Transport dept (RTO) Transport dept (RTO) Town Planning Dept IGR

Multiple departments Municipality Telecom service provider Telecom service provider

Transport dept (RTO) Transport dept (RTO)/ Police

Ownership registration Ownership registration Rental registration Marriage registration vehicle registrations Traffic violations Road widths

Value of property registered User information records Occupancy certificate Telephone numbers

Municipality Municipality

Linking permitted builtup spaces to usage Update on multiple ownerships

Passport services

Update on usage of properties for taxation Update on marital status

Municipality

Parking space allocation

Insurance agencies

Insurance premium on driving records Planning for parking charges or schemes like permits for residential areas Realtime update of ready reconer

Municipality

Municipality/Town planning dept Multiple departments

Real tive verification and updation of records

IGR

Update on occupancy certificates of registered peoperties. Identify frequent calling customer to give preference

Prepaid Taxi

Subscriber address

Various departments

Online validation of subscriber address

Vehicle number Vehicle ownership data Vehicle number/ Accident data

Insurance agencies Insurance agencies, individuals

Pricing premium based on vehicle history Vehicle history data available for a price

Insurance agencies, RTO

Pricing insurance on historical data (continued)

Big Data as Catalyst for Urban Service Delivery

49

Table 1 (continued) Information source

Information

User department

Potential transformation

Banks Income tax department IGR

Signature PAN Number

various departments Various departments

Document registration number Parking on internal roads

Various departments

Online verification of signature Online validation of PAN and signature Online validation of documents

Citizen

Municipality

Issue parking permits on select lanes

country, if leveraged can provide the fast-track solutions for adoption by other cities to bring in the agility to nationwide programs. Table 2 depicts some of such best practices references listed from currently ongoing programs in India which offers likely transformation opportunity in service delivery.

Table 2 Framework for replication of best practices Component

Subcomponent

Parameter

Best Practices

Transformation

CIDB

Service request management

Citizen information

1. Karnataka resident data hub, CeG, government of Karnataka 2. e Pragati program Goa 3. e District, e municipality programs

1. Elimination of duplicate records in beneficiaries schemes 2. GOAP plans to eliminate 100 certificates with online validations

Service issue management

Infrastructure information

1. Surat municipal corporation 2. Delhi geospatial —creation of Delhi state spatial data infrastructure 3. Urban service information systems

1. Availability of information as per citizen need through self service 2. Combined resolution of multiple utility issues in single fix

Service request management

Single point of contact

1. Sakala program Karnataka

650+ Services of 50 departments through single service help desk with extended reach through 40,000 offices and 206 help desks

Central help desk

(continued)

50

P. Gharpure

Table 2 (continued) Component

Subcomponent

Parameter

Best Practices

Transformation

2. eLokseva services —Goa 3. eULB program in Punjab, Haryana, Madhya Pradesh

100/450 services mapped for delivery through central helpdesk

Service catalog

Service Portfolio

Service procedure information

Pimpri Chinchwad Municipal Corporation

Availability of all information at one place reducint trip to offices for information.

Asset management

Service capacity, availability, continuity

Infrastructure asset information

1. Municipal asset management manual—Andhra Pradesh 2. SCADA implementations at municipal corporations

1. Lifecycle information of asset available 2. Proactive maintenance possible 3. Locating breakdown point faster 4. Resolution/alternative fix process accelerated

5 Visualizing Central Database The discussion above highlights the fact that duplication of information validations in service delivery mechanisms lead to multiple interactions of the end user with different service providers for the same set of information. Infrastructure items held/ maintained by different service provider departments are held as independent repositories with individual departments. For service, end user goes to departments individually. The linkage of these two information sets with usage of ICT in a structured framework has the potential to transform the urban service delivery.

Fig. 6 City information repository (CIR) visualization

Big Data as Catalyst for Urban Service Delivery

51

The initiatives are driven independently by departments, cities, and states, thereby sets of best practices are getting developed. However, the end user is deprived of a standard set of services through a formal “service catalogue”. The information from the initiatives and replication of success stories can potentially converge to CIR; Fig. 6 outlines these.

6 Conclusion In the country at present, a major initiative, namely Atal Mission for Rejuvenation and Urban Transformation (AMRUT) is ongoing with a target to cover 500 urban areas simultaneously. The program focuses on basic services (e.g., water supply, sewerage, urban transport) provision for citizen and provides social infrastructure elements within designated areas. The AMRUT Service level improvement program calls for decision support system based on user and spatial information available from within authorities to make water supply and sewerage plans effective within states and urban local bodies, an approach which synergizes with CIR concept put forward here. Indian cities are in need to have approach to effectively monitor and asses the performance of projects with improved information access along with systematic analysis with use of business intelligence tools which can help to improve the effectiveness of urban sector programs. Lack of information availability to citizen on services, missing clarity on procedures in the service provider departments, repetitive gathering/transmitting information for same user across multiple departments, elongated cycle time, etc., are the well-known grievances of the end user of civic services, i.e., Urban citizen. The analysis mentioned above highlights the lack of harmony between end user demands and features of ongoing projects. Thus, for shortlisting the initiatives using the customer-centric approach like the one described outlined above shall lead to faster launch of services, which shall have a common user base. The pace at which the end user expectations on service delivery is growing, there is a need for leveraging information available with IT tools to extend quality services which shall lead to achieving better maturity levels for service delivery. Such an approach shall complement physical planning projects which would be based on a universally adoptable process framework for service delivery leveraging “Big Data”.

References 1. The McKinsey Global Institute: Open data: unlocking innovation and performance with liquid information (2013) 2. Praful, G.: UID miles covered miles to go. Exp. Comput. (2010)

52

P. Gharpure

3. 4. 5. 6. 7.

MOUD: Streamlining Approval Procedures for Real Estate in India (2013) TSO, P.: ITIL Lifecycle (2007) Gharpure, P.: National Conference of e Governance. Gandhinagar (2015) Gharpure, P.: Transforming service delivery. Property Today, Oct 2012 Buchholtz, S., Bukowski, M., Sniegocki, A.: Big & Open Data in Europe: A growth Engine or Lost Opportunity. Warsaw, Centre for European Strategy Foundation (2014) Centre for Study of Science. Technology and Policy. (2015). Reconceptualising Smart cities—A Reference Framework for India—Compendium of Resources Council, S.C.: Smart Cities Open Data Guide (2014) Praful, G.: Prioritizing Urban Service Management Initiatives (2008). www.ijedict.dec.uwi.edu Government, H.: Open Data White paper. Cabinet Office Publication UK (2012) new-big-data-vs-open-data-mapping-it-out (2013). www.opendatanow.com

8. 9. 10. 11. 12.

Traffic Signal Automation Through IoT by Sensing and Detecting Traffic Intensity Through IR Sensors Sameer Parekh, Nilam Dhami, Sandip Patel and Jaimin Undavia

Abstract Traffic detection and controlling system are based on Internet of Things. The proposed system is based on this emerging concept as it collects the data and shares the data with others to take effective decisions. The main objective of the Dynamic traffic control system is to automate the current traffic controlling system in India. In an existing traffic control system, the traffic lights are controlled with fixed timing switching signals. This system has a lot of drawbacks because in this method, the actual density of the traffic is not monitored, so the vehicles stay maybe for longer time then needed. This problem can be solved by means of actual detection of density of the traffic. The proposed system detects the traffic density using IR sensors and accordingly, micro-controller controls the switching of the traffic lights. So, this paper demonstrates the dynamic traffic controlling system over the static and traditional traffic controlling system.



Keywords IoT (Internet of Things) Traffic density detection Traffic controlling Traffic detection Arduino IR sensor Android







S. Parekh (✉) ⋅ N. Dhami ⋅ S. Patel ⋅ J. Undavia Charotar University of Science & Technology, Anand, India e-mail: [email protected] N. Dhami e-mail: [email protected] S. Patel e-mail: [email protected] J. Undavia e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_6

53

54

S. Parekh et al.

1 Introduction IoT (Internet of Things) is not just about connecting things, it is also about connecting people. Things connecting to the internet are a big deal because they can start sharing their experiences and knowledge with other things. It works like, you take a thing and add ability to sense, communicate, touch, and control. The Internet of Things (IoT) has taken a significant step towards connecting four pillars—things, data, process, and even people—resulting in Internet of Everything (IoE). The Internet of Things (IoT) is the inter-networking of physical devices, electronic devices, vehicles, buildings, and many more items embedded with software, sensors, actuators, and network connectivity which enable these objects to record, collect, and exchange data. There is a vast scope of IoT such that smart city, is one of the powerful applications of IoT generating curiosity among world’s population as shown in Fig. 1 which includes smart grids, urban security, virtual power plants, intelligent transportation, smart surveillance, smart homes, smart energy management systems, smart traffic management, and environmental monitoring, etc.

Fig. 1 Categories of internet of things (IoT)

Traffic Signal Automation Through IoT by Sensing …

55

The smart city emerges from the effective usage of city resources for enhancing quality of life. In this context, the development of the Internet of Things (IoT) paradigm strongly encourages smart city vision around the world. IoT devices can be incorporated based on the geo location and evaluated by analyzing system. Then, this collected data can be used with several aspects of smart city as monitoring vehicles, public parking lots, vehicular traffic controlling, whether forecasting, smart homes, water systems, environmental pollution, Surveillance systems, etc. From all of the above, the traffic-related data are one of the few most vital sources of data in a usual smart city. As per the article in The Times of India [1], number of vehicles registered per day in India is almost 53,700 which show that the number of vehicle keeps on increasing at a faster rate than ever before. So, there has to be an accurate method to control the traffic. And, the goal of the proposed system is accurately detecting and controlling traffic in real time which in turn help in reduced fuel consumption, reduced time consumption and faster traffic clearance. There are two parts to traffic controlling: (1) detection of traffic density and another is (2) controlling. In detection of traffic, many systems use components like camera with image processing, RFID, IR sensors, smartphone, Loop detectors, and many more from which proposed system uses IR sensors.

2 Related Work This section shows the related work by others and different methods used to detect and control traffic flow. Table 1 shows the different methods and some of the loopholes in the method which will result in erroneous detection or controlling of traffic. The above table covers most of the methods for traffic detection and controlling some other techniques includes IR, RFID where RFID is installed in emergency vehicles and a GSM module is used for communicating traffic updates with Android app [2]. Previous mentioned technique uses IR sensor positioned differently from the proposed system.

3 Proposed System The system uses IR sensors for detection of the traffic (vehicles). We divide the roads into logical blocks* as Fig. 1. Then, system checks for the multiple IR sensors reading for the density of the traffic. If density is higher than any other lanes, then that lane gets higher priority and longer time duration (green signal) for clearing out of traffic.

• It uses a video to count and classify vehicles, so in any case such as rain or fog it would not be able to find centroid and distance between marked borders

• Cost of this system is too high as the cost of piezoelectric elements is too high (40,000/- as mentioned) • Requires high maintenance as this is a delicate system

• Camera

• Piezoelectric Cell • Piezoelectric converters

Piezo based self sustainable traffic light system [6] June, 2013

Traffic surveillance by counting and classification of vehicles from video using image processing [5] Nov, 2013

(continued)

• Not useful for individual vehicle detection • Insufficient to accurately track the occluded vehicles and tracking on each vehicle need to be initialized separately to handle occlusion better • Vehicle with higher height blocks vehicle behind it

• Camera

A video surveillance system for traffic application [4] December, 2014

• What if traffic is too long that Bluetooth of vehicle cannot connect (cause of out of range) • Equipped with GPS so, then why need IR sensors and Bluetooth to sense or identify the traffic or vehicle you can do it using GPS itself

• IR • Bluetooth/gps • GSM

• Infrared sensors are mounted in the dividers in order to detect the vehicles • Equipped with Bluetooth and GPS to identify vehicles • Message is sent using GSM to the control room also when an ambulance is approaching at the junction it will communicate to the traffic controller via Bluetooth • Image processing • The roads are divided sections, the background sections which has few changes • Section can be extracted by motion accumulation • Difference of frame is usually used to detect movement in a vehicle, in consecutive sequence of frames • Image processing (canny edge detection method) • Identifying and locating sharp incoherence in an image • The incoherence are abrupt changes in pixel intensity that characterize boundaries of items in a scene • Piezoelectric power • Piezoelectric elements convert mechanical stress into electric impulses. The mechanical stress can be applied on the piezoelectric elements in the form of vehicles moving on the road • Energy produced by the piezoelectric cells is converted into ½CV2 and then used to power the traffic light system

Smart traffic control system [3] March, 2016

Possibility of erroneous results

Tools

Method used

Name of paper

Table 1 Shows a survey done to find out the present techniques and their limitations in detecting traffic or controlling traffic flow

56 S. Parekh et al.

• In case of matching pose of vehicle detector may identify it as a priority vehicle • Higher cost

• There is a possibility that the user does not communicate to others and that may increase the traffic • Problematic while network problems • Does not work with other than Android phones

• Loop detector

• Android smartphone

Traffic detection system using Android [8] June, 2015

Vehicle identification concept [7] September, 2010

• Preemption Normal signal sequence is interrupted in difference to the special vehicle such as an ambulance • Priority green is held longer for the vehicle or the timing reverts to green a.s.a.p. (light emitter/ receiver is used to detect light code) • Signature loop detector Vehicle has a mounted transmitter to signal the controller and road has loop to detect the signal and its priority, the detector unit has discriminator module • This detector recognizes the vehicle passing over the loop • Uses GPS-enabled Android phone and when uses is stuck in traffic he sends the notification to others of traffic • There are four factors based on which we can decide which detector to use

Priority vehicle system control [7] September, 2010

Possibility of erroneous results • In case of same light code uses by another vehicle system may detect it as priority vehicle and change its signal sequence

Tools • Light emitter/ receiver

Method used

Name of paper

Table 1 (continued)

Traffic Signal Automation Through IoT by Sensing … 57

58

S. Parekh et al.

Fig. 2 Pictorial representation

Position of the sensor in this project is under the road. There will be sensor grid beneath the road and it will sense the traffic over the road. As it is grid, we get to know exact density of the traffic in terms of length. There is a calculated gap between the first block and the second block of sensors (Fig. 2). The reason behind this is that traffic is considered to be normal traffic till it reaches the second block and till that period of time, the traffic signal works normally. There will be three levels of traffic “NORMAL”, “MEDIUM”, and “HEAVY” (Fig. 2). Density of each road will be measured in these three categories and time will be allocated (Fig. 3) accordingly to each road. If there is a scenario that any or all the sensors stop working, then the contingency in the signal starts working as normal with fixed time intervals. And, there is an Android application that controls traffic signals. Figure 2 shows the pictorial representation of the proposed system from top. White and blue dots are the grid of the IR sensors and arrows shows the gap between two blocks (in mtr). Blocks distance depends on the length and width of the road, normally rectangular area. Figure 3 shows that lane with higher density of the vehicles gets higher priority and timings over other lanes. Figure 4 shows the flowchart of the experiment to support the proposed system.

Fig. 3 High traffic density gets high priority

Traffic Signal Automation Through IoT by Sensing …

59

Fig. 4 System flowchart only shows for two lanes

The fall back system Android application (Fig. 5) also can be used in case of emergencies, i.e., if there is an accident victim in any of the cars and it has to wait for traffic instead, the driver calls the authority and verifies that his vehicle is emergency vehicle and so that vehicle gets green light as soon as it reaches to the signal. Also, in case of priority vehicles like ambulance that particular road can be given higher priority using this application.

60

S. Parekh et al.

Fig. 5 Fall back system

Fig. 6 Customized IR module

4 Components for Working Model Components used for a working model are—Arduino mega, IR sensor (Fig. 6), Bluetooth Module HC 05 (Fig. 7), 16 × 2 LCD display (Fig. 8). Arduino mega 2560: Designed for projects that require more number of I/O pins, RAM, and memory. It has 54 digital I/O pins, 16 analog input pins, and a larger space for your sketch [9]. IR Sensor: This sensor is used to detect obstacles or motion of object. This emits and detects the radiations which are not visible to human eyes that can be sensed by an infrared sensor. In the proposed system, customized IR module is used to detect density of the traffic.

Traffic Signal Automation Through IoT by Sensing …

61

Fig. 7 Bluetooth module HC 05

Fig. 8 16 × 2 LCD display

16 × 2 LCD Display: LCD (Liquid Crystal Display) is an electronic two-line display. A 16 × 2 LCD can display 16 characters per line and 2 rows of that. In this LCD, each character is displayed in 5 × 7 pixel matrix. HC 05 Bluetooth chip: Bluetooth chip is used to transport wireless serial connection setup. It is an easy to use Serial Port Protocol module. It accepts the data and sends it through the TX pin which is connected to RX pin of Arduino [10]. Blueprint of Model: Circuit diagram in Fig. 9a contains, (1) Arduino Mega board, (2) Breadboard,

62

S. Parekh et al.

Fig. 9 a Connecting IR sensors and leds with Arduino using breadboard. b Connecting LCD display and Bluetooth chip with Arduino

100 90 80 70 60 50 40 30 20 10 0

63

High

Medium

Normal

High

Medium

Duration of time interval (in sec) Current System Normal

Duration of time interval (in sec)

Traffic Signal Automation Through IoT by Sensing …

10

80

50

30

90

45

Duration of time interval (in sec) Proposed System

Density and length of the traffic (in meter)

Fig. 10 Duration of time interval in current versus proposed system

(3) 15 IR sensors, (4) 6 LEDs. Circuit diagram in Fig. 9b contains, (1) Arduino Mega board same as used in Fig. 10 (2) 16 × 2 LCD display, (3) Bluetooth Module HC05.

5 Result Analysis The conclusion made from the experimental works by comparing current system [11] and the proposed system. Figure 10 graph shows the duration of time interval taken in corelation to the density of the vehicles for various types of traffic conditions like high, medium, and normal. And, as graph shows the current system is stagnant, whereas the proposed system dynamically allocates time for higher efficiency. Figure 11 shows free time of the green signal in corelation to the density of the vehicle. This shows the amount of time is wasted after the vehicles have passed and other lanes have to wait for the timer to complete. This is waste of resources and time also not very efficient. Figure 12 shows number of vehicle left to pass after signal changes from green to red in current system against the proposed system where there are no left out vehicle in various scenarios like high, normal, and medium.

64

S. Parekh et al. 60

Free time left (in sec)

50 40 30

Free time Current System

20 10

Free time Proposed System

0 -10

Normal

High

Medium

10

80

50

-20 -30

Density and length of the traffic (in meter)

Fig. 11 Free time left in current versus proposed system

35 25

No of vehicles left to pass Current System

20 15

No of vehicles left to pass Proposed System

10 5

High

Medium

Normal

High

Medium

0

Normal

Number of vehicle left to pass

30

10

80

40

30

90

45

Density and length of the traffic (in meter) Fig. 12 Number of vehicle left to pass in current versus proposed system

6 Conclusion After comparison results and demonstrating the proposed system, some of the observed benefits are, proposed system is cost effective because it uses components such as IR sensors which are cheaply available as compared to components used in other techniques like Image processing, Loop detection, piezoelectric elements, GPS systems, etc. The proposed system is time efficient as it allocates appropriate time to prevent people waiting for next iteration, i.e., Dynamic Time intervals based on density of the traffic. Other than that, this model has ability to detect traffic

Traffic Signal Automation Through IoT by Sensing …

65

situations within the sub roads of the main roads, and no conflict with vehicles behind heighted vehicles. Also has a fallback system and manual control over system in emergency cases which makes it more efficient and versatile in various situations. This proposed system scope can be extended to notify public the real-time traffic amount at a signal. And, it can also be further extended to calculate the time taken to reach from source to destination and best route with live signal timer view. Collect this data and it can further be used to answer the questions like High traffic on which week day? Or which weather causes most traffic jams? and many others. This system accurately detects the traffic and accordingly diverts the traffic effectively.

References 1. Article from The Times of India. https://timesofindia.indiatimes.com/auto/miscellaneous/ 53700-vehicles-registered-across-country-every-day/articleshow/53747821.cms 2. Sonawane, V.P.A., Sagane, A., Mane, P., Vidhate, R.: Intelligent traffic control system. Int. J. Eng. Technol. Manage. Appl. Sci. 4(2) 3. Yawle, R.U., Modak, K.K., Shivshette, P.S., Vhaval, S.S.: Smart traffic control system. SSRG Int. J. Electr. Commun. Eng. (SSRG-IJECE) 3(3) 4. Savaliya, R., Kalaria, V.: A video surveillance system for traffic application. SIJ Trans. Comput. Sci. Eng. Appl. (CSEA) 2(8) 5. Meshram, S.A., Malviya, A.V.: Traffic surveillance by counting and classification of vehicles from video using image processing. Int. J. Adv. Res. Comput. Sci. Manage. Stud. 1(6) 6. Saxena, P., Khurana, M.S., Shukla, R.: Piezo based self sustainable traffic light system. Int. J. Sci. Eng. Res. 4(7) 7. U.S. Department of transportation, Federal Highway Administration 1993. https://www. youtube.com/watch?v=j3CfU-f0JpI 8. Prakash, O., Aggarwal, M., Vishvesha, A., Kumar, B.: Traffic detection system using android. J. Adv. Comput. Commun. Technol. 3(3) 9. Arduino official website. https://store.arduino.cc/usa/arduino-mega-2560-rev3 10. https://diyhacking.com/arduino-bluetooth-basics/ 11. https://smartnet.niua.org/sites/default/files/webform/Best%20Practices%20for%20Traffic% 20Signal%20Operations%20in%20India.pdf

Performance Evaluation of Various Data Mining Algorithms on Road Traffic Accident Dataset Sadiq Hussain, L. J. Muhammad, F. S. Ishaq, Atomsa Yakubu and I. A. Mohammed

Abstract Many researchers use to spend much of time searching for the best performing data mining classification and clustering algorithms to apply in road accident data set for prediction of some classes such causes of the accident, prone locations and time of the accident, even type of the vehicle used to involve in the accident. The study was carried out by using two data mining tools—Weka and Orange. The study evaluated Multi-layer Perceptron, J48, BayesNet classifiers on 150 instances of accident dataset using Weka. The results showed that Multi-layer Perceptron classifier performed well with 85.33% accuracy, followed by J48 with 78.66% accuracy and BayesNet had 80.66% accuracy. The study had also found two best rules for association rule mining using Apriori algorithm with 1.0 minimum supports and 1.27 minimum confidences for rule one and 0.91 minimum supports and 1.15 minimum confidences for rule two. With Silhouette score 0.7, clustering and dimensionality reduction techniques K-means and Self-Organizing Maps were also used on the dataset using Orange data mining tool.



Keywords Data mining Classification mining Accident dataset





Clustering



Association rule

S. Hussain (✉) Dibrugarh University, Dibrugarh, Assam, India e-mail: [email protected] L. J. Muhammad ⋅ F. S. Ishaq ⋅ A. Yakubu Mathematics and Computer Science Department, Federal University, Kashere, Gombe State, Nigeria e-mail: [email protected] F. S. Ishaq e-mail: [email protected] A. Yakubu e-mail: [email protected] I. A. Mohammed Department of Computer Science, Yobe State University, Damaturu, Yobe State, Nigeria e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_7

67

68

S. Hussain et al.

1 Introduction Road accident is termed as any crash that happens on a public circulated road that involves one or more vehicle and killing or injuring one or more people. Thus, the international act of murder or suicide and natural disasters are excluded [1, 2]. According to the report of [3], total number road traffic death has sited at 1.25 million per year and it further indicated that most of the road traffic casualty occurs in developing countries. Traffic road accident, according to [4] killed more than 1.2 million people, injured people between 20 and 50 million in 2014 and it was the ninth common cause of death in the year [4]. Yet, traffic road accident remains among the most central public health problem that is facing the world and it is still one of the most common causes of the death worldwide [1–3]. It has been reported [4] half of the death on the world’s road are pedestrians, cyclists and motorcyclists those actually have a minimum or no protection. According to [3], over 3400 people die on the world’s road every day and more than 10 million people are injured or disabled every year. Nowadays, issue of traffic road accident has become a teething problem in the world [1]. World Health Organization worked with partners including governmental and non-government organizations around the world to raise the profile of the preventability of road traffic accident. In many countries around the world, there are many organizations in the world to maintain road safety for reducing the menace of the fatal road traffic accident [3]. On the other hand, researchers employed many techniques, especially statistics techniques to identify the causes of the traffic road accident using the historical road traffic road dataset. The data miners explored various parameters or variables for the causes of the road accidents as well as behaviours of the divers by using different data mining tools and techniques. So many researchers use to spend much of time searching for the best performing data mining algorithm for mining on the traffic road accidents data set. This study conducted the performance evaluation of various data mining algorithms on the dataset of road traffic accident.

2 Literature Review In the study of [1, 5], explored various parameters of road traffic accidents including its cause, locations and time along Kano–Wudil Highway, Kano in Nigeria were predicted using id3 decision data mining classification algorithm. The accuracy of the classifier was 72.7273% for the cause of accident prediction, for prone location and time of the accident the accuracy of the classifier was 80.6061% and 76.9697%, respectively. The study was conducted using Weka data mining software. In the work of [6], K-means data mining algorithm was applied to group the location of the road accident into high-frequency, moderate-frequency and low-frequency categories. The algorithm took the accident frequency count as

Performance Evaluation of Various Data Mining Algorithms …

69

parameters to cluster the locations. Then, the association rule data mining was used to characterize the locations. However, the accuracy of the classifiers used in the study was not determined. In the work of [4], data mining classification algorithms were used on the Fatal Accident dataset to determine the relationship between fatal rate and other accidental attributes which includes surface condition, collision manner, light condition, weather and drunk habit of the drivers. The classifier used was Naïve Bayes, clustering was done by K-means algorithm while Apriori algorithm was applied for generating the association rules. The accuracy of the classifier was found to be 67.95%. Apriori has 0.4 minimum supports and 0.6 minimum confidences. Weka data mining software was also used for the analysis. Elfadil [7] predicted the reasons behind road traffic accident applying Multi-class SVM (Support Vector Machines) from the data collated from Dubai Police Unit, United Arab Emirates. The result of the work showed that the model can predict the cause of the accident with an accuracy of 76.7% and Weka was used for the analysis. Dipo and Akinbola [8] collected the accident data from Lagos–Ibadan road situated in Nigeria and analyzing the data by using id3 decision tree classifier of Weka software. The study could predict the cause and location of the road accident with an accuracy of 77.77%. In the work of [9] traffic accident data collected, Naïve Bayes classifier was used to predict the severity of the accident using Weka. Three experiments were conducted, the first experiment conducted with seven attributes and accuracy achieved was 87.252%, for the second experiment, the attributes increased to eight including earlier seven attributes and accuracy achieved was 88.0613% and in the third experiment, the attributes increased to 13 including earlier 8 attributes and accuracy achieved was 89.4554%.

3 Methodology Data mining is defined as the procedure of mining knowledge or hidden knowledge from past or historical data [1]. It is an interactive process for discovering novel, valid, understandable and useful hidden pattern and the relationship among data for predictions [5]. The study used data mining classification algorithms which include the following.

3.1

J48

This classifier is used for generating decision tree based on the C4.5 algorithm. Ross Quinlan developed this algorithm [2].

70

3.2

S. Hussain et al.

BayesNet

This classifier delivers higher accuracy on a large database. It also makes the computational timeless with better speed. Bayesian Network uses conditional dependencies using direct graph [1].

3.3

Multi-layer Perceptron

Multi-layer Perceptron is one of the neural network data mining classifiers. It is well-established Multi-layer Perceptron as a promising and popular classifier [2].

3.4

Apriori Algorithm

The occurrence of an item may be predicted by using the occurrence of other items in the transactions. The rules that define such transactions in the form X → Y are called association rules. Support is termed as the frequency of occurrence of a set of items or itemset, while confidence is a fraction of transaction that contains the itemset [10].

3.5

Evaluation of Classifiers

The data mining evaluation mechanism used for evaluating the performance of the various classification algorithms to identify the suitable algorithm to be applied in road traffic accident dataset for prediction include the following. Accuracy is the proportion of the true positives and true negatives to the total number of cases. Accuracy =

ðTrue Positives + True NegativesÞ ðTrue Positives + True Negatives + False Negatives + False PositivesÞ ð1Þ

Specificity specifies the correct negatives divided by the all the negatives as mentioned below Specificity =

Sensitivity or Recall

tn tn = tn + fp N

ð2Þ

Performance Evaluation of Various Data Mining Algorithms …

71

Recall is the number of correct classifications divided by the total number of positives. Recall =

tp tp = tp + fn P

ð3Þ

Precision Precision is the number of correct positive classifications divided by total number of positive classifications. So, Precision =

3.6

tp tp + fp

ð4Þ

K-Means Clustering

K-means clustering is used to group various objects into some clusters in such a way that the mean of the objects within the cluster is the nearest mean. The goal of this clustering method is to reduce the intra-cluster variance or to minimize the squared error [11].

3.7

Self-Organizing Map

Self-Organizing Map is an unsupervised technique for visualizing high dimensional data with low dimension views. This dimensionality reduction method uses artificial neural network technique for discretized representation of the training data [12].

4 Experiment and Results 4.1

Data Set

The study used the dataset of traffic road accident of Kano to Wudil high way in Nigeria. The dataset was applied in the study of [1, 5] to predict the cause of the accident, prone location and time along Kano to Wudil high way in Nigeria. The dataset contains four Variables-Vehicle type, Accident Time, Accident Cause and Accident Location and contains 30 months data staring from January 2014 to June 2016 (Fig. 1).

72

S. Hussain et al.

Fig. 1 Mosaic display of the accident time against accident location

4.2

Experiment and Results

Weka data mining software was used for the experiment of the dataset of the study. Weka open source data mining software was used to mine the dataset. Weka can perform various data mining jobs using machine learning algorithms. The study applied Multi-layer Perceptron, J48, BayesNet classifiers or algorithms directly on

Table 1 Comparison of different classifiers Sl. no.

Data mining algorithm (classifier)

Accuracy (%)

Specificity

Sensitivity

Precision

Recall

1 2 3

Multi-layer Perceptron BayesNet J48

85.33 80 78.67

0.473 0.624 0.787

0.853 0.800 0.787

0.848 0.769 0.619

0.853 0.800 0.787

Performance Evaluation of Various Data Mining Algorithms …

73

150 instances to traffic road accident data set. The results found from the experiment are shown in Table 1. We had used the Apriori Rule Mining to find out the best possible association rules using Weka. We had found the following two rules and the results of the experiment are shown below. 1. AccidentCause = WrongOvertaking AccidentLocation = LocationC 15 ⇒ VehicleType = SmallCar 15 lift:(1.27) lev:(0.02) [3] conv:(3.2) 2. AccidentTime = Evening AccidentCause = WrongOvertaking 32 ⇒ VehicleType = SmallCar 29 lift:(1.15) lev:(0.03) [3] conv:(1.71) We had applied K-means clustering, Self Organizing Map (SOM) on the datasets as unsupervised learning using Orange data mining software. Orange is an open source data mining software for both novice and expert users with great visualization and the large toolbox. The silhouette score of 0.7 was achieved to depict the meaningful clustering. The figures visualized the clustering and unsupervised learning results (Figs. 2, 3, 4 and 5).

Fig. 2 SOM accident cause

74 Fig. 3 SOM accident location

S. Hussain et al.

Performance Evaluation of Various Data Mining Algorithms …

Fig. 4 Scatter plot visualization of K-means clustering

75

76 Fig. 5 Silhouette plot clustered by ‘accident location’

S. Hussain et al.

Performance Evaluation of Various Data Mining Algorithms …

77

5 Conclusion and Future Work Three data mining algorithms of Weka software were applied on the 150 records of traffic road accident on Kano–Wudil road, Nigeria. The experimental results depicted that, for a prediction on traffic road accident dataset, Multi-layer Perceptron is most appropriate, suitable and efficient data mining algorithm to be used. In the course of the experiment, Multi-layer Perceptron classifier performed well with 85.33% accuracy, followed by J48 with 78.66% accuracy and BayesNet had 80.66% accuracy. Therefore, the study of Multi-layer Perceptron is recommended to scholars and researchers used as efficient data mining classifier or algorithm for predictive tasks. The study had also found two best rules for association rule mining using Apriori algorithm with 1.0 minimum supports and 1.27 minimum confidences for rule one and 0.91 minimum supports and 1.15 minimum confidences for rule two. K-means clustering and Self Organizing Map were also applied to the dataset with silhouette score of 0.7. The algorithms may be tested with more data and different datasets for the performance evaluation as a future work.

References 1. Muhammad, L.J., Sani, S., Yakubu, A., Yusuf, M.M., Elrufai, T.A., Mohammed, I.A., Nuhu, A.M.: Using decision tree data mining algorithm to predict causes of road traffic accidents, its prone locations and time along Kano–Wudil highway. Int. J. Datab. Theory Appl. 10, 197–206 (2017) 2. Performance evaluation of the data mining models. www.shodhganga.inflibnet.ac.in/ bitstream/10603/7989/14/14_chapter%205.pdf. Accessed 29 Dec 2017 3. Global status report on road safety: World Health Organization (2015). http://www.who.int/ entity/violence_injury_prevention/publications/road_traffic/en/index.html. Accessed 25 Dec 2017 4. Global status report on road safety: time for action, WHO (2009) 5. Muhammad, L.J., Yakubu, A., Mohammed, I.A.: Data mining driven approach for predicting causes of road accident. In: 13th International Conference 2017—Information Technology for Sustainable Development, vol. 28, pp. 10–15. Nigeria Computer Society (2017) 6. Liling, L., Sharad, S., Gongzhu, H.: Analysis of road traffic fatal accidents using data mining techniques. In: 2017 IEEE 15th International Conference Software Engineering Research, Management and Applications (SERA) (2017) 7. Elfadil, A.M.: Predicting causes of traffic road accidents using multi-class support vector machines. In: Proceeding of the 10th International Conference on Data Mining, July 21–24, 2014, pp. 37–42. Las Vegas, Nevada, USA (2014) 8. Dipo, T.A., Akinbola, O.: Using data mining technique to predict cause of accident and accident prone locations on highways. Am. J. Datab. Theory Appl. 1(3), 26–38 (2012) 9. Jaideep, K., Chandra, P.S.: Mining road traffic accident data to improve safety on road-related factors for classification and prediction of accident severity. Int. Res. J. Eng. Technol. (IRJET) 03(10) (2016) 10. Rakesh, A., Ramakrishnan, S.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499, Sept 12–15 1994

78

S. Hussain et al.

11. Vattani., A.: k-means requires exponentially many iterations even in the plane (PDF). Discrete Comput. Geometry 45(4), 596–616. (2011). https://doi.org/10.1007/s00454-011-9340-1 12. Ultsch, A.: Emergence in self-organizing feature maps. In: Ritter, H., Haschke, R. (eds.), Proceedings of the 6th International Workshop on Self-Organizing Maps (WSOM ‘07). Bielefeld, Germany, Neuroinformatics Group (2007). ISBN 978-3-00-022473-7

Casper: Modification of Bitcoin Using Proof of Stake Nakul Sheth, Priteshkumar Prajapati, Ayesha Shaikh and Parth Shah

Abstract Proof of Work is a Bitcoin mining algorithm in which miners used to solve cryptographic puzzle using lots of electricity and computational power. Some amount of bitcoins will be rewarded to the first miner who solved this puzzle and generates a new block in the blockchain. But using proof of stake, this mining process may not require huge amount of computational power and electricity consumption from the miner. So, proof of stake may be the most promising mining algorithm for cryptocurrency. In proof of work, the validation of a transaction is based on the amount of computational power provided by a miner in the blockchain to solve the cryptographic puzzle but proof of stake is a consensus mining algorithm of blockchain that depends upon the amount of stake deposited by a validator into the network. Proof of stake reduces the risk of centralization and energy efficiency from the blockchain. Keywords Proof of work Bitcoin Cryptocurrency



⋅ Proof of stake ⋅ ⋅ Casper

Blockchain



Block

1 Introduction The first decentralized cryptocurrency bitcoin was introduced by Satoshi Nakamoto in 2009 [1]. It is represented by “BTC” in cryptocurrency market [1]. Bitcoin can be created and stored electronically in a public ledger so it cannot be printed on paper like other traditional currencies [2]. Bitcoin uses proof of work mining algorithm for generation of a new bitcoin. Proof of work is a cryptographic puzzle which can be solved by miners by providing computational power in blockchain, and the N. Sheth (✉) ⋅ P. Prajapati ⋅ A. Shaikh ⋅ P. Shah Department of Information Technology, CSPIT CHARUSAT, Changa, Anand, Gujarat, India e-mail: [email protected] P. Prajapati e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_8

79

80

N. Sheth et al.

miner who has most power in blockchain network can generate a new block and get a reward from the network for generating a new coin. This proof of work requires lots of computational power and electricity consumption [2]. So, a node that has more computational power than other nodes in blockchain than risk of centralization may increase. So, chance of modification in blockchain may increase due to centralization of the network [3]. All the transactions of the bitcoin are working under Public Key Cryptography algorithm. Bitcoin can be divided into some of the following units [1]: • 1 milibitcoin (mBTC) = 10-3 BTC. • 1 microbitcoin (µBTC) = 10-6 BTC. • 1 satoshi (satBTC) = 10-8 BTC. Casper is a proof of stake algorithm that can reduce this centralization risk from the network. A validator has to deposit some amount of coins into the network for taking part to generate a new block in blockchain. If any malicious work is founded by that validator by the network, then the validator is going to lose all deposited coins from the network [4]. Characteristics of Bitcoins. Bitcoin provides different features that set apart from traditional currency [5]. Decentralized Currency: As above bitcoin does not control any third party or a country’s central bank. Every node in the network has access to the transaction and can generate a new bitcoin. That means that all the network cannot be controlled by a single authority and they cannot tinker with this monetary system and cause a meltdown. If network of the blockchain goes offline that transaction keeps working [5]. Easy to set up: It is easy to set up a bitcoin account. Only a bitcoin wallet is required to start bitcoin transaction. At another side, traditional bank system needs some authorized documents and a user has to proceed from lots of authentication process from the central bank [5]. Anonymous currency: By using a bitcoin address, other person may not decide its owner because that user does not require to link its address, name, or any personal identity to a central authority [5]. Anonymous currency: This is the main advantage to use of bitcoin. All the transaction happens in the network can be visible to all nodes that participate in the blockchain. A normal person can easily know how many number of coins are stored in a bitcoin wallet but that other person cannot withdraw that coin without secret key of that particular wallet [5]. Miniscule transaction fees: Bitcoin does not require any transaction fee but a traditional bank may charge as its rules and regulation [5].

Casper: Modification of Bitcoin Using Proof of Stake

81

Provide fast transaction: Owner of the bitcoin can send money from anywhere in the world without any currency conversion after getting confirmation of that transaction [5]. Non-reputable: Once a bitcoin is sent and entered in the network, then it cannot be reversed. Unless, its second owner sends that amount of coins back to you. [5]. Transaction in Bitcoin. Transaction of bitcoin is like an electronically chain with digital signatures [2]. Every transaction in the network is divided into blocks. And, every block is stored in the public ledger. Each block contains some transaction details and its previous block’s hash. Every block contains its own hash address. A detail of a transaction contains three informations. 1. Owner’s public key. 2. Previously verified hash and current transaction’s hash. 3. And bitcoin address with digital signature that is signed by its owner’s private key. That transaction can be verified by using payee’s public key (Fig. 1). By storing previous verified transaction, hash system can prevent a user to double spending of coin. By using this previous hash, its next owner can be verified whether the bitcoin address is previously used or not. Nowadays, this can be verified by some special node of the network called miners. Miners of the networks are going to verify the transaction and added a new block into the blockchain (Fig. 2).

Fig. 1 Structure of the transaction [2]

82

N. Sheth et al.

Fig. 2 Process of bitcoin transaction

2 Proof of Work Proof of Work is implemented on a peer-to-peer server on a timestamp server. Proof of work is as similar to Adam Back’ Hashcash Algorithm [2, 6]. PoW algorithm is scanning a value that can hash a block such as SHA-256 hashing algorithm, the hash value begins with some number of zero [2, 6]. For counting this hash value, a random number is generated by the system and that number is called as Nonce. Nonce is calculated randomly to get some amount of zeros at the begging of the hash [2]. Proof of Work used to verify transaction over a blockchain network. Every node of the network tries to solve some cryptographic puzzle. The solution of that cryptographic puzzle is taken as majority purpose such as system checks which answer is given by maximum number of node. And, the system gives some amount of bitcoins as reword to that node. If an attacker tries to modify the previous block, then that attacker has to modify and redo the proof of work for that block and other block next to that block. And, if its blockchain does not match with other honest node’s blockchain, then that block does not consider it as a valid block and is entered in blockchain as orphan block. To increase the power of PoW, a term Difficulty is determined. Difficulty increases and decreases as how many blocks are added in the blockchain in an hour. If the speed of new block generation is fast, then the difficulty level is increased and if the block generation is slow, then its difficulty level is decreased [2]. Proof of Work (PoW) is very time consuming and costly process. In current blockchain, a miner cannot mine a block by a single computer. Miner needed some kind of Hardware called ASIC, which is designed for mining (Figs. 3 and 4).

Casper: Modification of Bitcoin Using Proof of Stake

83

Fig. 3 Proof of work (PoW) [2]

Begin Here

Previous Block Hash

Proposed New Block

Increment Nonce

Combine and Hash

Difficulty Value

HashNu mber

Determine

Target Value

Co

m

pa

Increment Nonce and Try again

re

Is HASH Sim(A, C). A and B are much more alike than A and C are (Tables 2 and 3).

Table 2 Centered cosine correlation

PC1 A B C D

−1/3 −1/3

A B C D

4 5

Table 3 User-Item matrix

PC2 −1/3

PC3

HP1

HP2

−4/3

5/3

1

−1 0

HP1

HP2

5

2

1

3 3

2/3

0

PC1

PC2 5 3

PC3 4

160

A. Patel et al.

Item-Based Collaborative Filtering In Item-based approach for item i, we try to find other similar items [7, 15, 21]. It estimate’s rating for item i based on ratings for similar items. We are going to predict rating for user X and item I so we start with item I and find a neighborhood of items. Neighborhood of X is just set of items that are both rated by user X and are similar to item I that we are looking at. Here item-based CF outperforms user-based CF.

2.2.3

Hybrid Collaborative Filtering

Hybrid collaborative filtering improves the quality of recommendation. Also, they overcome the CF problems such as sparsity and loss of information [7]. However, they are complex and are expensive to implement. Commercial recommender systems such as Google news recommender system are hybrid [7].

2.3

Hybrid Recommender Systems

Combining collaborative filtering and content-based filtering could be more effective in many application areas [6]. The hybrid methods are accurate recommendations. This method tries to overcome some of the common problems in recommender systems such as cold-start and the sparsity problem [6]. Example Netflix is a good example which uses hybrid recommender systems [22].

3 Comparing Recommendation Approaches

Advantages

Collaborative filtering

Content-based

Hybrid

1. CF systems do not require content information about users/items to be recognizable. They only use ratings [23] 2. CF systems are based on assessment by considering other people’s experience [23] 3. CF systems produce personalized

1. UserIndependence: it will analyze user and item profile for recommendation [24] 2. Transparency: CB will recommend items based on features [24] 3. No cold-start: new items can be rated by a different

1. Uses both CF and CB approaches and overcomes the disadvantages of both technique

(continued)

Survey and Evolution Study Focusing Comparative Analysis …

161

(continued) Collaborative filtering

Content-based

recommendation, because they consider other people experience [23] 4. CF systems suggest beneficial items by simply observing other people’s behavior [23]

Disvantages

Hybrid

number of users [24]

Content-based

Collaborative filtering

Hybrid

1. If content does not contain enough information to discriminate items precisely the recommendation’s will not be precise at end [24] 2. Over-specialization: there is limited degree of novelty since it matches features [24] 3. New User: when there’s not enough information to build a solid profile of a user [24] 4. Synonymy: If there are two words spelled different then the CB will treat independent words [24]

1. Cold-Start problem: if no ratings are available then CF systems cannot recommend 2. Poor accuracy: when there is little data about user’s ratings [23] 3. Many CF algorithms work slow on huge amount of data [23] 4. There is diffusion problem while collecting information from diverse groups [23]

1. Complex because two approaches are combined

4 Issues of Recommender Systems (a) Cold-start problem: In content-based approach, the system must match features of an item against relevant features in user’s profile. It must create a detailed user model of tastes and preferences. By observing user behavior [20, 25]. If there’s no user has rated any item previously and this is cold-start problem [15, 20, 25]. (b) Sparsity: There are users who have rated only few items other approaches create neighborhood of users using their profiles [26]. User has to go through many items else may relate to wrong neighborhood of items where sparsity is the problem due to lack of information [27].

162

A. Patel et al.

(c) Long tail: Long tail is serious issue where without recommendation user would not have discovered those items on their own. Even system might sometimes think that many items deserve to be recommended. So problem is to rank those items [4, 26].

5 Conclusion In this paper we discussed three approaches to recommendation system. Each of these has its advantages and disadvantages. Hybrid algorithm combines features of both user-based and content-based algorithms. Where user-based algorithms are accurate but they are not scalable compared to item-based algorithms which are scalable but not precise. At last we see several issues of recommender systems. These systems will keep improving in future.

References 1. Goldberg, D., Nichols, D., Oki, B.M., Terry, D.: Using collaborative filtering to weave an information tapestry. Commun. ACM 35(12), 61–70 (1992) 2. Nagarnaik, P., Thomas, A.: Survey on recommendation system methods. In: 2015 2nd International Conference on Electronics and Communication Systems (ICECS), pp. 1496– 1501. IEEE (2015) 3. Shah, L., Gaudani, H., Balani, P.: Survey on recommendation system. System 137(7) (2016) 4. Loskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets (Sound Recording) (2016) 5. Khari, M., Kumar, P.: Evolutionary computation-based techniques over multiple data sets: an empirical assessment. Arab. J. Sci. Eng. 1–11 (2017) 6. Jannach, D., Zanker, M., Felfernig, A., Friedrich, G.: Recommender Systems: An Introduction. Cambridge University Press (2010) 7. Gallege, P., Sandakith, L.: Trust-based service selection and recommendation for online software marketplaces. Ph.D. Dissertation, Purdue University (2016) 8. Ha, T., Lee, S.: Item-network-based collaborative filtering: a personalized recommendation method based on a user’s item network. Inf. Process. Manage. 53(5), 1171–1184 (2017) 9. Polatidis, N., Georgiadis, C.K.: A multi-level collaborative filtering method that improves recommendations. Expert Syst. Appl. 48 (2016) 10. Haviv, A.: Recommendation Systems in eBay: One of the Largest Semi-Unstructured Marketplace. Newell-simon, 30 Nov 11. Johnson, R.: Advanced recommendations with collaborative filtering. Notre Dame 12. Wang, H., Wang, N., Yeung, D.-Y.: Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244. ACM (2015) 13. Lardinois, F.: Four approaches to music recommendations: Pandora, Mufin, Lala, and eMusic. Read Write Web, Retrieved 12(19), 11 (2009) 14. Ricci, F., Rokach, L., Shapira, B.: Introduction to recommender systems handbook. In: Recommender Systems Handbook, pp. 1–35. Springer, US (2011)

Survey and Evolution Study Focusing Comparative Analysis …

163

15. Obadić, I., Madjarov, G., Dimitrovski, I., Gjorgjevikj, D.: Addressing item-cold start problem in recommendation systems using model based approach and deep learning. In: International Conference on ICT Innovations, pp. 176–185. Springer, Cham (2017) 16. Ullman, L.R.: Content Based Recommendations (Sound Recording). Stanford University (2016) 17. Ebrahim, Y.: “Memory-Based vs. Model-Based Recommendation System,” Memory-Based vs. Model-Based Recommendation System, 13 October 2012 18. Ullman, L.R.: Implementing Collaborative Filtering (Sound Recording). Stanford University (2016) 19. Roberts, N.: Some of the Challenges of Collaborative Filtering. Quora (2016) 20. Cacheda, F., Carneiro, V., Fernández, D., Formoso, V.: Comparison of collaborative filtering algorithms: limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans. Web (TWEB) 5(1), 2 (2011) 21. Almazro, D., Shahatah, G., Albdulkarim, L., Kherees, M., Martinez, R., Nzoukou, W.: A survey paper on recommender systems. arXiv:1006.5278 (2010) 22. Techlabs, M.: How do recommendation engines work. And what are the benefits? (2017) 23. Krasnoshchok, O.: Collaborative filtering recommender systems—benefits and disadvantages. In: KES Conference (2014) 24. Tuan, D.: Recommender systems-how they work and their impacts, May 2012 25. Kohar, M., Rana, C.: Survey paper on recommendation system. Int. J. Comput. Sci. Inf. Technol. 3(2), 3460–3462 (2012) 26. Bai, B., Fan, Y., Tan, Wei., Zhang, J.: DLTSR: a deep learning framework for recommendation of long-tail web services. IEEE Trans. Services Comput. (2017) 27. Macmanus, R.: 5 Problems of Recommender Systems. Readwrite (2009)

Flower Pollination Optimization and RoI for Node Deployment in Wireless Sensor Networks Kapil Keswani and Anand Bhaskar

Abstract Wireless Sensor Network (WSN) most popular area of research where lots of work done are in this field. Energy efficiency is one of the most focusing areas because the lifetime of the network is the most common issue. Many algorithms work for cluster and these algorithms form cluster either distance or neighbor node. In our study, we focus on node deployment approach and after that, we focus on routing algorithm. In the Wireless Sensor Network, the node placement is the essential part for the proper communication between the sensor nodes and Base Station (BS). For better communication nodes should be aware of their own or neighbor node’s location. Better optimization of resources and performance improvement are the main concern for the Wireless Sensor Network. Optimal techniques should be utilized to place the nodes at the best possible locations for achieving the desired goal. For node placement, flower pollination optimization is used to generate the better result. Base Station is responsible for the communication of nodes with each other and it should be reachable to nodes. For this Region of Interest (RoI) is helpful to choose the best location. Placement of Base Station in the middle is the suitable place for the static nodes deployment and there should be another strategy for the dynamic environment. Nodes should be connected to each other for the transmission of data from the source to Base Station properly. From the MATLAB simulation, it has been shown that the proposed methodology improves the network performance in terms of dead nodes, energy and the various packets sent to Base Station.



Keywords Wireless sensor network Routing protocol optimization Genetic algorithm and region of interest





Flower pollination

K. Keswani (✉) ⋅ A. Bhaskar Department of ECE, Sir Padampat Singhania University, Udaipur, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_17

165

166

K. Keswani and A. Bhaskar

1 Introduction Wirelessly Sensorized Systems are exploring huge numbers of applications made to be had through this day to day-changing technology. One of the major problems with wireless sensor networks are limited power which can rigorously limit network lifetime [1, 2]. To fight this problem of limited power, self-driven networks with reusable batteries are frequently implemented. Although with develop in energy conserving routing protocols and energy efficient sensors, inconsistencies within networks such as excessive loads and non-ideal energy harvesting scenarios can be devastating [3, 4]. There are various possible alternates available to overcome these network issues and help to secure efficient optimization of a network lifetime [5, 6].

2 Literature Survey Isaq and Dixit [7] present that deterministic approach introduces that deterministic approach. The characterize protocol is simulated and the result show a tremendous lessening in network energy utilization when contrasted with LEACH and different calculations. Jamin et al. [8] show that, in this paper, they characterize in a WSN, a non-helpful, dynamic and broad diversion demonstrate for routing. They take care of each sensor nodes as a team member who has a plan to select from in sequence to find the direction that meets certain objective functions. Bochem et al. [9] present that Tri-MCL which fundamentally better on the precision of the Monte Carlo Localization algorithm. To do this, they use three distinctive separation measurement algorithms in light of without range algorithm. Bouachir et al. [10] show that an ORP and information broadcasting protocol for energy harvesting WSN rely upon inter-layer builds that permit over the layers bringing together and synchronization among the routing protocol and the application layer service. Zheng and Luo [11] exhibit that, this paper displays an intelligent technique for demonstrating Routing in WSN. Liaqat et al. [2] present to control energy hole issue unacceptable clustering structures have been characterize for the objective of security network. Energy hole avoidance stays as a testing issue.

3 Proposed Work In proposed work first concerns over nodes deployment and after that data forwarding using energy efficient technique.

Flower Pollination Optimization and RoI for Node Deployment …

3.1

167

ROI (Region of Interest)

The coverage in WSN defined as the total area covered by a set of SNs placed in the ROI. This district is considered as m × n grids, each framework point measure was equivalent to 1 and meant as G(x, y). Here we utilized ROI for the position of BS at the best area. This method is very suitable to place BS in the network.

3.2

Genetic Algorithm (GA)

After we get the best position for BS by using ROI then we get the population details by utilizing GA. GA is mainly work on the chromosomes and then three operations are performed over that data for getting the better results and here used for the generation of population of that particular sensor area.

3.3

Node Deployment/Placement

The deployment of SNs in WSN is to detect the topology of the deployment nodes or detect the coordinates of the SNs within the dimensional plane. Thus, an optimal placement approaches essentials to be considered to achieve the desired goal.

3.4

Flower Pollination

The generation in plant life occurs through combination of the gametes. The pollen grains delivered by methods for male gametes and ovules borne with the guide of female gametes are created by dissimilar components and it is fundamental that the pollen should be transmitted to the stigma for the union. This system of transmits and deposition of pollen grains from another to the flower stigma is pollination. The technique of pollination is for the most part encouraged with the guide of an agent. The pollination is a result of pollination and it is should in agriculture to harvests fruits foods [9, 12].

3.5

Graph Technique for Data Forwarding

Graphs are consider as connectivity between nodes and applying the graph method for data forwarding gives better results after deploying nodes using optimization technique. In proposed work connectivity is required for data forwarding and for this bipartite graph technique is used (Fig. 1).

168

K. Keswani and A. Bhaskar

Fig. 1 Bipartite graph

By using above technique it will connects the whole network so that communication become faster for this first if any event occur in only one path active at a time. Using this flip flop condition life time of network enhances and performance of network increase. Proposed Algorithm: Step:1 Step:2 Step:3 Step:4

Create initial population Place base station at best position using RoI We consider the x- and y-axis for the base station deployment Calculate Euclidean distance with each node qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Dist ða, b) = ðb1 − a1 Þ2 + ðb2 − a2 Þ2

Step:5 If (Connectivity = max) Then obtain the position and place BS Else Step:6 Step:7 Step:8 Step:9 Step:10 Step:11

Goto step 4 Now we apply genetic algorithm for population generation Perform selection on the population Generate new population by mutation and crossover operators Apply FPO for the node deployment Find the best-fitted position for each node to avoid overlapping Fitness function can be calculated as: k

R ðx, y, N) = 1 − ∏ ð1 − Rðx, y, Ni ÞÞ i=1

Where R(x, y, Ni) is the probability which is covered by a sensor node Ni at (x, y) Step:12 If (Fitness > Threshold) Then get the optimal position of nodes Else Calculate fitness value again

Flower Pollination Optimization and RoI for Node Deployment …

Step:13 Step:14 Step:15 Step:16

Update them in the population Each node get connected with Bipartite graph for data forwarding Data reached efficiently towards destination End (Fig. 2).

Fig. 2 Flowchart of proposed technique

169

170

K. Keswani and A. Bhaskar

4 Result Analysis The proposed work is simulated in MATLAB 2013 to show the node deployment in the WSN. There are 100 nodes and BS in which RoI used for the BS. The parameter defines the area, initial energy, transmission energy, receiving energy, size of packet and number of SNs utilized in the analysis of WSN. There are total 3000 iterations performed in the execution. In the existing technique, Flower Pollination Optimization algorithm used for the node deployment which is demonstrated. The nodes are placed in the given scenario and each node communicates and the output can be shown in the form of the dead node, energy, and packet sent to BS (Fig. 3). In Fig. 4, it is shown that the total number of packets sent to BS which is initially 10 (minimum) at iteration 1 and 63,112 (maximum) at iteration 3000. The energy of the node is the significant term for the better communication among the nodes. At initial iteration, each node has 50 J for the entire process and it decreases with the time. In the last iteration, the total energy of the nodes is 0.9650 J approx (Fig. 5). The number of dead nodes increases after 1596 iterations then it reaches to 91 till the last iteration. So it can be seen that 91 nodes dead from the total number of nodes (Fig. 6). Now, the proposed work has been demonstrated in which FPO used with GA and Graph Theory. From the results, it demonstrated that the proposed technique is much better than existing techniques (Fig. 7). In Fig. 8, it is shown that the total number of packets sent to BS which is initially 10 (minimum) at iteration 1 and 69,745 (maximum) at iteration 3000. The energy of the node is the significant term for the better communication among the nodes. At initial iteration, each node has 50 J for the entire process and it

Fig. 3 Node deployment in network

Flower Pollination Optimization and RoI for Node Deployment … Fig. 4 Number of packet sent to BS

Fig. 5 Energy of nodes

Fig. 6 Number of dead nodes

171

172

K. Keswani and A. Bhaskar

Fig. 7 Node deployment in network

Fig. 8 Number of packet sent to BS

decreases with the time. In the last iteration, the total energy of the nodes is 1.7117 J approx (Fig. 9). The number of dead nodes increases after 1623 iterations then its reach to 80 till the last iteration. So it can be seen that 80 nodes dead from the total number of nodes (Fig. 10).

Flower Pollination Optimization and RoI for Node Deployment …

173

Fig. 9 Energy of nodes

Fig. 10 Number of dead nodes

5 Conclusion Wireless Sensor Network is a very interesting area of research where energy efficiency is a very concerning area because SNs have limited energy resource. There are lots of works done in this field our objective of the research has first deployed the nodes in an accurate position so that we get accurate node location. If priorly aware about node location, then in this scenario better routing algorithm will apply in the network which saves the energy and increases the lifetime of nodes. As each and every nodes are power limited and uses this power it in detecting, processing, collecting and forwarding the data, therefore the main problem in wireless sensor networks is to utilize the energy of sensor nodes optimally so that the lifetime can be enhanced. Since the battery power of the sensor nodes cannot be changed or modified after deployment they are suppose to perform for relatively long duration.

174

K. Keswani and A. Bhaskar

Thus to optimum utilization of the energy and optimization of the lifetime of network here two nature inspired intelligent algorithm are using together for node deployment which shows better optimization of network lifetime.

References 1. Heinzelman, W., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless microsensor networks. In: Proceedings of the 33rd Hawaii International Conference on System Sciences (HICSS’00) 2. Liaqat, M., Javaid, N., Akbar, M., Khan, Z.A., Ali, L., Hafizah, S., Ghani, A.: HEX clustering protocol for routing in wireless sensor network. IEEE 2014. ISSN 1550-445X/14 3. Roopashree, H.R., Dr. Kanavalli, A.: STREE: a secured tree based routing with energy efficiency in wireless sensor network. IEEE (2015). ISBN 978-1- 4799-7623-2/15 4. Kulik, J., Heinzelman, W., Balakrishnan, H.: Negotiation-based protocols for disseminating information in wireless sensor networks. Wirel. Netw. 8(2/3), 169–185 (2002) 5. Fang, Z., Zhao, Z., et al.: Localization in wireless sensor networks with known coordinate database. EURASIP J. Wirel. Commun. Netw. 2010, 901283 (2010) 6. Khera, S., Dr. Mehla, N., Dr. Kaur, N.: Applications and challenges in wireless sensor networks. https://doi.org/10.17148/ijarcce.2016.5694 7. Isaq, A.S., Dixit, M.R.: Energy optimized data routing in wireless sensor network. IEEE (2015). ISBN 978-1-4799-7678-2/15 8. Jamin, B.S., Chatterjee, M., Samanta T.: Extensive game model for concurrent routing in wireless sensor network. IEEE (2015). ISBN 978- 1-4673-7910-6/15 9. Bochem, A., Yuan, Y., Hogrefe, D.: Tri-MCL: synergistic localization for mobile ad-hoc and wireless sensor networks, p. 333. Arne Bochem, Under license to IEEE (2016). https://doi. org/10.1109/lcn.2016.61 10. Bouachir, O., Mnaouer, A.B., Touati, F., Cresciniy, D.: Opportunistic routing and data dissemination protocol for energy harvesting wireless sensor networks. IEEE (2016). ISBN 978-1-5090-2914-3/16 11. Zheng, W., Luo, D.: Routing in wireless sensor network using Artificial Bee Colony algorithm. IEEE (2014). ISBN 978-1-4799-7091-9/14 12. Chakravarthy, V.V.S.S.S., Chowdary, P.S.R., Panda, G. et al.: Arab. J. Sci. Eng. (2017). https://doi.org/10.1007/s13369-017-2750-5

Exploring Causes of Crane Accidents from Incident Reports Using Decision Tree Krantiraditya Dhalmahapatra, Kritika Singh, Yash Jain and J. Maiti

Abstract Electrical Overhead Traveling (EOT) cranes in manufacturing industries serve the purpose of material handling in complex working environment. Complexity involved in human machine interaction at the workplace make it hazard and incident prone. In current study, emerging data mining technique like Decision tree (DT) is adopted to explore the underlying causes involved in incidents happened in the studied plant from the year 2014–2016. Interesting results are obtained from the analysis like number of incidents happened during construction and maintenance activities and in weekend (Saturday, Sunday) are more. Managerial implications are suggested for betterment of safety management system of the studied plant. Keywords EOT crane



Decision tree



CART



Occupational safety

1 Introduction Occupational accident prevention is the key for effective safety management of any industry. Exploration of causes and sequential events that led to these accidents will effect in deployment of interventions to prevent the further occurrence. Manufacturing industries have seen increase in occupational accidents owing to application of high end machineries with complex working procedures for better productivity.

K. Dhalmahapatra (✉) ⋅ K. Singh ⋅ Y. Jain ⋅ J. Maiti Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, India e-mail: [email protected] K. Singh e-mail: [email protected] Y. Jain e-mail: [email protected] J. Maiti e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_18

175

176

K. Dhalmahapatra et al.

Design and operational complexity involved with heavy material transfer equipment like EOT cranes make the workplace susceptible to various hazards and unprecedented accidents. The current study focused on incident investigation data of slab casting units of an integrated steel plant of India where EOT crane operations are predominant. Adverse situations like working at height, high temperature, improper visibility due to smoke, collisions, electrical malfunction more often lead to incidents (near miss and injury/property damage) in these units. Hence we carried out analysis of 176 crane related incidents that was recorded in the online safety management system in form of incident reports in the year of 2013–2016. These reports capture different attributes pertaining to an incident like date of incident, time of incident, brief description, incident category, impact, activity type, primary cause. The incident data are recorded by using two methods: in the first method, regular employees of the organization report the incident by directly logging in the SMS. In the second method, the contractor supervisors manually report the incident in their template paper form. Then by e-mail, the soft copy of that particular incident is sent to the safety manager of concerned section/subsection. The corresponding manager then logs the incident in online SMS. For this study, the authors extracted the incident data in excel format from 2014 to 2016 to determine the root causes behind the incidents. Quantitative and qualitative analysis of crane accidents have been carried out by researchers to extract the major causes behind the crane accidents. Several studies are carried out based on questionnaire survey to find out the major factors affecting the safety of tower crane where operator proficiency is found to be the top factor that led to many accidents [1–4]. Reference [5] worked on quantification of risk associated with human errors while performing EOT crane operations by using HTA, SHERPA and Fuzzy Vikor. Chi-square test and proportional analysis helped [6] for identification crane related hazards from accident data after regulation of new OSHA rules for crane safety. Quantitative analysis like data mining approaches are not explored to its full potential by the researchers in the field of crane accidents. Emergence of data mining approaches like support vector machine (SVM), neural network (NN), self-organizing map (SOM) and decision tree (DT) helped in analyzing accident data in various industries and suggesting preventive measures [7, 8] adopted K-means algorithm for analyzing risk factors in crane related accidents and near misses from the incident reports. Applying data mining technique SOM and cluster analysis in accident data, [9] carried out risk assessment and explored common pattern of accidents. Comparative analysis is done for prediction of road accidents using both multivariate analysis and artificial neural network (ANN) which will ultimately help in road safety management [10]. Decision tree (DT) method is one of the data mining techniques which has tremendous potential in analyzing accident data [7]. DT is a top down branching structure where information contained in a data set is systematically broken down and relation between various input and out variable is established. DT can analyze both categorical and numerical data and resolve both regression and classification type problems [11]. Rules generated by decision tree often helps in finding out behavioral pattern that lies in the accident data set [12].

Exploring Causes of Crane Accidents from Incident Reports …

177

Fig. 1 Schematic diagram of methodology adopted

This study starts with describing the methodology in Sect. 2 and a detailed schematic diagram for the methodology is shown in Fig. 1. Data pre-processing, Attribute selection, Decision tree and Assessment of rules are given in Sects. 2.1, 2.2, 2.3 and 2.4 respectively. Results and discussion are described in Sects. 3 and 4 respectively. Finally, conclusion of the study is given in Sect. 5.

2 Methodology 2.1

Data Pre-processing

For achieving high accuracy of an algorithm, satisfactory quality of the data has to be maintained by performing data preprocessing. In our study, we had 202 record in the dataset before pre-processing was performed. Data cleaning of this data was done by executing following processes in sequence. Data reduction. Data reduction is performed in a manner that it reduces dimensions of data while maintaining the integrity of data to the original form. We have adopted dimensionality reduction which involves decreasing random attributes or redundant variables to reduce data space. This was done manually by using domain knowledge and identifying significant attributes.

178

K. Dhalmahapatra et al.

Duplicate data removal. After considering reduction of data space with respect to attributes, it is also important to check data for redundancy present in data at tuple level. The crane accident considered in the study was cleaned from duplicate data using excel. Missing data imputation. Missing values in tuples can be handled in different ways, such as, ignoring those tuples, filling the values manually, using a global constant to fill all missing value in an attribute, etc. For our purpose, attribute “Risk Score” had missing values which was handled using EM (expectation maximization) algorithm. Handling missing data was the last step of our data cleaning process. After pre-processing steps were sequentially executed, we obtained 176 crane accident records with 37 attributes, which met our criteria of quality for performing mining.

2.2

Attribute Selection

Both structured and unstructured data can be exploited using data mining techniques to obtain valuable insights from data. However, for our study we have considered only structured data. Before applying any mining technique on data, it is very important that attribute selected for the application provide most distinctive and clear picture of the set objective. Since our goal was to study the hidden pattern existing in the crane accidents and also find the causes for these patterns, we concluded 8 attributes as most distinctive for our objective: Primary cause, Incident Location, Incident Category, Impact, Shift, Day, Month and Activity type.

2.3

Decision Tree

A decision tree is a widely used classification method represented in a tree structure, where internal nodes are the test done on attribute, while the branches represent the outcome of the test. The leaf nodes are the target variable or class label. For any given tuple whose target variable is unknown, its attribute values are tested along the path of decision tree from root to leaf node, which holds the class label prediction for the tuple. To distinctively divide each node, attribute selection is employed using splitting criterion. Splitting criteria provides the information for the test to be performed at node N so that it can be partitioned in best way. Most of the time such information measures for classification purpose are entropy and Gini Index. The Gini Index, developed by Conrado Gini in 1912, is used in CART and is measure of impurity of T. Gini index is given as:

Exploring Causes of Crane Accidents from Incident Reports …

179

Gini ðT) = 1 − ∑ ðki Þ2 where ki is the probability that an arbitrary tuple in D belongs to class ci. The application of decision tree has broadened over years and has found its strong base in exploring accident causes and severity. Reference [13] used CART algorithm to identify causes and type of road accidents for traffic accidents in Brazil. Reference [14] made use of CHAID to predict the severity of bicycle crashes by establishing relationship between 8 important categorical predictors of crash and severity. For our purpose we use CART algorithm to explore pattern underlying for different incident category and use expert’s knowledge to identify causes for these patterns. CART. CART (Classification and regression tree) is a cornerstone algorithm in decision tree that adopts greedy approach [15]. This algorithm responds well where no underlying relationship exists between independent and dependent variable. It is a non-parametric model that establishes relationship between independent variable (characteristic variables) and dependent variable (target variable) [13]. Apart from using Gini Index, CART also uses multivariate split for attribute selection in some cases. Multivariate split considers more than one variable at a time to measure the splitting criteria. In our study we have used Salford Predictive Modeler to perform CART algorithm with incident Category as the target variable. Rules Extraction (RE). Decision rules can be extracted from decision tree, which provides more insight into the data. These rules are generated in the form of X => Y which can be interpreted as IF X happens THEN Y occurs. Here X (precedent) is the set of variable and Y (consequent) is the independent variable. Number of such rules is equal to the number of level in tree. However, in case of large number rules, these rules can be pruned on the basis of three parameters: support, confidence and lift value of rule. In our case total number of rules were limited due to small size of dataset and attributes considered, therefore we were not required to prune the rules.

2.4

Assessment of Rules

Decision rules obtained from decision tree were further manually intervened to design investigating questions. These questions were then communicated to domain expert for their opinion. The purpose of receiving the expert’s opinion was to find the major causes behind the patterns explored from the accident data. Based on the expert’s opinion we proposed some intervention to prevent these accidents to occur. The schematic diagram explaining the steps involved in achieving the causes of accidents is given in Fig. 2.

180

K. Dhalmahapatra et al.

Fig. 2 Decision tree results

3 Results It can be observed from the tree that accidents occurring during construction and maintenance lead to near misses and no other attribute is tested in this case. On the other hand, for operation activity, more distinctive attributes are required to classify tuples. Rules extracted from the tree using conditional operator are mentioned in detail in Table 1. These rules were further studies to obtain investigating question. Following set of inferences were obtained from decision rules, which were further communicated to expert for their opinion: (1) Near misses (NM) are more in number in case of Construction/Maintenance activity (2) During activity Type (AT) {Operation}, occurrence of near miss results in Impact = {Fatality/First Aid} otherwise {Injury/Property damage} (3) During activity Type {Operation}, during the month April, December, January, November and October, occurrence of near miss results in Impact {Equipment/ Property damage} otherwise Injury/property (I) damage (in other months) (4) Accidents happening on weekends (Fri, Sat, Sun) and start of the week (Mon) lead to near miss otherwise Injury/Property damage (in other days).

Exploring Causes of Crane Accidents from Incident Reports …

181

Table 1 Decision tree rules RN

1 2 3

TL

Rules IF

1 1 2

AT = {Construction Site, Maintenance} AT = {Operation} AT = {Operation} and Impact = {Fatality, First Aid} 4 2 AT = {Operation} and Impact = {Equipment/ Property Damage} 5 3 AT = {Operation} and Impact = {Equipment/ Property Damage} and Month = {April, December, January, November, October} 6 4 AT = {Operation} and Impact = {Equipment/ Property Damage}and Month = {April, December, January, November, October} and Days = {Friday, Monday, Saturday, Sunday, Thursday} 7 4 AT = {Operation} and Impact = {Equipment/ Property Damage} and Month = {April, December, January, November, October} and Days = {Tuesday, Wednesday} 8 3 AT = {Operation} and Impact = {Equipment/ Property Damage} and Month = {February, March, May, June, July, August, September} TL Tree Level RN Rule Number

THEN NM I (%) (%)

Cases No. of cases

% of cases

94.9 71.4 96.7

5.1 28.6 4.3

99 77 23

56.25 43.75 13.06

61.1

38.9

54

30.68

90.5

9.5

21

11.93

100

0

16

9.09

60

40

5

2.84

42.4

57.6

33

18.75

These results were shared with expert and opinion received from him helped us learn the causes for these patterns to exist. The causes informed by expert and interventions suggested from our end are discussed in the Discussion section.

4 Discussion Safety experts gave their suggestions after going through the results. In maintenance and construction activities, manual work involvement is more with large number of workforces. Hence near miss cases are more in these activities. In contrast, operation activities involve more automation and less involvement of workforce, hence probability of near misses are less. The movement of crane primarily happens during “Operation” therefore, it is obvious that the safety incidents related with hit/fall can happen which lead to fatality or first aid (mostly). Otherwise during maintenance activities, equipment are in static condition for which probability of impact of incidents are less. In the studied plant, December to

182

K. Dhalmahapatra et al.

April are the months when the pressure is more for production to achieve the requisite turnover after that maintenance work is carried out. So it is always expected that incidents would be more in these months and less during maintenance work. Lethargic approach of operator on Saturday and Sunday because of absence of senior authorities and Monday as the first day of week, lead to more number of incidents.

5 Conclusion Current study focused on exploring the underlying causes of accidents frequently happening in sections in a specific steel plant where application of EOT crane is predominant. Most of the incidents happening are because of operator negligence or improper supervision at the workplace. Stringent safety guidelines at the workplace, regular monitoring of supervisor’s activities, adequate training about standard operating procedures (SOPs) to be followed at the workplace, use of proper personal protective equipment (PPE) while working are some managerial implications that can help the plant management for improving safety at workplaces and save people from accidents.

References 1. Zhao, C., Zhang, J., Zhong, X., Zeng, J., Chen, S.: Analysis of accident safety risk of tower crane based on fishbone diagram and the analytic hierarchy process. Appl. Mech. Mater. 127, 139–143 (2012) 2. Wang, Q., Xie, L.: Safety analysis of tower crane based on fault tree. Appl. Mech. Mater. 163, 66–69 (2012) 3. Bao-Chun, C., Jian-Guo, C.: Fuzzy AHP-based safety risk assessment methodology for tower crane. J. Appl. Sci. 13(13), 2598–2601 (2013) 4. Shin, I.J.: Factors that affect safety of tower crane installation/dismantling in construction industry. Saf. Sci. 72, 379–390 (2015) 5. Mandal, S., Singh, K., Behera, R.K., Sahu, S.K., Raj, N., Maiti, J.: Human error identification and risk prioritization in overhead crane operations using HTA, SHERPA and fuzzy VIKOR method. Expert Syst. Appl. J. 42(20), 7195–7206 (2015) 6. Cho, C., Boafo, F., Byon, Y., Kim, H.: Impact analysis of the new OSHA cranes and derricks regulations on crane operation safety. KSCE J. Civ. Eng. 21(1), 54–66 (2016) 7. Mistikoglu, G., Gerek, I.H., Erdis, E., Mumtaz Usmen, P.E., Cakan, H., Kazan, E.E.: Decision tree analysis of construction fall accidents involving roofers. Expert Syst. Appl. 42 (4), 2256–2263 (2015) 8. Raviv, G., Fishbain, B., Shapira, A.: Analyzing risk factors in crane-related near-miss and accident reports. Saf. Sci. 91, 192–205 (2017) 9. Moura, R., Beer, M., Patelli, E., Lewis, J., Knoll, F.: Learning from accidents: interactions between human factors, technology and organisations as a central element to validate risk studies. Saf. Sci. 99, 196–214 (2017) 10. De Luca, M.: A comparison between prediction power of artificial neural networks and multivariate analysis in road safety management. Transport 32(4), 379–385 (2017)

Exploring Causes of Crane Accidents from Incident Reports …

183

11. Konda, R.: Predicting Machining Rate in Non-Traditional Machining using Decision Tree Inductive Learning. Nova Southeastern University (2010) 12. De Oña, J., López, G., Abellán, J.: Extracting decision rules from police accident reports through decision trees. Accid. Anal. Prev. 50, 1151–1160 (2013) 13. da Cruz Figueira, A., Pitombo, C.S., de Oliveira, P.T.M.e.S., Larocca, A.P.C.: Identification of rules induced through decision tree algorithm for detection of traffic accidents with victims: a study case from Brazil. Case Stud. Transp. Policy 5(2), 200–207 (2017) 14. Prati, G., Pietrantoni, L., Fraboni, F.: Using data mining techniques to predict the severity of bicycle crashes. Accid. Anal. Prev. 101, 44–54 (2017) 15. Jiawei, H., Kamber, M., Pie, J.: Data mining: concepts and techniques 5 (2011)

A Novel Controlled Rectifier to Achieve Maximum Modulation Using AC-AC Matrix Converter with Improved Modulation K. Bhaskar and Parvathi Vijayan

Abstract This paper contrasts the traditional rectifier and the Matrix rectifier (MR). It is outstanding that a routine rectifier draws non-sinusoidal current from the supply and lessens the productivity of the framework. PWM-controlled rectifier can offer preferred standpoint of diminished low-order harmonics and solidarity unity power factor control consider when contrasted with a customary thyristor converter. PWM exchanging systems are regularly hard to actualize physically and it is hard to reach out to regenerative operation. This paper proposes an option PWM strategy in view of AC-AC MC hypothesis, by enhancing the modulation index or voltage ratio of the signal it produces just higher order harmonics by switching which can be filtered out easily by designing necessary filter. It leads to unity power factor by drawing sinusoidal input current from the supply of variable renewable energy sources. The four quadrant and regeneration operation is made easy by center tapped DC source. The physical implementation of the proposed rectifier is possible using modified algorithm called venturini which gives out real time controlled output of dual DC voltage (Positive and Negative DC Voltages). The results of the paper are presented and verified in MATLAB Simulink model. Keywords MC (Matrix converter) Modulation AC-AC converter





Controlled rectifier



MR (Matrix rectifier)

K. Bhaskar (✉) ⋅ P. Vijayan Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai 600062, India e-mail: [email protected] P. Vijayan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_19

185

186

K. Bhaskar and P. Vijayan

1 Introduction The conventional thyristor converter produces output voltages by drawing non-sinusoidal input current with low harmonics and lagging power factor whereas the PWM-based rectifier gives out ripple free DC voltage with improve in power factor and less harmonics. The performance of the PWM techniques can be increased by selecting the right modulation strategy. The demerit of this type of modulation strategy is that it requires more switches and the algorithm is difficult to implement in the real time. This leads to the other concept of matrix converter (AC-AC) which requires fewer switches and gives less distortion in output voltage in this paper the conventional Thyristor converter is changed over as Matrix rectifier which gives out AC-DC with variable input supply rather than AC-AC yield by utilizing venturing algorithm with improved modulation index from 0.866 to 1. This lessening of AC-AC converter as AC-DC converter can be accomplished by setting the frequency as zero and abandoning one yield stage detached and permitting DC current to stream in other two stages. So, it is effortlessly made to work in recovery mode with unity power factor control.

2 Demerits of Conventional Rectifier The conventional rectifier and thyristor based converter have some disadvantages which reduces its usage for real time application [1]. • • • •

It draws non-sinusoidal input currents Power factor decreases Low harmonic content are increased Efficiency of the system decreases gradually due to the drawing of non-sinusoidal currents • The input to output ratio ðvvino Þ attaining is limited up to 0.5–0.866 • Attaining regeneration operation or four quadrant operation is difficult • Implementing of PWM switching algorithm is difficult.

3 Need for Matrix Rectifier The matrix converter (AC-AC) can be reduced to AC-DC matrix rectifier which overcomes the above-mentioned problems by improving the modulation strategy of PWM technique with new switching theory called venturing switching algorithm [2]. It is tried that the Matrix converter utilizing PWM converters with inverter

A Novel Controlled Rectifier to Achieve Maximum Modulation …

187

exchanging systems are hard to actualize continuously controller. The matrix converter AC-AC has some physical limits on the output voltage achievable. The control of output voltage, the output voltage envelope should contain the envelope of input voltage. A new switching sequences using Venturini switching techniques is incorporated to increase the voltage transfer ratio or modulation index from 0.866 to 1. The reduction of a (AC-AC) matrix converter to a controlled AC/DC rectifier yields three different output phase voltages of maximum, minimum and zero due to the center tapped operation of the reduced converter model [3, 4]. This modification in the matrix converter lead to overcome the problems in conventional types of rectifiers and draws sinusoidal currents and unity power factor attained with harmonics.

3.1

Merits of Matrix Rectifier over the Conventional Rectifiers

The matrix rectifier over comes the problems of conventional rectifiers and PWM based converters by • • • • • • •

Maintaining unity power factor It generates higher order harmonics which is easy to filter out It draws sinusoidal input current The voltage transfer ratio can be reached up to 1 Regeneration operation is attained Reactive energy storage components is not required in this topology New venturini switching theory is used for implementation.

3.2

Circuit for AC-AC Matrix Converter

Matrix converter is for the most part utilized for AC-AC change as it were. In this proposed paper matrix converter is lessened into AC-DC network rectifier [5]. It is accomplished by setting coveted yield frequency to zero and making one stage detached which permit DC current to move through a heap associated over the other two stages (Fig. 1).

3.3

Reduction of Matrix Converter as Rectifier

To make the matrix converter as controlled rectifier the output frequency to be set to zero and output voltage angle as 30° [2]. This arrangement gives out three possible

188

K. Bhaskar and P. Vijayan

Fig. 1 Structure of three-phase AC-AC matrix converter

output voltages of maximum, zero and minimum. By incorporating the new switching strategy 2

Vo .

6 ½Vo ðtÞ = 4 zero + − Vo .

cos 30◦ + Vpoffiffi . 2 3

cos 30◦

Vo cos pffiffi3ωi t 2 3

3 7

cos 3ωi t 5 Vpoffiffi cos 3ωi t 2 3

and the matrix converter modulation function [m(t)] reduces to [m(t)] (Fig. 2) 2

m11 ðtÞm12 ðtÞ = 4 m21 ðtÞm22 ðtÞ m31 ðtÞm32 ðtÞ

1 − m11 ðtÞ 1 − m21 ðtÞ 1 − m31 ðtÞ

Fig. 2 Operation of a matrix converter as a DC rectifier

3 − m12 ðtÞ − m22 ðtÞ 5 − m32 ðtÞ

A Novel Controlled Rectifier to Achieve Maximum Modulation …

189

Fig. 3 Dual-supply, bidirectional AC-DC matrix rectifier

4 Different Topology The different topology can be obtained by this modulation are [6] • • • •

Single-ended, bidirectional AC-DC matrix converter Single-ended, unidirectional AC-DC matrix converter Dual-supply, bidirectional AC-DC matrix converter Dual-supply, unidirectional AC-DC matrix converter (Fig. 3).

5 Simulation To simulate the matrix converter as reduced and controlled AC-DC rectifier MATLAB Simulink is applied. Both the theory and the physical simulation results are presented in the paper [7–9]. It is used to conform that the converter operation in

Fig. 4 Matrix rectifier

190

K. Bhaskar and P. Vijayan

Fig. 5 Output waveform of matrix rectifier as dual DC voltage

different operating conditions with different switching frequencies to obtain improved modulation index and dual DC supply voltage (Figs. 4 and 5).

6 Conclusion The conventional rectifiers and the PWM converters draw non-sinusoidal currents which damage the source system and reduce the efficiency of the system. The power factor reduces and the low-order harmonics are generated to obtain the regeneration operation energy storage component is required to overcome these problems the matrix converter AC-AC is reduced to AC-DC with dual DC output with improved modulation and new switching strategy called venturing switching. This modulation in the matrix converter leads to (1) Improvement in voltage transfer ratio to reach maximum to 1 (2) Regeneration is possible without any energy storage components (3) Dual DC output voltages of maximum, zero and minimum is obtained by center tapped operation of MC (4) High-order level harmonics are generated which can be eliminated easily (5) Power factor is improved to unity and the MC draws sinusoidal input currents.

A Novel Controlled Rectifier to Achieve Maximum Modulation …

191

References 1. Saravanan, P., Sureshkumar, R., Ramanujam, M., Arumugam: Comprehensive analysis of Z source DC/DC converters for DC power supply systems. IEEE Trans. Ind. Electron. (2012) 2. Sunitha, P.V., Thomas, V.: A Novel DC motor drive using single phase AC-DC matrix converter. In: Proceedings of 2nd National Conference on Emerging Trends in Engineering— NET2014, ISBN 978-93-83459-67-4, Bonfring (2014) 3. Mahendranand, N., Dr. Gurusamy, G.: THD analysis of matrix converter fedload. In: IEEE Eighth International Conference on Power Electronics and Drive Systems (PEDS2009), Taiwan, pp. 829–832 (2009) 4. Watthanasarn, C., Zhang, L., Liang, D.T.W.: Analysis and DSP based implementation of modulation algorithms for AC to AC matrix converter, pp. 1053–1058. IEEE (1996) 5. Sneha, E.V.S., Sajin, M.: Design and simulation of single phase matrix converter as a universal converter. Int. J. Eng. Res. Technol. (IJERT) 2(11), 1472–1474 (2007) 6. Venkatasubramanian, D., Natarajan, S.P., Baskaran, B., Suganya, S.: Dual converter controlled single phase matrix converter fed DC drive. ARPN J. Eng. Appl. Sci. 7(6), 672–679 (2012) 7. Kolar, J.W., Schafmeister, F., Round, S.D., Ertl, H.: Novel three-phase AC–AC sparse matrix converters. IEEE Trans. Power Electron. 22(5), 1649–1661 (2007) 8. Wheeler, P.W., Rodríguez, J., Clare, J., Empringham, L., Weinstein, A.: Matrix converters: a technology review. IEEE Trans. Ind. Electron. 49(2), 276–289 (2002) 9. Imayavaramban, M., Chaithanya, A.V.K., Fernandes, B.G.: Analysis and mathematical modeling of matrix converter for adjustable speed AC drives. In: Proceedings of IEEE Power System Conf. Expo, pp. 1113–1120 (2006) 10. Sing, G.K.: A research survey of induction motors operation with non-sinusoidal supply waveforms. Electric Power Syst. Res. (Elsevier) 75, 200–213 (2005)

Service Quality Parameters for Social Media-Based Government-to-Citizen Services Sukhwinder Singh, Anuj Kumar Gupta and Lovneesh Chanana

Abstract As the digital delivery of Government services to the citizens is swiftly moving towards a maturity stage, there is a need for availability of multiple channels of service delivery. In today’s ICT environment, the social media applications have become one of the most commonly used and preferred platform. A massive growth in number of Internet users and social media penetration at a large scale in India has opened up the potential for use of social media for Government service delivery. The use of social media is steadily getting adopted by Governments for delivery of services to the citizens and some states and central Government organizations in India have started offering a few services over social media. The quality parameters of such social media-based service delivery will have an impact on attracting new users and also on retaining the existing users. It therefore becomes important to identify and prioritize the various parameters which influence the quality of social media-based Government-to-Citizen services. This paper is based on an exploratory study to identify various service quality parameters in an Indian state of Punjab. The paper highlights the outcomes of an online survey of Indian e-Government experts on the quality parameters for social media-based e-Government services in the state of Punjab.



Keywords e-Government Social media Government to citizen services



Quality parameters

S. Singh (✉) IKG Punjab Technical University, Kapurthala, India e-mail: [email protected] A. K. Gupta Chandigarh Group of Colleges, Mohali, India e-mail: [email protected] L. Chanana SAP India Private Limited, Bengaluru, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_20

193

194

S. Singh et al.

1 Introduction e-Governance may be simply defined as the use of Information and Communication Technology (ICT) in services provided by the Government to different stakeholder groups. The key objective of e-Governance is to deliver the Government services to doorstep of the citizens on an anywhere-anytime basis. As e-Governance initiatives have currently gained acceptability all around and attaining maturity level in India, there is a demand for more channels to improve the service delivery mechanism. Correspondingly, an exponential rise is being experienced in the availability and usage of social media applications. Social media may be defined as a set of online tools mainly considered for social interactions. Examples include Facebook, Twitter, Instagram, LinkedIn, Pinterest, YouTube, Wikipedia and WhatsApp, etc. The pioneering initiative by Indian Government is the flagship program called Digital India. Digital India is conceptualized as a holistic programme to transform India into a digitally empowered society and knowledge economy. The Digital India programme (www.digitalindia.gov.in) has been framed on nine significant pillars; wherein using social media for two-way communication between Government and citizens is one of important facet. As the initiatives at the central and state Government levels start including the use of social media in their service delivery, it is important to explore the various quality parameters for such service delivery so that acceptance by citizens can be enhanced. Such kind of identification of quality factors can potentially facilitate the Government departments to focus on the quality dimensions of service delivery mechanism for social media-based e-Government services. The study in one state may also prove to be useful as a reference for other states with similar social, economic and cultural parameters. The geographical scope of the paper is limited to the state of Punjab in north India. The paper has been organized as follows: next section covers the literature review on service quality factors for social media-based e-Government services as identified globally and also, specifically for India including the state of Punjab; the section on literature review is followed by a section highlighting the motivation for research; the next section presents the methodology followed by results, findings and conclusions of the study. The last section highlights the scope for future research based on this study.

2 Literature Review Many researchers and experts from different parts of the world have identified various service quality parameters for e-Government and social media in their countries and states. Agourram [1] concluded that social media has gained popularity due to the factors like ubiquity (availability of services at all the times and places) and

Service Quality Parameters for Social Media …

195

interactivity (interaction in a very less time with a large community). ALotaibi et al. [2] identified reliability/access to right information, ubiquity and interactivity as the important parameters which influence the service quality of social media-based delivery of Government services. Wang and Sinnott [3] perceived factors like data privacy and data utility as most important service quality parameters. Bonsonet al. [4] identified the factors like fast navigation and relevance of content which influence the quality of social media-based service delivery mechanism of the Government in Europe. Dong [5] explored different aspects for social media-based delivery of services in Minnesota and identified some important service quality parameters such as transparency in the actions, accountability in case of failures and technological support, etc. Mandal and McQueen [6] believed that easy access and quality parameters had a great impact on the acceptance of social media applications by the users. Henrique et al. [7] conducted a survey in United States as a part of their research and highlighted parameters technology capacity/management like quality of social media website and application, etc., responsible for affecting quality of social media in service delivery mechanism of US Government. According to Stowe [8], access to correct account and correct information were the main quality factors for social media applications. Agrawalet al. [9] studied various dimensions of service quality of e-Services and proposed a quality measurement instrument- EGOSQ having seven different categories namely information, interaction, integration, accessibility, emotional engagement, active service recovery, assurance. Agarwal et al. [9] presented a consolidated picture of various service quality parameters for e-Services as identified by different researchers and experts in diverse countries and states as shown in Table 1. Zmud et al. [10] believed that policies for quality must ensure the relevance, accuracy, reliability and timeliness of the information. Wu et al. [11] identified accuracy, diversity and novelty as the service quality parameters for social media. Lee and Kim [12] experienced that strength of social networks, ease of use and social altruism having a great impact on quality of social media in Government activities in OASIS. Landsbergen [13] explored various factors like relevance, quality and trustworthiness of the information affecting the quality of social media-based e-Government services in Columbus-Ohio, USA. Olumuyiwa Dele [14] identified technical efficiency, usability, real-time updation, privacy and security concerns, graphical layout, community connectedness, etc., as most important service quality parameters for social media applications. Ampofo [15] believed that privacy and trust supposed to be most important factors for social media service quality. Ellahi and Bokhari [16] conducted a survey for key quality factors of social networking websites and efficiency, entertainment, community drivenness, privacy, user friendliness, efficiency and navigability were found to be most significant quality factors. Zhang and Gupta [17] concluded that service quality factors like security, privacy and trust worthiness having much importance in this regard. Chanana et al. [18] explored various service quality parameters for mobile-Government services in

196

S. Singh et al.

Table 1 Proposed dimensions of e-service quality given by Agarwal et al. [9] Kavnama and Black (2000) “E-QUAL”

Zeithaml et al. (2001)

Liljander et al. (2001)

Loiacono et al. (2000) “WEBQUAL”

1. Responsiveness, 2. Content and purpose, 3. Accessibility, 4. Navigation, 5. Design and presentation, 6. Background, 7. Personalization and customization

1. 2. 3. 4. 5.

Reliability, Responsiveness, Access, Flexibility, Ease of navigation, 6. Efficiency, 7. Assurance/trust, 8. Security/ privacy, 9. Price knowledge, 10. Site aesthetics 11. Customization/ personalization

1. 2. 3. 4. 5.

Lin and Wu (2002)

Zeithaml (2002)

1. Information content, 2. Customization, 3. Response rate

1. Core e-SQ 2. Efficiency, 3. Reliability, 4. Fulfillment 5. Privacy. 6. Recovery-SQ 7. Responsiveness, 8. Compensation, 9. Contact Li et al. (2002)

van Riel et al. (2004) 1. Usability, 2. E-Scape design, 3. Customization, assurance 4. Responsiveness

1. Information fit to task, 2. Interaction, 3. Trust, 4. Response time, 5. Design, 6. Intuitiveness, 7. Visual appeal, 8. Innovativeness 9. Flow (Emotional appeal), 10. Integrated communication, 11. Business processes, 12. Substitutability Yang et al. (2004)

Yoo and Donthu (2001) STTE-QUAL 1. Ease of use 2. Processing speed 3. Aesthetic design 4. Interactive responsiveness

1. Website design 2. Customer service 3. Reliability 4. Privacy

User interface, Responsiveness, Reliability, Customization Assurance

Zeithaml et al. (2005) “e-SQUAL” 1. Tangibility 2. Reliability 3. Responsiveness 4. Integration of communication 5. Assurance 6. Quality of information 7. Empathy

1. 2. 3. 4. 5. 6.

Reliability Responsiveness Competence Easer of use Product portfolio Security

Agarwal et al. [9] 1. 2. 3. 4. 5. 6.

Information Interaction Integration Access Corporate image Emotional engagement 7. Active service recovery 8. Assurance

Source Agarwal et al. [9]

India and concluded that these parameters were in line with the e-Government service quality parameters like Privacy, getting things done in the expected time frame, getting things done right the first time, fast navigation and availability of mobile services all the time, etc. Picazo et al. [19] concluded that quality factors like reliability of information, updated information, easy access to data, reliability of network and security, etc., influenced the use of social media in the public sector in Mexico.

Service Quality Parameters for Social Media …

197

Khan and Rahim [20] conducted a survey in Malaysia to explore the service quality factors for social media-based Government services like credibility/ reliability of social media platform, access to right information and design/layout quality. Patel and Jacobson [21] have identified the need of re-defining the service delivery mechanism to compete with the challenge of continuously increasing size of the population in India. Kalsi and Kiran [22] in “A strategic framework for good governance through e-governance optimization: A case study of Punjab in India” discussed that service delivery mechanism for e-Government should be of high quality, so there is a need to explore the parameters that influence the quality of social media-based Government services. Sargunam and Vinod [23] in “Use of Social Media in Effective Implementation of e-Governance in India” discussed that social media has the potential to change the style of governance in the present era. Banday and Matoo [24] in “Social Media in e-Governance: A Study with Special Reference to India” discussed various issues that hinders the successful implementation of social media in e-Governance. Smith and Gallicano [25] identified peer linking, online interaction and crowdsourcing influence as the significant service quality factors.

3 Research Objectives Researchers indicate that while use of social media in e-Governance is in its sprouting stage; there is a need to find out the substantial service quality parameters having an impact on the same. It seems that no research has been carried out to identify and analyze the service quality factors for social media-based e-Government services, specifically for India and also for states within India. The specific research objectives are: • To identify e-Government service quality parameters that can be deliberated for assessing the quality of social media-based Government services. • To rank the parameters on which quality of social media-based Government services can be measured. • To identify the quality parameters which are most significant for social media-based Government service delivery in the Indian state of Punjab.

4 Methodology This exploratory study has been carried out to utilize the expertise of the professionals vigorously involved in the domain of e-Governance and social media in Government, Information Technology companies, academia, consulting and research organizations

198

S. Singh et al.

etc. The study is based on experts’ survey in order to identify and prioritize the quality factors for social media-based Government to Citizen (G2C) services in the state of Punjab. As an initial step, the study uses various e-Government service quality factors as identified globally by researchers and experts for other countries and states. The purpose of choosing these factors is twofold—first, to have a better idea of all the service quality factors identified globally for preparing a consolidated list and second, to check whether the service quality parameters for e-Government are inline with the quality parameters for social media-based e-Government services in the state of Punjab. As the first step, after having a thorough survey of literature, a set of e-Government service quality factors identified in “EGOSQ—User’s Assessment of e-Governance Online Services: A Quality Measurement Instrumentation” by Agarwal et al. [9] has been found to be the most comprehensive, covering nearly all the dimensions and work done by the other researchers. The literature related to following has been reviewed: • service quality factors for e-Government • service quality factors for social media (Fig. 1). Based on the initial identified factors, the next phase of research conducted a web-based survey inviting various professionals and experts for responding to questionnaire. Due to geographical spread of the respondents, the authors opted for web-based survey. The invited experts were requested to provide their basic and demographic information along with contact details for future reference.

4.1

Primary Data

The e-Government experts from almost all the domains were approached through an online questionnaire for getting the responses to survey questions. For the expert group from academics and researchers, the experts with at least one peer reviewed

Fig. 1 Service quality parameters for e-government and social media

e-Government service quality parameters

Social media service quality parameters Quality parameters for Social media based Government-to-Citizen service delivery mechanism

Service Quality Parameters for Social Media …

199

journal or conference publication regarding e-Government/social media were approached. Technical solutioning experts and business development professionals were targeted from information technology companies. Officials of the rank of Secretary, Joint Secretary, Director, Head of Department etc. were approached as experts from Government. Partners of consulting organizations were taken from Consultancy/Advisory sector. Heads of NGOs and project directors of electronic governance based projects funded by the multinational organizations were targeted from these organizations.

4.2

Sampling Technique

The exploratory research is based on stratified purposive sampling having target population as expert groups from Government officials, Academics, Research, NGOs, Consultants, IT industries, Application Developers, etc. Academic databases, reputed journals, and publications were referred to explore expert groups from academics and research. The experts from Government were drawn from the civil list, directory of officials of Ministry of Communication and Information Technology, Department of Governance Reforms and Department of Administrative Reforms and Public Grievances (DARPG). The reason for opting stratified purposive sampling is to have in-depth, quality responses from the invited experts. A total of 64 respondents who completed the survey included Government officials, IT and Telecom professionals, Academicians, Researchers and Consulting professionals.

4.3

The Survey Instrument

An online web-based questionnaire was used as the survey instrument. Around 150 experts were requested to participate in the survey through a web link on www. surveymonkey.com. An e-mail having the web link was sent to target respondents. The survey questionnaire comprised of eight questions in total. The first three questions for basic details like name, gender and age. The questions fourth to sixth for collecting details like sector/domain, designation and organization. The next question had a set of 17 service quality factors to be ranked from 1 to 17. The experts had to rank the factors which may influence the quality of social media-based e-Government (G2C) service delivery in order of their impact/ importance for the state of Punjab. Rank 1 specified the most important factor and rank 17 for the least important factor. The last question was an open ended question for experts to suggest other relevant quality parameter, if any, which has not been included in the given set.

200

S. Singh et al.

5 Survey Results The experts involved in e-Governance from different fields like Government, IT industries, Consulting companies, Academics, Research, Telecom and non-Government organizations has been targeted to collect responses.

5.1

Profile of Respondents

Nearly 33% of the respondents belong to age group of 35–44 followed by about 31% in the age group of 45–54. The respondent percentage depicts that most of the respondents are senior and experienced. 23% of the respondents fall in the age group of 25–34 while 13% respondents belong to age group of 55–64. In terms of gender, 92% of the respondents are male while 8% are female. Moreover, all the respondents belong to India only (Figs. 2 and 3). Figure 4 depicts the sector-wise distribution of the respondents. Majority of the respondents, i.e., 42% belong to Government sector, while 25% respondents are from IT industry followed by 10% from Consultancies/Advisory. The respondents from Academia constitute 11%, Research 7% Telecom 3% and other sectors 2%. Generally, it clearly depicts a healthy mixture of almost all the sectors expected to be involved in e-Governance.

5.2

Experts’ Ranking of Identified Factors

As discussed above, the experts were presented with a set of 17 factors identified by researchers and scholars globally and requested to rank the factors in the order of their priority of impact as per social, economic and cultural conditions in the state of Punjab. The rank one specifies the most important parameter having highest impact on the quality of social media-based G2C services while the rank seventeen for the Fig. 2 Age profile of respondents

Service Quality Parameters for Social Media …

201

Fig. 3 Gender profile of respondents

Fig. 4 Sector-wise distribution of experts

least important factor having very less impact on the same. The experts’ responses were analyzed and average rank was calculated for each quality parameter (Table 2).

5.3

Analysis of Results

As per the average ranks of the experts’ responses, the top five quality parameters having comparatively higher importance for social media-based Government-toCitizen service delivery in the state of Punjab are: 1. 2. 3. 4. 5.

Getting things done in the expected time frame Privacy—protection of personal information Getting things done right the first time Ease of use of applications Getting “updated” information through the application.

202

S. Singh et al.

Table 2 Average ranks of the identified service quality factors S.no

Service quality parameters

1 2 3 4 5 6 7 8

Getting things done in the expected time frame Privacy—protection of personal information Getting things done right the first time Ease of use of applications Getting ‘updated’ information through the application Getting ‘useful’ information through the application Fast navigation through applications without jams Availability of social media based G2C services at all days and at all times Financial security of online transactions Transparency in actions on applications A ‘wide’ range of services through applications Easy to retrieve and use applications Getting ‘reliable’ information Customization according to needs Accountability in case of service failure Availability of online contact information of key stakeholders in case required Opportunity to provide online interaction with other users

9 10 11 12 13 14 15 16 17

Average rank

Relative importance

4.58 4.91 5.29 6.25 6.38 6.51 7.08 7.40

1 2 3 4 5 6 7 8

9.45 9.92 10.23 10.94 11.28 11.63 12.48 13.86

9 10 11 12 13 14 15 16

14.66

17

Moreover, other equally important service quality parameters in next order of their priority are “Getting useful information through the application” and “Fast navigation through applications without jams”.

6 Findings This paper has highlighted the outcomes of an online web-based survey of the experts regarding parameters which influence the quality of social media-based e-Government (G2C) service delivery in the state of Punjab. The survey results depicted in Fig. 5 indicate the following findings: “Getting things done in the expected time frame” has been ranked as the most important service quality factor. It is coherent to the anticipation of delivery of e-government services in the expected time. This rank is supposed to be important for researchers, keeping in mind the fact that social media-based government service delivery may be dependent on various characteristics related to social media, which is still developing, and thus it is important that the proposed application and functional design should be robust.

Service Quality Parameters for Social Media …

203

Fig. 5 Graphical representation of results (average ranks)

“Privacy-protection of personal information” has been ranked as the next most significant service quality factor. The experts perceive that as Government has started delivering the services using social media platform, privacy of personal information assumes much significance. The perception may be attributable to social connectedness and sharing of information with a very large community. “Getting things done right the first time” has been ranked as the third substantial factor. It has been also in line with service quality factors for e-Government. It may be accredited to the prominence being given by the experts to the need of getting the information/transactions fabricated right within the applications only so that the users can avail the service right the first time itself. Monotonous requests regarding same service could be restraining the satisfaction of users. “Ease of use of applications” has been ranked as the fourth most significant service quality factor. It can be attributable to the need of easy navigation and managing other application features over social media. Similarly, “Getting updated information through the application” has been ranked as the next most relevant parameter which influence the quality of social media-based Government-to-Citizen service delivery mechanism. Besides the above five parameters, “getting useful information”, “fast speed of navigation”, “availability of social media-based G2C services at all time”, “security of financial transactions”, “transparency in actions on applications” have been identified as significant service quality factors. It is relevant to mention that the above survey was conducted as an exploratory survey (done with the same set of experts using the same questionnaire) to explore the factors affecting use of social media in e-Governance in Punjab. The outcomes explained above have been found to be linked to the identified factors for successfully using social media in service delivery mechanism of the Government. The experts identified “IT literacy of Stakeholders”, “Awareness and Motivation of users”, “Acceptance by users”, “Government framework and strategy for using social media in e-Governance” and “e-Participation” as the most important factors having higher impact on use of social media in e-Governance and therefore the

204

S. Singh et al.

service quality parameters identified above and related to these very factors have also been acknowledged by experts as important for the state of Punjab. Some of the respondents suggested a few more relevant service quality factors to be considered, which they felt were important but not contained in the above list. The suggested quality parameters for social media-based service delivery are “tracking the status of service up to completion of the task”, “ease of verification of documents over social media” and “availability of a ready action plan in case of any rumor/wrong information circulation”. We argue that ease for verification of documents over social media can be seen as a part of “ease of use of applications”. Similarly, tracking the status of service up to completion of the task can be seen as a part of “getting updated information through the application”. Moreover, availability of a ready action plan in case of any rumor/wrong information circulation can be seen as a part of “getting reliable information”.

7 Conclusion The quality parameters significant for e-government services can also be seen as applicable for social media-based delivery of Government services. The findings of the study reveal that the most significant quality parameters for social media-based G2C services in the state of Punjab are “getting things done in the expected time frame”, “privacy”, “getting things done right the first time”, “ease of use of applications”, “getting updated information through the application”.

8 Scope for Future Work The findings of the exploratory research may assist the State Government of Punjab to design ideal strategies keeping in view the quality parameters for delivery of social media-based Government services. The results of this survey can be used for identifying sub-factors under each of the identified service quality parameters. The relation between above identified service quality parameters and factors affecting use of social media in e-Governance can be other area for research. The study can be extended to other states too. Acknowledgements The authors thank all the e-Government experts who participated in the online survey for their inputs and time and IKG Punjab Technical University, Kapurthala (India).

Service Quality Parameters for Social Media …

205

References 1. Agourram, H.: The impact of national culture on online social network usage and electronic commerce transactions. Eur. Sci. J. 9–19 (2013) 2. ALotaibi, R.M., Ramachandran, M., Kor, A., Hosseinian-Far, A.: Factors affecting citizens’ use of social media to communicate with the government: a proposed model. Electr. J. e-Gov. 14(1), 60–72. www.ejeg.com (2016). ISSN 1479-439X 60 3. Wang, S., Sinnott, R.O.: Protecting personal trajectories of social media users through differential privacy. Comput. Secur. Elsevier 67, 142–163 (2017) 4. Bonson, E., Torres, L., Royo, S., Flores, F.: Local e-government 2.0: social media and corporate transparency in municipalities. Gov. Inf. Q. Elsevier 29, 123–132 (2012) 5. Dong, C.: Social Media Use in State Government: Understanding the factors affecting social media strategies in the Minnesota State Departments. HHH, University of Minnesota Digital Conservancy. http://hdl.handle.net/11299/172808 (2015) 6. Mandal, E., McQueen, R.: Extending UTAUT to explain social media adoption by microbusinesses. Int. J. Manag. Inf. Technol. 4(4), 1–11 (2012) 7. Henrique, G., Oliveria, M., Welch, E.W.: Social media use in local government: linkage of technology, task, and organizational context. Gov. Inf. Q. Elsevier, 30(4), 397–405 (2013) 8. Stowe, L.: Verified Social Accounts Are More Important Than Ever. http://www.govtech. com/social/Verified-Social-Accounts-Are-More-Important-ThanEver.html (2015) 9. Agrawal, A., Shah, P., Wadhwa V.: Foundations of e-Government. Chapter- EGOSQ-Users’ Assessment of e-Governance Online-Services: A Quality Measurement Instrumentation. GIFT Publications (2007) 10. Zmud, R.W., Lind, M.R., Young, F.W.: An attribute space for organizational communication channels’. Inf. Syst. Res. 1(4), 440–445 (1990) 11. Wu, H., Yue, K., Pei, Y., Li, B., Zhao, Y., Dong, F.: Collaborative topic regression with social trust ensemble for recommendation in social media systems. Knowl. Based Syst. Elsevier 97, 111–122 (2016) 12. Lee, J., Kim, S.: Active e-Participation in local government: citizen participation values and social networks. In: EGPA Conference: Information and Communications Technologies in Public Administration, Bergen, Norway, 5–8 Sept, 2012 13. Landsbergen, D.: Government as part of the revolution: using social media to achieve public goals. Electr. J. e-Gov. 8(2), 135–147 (2010) 14. Olumuyiwa Dele, O.: An analysis of factors influencing the success of social networking websites: a case study. Faculty of Computing Blekinge Institute of Technology, Sweden (2016) 15. Ampofo, L.: The social life of real-time social media monitoring. University of London-UK, J. Audience Recept. Stud. 8(1) (2011) 16. Ellahi, A., Bokhari, R.H.: Key quality factors affecting user’s perception of social networking websites. J. Retail. Consum. Serv. 20(1), 120–129 (2012) 17. Zhang, Z., Gupta, B.B.: Social media security & trustworthiness: overview and new direction. Future Gener. Comput. Syst. Elsevier (2016). https://doi.org/10.1016/j.future.2016.10.007 18. Chanana, L., Agarwal, R., Punia, D.K.: Service quality parameters for mobile government services in India. Global Bus. Rev. 17(1), 136–146 (2016) 19. Picazo, S., Gutierrez, I., Luna, L.F.: Understanding risks, benefits, and strategic alternatives of social media applications in the public sector. Gov. Inf. Q. Elsevier 29(2012), 504–551 (2012) 20. Khan, S., Dr. Rahim, N.Z.A.: Factors Influencing Citizens Trust on Government Social media. Advanced Informatics School, University Teknologi, Malaysia (2016) 21. Patel, H., Jacobson, D.: Factors influencing citizen adoption of e-government: a review and critical assessment. In: 16th European conference on information system, ECIS Proceedings paper 176, pp. 1058–1069 (2008) 22. Kalsi, N.S., Kiran, R.: A strategic framework for good governance through egovernance optimization. Emerald Insight 49, 170–204 (2015)

206

S. Singh et al.

23. Sargunam, T., Vinod, V.: Use of social media in effective implementation of e-governance in India. Int. J. Commun. Netw. Syst., Integrated Intelligent Research (IIR) 03, 267–278 (2014) 24. Banday, M.T., Matoo, M.M.: Social media in e-governance: a study with special reference to India. Social Netw. Scientific Research, 47–56. http://www.scirp.org/journal/sn (2013) 25. Smith, B.G., Gallicano, T.D.: Terms of engagement: analyzing public engagements with organizations through social media. Comput. Human Behav. Elsevier 53(2015), 82–90 (2015)

Slot-Loaded Multiband Miniaturized Rectangular Microstrip Antenna for Mobile Communications Sajeed S. Mulla and Shraddha S. Deshpande

Abstract In this paper, the design of multiband miniaturized Rectangular Microstrip Antenna(RMSA) with selective frequency bands for mobile communications has proposed. The miniaturization techniques for proposed microstrip antennas have been discussed. The dual-band Rectangular Microstrip Antenna (RMSA) operating in orthogonal modes (TM01 and TM10 ) is designed first, and then it is miniaturized using the slot-loaded technique, followed by two slot antennas that are introduced for extra WIFI and LTE bands. 23% miniaturization of multiband miniaturized RMSA relative to the dual-band RMSA, without significant change of radiation characteristics, is achieved. Radiation characteristics of multiband miniaturized RMSA such as radiation efficiency (𝜂), gain (G), and bandwidth (BW) at three different frequencies of interest are observed. Keywords Rectangular microstrip antenna ⋅ Miniaturized antenna ⋅ Multiband antenna ⋅ Compact microstrip antenna

1 Introduction Ultra-wideband (UWB) antennas, discussed in number of literature, cover large frequency range, but its performance parameters such as gain and radiation efficiency are not remained uniform over a wide frequency range. It covers wide range of frequencies but not uses comprehensively in wireless communication. Mobile communications and local area wireless network (WIFI) requires small instantaneous frequency band, accordingly compact dual-band or multiband antennas become alternative to a UWB antenna.

S. S. Mulla (✉) ⋅ S. S. Deshpande Walchand College of Engineering, Sangli 416415, Maharashtra, India e-mail: [email protected] S. S. Deshpande e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_21

207

208

S. S. Mulla and S. S. Deshpande

The multiband antenna using number of antenna elements gave desired frequency bands but the structure is complex and suffered from the problem of impedance mismatch [1]. Rectangular microstrip antenna (RMSA) and circular microstrip antenna (CMSA) operated in multimode at resonant frequencies do not have same radiation characteristics such as radiation pattern and input impedance for all modes except number of the lower order modes [2]. In CMSA and RMSA, all modes except fundamental modes have fixed frequencies. Therefore, most of the frequency bands generated by RMSA and CMSA are not suitable for mobile communications. The RMSA in TM10 and TM30 have same radiation characteristics [3]. The RMSA and CMSA operating in multimode help to eliminate need of single-frequency band antenna elements, which leads to reduce the size of antenna structure. The problem of fixed frequency band in higher order mode of RMSA (or CMSA) eliminated by exciting it in orthogonal mode (TM10 , TM01 ). The feed is located at offset from two principal axes of the RMSA (or CMSA). Ratio f1 ∕f2 is approximately equal to L∕W, where f1 and f2 are resonance frequencies for TM10 and TM01 , respectively. L, W are length and width of RMSA, respectively. The RMSA operating in orthogonal modes provides orthogonal polarization, which is suitable for mobile communications. The mobile communication system requires small-sized mobile devices which consumes low power and handles easily. Such mobile devices require low gain, low profile miniaturized (compact) microstrip antenna (MSA) with desired radiation characteristics. The compactness of MSA is trade-off between its size and performance. Reducing the size of MSA degrades the performance parameters such as gain, bandwidth, radiation efficiency, and input impedance. The number of miniaturization techniques discussed in [3, 4] is classified as follows: √ (i) Compact MSA: 𝜆e is inversely proportional to 𝜖r . This allowed for design to increase compactness of MSA using high contrast substrate with high material constant (𝜖r ). (ii) Compact Shorted MSA: The compactness obtained by putting the shorting post at zero field location in dominant mode of the MSA. But it provides less radiation efficiency and gain. Shortening and folding technique is a popular technique for antenna miniaturization used in mobile applications [5]. The planer inverted F antenna (PIFA) has been designed using this technique. (iii) Slot/Stub-loaded MSA: Compactness of MSA increases by increasing the path length of the surface (inductive effect), by cutting the slots on radiation patch [6, 7]. (iv) Artificial electromagnetic metamaterial-loaded MSA: The resonance frequency of the MSA reduces significantly by integrating the metamaterial to the MSA. The metamaterial acts as pure reactive circuit at resonance frequency [8–11]. The antenna without ground whose size is less than one wavelength (𝜆0 = c∕f ) is termed as electrically small antenna (ESA) [12, 13]. The ESA satisfies the Chu’s condition (ka ≤ 1), where k = 2𝜋∕𝜆0 , and a is the radius of Chu’s sphere which encloses the antenna. For miniaturized antenna, the value for ka ranges from 1 to 3.14 (1 ≤ ka ≤ 3.14) [12, 13].

Slot-Loaded Multiband Miniaturized Rectangular . . .

209

This paper focused on design and optimizations of the miniaturized multiband rectangular microstrip antenna. The complete design procedure for dual-band rectangular microstrip antennas operating in TM01 and TM10 modes in Sect. 2 has been discussed and then designed and optimized. In the same dual-band rectangular microstrip antenna, the slot loading technique has been proposed in Sect. 3 for its miniaturization. Section 4 has introduced multiband miniaturized rectangular microstrip antenna (RMSA), in which an extra band of WiFi has been added using integrated slot antenna on RMSA. After Sect. 4, the result and its discussion have been made.

2 Design of Dual-Band RMSA In this section, the design of dual-band RMSA for frequencies of f1 = 1.95 GHz and f2 = 2.15 GHz operating in orthogonal modes (TM10 , TM01 ) presented. The material selected for the dual-band RMSA is low-cost FR4 glass epoxy substrate with dielectric constant 𝜖r = 4.4, loss tangent tan 𝛿 = 0.02, “cu” thickness t = 35 µ m, and substrate height h = 1.59 mm. The following procedure is followed for design of dual-band RMSA.

2.1 RMSA Design for TM𝟎𝟏 mode RMSA for frequency of f2 = 2.15 GHz operating in TM01 mode is designed with help of the following procedure: (i) Effective wavelength is calculated as 𝜆01 e =

√ f2

c 𝜖r + 1 2

(1)

(ii) Effective width in terms of 𝜆01 is given by e We01 ≈

𝜆01 e 2

(2)

for the proposed design We01 = 38 mm is selected. (iii) Effective permittivity 𝜖e01 is given by 𝜖e01

]−1∕2 [ 𝜖r + 1 𝜖 r − 1 10 h = + 1 + 01 2 2 We

(3)

210

S. S. Mulla and S. S. Deshpande

(iv) Effective length L01 is e L01 e =

c √ 2f2 𝜖e01

(4)

(v) Physical length L01 and physical width W 01 are 01 L01 = L01 e − 2𝛥L

(5)

W 01 = We01 − 2𝛥W 01

(6)

h where 𝛥L01 = 𝛥W 01 = √ . 𝜖e01 (vi) The feed position is located where the input impedance of RMSA is 50 Ω. The optimized feed location is (0, L01 ∕4). As per above design procedure, the geometric parameters of RMSA in TM01 mode 01 01 are calculated as per follows: 𝜖e01 = 4.12, L01 e = 34.37 mm, L = 32.80 mm, We = 38 mm, W01 = 36.43 mm, feed location (0, 8.2 mm), SUBX = 45 mm, and SUBY = 43 mm.

2.2 RMSA Design for TM𝟏𝟎 mode RMSA for frequency of f2 = 1.95 GHz operating in TM10 mode is designed with the help of following procedure: (i) Effective width is selected, We10 = L01 e , since the same RMSA is to be operated in orthogonal modes. (ii) Effective permittivity 𝜖e10 is given by 𝜖e10

]−1∕2 [ 𝜖r + 1 𝜖 r − 1 10 h = + 1 + 10 2 2 We

(7)

(iii) Effective length L01 e is calculated as L10 e =

c √ 2f2 𝜖e10

(8)

(iv) Physical length L10 and physical width W 10 are given by − 2𝛥L10 L10 = L10 e

(9)

W 10 = We10 − 2𝛥W 10

(10)

Slot-Loaded Multiband Miniaturized Rectangular . . .

211

h where 𝛥L10 = 𝛥W 10 = √ 𝜖e10 (v) Feed position is located where the input impedance of RMSA is 50 Ω. The optimized feed location is (L10 ∕4, 0) As per above design procedure, the geometric parameters of RMSA in TM10 mode = 37.98 mm, L10 = 36.40 mm, are calculated as per following, 𝜖e10 = 4.10, L10 e We10 = 34.37 mm, W10 = 32.79 mm, feed location (9.1 mm, 0), SUBX = 45 mm, and SUBY = 43 mm.

2.3 RMSA Design for Orthogonal (TM𝟏𝟎 𝐚𝐧𝐝 TM𝟎𝟏 ) mode Referring to above sections, the dual-band RMSA operating in orthogonal mode (TM10 and TM01 ) is designed, analyzed, and optimized on ANSYS HFSS 18.1. The optimized geometric parameters are given as L = L10 = 36 mm and W = W 10 = 32 mm. The good impedance matching is possible at fx = 8 mm, fy = 8.5 mm. Figure 1 shows schematic diagram of dual-band RMSA. The reflection coefficient (𝛤 ) versus frequency (f ) for RMSA is observed as shown in Fig. 2. It is noticed that, in addition

Fig. 1 Dual-band RMSA (Antenna-1)

212

S. S. Mulla and S. S. Deshpande

Fig. 2 Simulated |𝛤 | versus frequency of dual-band RMSA

|Γ |(dB)

0

−10

−20

T M30 T M10

−30

T M01

1.5

2

2.5

3

3.5

f (GHz)

to TM10 , and TM01 modes, one more higher order TM30 mode is obtained, which can be used for Long-Term Evolution (LTE) applications.

3 Miniaturization of Dual-Band RMSA MSA antenna behaves √ as a parallel resonance circuit [14], whose resonance frequency is fn = 1∕2𝜋 Ln Cn , where “n” is the positive integer. The RMSA is miniaturized by increasing inductive or capacitive effects of the antenna circuit at resonance frequency. In the proposed miniaturized dual-band RMSA, the inductive effect increases by increasing the path length of the current, and thus rectangular slot of length Lx and Ly is cut along X-axis and Y-axis at the center of the RMSA, respectively; however, the width of slot is Ws , as shown in Fig. 3. With increase in the slot dimension, the resonance frequency of the RMSA decreases. The effect of width Ws on antenna miniaturization is negligible in this case, since slot width Ws is so narrow relative to slot lengths Lx and Ly . Following the RMSA design, the proposed miniaturized dual-band RMSA with slot loading for frequencies of f1 = 1.5 GHz and f2 = 1.8 GHz is designed as per the following. (i) Miniaturized RMSA design for TM01 mode: (a) The slot width Ws = 1 mm is selected; the purpose behind selecting the narrow slot width is to avoid the problem of impedance mismatch. c (b) Calculate 𝜆01 = √ . e2 f2 𝜖e01 (c) For narrow slot width (Ws ≪ 𝜆01 ∕10), calculate Lx = 𝜆01 − 2W. e2 e2 (ii) Miniaturized RMSA design for TM10 mode: (a) The slot width Ws = 1 mm is selected. c = √ . (b) Calculate 𝜆10 e1 f1 𝜖e10

Slot-Loaded Multiband Miniaturized Rectangular . . .

213

Fig. 3 Slot-loaded RMSA (Antenna-2)

(c) For narrow slot width (Ws ≪ 𝜆10 ∕10), calculate Ly = 𝜆10 − 2L. e1 e2 The design, optimization, and full-wave analysis of 3D model of miniaturized RMSA shown in Fig. 3 has carried out on ANSYS Electronics Desktop (ANSYS HFSS 18.1) simulation software [15]. The ANSYS HFSS uses numerical method like Finite Element Method (FEM) for solving the differential equation. The optimum slot lengths along X-axis and Y-axis selected for getting desired resonance frequencies are Lx = 20 mm and Ly = 26.77 mm, respectively. The material selected for the dual-band RMSA is of low-cost FR4 glass epoxy substrate with dielectric constant 𝜖r = 4.4, loss tangent tan 𝛿 = 0.02, “cu” thickness t = 35 µm, and substrate height h = 1.59 mm. As in the graph of 𝛤 versus f shown in Fig. 4a, b, the miniaturization of RMSA depends on length and width of the slots. The lengths (Lx , Ly ) of the slots control the degree of miniaturization of the proposed antenna. Experimental analysis of slot loading RMSA carried out with help of 𝛤 versus f graph. In the observation, the first and third frequency bands of TM10 and TM30 modes, respectively, shift toward left as length Ly increases from 24 to 26 mm by keeping Lx = 19 mm constant, as shown in Fig. 4a; however, second frequency band of TM01 shifts toward left as slot length Lx increases from 19 to 21 mm by keeping Ly = 24 mm constant, as shown in Fig. 4.

214

S. S. Mulla and S. S. Deshpande

(b) 0 |Γ |(dB)

|Γ |(dB)

(a) 0 −10 T M30

−20

T M10

Ly = 24mm

T M01

−10 T M30

−20

Ly = 25mm

T M10

Ly = 26mm

1.2 1.4 1.6 1.8

2

2.2 2.4 2.6 2.8

3

Lx = 19mm

1.2 1.4 1.6 1.8

f (GHz)

Lx = 20mm

T M01

2

Lx = 21mm

2.2 2.4 2.6 2.8

3

f (GHz)

Fig. 4 Simulated |𝛤 | versus frequency of slot-loaded RMSA for a Lx = 20 mm, b Ly = 25 mm

4 Design of Multiband Miniaturized RMSA Multiband RMSA operating in TM10 , TM01 and TM30 with frequencies f1 = 1.95 GHz, f2 = 2.15 GHz, and f3 = 3 GHz, respectively, has designed in Sect. 2, and then same antenna has miniaturized with slot loading technique for frequencies of f1 = 1.5 GHz, f2 = 1.8 GHz, and f3 = 2.7 GHz in Sect. 3. The frequency band of f = 2.43 GHz required to add for local wireless network (WIFI). For obtaining the WIFI band, two narrow parallel slots symmetric about Y-axis having length of Lsl and width Wsl cut at x = 15.5 mm and x = −15.5 mm on miniaturized RMSA as shown in Fig. 5. These two parallel symmetric slot antennas placed for proper impedance matching. The antenna slot length (Lsl ) for frequency f = 2.43 GHz calculated as per the following procedure: c (i) Effective wavelength is calculated, 𝜆e = √ , 𝜖e = 𝜖01 , since antenna slots f 𝜖e are parallel to Y-axis. 𝜆 (ii) Effective length of slot antenna is given by Lsle = e . 2 Lsle . (iii) Actual length of the slot antenna is Lsl = 0.2 1+ 𝜖e Calculated value of the antenna slot length (Lsl ) is 27.68 mm; however, optimum value of the Lsl selected for getting desired result is 28 mm. The effective length of the slot located along Y-axis become greater than Ly due to antenna slots parallel to Y-axis, so first frequency band of TM10 shifted again toward left-hand side. The optimum value of slot length Ly is selected as 25 mm, for getting the desired resonance frequency of 1.5 GHz. For impedance matching, feed positions are optimized to fx = 6.6 mm, fy = 6 mm. The multiband miniaturized RMSA is shown in Fig. 5. Behavioral study of slot antennas for WIFI was carried out with help of 𝛤 versus frequency graph as shown in Fig. 6b. The frequency band resonating due to slot antennas shifted toward left as length Lsl increases from 27 to 29 mm with Wsl = 1 mm constant. It is to be noted from Fig. 6b that other frequency bands of TE modes are unaffected by variation of length (Lsl ) of the slot antenna.

Slot-Loaded Multiband Miniaturized Rectangular . . .

215

Fig. 5 Multiband Miniaturized RMSA (Antenna-3) 0

|Γ |(dB)

|Γ |(dB)

0

−10

−20

−10 Slot antenna resonance

−20

Reference RMSA

1.2 1.4 1.6 1.8

2

2.2 2.4 2.6 2.8

f (GHz) (a) Reference RMSA and the slot loaded RMSA

Lsl = 27mm Lsl = 28mm Lsl = 29mm

Slot loaded RMSA

3

1.2 1.4 1.6 1.8

2

2.2 2.4 2.6 2.8

3

f (GHz) (b) Multiband miniaturized RMSA for variable Lsl with constant Wsl = 1mm

Fig. 6 Simulated |𝛤 | versus frequency

5 Result and Discussion The simulated reflection coefficient (|𝛤 |) of reference dual-band RMSA and slotloaded RMSA structures is shown in Fig. 6a. Very good agreement is observed between analytical and simulated results. The simulated resonance frequencies are at 1.93, 2.14, and 2.97 GHz, whereas the slot-loaded RMSA is resonated at 1.5, 1.85,

216

S. S. Mulla and S. S. Deshpande

(a) 50

(b) 150 100

Rin

Xin

40

30

50 0

Reference RMSA

Reference RMSA

Slot loaded RMSA

20 1.2 1.4 1.6 1.8

2

2.2 2.4 2.6 2.8

Slot loaded RMSA

3

1.2 1.4 1.6 1.8

f (GHz)

2

2.2 2.4 2.6 2.8

3

f (GHz)

Fig. 7 Simulated Impedance versus frequency of the reference RMSA and the slot loaded RMSA a Rin , b Xin Table 1 Comparison of microstrip antennas at solution frequency (f ) = 2.43 GHz) Parameters Antenna-1 Antenna-2 Antenna-3 Directivity (D) (dB) 4.59 Gain (G) (dB) 2.01 % Radiation efficiency 43.89 (%𝜂)

3.46 0.89 25

3.84 0.88 23

and 2.69 GHZ. Both antenna structures are resonating in TM10 , TM01 , and TM30 modes as shown in Fig. 6a. The resonance frequencies of the slot-loaded RMSA reduced by 20.20, 12.61, and 10% are relative to first three respective frequencies of the reference RMSA. The reduction in resonance frequencies is possible by increasing slot lengths Lx and Ly . As shown in Fig. 7 of simulated graph of input reactance Xin for the reference RMSA and slot-loaded RMSA. The Xin of the reference RMSA observed at first three resonance frequencies are 88.21 Ω, 66.78 Ω, and 155.14 Ω, respectively; however, the Xin of the slot-loaded RMSA observed at its resonance frequencies are 99 Ω, 90.17 Ω, and 155.28 Ω, respectively. As seen from Xin plot, the additional inductance added in slot-loaded RMSA at its resonance frequencies relative to reference RMSA, which helps to shift the resonance frequency toward left. Comparison of dual-band RMSA (Antenna-1), slot-loaded RMSA (Antenna-2), and multiband miniaturized RMSA (Antenna-3) with respect to antenna parameters at solution frequency (f = 2.43 GHz) given in Table 1. The radiation efficiency of the Antenna-3 decreases due to addition of slot antennas operating at resonance frequency of 2.43 GHz. Simulated antenna parameters of miniaturized multiband RMSA (Antenna-3) at f1 = 1.5 GHz, f2 = 1.85 GHz, and f3 = 2.43 GHz calculated in HFSS are given in Table 2. As shown in Fig. 8, the characteristics of E-plane (𝜙 = 0◦ ) radiation pattern of miniaturized multiband RMSA at three different frequencies (f1 = 1.5 GHz, f2 = 1.85 GHz, and f3 = 2.43 GHz) are nearly same. The shape of the radiation pattern is directional, and the co-polarized electric field has got good strength relative to the cross-polarized electric field as shown in Fig. 8.

Slot-Loaded Multiband Miniaturized Rectangular . . .

217

Table 2 Simulated results of miniaturized multiband RMSA (Antenna-3) at f1 = 1.5 GHz, f2 = 1.85 GHz, f3 = 2.43 GHz f (GHz) D (dBi) G (dBi) %𝜂 %BW 1.5 1.85 2.43

2.87 3.76 3.84

120◦

1.37 1.02 0.88

90◦

E-co sim. E-cross sim.

60◦

150◦

30◦

0

180◦

2

4

6

210◦

0◦ ·10−3

330◦

240◦

(a) f1 =1.5GHz 120◦

90◦

120◦

2 2.1 2

90◦ 60

E-co sim. E-cross sim.



150◦

30◦

0

180◦

2

4

210◦

0◦ ·10−3

330◦

240◦

300◦

270◦

48 27.30 23

270◦

300◦

(b) f2 =1.85GHz E-co sim. E-cross sim.

60◦

150◦

30◦

0

180◦

2

4

210◦

0◦ ·10−3

330◦

240◦

270◦

300◦

(c) f3 =2.43GHz Fig. 8 Simulated E-plane pattern of Multiband Miniaturized RMSA

6 Conclusion The contributions of the research work are as follows: (1) Miniaturization: The conventional RMSA has only first two useful modes (TM10 , TM01 ). The higher order modes have poor radiation characteristics. The conventional RMSA operating in TM10 and TM01 has miniaturized with the help of slot loading technique. (2) Selective Frequency Bands: One more WIFI band has been added by integrating the slot

218

S. S. Mulla and S. S. Deshpande

antenna to the miniaturized RMSA. Thus, the proposed antenna is operating at three selective frequency bands of interest. The slot loading technique for miniaturization of the RMSA was effectively used in the proposed antenna, and 23% percentage miniaturization was achieved relative to conventional RMSA antenna. The slot antenna for additional frequency band of 2.43 GHz was integrated successfully with dual-band orthogonal-based RMSA. Although the radiation characteristics of the proposed antenna were affected a bit, the shape of the radiation patterns at resonant frequencies has not been changed. The proposed antenna is useful for mobile communication. This proposed antenna resonating at four different frequency bands, viz., 1.5 GHz, 1.85 GHz, 2.43 GHz, and 3 GHz, could be used for UMTS-FDD, GSM-1800, WIFI, and LTE applications, respectively.

References 1. Mulla, S., Deshpande, S.S.: Compact multiband antenna fed with wideband coupled line impedance transformer for improvement of impedance matching. Microw. Opt. Technol. Lett. 59(9), 2341–2348 (2017) 2. Ho, M.H., Chen, G.L.: Reconfigured slot-ring antenna for 2.4/5.2 GHZ dual-band WLAN operations. IET Microw. Antennas Propag. 1(3), 712–717 (2007) 3. Kumar, G., Ray, K.: Broadband microstrip antennas. Artech House (2003) 4. Mulla, S., Deshpande, S.: Miniaturization of micro strip antenna: a review. In: 2015 International Conference on Information Processing (ICIP), pp. 372–377. IEEE (2015) 5. Chen, H.M., Lin, Y.F., Chen, C.H., Pan, C.Y., Cai, Y.S.: Miniature folded patch GPS antenna for vehicle communication devices. IEEE Trans. Antennas Propag. 63(5), 1891–1898 (2015) 6. Kandasamy, K., Majumder, B., Mukherjee, J., Ray, K.P.: Dual-band circularly polarized split ring resonators loaded square slot antenna. IEEE Trans. Antennas Propag. 64(8), 3640–3645 (2016) 7. Gupta, S.K., Sharma, M., Kanaujia, B.K., Gupta, A., Pandey, G.P.: Triple band annular ring loaded stacked circular patch microstrip antenna. Wirel. Pers. Commun. 77(1), 633–647 (2014) 8. Atallah, H.A., Abdel-Rahman, A.B., Yoshitomi, K., Pokharel, R.K.: Mutual coupling reduction in mimo patch antenna array using complementary split ring resonators defected ground structure. Appl. Comput. Electromag. Soc. J. 31(7) (2016) 9. Cheng, Y., Wu, C., Cheng, Z.Z., Gong, R.Z.: Ultra-compact multi-band chiral metamaterial circular polarizer based on triple twisted split-ring resonator. Prog. Electromag. Res. 155, 105– 113 (2016) 10. Zuffanelli, S., Zamora, G., Aguilà, P., Paredes, F., Martín, F., Bonache, J.: On the radiation properties of split-ring resonators (SRRs) at the second resonance. IEEE Trans. Microw. Theory Tech. 63(7), 2133–2141 (2015) 11. Zarrabi, F.B., Mansouri, Z., Ahmadian, R., Rahimi, M., Kuhestani, H.: Microstrip slot antenna applications with SRR for wimax/wlan with linear and circular polarization. Microw. Opt. Technol. Lett. 57(6), 1332–1338 (2015) 12. Ghosh, B., Haque, S.M., Mitra, D., Ghosh, S.: A loop loading technique for the miniaturization of non-planar and planar antennas. IEEE Trans. Antennas Propag. 58(6), 2116–2121 (2010) 13. Ghosh, B., Haque, S.M., Mitra, D.: Miniaturization of slot antennas using slit and strip loading. IEEE Trans. Antennas Propag. 59(10), 3922–3927 (2011) 14. Chang, M.H., Chen, B.Y., et al.: Microstrip-fed ring slot antenna design with wideband harmonic suppression. IEEE Trans. Antennas Propag. 62(9), 4828–4832 (2014) 15. Ansys electronics desktop, ANSYS HFSS. ver. 18.1, Ansoft corporation. Pittsburgh, PA, USA

Prediction of a Movie’s Success Using Data Mining Techniques Shikha Mundra, Arjun Dhingra, Avnip Kapur and Dhwanika Joshi

Abstract A lot of movies release every day. Predicting success of a movie is a complex task as various factors influence its performance on the box office. Since a huge amount of capital is involved in the production, marketing, promotion and distribution of movies, it has been a topic of interest not just for the viewers, but also for the media and production houses and all others who are involved in these processes since a long time now. So, we decided to perform a study on this topic. For the study, we are using the IMDB dataset. In this Internet age, online publicity plays a major role in the success of a movie, so we felt the need of including sentiment analysis of tweets related to movies in our study. We used a variety of data mining models to get predictions as accurate as possible.



Keywords Data mining K-nearest neighbor Random forest Sentiment analysis





Movie prediction

1 Introduction The film industry is not just an entertainment business but also an art. People around the world are crazy about movies. The result of this as we can see is thousands of movies are being produced every year in multiple languages, different genres, with casts from different countries. Many websites and platforms have

S. Mundra (✉) ⋅ A. Dhingra ⋅ A. Kapur ⋅ D. Joshi Manipal University Jaipur, Jaipur, India e-mail: [email protected] A. Dhingra e-mail: [email protected] A. Kapur e-mail: [email protected] D. Joshi e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_22

219

220

S. Mundra et al.

emerged now where people look for information regarding movies like IMDB, YouTube, Rotten tomatoes that have detailed information about the cast, direction, user and critic ratings, summary, financial information of thousands of movies released worldwide over a period of more than 50–60 years. Thus this dataset is can provide complete details needed for studying the trends and predict the success of a movie provided all the necessary details of the movie whose success is to be predicted. Movies are the most essential source of entertainment for people globally, irrespective of whether they belong to rural or urban parts, which language they speak, which profession they belong to. Besides serving as a means of entertainment, movies provide a window to the society and also influence the decision they make in their lives. So, it is crucial to study what factors determine whether a movie is a hit or flop. “Hollywood is the land of hunch and the wild guess” For a long time people have been believing that success of a movie is generally random and cannot be mathematically. But recent research has proven that by incorporating a variety of attributes this can be achieved with a very good efficiency. We propose to build a model to predict movie success using IMDB dataset and sentimental analysis of tweets taken in real time from Twitter. Contrary to popular belief movie prediction is not a “wild guess”, but something that can be predicted using a mathematical model.

2 Related Work Over the past few years many researchers, production houses and investors have tried to create machine learning model for predicting success of a movie at the box office. A few of them that we found useful are summarized as given [1]. It uses IMDB dataset to predict movies popularity by building Classifiers—Decision tree, Logistic Regression, Multilayer perceptron, Naïve Bayes. Simple logistic and logistic regression were best predictors with accuracy at around 84%. The attributes that contributed the most to information are metascore and number of votes for each movie, Oscar awards won by the movies and the number of screens the movie is going to be screened. It does not use Sentiment analysis from Twitter or YouTube [2]. It uses IMDB data set for predicting the rating of movies by using clustering with 3 clusters (poor, excellent, average). According to this paper, important factors are the actors and actresses appearing in the movie, being over 90% relevant each. The director also plays a large part, at around 55%, and the budget is quite significant at around 28% [3]. It uses a hidden layer neural network and SVM for predicting box office predictability of movies by using dataset from IMDB, Rotten Tomatoes and Meta critic. Two output classes as success and failure are used. SVM and NN have 84% and 88% accuracy respectively [4]. It uses sentiment analysis on tweets collected from twitter for predicting movie as super hit, hit, average, flop. It analyses very movies and does not use any other attribute other than tweet sentiments.

Prediction of a Movie’s Success Using Data Mining Techniques

221

3 Proposed Technique and Algorithm See Fig. 1

3.1

Selection and Preprocessing the Data

Data is collected from two sources IMDB and Twitter. IMDB Dataset contained 28 attributes out of which we extracted 18 attributes which were used for the study. To include social media publicity, we did sentimental analysis on tweets related to the movie. Steps involved are as follows (1) Removing rows containing NA fields (2) Removing unrelated attributes from our study like IMDB link of movie, color/ b&w, face number in posters etc. Removing movies released before 2010 as they are not much relevant in our study and social media analysis cannot be done on that.

3.2

Data Transformation and Feature Selection

Binarization—For processing Genre, we created new fields corresponding to each type of genre and mark it as 1 if that movie belong to that genre, else 0. Some of the

• SelecƟon and Preprocessing the data

1

•IMDB and TwiƩer •Cleaning and Coding data using R programming

• Data TransformaƟon and Feature SelecƟon

2

•BinarizaƟon •Average gross of actor and director

• SenƟment Analysis

3

•PT-NT raƟon using python

• Data mining techniques

4

•LDA,CART,KNN,Random Forest,SVM

• Conclusion

5

•RF has the highest accuracy(93.17%)

Fig. 1 Flow chart of the processing of data using multiple data mining techniques

222

S. Mundra et al.

attribute that we have additionally calculated and added to the processed database are average gross of director, average gross of all the actors, PT–NT ratio and content rating. For calculating average gross of director and all the actors, we averaged the gross of all movies of corresponding actor/director and added that as a new attribute. To incorporate social media aspect, we added another new attribute —PT–NT ratio. This was done by calculating the ratio of positive tweets to sum of positive and negative tweets. Content rating—All nominal values of this feature were mapped onto numeric values.

3.3

Sentimental Analysis

The conventional method of predicting movie success by using their ratings, genre, actors casting, director, producer, etc., does not give the real picture of the success of the movie. The ratings could tell if the plot of the movie is good or bad but it cannot tell about the box office collections. So our focus is on predicting the revenues a movie can generate in which social media factor comes handy. Social media content contains rich information about viewer’s preferences. An example is that people often share their thoughts about movies using Twitter. We did data analysis on tweets about movies to predict several aspects of the movie popularity.

4 Methodology A python library, Tweepy is used which provides access to Twitter API to retrieve data from Twitter. Streaming API of Tweepy is used to get the tweets relevant to our task. The API, tweepy.streaming. Stream, continually retrieves data relevant to some topics from Twitter’s global stream of Tweets data. After that a high level library is built over top of NLTK library to do the sentiment analysis is explained in the following steps.

4.1

Authorize Twitter API Client

First, to interact with twitter API and parsing tweets, we created a TwitterClient class which contains all the methods for the interaction with Twitter API and parsing tweets. Function__init__ is used to handle the authentication of API client [4]. Tweepy supports accessing Twitter via OAuth. It is a bit more complicated than Basic authentication but offers lucrative benefits. Tweepy provides access to the well documented Twitter API. With tweepy, it is possible to get any object and use any method that the official Twitter API offers.

Prediction of a Movie’s Success Using Data Mining Techniques

4.2

223

To Extract Tweets for a Specific Query GET Request Is Made to Twitter API

To fetch tweets, Twitter API is called by the method get_tweets. In this method we use: Extract_tweets = self.api.search(q = query, count = counts) to fetch tweets.

4.3

Parse and Remove the Irrelevant Data

We have used Textblob module, a high level library for sentiment classification. As observed, tweets contain many special characters, symbols or links so to remove them clean_tweet method is used as: analysis = TextBlob(self.clean_tweet) Then, the text has to be processed by the textblob library in the following manner: Tokenization of tweet, i.e., split body of text into words and individual word is called as token. After the process of tokenization, stopwords (commonly used words which are the irrelevant in text analysis like I, am, you, are, etc.) were eliminated from the tokens. After that, POS (part of speech) tagging of each token is done to select only significant features/tokens like adjectives, adverbs, noun.

4.4

Creation of Sentiment Classifier

For this, dataset used for the training of sentiment classifier by the TextBlob is the movie review dataset already labeled as positive or negative. From each positive and negative review, all the positive and negative features are extracted and at last, by the help Naive Bayes Classifier final training data consist of labeled positive and negative features.

4.5

Classification of Tweets as Positive, Negative or Neutral Based on Sentiment Classifier

After labeling positive and negative feature, classification is done by passing tokens to the sentimental classifier which classifies the tweet sentiment as either positive or negative or neutral by assigning it a polarity between −1.0 and 1.0. TextBlob class classifies the polarity of tweets as:

224

S. Mundra et al.

If analysis.sentiment.polarity > 0: return “positive”. Else if analysis.sentiment.polarity == 0: return “neutral”. Else: return “negative”. Finally, parsed tweets are returned.

4.6

Statistical Analysis

After assigning tweet as positive/negative/neutral on the basis of sentimental classifier we calculated total count of positive tweet and named as PT similarly total count of negative tweet as NT. To analyze the category of movie as flop, hit or super hit we developed a metric called as PT–NT ratio. PT ratio is the percentage of positive tweets. PT ratio is the percentage of positive tweets to percentage of (positive + negative) tweets. This attribute was then included in the analysis part as if PT-NT ratio is greater than equals to 5 that it is “Super hit” and if it is in between 1.5 and 5 than it is “hit” else it will be “flop”.

5 Dataset The dataset initially contained attributes like the cast, director, audience and critic ratings, how many users rated/reviewed, IMDB link of that movie, financial information like gross, budget, date and year of release, etc. All the irrelevant, redundant and interdependent attributes were removed to reduce the size of data to be processed. After selection we obtained our target data with following attributes: Director name, Duration, Actor name 1, 2, 3, Gross, Genres, Number of voted users, Content rating, Budget, Title year, Movie name.

6 Experimentation and Results We applied several data mining algorithms [5] such as Linear Discriminant Analysis, CART, K-Nearest neighbor, Support Vector Machine and Random forest to check accuracy of each algorithm to our dataset. Given below is the confusion matrix and statistics reference for 761 samples, 12 predictor and three classes as “FLOP”, “HIT”, “SUPER HIT” (Fig. 2; Tables 1, 2, 3, 4 and 5).

Prediction of a Movie’s Success Using Data Mining Techniques

225

Fig. 2 Variation of accuracy and kappa with each approach used

Table 1 Result using linear discriminant analysis

Table 2 Result using CART

Prediction Flop Hit Super hit Accuracy 0.7458 Kappa 0.4925

Prediction Flop Hit Super hit Accuracy 0.8856 Kappa 0.7961

Table 3 Result using K-nearest neighbor

Prediction Flop Hit Super hit Accuracy 0.8475 Kappa 0.7267

Flop

Hit

Super hit

9 41 0

0 131 5

0 14 36

Flop

Hit

Super hit

37 13 0

1 128 7

0 6 44

Flop

Hit

Super hit

32 17 1

3 124 9

0 6 44

226 Table 4 Result using SVM (support vector machine)

Table 5 Result using RF (random forest)

S. Mundra et al. Prediction Flop Hit Super hit Accuracy 0.6017 Kappa 0.0866

Prediction Flop Hit Super hit Accuracy 0.9110 Kappa 0.8461

Flop

Hit

Super hit

7 43 0

1 135 0

0 50 0

Flop

Hit

Super hit

43 6 1

4 126 6

0 4 46

7 Conclusion and Future Work The proposed research aims to predict movie’s success. We have used several data mining model for our experimentation namely Random Forest, CART, K-nearest neighbor, Linear Discriminant Analysis, and Support Vector Machine. After applying these data mining approaches onto the processed dataset we found out that Random forest model is the best model for this particular dataset. It holds the maximum accuracy (93.17%) for predicting the outcome of the experiment as compared to other four models. Some of the points to be covered in future are movies that were released 2– 3 weeks before and those which will be released just after the movie whose success is being predicted should also be incorporated in the study as these have a significant impact in the success of movie. Adding these attributes might lead to better efficiency of the model. Another point that have been noticed is movies released on festivals have become a trend recently because people generally tend to go to movies during holidays, so upcoming holidays are also an important aspect in predicting movie success. YouTube is also an important aspect of online publicity. By studying the number of views, comments, likes, dislikes and shares of trailers, songs and interviews etc., popularity of a movie can be known. And thus the projected box office collections can be estimated.

Prediction of a Movie’s Success Using Data Mining Techniques

227

References 1. Latif, M.H., Afzal, H.: Prediction of Movies popularity Using Machine Learning. National University of Sceinces and Technology, H-12, ISB, Pakistan 2. Saraee, M., White, S., Eccleston, J.: A Data Mining Approach to Analysis and Prediction of Movie Ratings. University of Salford, England 3. Predicting Movie Box Office Profitability A Neural Network Approach 4. Nithin, V.R., Pranav, M., Sarath Babu, P.B., Lijiya: A predicting movie success based on IMDB data. Int. J. Data Min. Tech. 03, 365–368 (2014) 5. https://www.analyticsvidhya.com/blog/2017/09/common-machine-learning-algorithms/ 6. http://www.saedsayad.com/decision_tree.htm 7. http://www.geeksforgeeks.org/twitter-sentiment-analysis-using-python/

Brain Tumor Segmentation with Skull Stripping and Modified Fuzzy C-Means Aniket Bilenia, Daksh Sharma, Himanshu Raj, Rahul Raman and Mahua Bhattacharya

Abstract For medical image processing, brain tumor segmentation is one of the most researched topics. Diagnosis at an early stage plays a vital role in saving a patient’s life. It is troublesome to segment the tumor region from an MRI due to unavailability of a sharper edge and properly visible boundaries. In this paper, a combination of skull stripping methods and modified fuzzy c-means is presented to segment out the tumor region. After the acquired image is denoised, it is stripped of irrelevant tissues on the outer boundaries. It is further processed through the fuzzy c-means algorithm. The obtained results were proven to be better compared to the standard fuzzy c-means when applied on a sample of 100 MRIs. Keywords MRI ⋅ Fuzzy c-means ⋅ Skull stripping ⋅ Image denoising

1 Introduction Brain tumors can have different regions of origins based on which they can be classified as primary and metastatic. Primary ones originate from brain cells while metastatic ones are spread from different regions of the body. Most of the current research focus is on Glioma, which is a particular case of brain tumor which develops A. Bilenia (✉) ⋅ D. Sharma ⋅ H. Raj ⋅ R. Raman ⋅ M. Bhattacharya Atal Bihari Vajpayee Indian Institute of Information Technology and Management, Gwalior, Gwalior 474010, Madhya Pradesh, India e-mail: [email protected] D. Sharma e-mail: [email protected] H. Raj e-mail: [email protected] R. Raman e-mail: [email protected] M. Bhattacharya e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_23

229

230

A. Bilenia et al.

from the glial cells. Diagnosis at an early stage can immensely improve the treatment of a patient. Computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET) are some of the techniques used for diagnosis. Due to the wide availability of MRI, it is considered the most standard method used for tumor detection. There are four distinct kinds of modalities for glioma which are primarily used for detection which consists of T1, T2, T1-Gd (gadolinium contrast enhancement), and FLAIR modalities [1]. The abnormality of WM, GM, and CSF possesses a need to develop a model for efficient assessment of these images. To come up with a robust model, we refer to the works reported by various authors for segmentation with soft computing methods, optimization, as well as statistical methods of computation [2–4]. Segmenting the tumor region in an MRI involves manual diagnosis of the MRI, separating the affected tissues from the normal tissues [5]. This is a time-intensive operation which is interactive for the most part. An accurate estimation of this requires the knowledge of various algorithms as well as how the parameters need to be adjusted. Hence, there is a need to develop an automated segmentation method for an objective calculation of the tumor boundary. The main method that is employed for clustering on an MRI image is fuzzy cmeans. In recent years, it has been used for image segmentation [6, 7]. When compared to hard clustering technique, such as k-means, when compared with other clustering algorithms such as K-means, the incorporation of fuzzy c-means provides better preservation of information in the image [8]. Since the pixels belonging to the local neighborhood may be highly correlated, this spatial information needs to be accounted for while characterizing the segmentation method. The FCM in its original state does not take this into consideration and hence it has to be modified making it less susceptible to noise. This paper presents modification in FCM after denoising the original image and skull stripping for maximal effect. Skull stripping will act as a preprocessing step for enhancement in speed and accuracy during diagnosis. This will remove noncerebral regions like the skull area and scalp from the brain images. Due to fuzzy clustering, the pixels will be allocated to more than one cluster, each having their own membership value, essentially representing the image pixels as fuzzy sets.

2 Proposed Methodology In this section, we describe the proposed methodology for image preprocessing and tumor segmentation using combination of methods, namely, skull stripping, anisotropic filtering, and FCM. The diagram illustrated in Fig. 1 represents the methods used for the tumor segmentation.

Brain Tumor Segmentation with Skull Stripping . . .

231

Fig. 1 Flow diagram for the proposed methodology

2.1 Image Acquisition In this paper, the implementation of the proposed method is done on T1-weighted MR images of the human brain. The process of image acquisition is hardware-based. This is characterized by the representation of an object or a scene. The images used have been clinically obtained for the experimentation.

2.2 Image Denoising In many key areas of research and education streams, the use of images has become increasingly mundane such as medical imaging and education. The key area of difficulty is the noise that gets incorporated when an image is acquired. One key problem that arises after an image is denoised is the introduction of artifacts in an image and the removal of structures or details from an image. Anisotropic diffusion filter remarkably improves the image quality by preserving the edges. It applies the law of diffusion (smoothing) on pixel intensities. Threshold functions are utilized to prevent diffusion across edges. Whenever there is high gradient, this function prevents smoothing at edges. It = div(c(x, y, t))▽I)

(1)

= c(x, y, t)𝛥I + ▽c▽I

(2)

▽ is the gradient, and 𝛥 is the Laplacian operator. At time t, smoothing is achieved by putting conduction coefficient to be 1 in interior and 0 at the boundaries. Estimation for the location of boundaries requires esti-

232

A. Bilenia et al.

mate of E(x, y, t). Let E(x, y, t) be such an estimate. It is a vector-valued function if 1. E(x, y, t) = 0 ⇒ interior region. 2. E(x, y, t) = ke(x, y, t) ⇒ at edge point where e is the unit vector, and k is the contrast. 3. If E(x, y, t) is available, the conduction coefficient c(x, y, t) can be function c = g(‖E‖).

2.3 Skull Stripping Skull stripping is one of the key steps of image preprocessing in which tissues that consists of white and gray matter are separated from the skull in an MRI image. The process of skull stripping needs to be accurate; otherwise, it leads to wrongful estimation of the cerebral and cortical thickness of skull. The current methods are very dependent upon accurate geometric assumptions which are features that may be not accurately designed or may be not up to the standard because of poor image registration. To find the threshold of the image and apply labels correspondingly, we apply the Otsu’s thresholding technique. This thresholding technique assumes that the intensity of histogram should be bi-modal. If it is not, then to convert it into one, and a Gaussian averaging filter is applied which also smooths the image. It maximizes the inter-class variance and minimizes the intra-class variance, resulting in a very distinct background and foreground. This results in the image being converted from grayscale into a binary image. Otsu’s mathematical interpretation is as follows: 𝜎w2 (t) = 𝜔0 (t)𝜎02 (t) + 𝜔1 (t)𝜎12 (t)

(3)

Here, 𝜎w2 (t) is the intra-class variance. It is calculated by the summation of the weighted variance of the two individual classes. The individual weights are calculated as follows: t−1 ∑ p(i) (4) 𝜔0 (t) = i=0

𝜔1 (t) =

L−1 ∑

p(i)

(5)

i=t

Here, p(i) is the probability of intensity of the individual pixels. Otsu’s method also shows that minimizing the intra-class variance also maximizes the inter-class variance according to the following equations: 𝜎b2 (t) = 𝜎 2 − 𝜎w2 (t)

(6)

Brain Tumor Segmentation with Skull Stripping . . .

233

𝜎b2 (t) = 𝜔0 (𝜇0 − 𝜇T )2 + 𝜔1 (𝜇1 − 𝜇T )2

(7)

[ ]2 𝜎b2 (t) = 𝜔0 (t)𝜔1 (t) 𝜇0 (t) − 𝜇1 (t)

(8)

2.4 Fuzzy C-means Segmentation Among unsupervised methods of segmentation, clustering methods are the most applied in medical field. Without any training of images, clustering partitions an image into pixel clusters with similar intensities. Bias field in MRI represents the nonuniformity of the image. This nonuniformity affects regions with homogeneous tissues. Therefore, bias field needs to be accurately estimated in order to alleviate the clustering process and increase the performance of automated MRI segmentation. The bias field is typically modeled as low-frequency multiplicative field [9]. This signal can be viewed as St = Xt Gt ,

where

k𝜖{1, 2, … , n}

(9)

This can be rewritten as st = xt + 𝛽t ,

where

k𝜖{1, 2, … , n}

(10)

This simplified multiplicative model is used in most bias correction methods to represent the bias field. This model has limitation due to the differences between the measured and true intensities. Fuzzy c-means clustering is modified using the multiplicative low-frequency bias field estimation. It is the generalized form of k-means clustering. It allows membership of a pixel in multiple clusters. FCM works by minimizing the object function as follows [10]: n m ∑ ∑ q uij d(xi , cj ) (11) Jq = i=1 j=1

where q controls the fuzzification of clustering, u is fuzzy membership of data xi to the cluster with the centroid cj , and d is the distance between the data point and the centroid of cluster j. The u has the following conditions: uij 𝜖[0, 1],

n ∑

uij = 1

(12)

j=1

0<

n ∑ j=1

uij < n

(13)

234

A. Bilenia et al.

The objective function is minimized following certain constraints. The membership function and centroid of each cluster are obtained as follows: 1 (2q−1) k=1 (d(xi , cj ))∕(d(xi , ck ))

uij = ∑m

∑N i=0

cj = ∑N

q

uij xi

i=0

(14)

q

(15)

uij

FCM optimizes the membership function and cluster center until optimization threshold between iterations is reached. For similarity measure, squared Euclidean distance is used. FCM is modified by exploiting the spatial information. The membership function is modified as follows: sm um ij ij (16) uij = ∑c um sm k=1 kj kj where sij is the function that gives the probability of pixel xj belonging to ith cluster. The modified FCM algorithm (mFCM) can be described as follows: 1. Set the number of clusters c and the parameter m in (11). Initialize the centroid vector for fuzzy cluster V = [v1 , v2 , … , vc ] randomly and set 𝜖 = 0.01. 2. Compute uij using (14). 3. Compute cj using (15). 4. Update uij using (16). 5. Update cj using (15). 6. Repeat steps 4 and 5 until u(k + 1) − u(k) ≤ 𝜖. Above algorithm first calculates the membership function in the spectral domain, and then the membership information of each pixel in the spatial domain is mapped onto the domain and calculates that spatial information from it. The iterations are repeated until maximal dissimilarity between two of the cluster centroids in consecutive iterations becomes less than the threshold 𝜖. When the threshold is reached, the clusters converge for the pixels whose corresponding clusters are defined through defuzzification. Defuzzification assigns the pixel to the cluster for which the membership is maximal.

3 Result After testing the proposed method on different MR images, the obtained results have been quite satisfactory. We have used MATLAB 2017a for the implementation. The approach was tested on MRI images sampled from BRATS challenge dataset out of which three sets of these are shown. The images in the set describe different phases

Brain Tumor Segmentation with Skull Stripping . . . Table 1 SSIM of MRI samples based on area Sample No. Expected Extracted tumor area tumor area 1 2 3 4 5 6 7 8 9 10

24.525 12.3235 8.4768 5.656 15.184 15.7185 15.4299 12.5171 15.3733 18.6545

23.1719 12.2895 6.1801 6.6787 14.3898 15.663 16.929 13.2158 14.9061 18.1758

235

Under segment

Over segment SSI

1.3531 0.034 2.2967 0 0.7942 0.0555 0 0 0.4672 0.4787

0 0 0 1.0227 0 0 1.4991 0.6987 0 0

0.9529 0.9878 0.9942 0.9903 0.9905 0.9879 0.9783 0.9889 0.9885 0.9843

Fig. 2 From left to right: The original image, image after denoising through anisotropic filtering, skull-stripped image, and the segmented tumor

236

A. Bilenia et al.

of the proposed model, starting from the noisy image acquired. After the image is denoised through adaptive filtering, the redundant tissues are carved out of the image. Finally, the modified FCM is applied on the skull-stripped version of the image. The obtained segment is compared with the tumor area given in the dataset in terms of structural similarity for over segmentation and under segmented boundaries as presented in Table 1. The images in the set describe different phases of the proposed model, starting from the noisy image acquired. Finally, the segmented tumor boundary is presented in the far right. The parameter for fuzziness, q, has been manually tuned to obtain these segments. The initial estimation of bias field was chosen as a constant image for the fuzzy c-means algorithm. The comparison of areas for the sample images follows closely to the anticipated tumor area (Fig. 2).

4 Conclusion This paper proposed a combination of skull stripping and fuzzy clustering method for segmentation of brain MRI image. This approach is able to segment pixels belonging to WM and GM rather satisfactorily. The method was realized after the incorporation of spatial neighborhood information in the membership functions replacing the standard FCM algorithm to account for the weight in each cluster. Based on the testing from various MRIs, the obtained results showed to be competitive with the existing methods. The proposed method for an automated brain MRI image segmentation is open to further investigation by the radiologists. It can be enhanced by adding steps for classifying the tumor boundary under different MR modalities. Acknowledgements This research was supported by ABV-Indian Institute of Information Technology and Management. The dataset for testing our implementations was acquired from BRATS 2015, from which 100 MRI files (.mat format in T1 modality) were sampled out of 700 images.

References 1. Dale, A.M., Halgren, E.: Spatiotemporal mapping of brain activity by integration of multiple imaging modalities. Curr. Opin. Neurobiol. 11(2), 202–208 (2001) 2. Karnan, M., Logheshwari, T.: Improved implementation of brain MRI image segmentation using ant colony system. In: 2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1–4. IEEE (2010) 3. A. Srivastava, A. Asati, and M. Bhattacharya, “A fast and noise-adaptive rough-fuzzy hybrid algorithm for medical image segmentation. In: 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 416–421. IEEE (2010) 4. Ayoobkhan, M.U.A., Chikkannan, E., Ramakrishnan, K.: Feed-forward neural network-based predictive image coding for medical image compression. Arab. J. Sci. Eng. 1–9 (2017) 5. Prastawa, M., Bullitt, E., Ho, S., Gerig, G.: A brain tumor segmentation framework based on outlier detection. Med. Image Anal. 8(3), 275–283 (2004)

Brain Tumor Segmentation with Skull Stripping . . .

237

6. Pham, D.L., Prince, J.L.: An adaptive fuzzy c-means algorithm for image segmentation in the presence of intensity inhomogeneities. Pattern Recognit. Lett. 20(1), 57–68 (1999) 7. Chen, W., Giger, M.L., Bick, U.: A fuzzy c-means (FCM)-based approach for computerized segmentation of breast lesions in dynamic contrast-enhanced mr images. Acad. Radiol. 13(1), 63–72 (2006) 8. Bradley, P., Bennett, K., Demiriz, A.: Constrained k-means clustering. Microsoft Res. 1–8 (2000) 9. Sled, J.G., Zijdenbos, A.P., Evans, A.C.: A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Trans. Med. Imaging 17(1), 87–97 (1998) 10. Pohle, R., Toennies, K.D.: Segmentation of medical images using adaptive region growing. In: Proceedings of SPIE Medical Imaging, vol. 4322, pp. 1337–1346 (2001)

Extended Security Model over Data Communication in Online Social Networks P. Mareswara Rao and K. Rajashekara Rao

Abstract In online social networks (OSNs) privacy is main and competitive concept in real-time scenario. In real-time software implementations processed in online social networks privacy issues like surveillance, social and institutional privacy. Handling these security issues, they were autonomous because of their precedence in OSN scenario. So in this paper, we analyze these security issues with respect to consistent social privacy because of increasing collisions in data sharing to different users without any confidentiality. Based on objectives approaches presented in above-mentioned hierarchal representation of different users data processing in online social networks. We present advanced framework to support above security issues in real-time cloud sharing applications. Promote change of our proposed approach will be actualized in Optimized and Active Learning Genetic Programming (O and ALGP) way to deal with give productive protection accomplishments in online informal communities. Our exploratory outcomes demonstrate effective access control regarding protection for information availability in online interpersonal organizations.





Keywords Online social networks Information privacy Risk assessment Active learning and genetic programming Information assurance



1 Introduction The coming real-time web-oriented applications has offered ascend to different types of social services available in web, including email, user Internet, texting, blogging, and web-based administrations. To consider this, the mechanical marvel that has procured the best notoriety in present social network site representations P. Mareswara Rao (✉) Rayalaseema University, Kurnool, Andhra Pradesh, India e-mail: [email protected] K. Rajashekara Rao Usha Rama College of Engineering and Technology, Telaprolu, Andhra Pradesh, India © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_24

239

240

P. Mareswara Rao and K. Rajashekara Rao

(SNSs). Past decade hardly any years, the quantity of members of such long range interpersonal communication administrations has been expanding at a mind boggling rate [1–4]. Then presented OSNs are the system differences different people are permitted to contribute their considerations, new innovations, and furthermore to frame in network communication groups. Online social systems give critical favorable circumstances in between people in their organizational business segments. A portion of the imperative advantages of online interpersonal organizations are: • Enable the general population to remain associated with each other advantageously and adequately, even on a global level. The connectedness and closeness created through this long range informal communication may add to expanded confidence and fulfillment may life for a few understudies [5, 6]. • Allow the similar people to find and connect with each other. • Provide a discussion for new methods of on the web coordinated effort, instruction; encounter sharing and put stock in arrangement, for example, the accumulation and trade of notoriety for organizations and people. • In the business part, a very much tuned SNS can improve the organization’s aggregate information and connect with an expansive scope of individuals in the organization in the key arranging process in [7, 8]. Since the achievement of a SNS relies upon the quantity of clients it pulls in, there is weight on SNS suppliers to empower plan and conduct which increment the quantity of clients and their associations. Be that as it may, the security and the entrance control instruments of SNSs are generally feeble by plan as the security and privacy are not easiest tasks as the principal need in the advancement of social network sites [9, 10]. Thus, alongside the advantages, huge protection and security dangers have additionally risen in online social organizing and the investigation of SNSs’ security issues has now turned into a broad zone of research. At present, most analysts are concentrating on hazard appraisal yet tend to neglect the hazard decrease viewpoint. Because of hazard evaluation alone, information security (IS) chance just gets appraised yet not limited or diminished since chance lessening is very mind boggling and loaded with vulnerability. Main task present in social network vulnerability present in traditional hazard diminishment execution is one of essential factor that impact InfoSecu Risk Management (ISRM) viability. Thusly, it is significant to address the vulnerability issue in the InfoSecu chance lessening process. So in this paper, we develop advanced risk identification methodology, i.e., Optimized and Active Learning Genetic Programming (O and ALGP) for information security in online social networks. This approach evaluates following language in development of protection issues. This assessment procedure analyze optimization issues in natural assessment procedure with semantic information occasion procedure for mixing all the category events in online community protection procedure with convenient procedure in realistic information occasion proceeds in online community privacy presentations, our developed proposed

Extended Security Model over Data Communication …

241

approach consists following parameters: Mutation, Inheritance, Crossover and Selection Organization of this paper organized as follows: Sect. 2 describes the related work relates to risk assessment in information security for online social networks. Section 3 defines background research relates to privacy in data sharing. Section 4 defines proposed optimized model to elaborate communication and risk assessment of online social networks. Section 5 evaluates the performance of proposed with security parameters. Section 6 concludes overall conclusion relates to our proposed approach.

2 Related Work Interpersonal interaction is a key component of the danger to data identified with the administration of the dangers which cannot generally be anticipated and forestalled in light of the fact that the arrangement of informal organizations is from individuals (at the unique ages and with an incredible assortment of individual qualities) who want to make, oversee, and share data, be a part of the stream of data. That data many-sided quality and low level of trust in interpersonal organizations makes it conceivable to complete exercises which can be viewed as undermining in connection to a subject (client profile data space, geopolitical area, the Internet as a space by and large). Nature of research and investigation procedure of long range interpersonal communication adds to framework demonstrating and plan circumstance, demonstrating, by advancing a focused global researchers to have the capacity to present the most recent logical information of demonstrating socio-specialized frameworks, making an open door for risk forecast, recording and portraying the relevance of financial and specialized space. Interpersonal organization is particular condition to chance relief measures. Speculations around innovation and data innovation have depicted different existing dangers, some of which are as takes after: authoritative, monetary, natural, political, legitimate and security dangers. Amid investigation exceptional consideration ought to be paid to the standard ISO 31000 (Risk administration—Principles and guidelines (1) depicted meanings of hazard, which comprises of five data frameworks intrinsic hazard qualities: 1. Result or the deviation from the normal—positive and/or negative; 2. Objects can have diverse perspectives, (for example, money related, wellbeing and security, condition), and they can be connected at distinctive levels, (for example, vital, association, task, item and process); 3. Risk is frequently described by reference to potential occasions and results or their blends; 4. Risk is frequently communicated as a blend of the impacts of the occasions (counting changes in conditions) and it is identified with the likelihood of the hazard happening;

242

P. Mareswara Rao and K. Rajashekara Rao

5. Uncertainty is the hazard circumstance, even halfway, coming about because of insufficiency of data identified with comprehension or information of the occasion, its outcomes or opportunities. Current data frameworks are presented to different dangers [11–13]. The potential hazard consciousness of the conditions furthermore, reaction is a piece of hazard administration. Hazard administration is an all encompassing procedure, which comprises of a few interrelated segments. Creators of A Risk Management Standard characterize a standard objective and depict why it is vital for standard and verify that it is important to concede to wording, and stresses that the standard is abridged by best rehearses (Represent Best Practice). The standard is depicted in the hazard administration process and association and chance administration goals. Additionally there are dangers and hazard administration is completed, meanings of terms and hazard classes portrayed in the hazard administration process, chance administration process and hazard approach. The different approaches of hazard administration are not having rule contrasts on the procedure.

3 Background Approach In recent years, demonstrated that both in media talk, and additionally in examine, the observation and social protection points of view are dealt with as partitioned issues. Next, we turn our consideration regarding the relating protection inquire about customs in software engineering. We give a short outline of some of their suppositions, meaning of the security issue, techniques, goals, and proposed arrangements. a. Security as Assurance from Surveillance, Social and Interface The arrangement of advancements that we allude to as Improved and Enhanced Security Methodology (I and ESM) became out of cryptography and PC security look into, and are therefore outlined after security building standards, for example, risk demonstrating and security investigation. Established security innovations were created for national security purposes, and later, to secure business data and exchanges. They were intended to ensure state and corporate insider facts, and to shield authoritative operations from disturbances [14, 15]. The protection issues tended to buy I and ESM are from multiple points of view a reformulation of old security dangers, for example, secrecy breaks or fore swearing of administration assaults. This time be that as it may, conventional nationals are the proposed clients of the advances, and surveillance arrays are the undermining elements from which they require security. Obviously, the quintessential client and utilization of I and ESM s is the “dissident” occupied with political difference. The objective of I and ESM with regards to OSNs is to empower people to connect with others, offer, get and share data to web, free from observation and impedance. In a perfect world, client user uniquely shared data is needful to her/his

Extended Security Model over Data Communication …

243

expected representations, while the revelation of different data to some other different data presentations is anticipated. More else, I and ESM intend is to increase the performance of client in different ways, like what is more, get to data on OSNs by giving her way to bypass control [16]. Concerning observation, the plan of I and ESM starts with conceivable work present in OSNs. So privacy is necessity for users present in online social networks to share sensitive information to elaborate different individual tasks.

4 Proposed Methodology Implementation a. Basic Procedure Implementation of Optimized and Active Learning Genetic Programming (O and ALRP) method can be applicable to represent and provide efficient relevancy for security in OSNs. In this section, we present and explain how to solve solutions for different operations, for example, populace measure, hybrid calculation and transformation calculation in systematic and semantic information portrayals. For this there is no result on what is ideal yet frequently they utilize settings like those as: Population measure 20–30, hybrid rate 0.75–0.95, and change rate 0.001–0.005. This is main arrangement with sequential dialog of attributes irretrievability with privacy contemplations in informal community. Above privacy issues are re-organize in view of information security and other security issue occasions progressively information assurance in online interpersonal organizations with basic steps show in Fig. 1. These movements are examined for creating privacy approaches in information insurance in each presented operation. Evaluation of Fitness The consequences of every unique wellness esteems are related with systematic information. Unique chromosomal parameters are portrayal gives higher arrangement with semantic information privacy continuously social network applications. This approach can be presented with different data relations first greatest and after that base contemplations. These outcomes are gotten to with every one of the requirements progressively application improvement. Every single infeasible arrangement are wiped out, and wellness capacities are processed for the practical ones. Operational Selection Closing the observation privacy contemplation of all relational portrayal is completed in relative information occasion age. Figure the distinctive element extraction from different assets display in online interpersonal organizations.

244

P. Mareswara Rao and K. Rajashekara Rao

Fig. 1 Risk assessment in online social networks with respect to different features

Crossover Operation The recombination of the considerable number of operations continuously applications contains effective process age in security of online social protection in interpersonal organizations. The hybrid operation creates altogether in bit portrayal and disregarding hereditary code and other portrayal of hereditary works. Mutation Operation This operation completed utilizing utilitarian clasp operations to decide different particulars for changing transformation of all the related information handling in interpersonal organizations with incorporating data in late information preparing for security thought. Risk Reduction Based Optimized and Active learning Approach The hazard recognizable proof process begins from various viewpoints exhibit progressively application improvement in light of procedure condition. This hazard administration can be begin utilizing an evaluation. For instance consider the investigation of the exploration procedure continuously associations then appropriate occasions are gathered data from various associations. Every association keeps up alternate point of view examination occasion and other occasion arranging

Extended Security Model over Data Communication …

245

ages are pertinent in occasion handling. For grouping and arranging these evaluations in view of individual appraisal can be relevant with in evaluation technique occasion handling. Thus resources are orchestrated in consecutive request with relative force in dynamic application development. These operations are created within interpersonal organization qualification preparing with comparability of all the relative information of online interpersonal organization security. b. Implementation The O and ALGP contains only one basic data for real-time social networks, implementation procedure of our proposed approach follows individual population for relative social networks. Every individual tenderly known as a critter speaks to a component with the area of the arrangement space of the improvement issue. The people in O and ALGP are basically limited length bits based on series. In that each attribute relation called as chromosomes for settled length n [17]. The matching sequence of chromosome is the main hot-spot for all the data about the relating arrangement. The relative variable esteems are spoken to as paired; there must be a method for changing over consistent esteems into double esteems and the other way around. Step by step implementation procedure shown in below. Alg. 1. Step-by-step procedure relate to data accessibility in social networks Information (Input): An arrangement of n hubs, a requesting on the hubs, an upper bound u on the quantity of guardians a hub may have, and a database D containing m cases Yield (Output): For every hub, a printout of the guardians of the hub 1. for I: = 1 to n do 2. πi := _; 3. Pold := f (I, πi); 4. OKToProceed := genuine; 5. while OKToProceed and |πi | < u do 6. give z a chance to be the hub in (Pred(xi) - πi) that expands f (I, πi U {z}); 7. Pnew := f (I, πi U {z}); 8. in the event that Pnew > Pold at that point 9. Pold := Pnew; 10. πi := πi U {z}; 11. else OKToProceed := false; 12. end {while}; 13. write('Node: ', xi, 'Parent of xi: ', πi); 14. end {for}; 15. end {K2};

The distinction between the genuine capacity esteem and the equalization calculation is equalization blunder. The numerical formula to support twofold encoding and translating of the nth and pn values are described as follows:

246

P. Mareswara Rao and K. Rajashekara Rao

For encryption representation Pnorm =

Pn − Pmin Pmax − Pmin m−1

chromosome½m = roundfPnorm − 2 − m − ∑ chromosome½p × 2 − p g p−1

For decoding: n

Pquant = ∑ chromosome½m × 2 − m + 2 − ðm + 1Þ m=1

Qn = Pquant ðPmax − Pmin Þ For each relation in online social networks may follows following relations based on individual attributes. range: 0 ≤ Pnorm ≤ 1 where Pmin Pmax chromosome½m roundf.g

is the variable at minimum stage is the variable at maximum value in social networks. is the binary version relates to Pn round the variable values based on integer values

Qn is the quantization version of Pn with different features present in online social networks. Generally, this means that a sequence of 0s and 1s are used to existing the choice factors, the gathering of addressing a prospective treatment for the problem. In our O and ALGP, we implement inherited providers to an whole inhabitants at each creation. Then the procedure carries on until the most of creation is achieved, or the best possible remedy is found.

5 Experimental Results Based on discussion above protection concerns with security for information interaction from one to other sequential presentations in the online community peace of mind in commercial event control functions. In this practical control functions the developing will process on the practical and other resources with their interaction of other public comfort problems are obtained in information systems of online community issues. Basic design for online social networks with data transmission may shown in Fig. 2 with related features. Basic user interface construction may be shown in above Fig. 2, the execution accomplishes efficient activities of use activities present in the online community development event control functions in appropriate data control over the each customer interface.

Extended Security Model over Data Communication …

247

Fig. 2 User interface design for online data sharing wih different features

Client in informal organizations with their relative preparing steps right off the bat client go into correspondence arrangement of the interpersonal organization, enlist with certifications of client point of view and afterward login into informal organization movement condition for getting to administrations of the informal community movement. Client profile can be set with different highlights exhibit in the interpersonal organization proficiency condition estimation with their correspondence procedure administration with other client points of interest. As indicated by the client operations in relative information administration getting to administrations with their correspondence handling units. Every client get to administrations with relative information correspondence of the system movement utilizing pertinent administrations of client introduced in correspondence connect with their relative correspondence of every client. Basic operations performed for clients in social networks shown in following Table 1. As appeared in the Table 1 distinct cases are produced with estimation of all the module portrayal with relative information correspondence occasions in security procedure of the considerable number of administrations in defensive quirk to characterize every one of the administrations of the client procedure. Running every one of the applications with plan of the considerable number of administrations with their defensive administration operations continuously dynamic occasions with relative information estimation in business occasions. We could distinguish how the reconnaissance and social protection analysts make correlative inquiries. We trust that such reflection may enable us to provide efficient and better privacy issues we

248

P. Mareswara Rao and K. Rajashekara Rao

Table 1 Step by step interface statistics relates to online social networks S. no

Module name

Implementation design

Expected output

Result

i

Setup the server

Work in glash fish

Pass

ii

Setup the server

Work in apache

Pass

iii iv v

Register Enter into system Classify based on known terms Security for social privacy Security for institutional Security for surveillance

Check glash fish server Check apache server setup Verify each details Verify each details Classify each frame

Display “failure” Return to operational page Check classification failure Check all services in social networks Check all operations with users credentials Check all operations relates to surveillance

Pass Pass Pass

vi vii viii

Check social security concerns Check institutional security concerns Check each reports

Pass Pass Pass

encounter as OSN clients, paying with considerable presentations whether we do as such as activists or customers.

6 Conclusion As indicated by the exchange of all the above discussions, we contend given ensnarement among observation and social security in OSNs, protection explore needs a more all encompassing methodology that advantages from the learning base of the two points of view. General thought of the considerable number of occasions show in the informal community movement there is a procedure of getting to administrations with their point of view information occasion administration operations with their reflexive information examination in interpersonal organization security with their relative information insurance in view of regular accomplishments of all the relative operations for giving privacy continuously applications. For this implementation productively, our created application, we build up an all encompassing methodology for getting to administrations from clients display in the interpersonal organization with developed and connected to relative information administration operations in business event security. Further improvement of our approach is to support multi party access control features with respect to conflict resolution in online social networks.

Extended Security Model over Data Communication …

249

References 1. Boase, J., Wellman, B.: Personal Relationships: On and Off the Internet (2006) 2. McLuhan, M.: Understanding Media: The Extensions of Man. Cambridge: MIT Press. In: O’Reilly (O’Reilly, 2005) Adamic, L. A., & Adar, E. (2003). Friends and Neighbors on the Web. Social Networks (1964) 3. Koorn, R., van Gils, H., ter Hart, J., Overbeek, P., Drs. Tellegen, R.: Privacy-enhancing technologies white paper for decision-makers 4. Alexander, A.: The Egyptian experience: sense and nonsense of the internet revolution. Details of the Internet shutdown and restoration can be found on the Renesys blog, J. Cowie (27 January, 2011; 2 February, 2011) 5. Shehab, M., Squicciarini, A., Ahn, G.-J., Kokkinou, I.: Access control for online social networks third party applications. www.Sciencedirect.Com Journal Homepage: www. Elsevier.Com/Locate/Cose Comput. Secur. 31(2012) 897E–911. ISSN 0167-4048/$ E See Front Matter © 2012 Elsevier Ltd. All Rights Reserved. http://dx.doi.org/10.1016/J.Cose. 2012.07.008 6. Lipford, H.R., Froiland, K.: Visual vs. compact: a comparison of privacy policy interfaces. In: CHI 2010: Input, Security, and Privacy Policies, Atlanta, GA, USA, 10–15 April, 2010 7. Lee, M.-C.: Information security risk analysis methods and research trends: AHP and Fuzzy comprehensive. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 6(1) (2014) 8. Feng, N., Yu, X.: A data-driven assessment model for information systems security risk management. J. Comput. 7(12) (2012) 9. http://policyreview.info/articles/analysis/necessary-and-inherent-limits-internet-surveillance 10. Gürses, S., Diaz, C.: Two tales of privacy in online social networks. This article appears in the IEEE Secur. Priv. 11(3), 29–37, May/June 2013. This is the authors’ version of the paper. The magazine version is available at: http://www.computer.org/csdl/mags/sp/2013/03/ msp2013030029-abs.html 11. Beato, F., Kohlweiss, M., Wouters, K.: Scramble! your social network data. In: Privacy Enhancing Technologies Symposium, PETS 2011, volume 6794 of LNCS, pp. 211–225. Springer (2011) 12. De Cristofaro, E., Soriente, C., Tsudik, G., Williams, A.: Hummingbird: privacy at the time of twitter. In: IEEE Symposium on Security and Privacy, pp. 285–299. IEEE Computer Society (2012) 13. Sayaf, R., Clarke, D.: Access control models for online social networks. In: Social Network Engineering for Secure Web Data and Services. IGI—Global (in print) (2012) 14. Stutzman, F., Hartzog, W.: Boundary regulation in social media. In: CSCW (2012) 15. Tamjidyamcholo, A., Dawoud Al-Dabbagh, R.: Genetic Algorithm approach for risk reduction of information security. Int. J. Cyber-Secur. Dig. Forensics (IJCSDF) 1(1), 59–66; The Society of Digital Information and Wireless Communications. ISSN 2305-0012 (2012)

Emotional Strategy in the Classroom Based on the Application of New Technologies: An Initial Contribution Hector F. A. Gomez, Susana A. T. Arias, T. Edwin Fabricio Lozada, C. Carlos Eduardo Martínez, Freddy Robalino, David Castillo and P. Luz M. Aguirre Abstract The main purpose of a class is for students to generate positive emotions and to maintain them along the semester; in this way, students not only learn but also manage to retain learning in order to be applied in concrete and real facts. In this study we show a learning mechanism based on the inclusion of NTICS based on our EVA robot. The initial results showed that the level of attention in positive emotions was 100% of the class. During the course of 2 months it decrease due to the lack of updates in the technology operation introduced in the process. However, the results in the study time show a positive emotional level greater than 60% and a concentration level of 80%. All these based on speech and analysis of facial emotion. These results are primary in the inclusion of new technologies in the classroom, so we propose conclusions and recommendations that will improve our future research. Keywords Emotions



Technologies



Student

H. F. A. Gomez (✉) ⋅ S. A. T. Arias ⋅ F. Robalino Universidad Tecnica de Ambato, Ambato, Ecuador e-mail: [email protected] S. A. T. Arias e-mail: [email protected] F. Robalino e-mail: [email protected] T. E. F. Lozada ⋅ C. C. E. Martínez ⋅ P. Luz M. Aguirre Universidad Regional Autónoma de Los Andes – UNIANDES, Ibarra, Ecuador e-mail: [email protected] C. C. E. Martínez e-mail: [email protected] P. Luz M. Aguirre e-mail: [email protected] D. Castillo Universidad Indoamerica, Quito, Ecuador e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_25

251

252

H. F. A. Gomez et al.

1 Introduction The processes to capture student’s attention in the classroom are of vital importance to achieve efficient academic performance. It has been observed that in 3 out of 4% of adults have attention deficit. This implies that it is important to address this type of problems in the classroom in a better way. The point is to identify which mechanisms are necessary and sufficient so that learning does not become boring and therefore discourage students. The problem is that in some theoretical and numerical classes this occurs regularly [1]. Therefore it is necessary to face this problem by including new teaching techniques. ICTs are a fundamental element in the teaching-learning process and can change the paradigm of traditional education [2]. In this paper we introduce elements of ICT but based on the use of robotics in search of solutions applied to the study area. The first impression of the second-level Psychopedagogy students when observing the Nao Robot was a total impression. During the development of the class this had a level of attention about a 100%. Understanding the structure of the robot (EVA) its functionalities and characteristics made students classify the class as the best they have had at the time. This process was maintained for 2 months. We sought to maintain the attention on an eminently theoretical lecture: Research Methodology of life stories. In order to observe the influence of EVA on it, this for 2 months, 6 h per week (class face-to-face) and 4 h per week of autonomous work. The result showed that there was a high level of academic interest to this new form of teaching. We experiment with class videos to get the emotions of the students and process the data with market tools and those generated by our team.

1.1

Emotions

In search of the quality of education the importance of emotions in the learning process and the treatment of the academic in the classroom emphasizes the emotional aspect of the student. For this, some questions related to the affective part are issued in the classroom. Is the teacher a source of motivation or anguish in the educational process? If this is affirmative, what are the mechanisms by which these sensations are experienced? [3]. This research relates to maternal, paternal, family and community formation, strengthening in educational institutions that are spaces for human relationships, recognizing affect as an emotion for student development, the methodology used is the orientation to the daily narratives of students and teachers as well as reflections and analysis that leads to a conclusion that in the classroom are spaces to contribute to the formation of knowing, acting, and relating. It is the responsibility of the educational institutions an integral formation that is related to the intellectual, emotional, academic development of the students. We live emotions in any space and time, with family, with friends, with our environment, with our school, with our educators, etc. [4]. Therefore, educational

Emotional Strategy in the Classroom Based on the Application …

253

institutions are where greater experiences and theoretical and practical foundations are acquired in the development of emotions that are shared with people around with greater or lesser intensity. The methodology is oriented to the pedagogical action and the use of emotional competences with the use of emotional and experiential strategies. From the point of view of positive psychology it is important to develop strengths in people and especially in students, as the main component of the emotional part. Happiness includes positive emotions in their broadest and most flexible sense-, and pleasant ones, such as joy and gratitude. This does not imply the absence of negative emotions. Happiness is possible with a certain dose of negative feelings, what is important is that positive emotions predominate [5]. The purpose in the educational area is to look for strategies to make people happy. In this work we use market tools as developed by our team, which help us identify the positive and negative emotions recorded during the teaching-learning process.

1.2

Sentistrength

The analysis of feelings is oriented to the classification of a common characteristic that can be positive, negative or neutral of a given text from a fully established document. In this research it is interesting to analyze the interaction and participation of users with the profiles that Cronoshare has in the social networks in order to obtain if the comments they generate are positive or negative [6]. SentiStrength is used to relate the state of mind and the extension of the lexicon. It predicts the positive and negative feelings of texts simultaneously which assigns a value scale to the feeling. Currently it is very important to evaluate the opinion in the representation of messages published on the social network such as twitter. The investigation at first approximation reveals that emotions in Twitter supposes to assign each published message a value related to the emotional load that it transmits [7]. The analysis of feelings on twitter provides data and relevant information in relation to communication through the Internet as well as its interest in different areas of knowledge. It is important to reflect with emphasis over the analysis of the information and the analysis of feelings that is reflected in many contexts such as the taste for acquiring a product, the fear of a tremor or comment in a social issue and why not in educational areas. The Sentistrength tool is available for academic and commercial goals with license use [8]. It allows us to work with small texts to perform an analysis of the language which is going to be worked. This tool is oriented to satisfy the different needs that human beings have based on the real context that surrounds them to obtain positive or negative results. These results will allow it making appropriate decisions.

254

1.3

H. F. A. Gomez et al.

Google API

The google development technologies in recent years have been evolving to solve problems in different areas and needs of the human being. Google Ajax Search technology: the use of the Google Ajax API allows you to add Google search results to a website through JavaScript allowing you to display the search results easily and dynamically [9]. The integration of google technologies allow to support from the methodological point of view it helps the student to be motivated in the theoretical and practical activities of social and cultural environment problems. Currently, data mining focuses on categorizing elements such as the taste of people, price estimation, analyzing text that can be determined with certain computer processes that can be positive or negative for decision making. During this project, we will work on the classification of images as they express feelings of happiness or sadness with the aim of introducing us into the process of Big Data and data mining with images as it is done with texts or prices [10]. Companies like Google Cloud Vision have spent a lot of time analyzing feelings with the use of images to respond to certain needs of the human being, that is to extract the feelings of the faces to predict a happy or sad face. The technologies that have currently evolved are not concentrated in a common transmission network, but today the trend is the use of web applications in the cloud. Cloud (Cloud) is the set of information servers deployed in data centers, throughout the world, where millions of Web applications and huge amounts of data are stored by thousands of organizations and companies as well as hundreds of thousands of users that download and directly execute the programs and software applications stored in those servers such as Google, Amazon, IBM, or Microsoft [11]. Google currently provides several services to the cloud known as (GCP) Google Cloud Platform where they are related computer resources, storage and database and other services that help users. As indicated Sentistrength is a short text classifier. Google’s Vision API allows you to detect individual objects and faces within the images, HER [12] Is a software that allows you to identify emotions based on images and text classification automatically with the use of the library of text mining, intercalation and Bayesian generalized line model as proposed in [13] for the classification of texts. We seek to identify the number of students where the attention remains. If the emotional level is positive with the introduction of the robot in the class and if we can recommend the procedure developed to other research teachers. The result showed that more than 60% of the speech and emotions in class continue with positivity. There is evidence of a deterioration of half of the class because they do not develop new routines with EVA so they can attract attention, so it is necessary to identify new dynamic teaching mechanisms. The following section clearly describes the work that has been developed to maintain the interest and attention of the student, as well as in the methodology of this work we explain the mechanism to obtain the levels of attention during the teaching-learning process, to I experimented with 70 students of Psychopedagogy for 2 months. It also describes the

Emotional Strategy in the Classroom Based on the Application …

255

conclusions and future work of this first experimental contribution based on the fact that the first impression of ICT is fundamental to avoid boredom and achieve an improvement in academic performance.

2 State of the Art Classroom management is not an optional aspect since the fact of entering a classroom and standing in front of an audience modifies the situation and activities. The expectations that are had before entering the classroom, concentration, silence, and more are modified by the presence of a teacher. A classroom is a space for coexistence between people, and is a special space where students spend a lot of time during a course. It can cause friction between the actors of the context. These continuous contacts are neutral, positive or negative. With the pretext of trying to make students know a lot, what happens is that there is usually forget that there are different types of behavior in the classroom, and therefore; unconsciously we leave aside those who can not or do not want to study. Therefore, it can be concluded that there are students who have learned not to want to study increasing the number of negative evictees who strictly depend on the solutions proposed by the teacher. In the development of a class that is of interest to all; it is based on proactive and formative assertive discipline which understands that each conflict in the classroom is a sign of an emotional lack. The student-centered strategies proposed by [14]. Were guided by constructive alignment and focused on promoting advanced levels of learning. They included interconnected activities and learning-oriented assessment methods. They proved to strengthen students’ abilities for effective autonomous and collaborative learning. The teaching methods used included inverted classes, peer learning and role plays. The evaluation of the project was supported by the SPSS and ATLAS.ti tools. This experience was developed in a Colombian university, where the increase in the number of students. Competition between universities and limited resources present several challenges. The findings could be important for curricular development or the promotion of good teaching practices. [15] provides an analysis of students’ experiences of a teaching theory approach that integrates teaching theory and data analysis. The argument that supports this approach is that the theory is taught more effectively by using empirical data to generate and test propositions and hypotheses, thus emphasizing the dialectical relationship between theory and data through experiential learning. Bachelor of Commerce students in two substantive second-year organizational theory subjects were introduced to this learning method at a large multi-campus Australian university. Hanushek (2013) use the TIMSS data to compare the performance of education systems for countries of different levels of economic development, by resorting to educational production functions; the results show that countries with high and low levels of development present general problems in the efficient use of resources; Specifically, they conclude that the family background exerts a strong influence on the school result; consequently, students from disadvantaged families,

256

H. F. A. Gomez et al.

in economic terms, show lower performances to students from families with medium and high incomes. In this article, we present a model that proposes a relationship between students’ perceptions of their learning, the enjoyment of experience and the expected future results. The results of our evaluation reveal that most students: “enjoyed this way of learning;” He believed that exercise helped his learning of substantive theory, computer applications and the nature of the survey data; and he felt that what they have learned could be applied elsewhere. We argue that this approach has the potential to improve the way in which the theory is taught through the integration of theory, theoretical tests and theoretical development; moving away from teaching theory and analysis in discrete subjects; and introducing iterative experiences in substantive subjects [16]. Discusses the complexities surrounding the teaching of a module of critical thinking and academic writing in a vocational graduate program. Students enrolled in this program are strongly focused on the industry and often do not see the relevance of that module, despite the fact that most are international students with English as a second language. Obtaining student acceptance has been a challenge, and initial comments from students and discipline teachers were disappointing. However, this frustration was the trigger for an innovative approach that adopted the evaluation design as the starting point in the restructuring of the module. The approach is based on the principles that underlie assessment for learning. Taking into account the diverse interests and backgrounds of the students was crucial in the restructuring and has led to a remarkable improvement in both the attitude and participation of the students. In this paper we intend to avoid boredom in class with the introduction of EVA and determine if levels of attention based on positive emotions are maintained for a considerable time in which it is included in the teaching-learning process, theory, recommendations and evaluations. In the following section we describe the methodology that was used during the experimental phase.

3 Methodology During the development of the classes was introduced to EVA. Its functionalities, its processes and its characteristics were discussed. The students were asked to develop functions so that EVA can speak, move, walk, dance. Figure 1 shows a picture of the operation of the class. Figure 2 shows the same class that was recorded in video, which was analyzed with the SentiStrenght tool and with Naive Bayes to classify the text (the speeches made in the class) and with Google’s API and HER for the identification of emotions. In Table 1 you can see a sample of the collected data. Once the data organized by columns is obtained, a relationship analysis is carried out (Positive + Positive = Positive; Negative + Negative = Negative; Negative + Positive = Positive; Neutral + Negative = Negative; Neutral + Positive = Positive) between the columns benchmarks of tools such as: Sentistrength

Emotional Strategy in the Classroom Based on the Application …

257

Fig. 1 Detection of emotions during the classes (Extern universities)

Fig. 2 Detection of emotions during the classes (UTA-CLASS)

with API Vision of google; and HER with R; these jointly processed results define a final textual value (Positive, Negative) that in turn are related and assigned to their numerical value (Positive = 1, Negative = 0). The values assigned numerically

Table 1 Data obtained from videos and processing with tools

258 H. F. A. Gomez et al.

Emotional Strategy in the Classroom Based on the Application …

259

(1, 0) for the column of the applied tools and the column of the expert as final results are followed by the Precision and Recall analysis where the relationship between the two columns between the System (Tools) and the Expert for VP values => V = 1, P = 1; VN => V = 1, N = 0; FP => F = 0, P = 1; FN => F = 0, N = 0.

4 Experiment The data obtained in the research have been the origin of several videos in classrooms with university students (with teacher and student participation), with durations ranging from 1 min 9 s to 2 h 12 min, where each of they have obtained the texts of fragments for every 8 s, obtaining a total of 1218 records, the same ones that have been processed tools such as Sentistrength, Google Vision API, HER (Human Emotion Recognition), R Studio. The results that have been obtained with the tools are identified in textual values such as positive (joy), negative (anger, sadness, anger) and neutral (fear, surprise); In the same way, it is carried out in the analysis of each of the results of the tools from the perspective of the expert, making reference to the values stated above. The total sum of the values for VP = 927, VN = 23, FP = 65, FN = 158 give a total of 1218, with these results we proceed to the application of Recall formulas (S = VP/(VP + FN) and Precision (P = VP + FP) where a value of S = 0.86 and P = 0.93 is obtained, these results are the basis for the application of the formula F1 score (F1 = 2 * (Precision * Recall/Precision + Recall)) that generates a result of 0.89, a result that justifies the validity of the data Table 1 shows what is detailed in this paragraph (Table 2). The F1 Score is 0.89, which allows conclusions to be drawn for the experiment carried out.

Table 2 F1 Score affirms what was obtained by the tools

VP VP FN Total

972 158 1130 VP + FN Sensibilidad

FP 65 23 88 FN + VP o recall

S = VP/(VP = FN) S= F1=

0.86 0.89

E=

1037 181 1218

VP FN 1218

Especificidad o precision E = VN/ (VN + FP) 0.93

FP BN

260

H. F. A. Gomez et al.

5 Conclusions and Recommendations The F1 Score allows to conclude that 50% of the phrases and faces analyzed are positive, 30% neutral, and 20% negative. This makes it possible to conclude that the use of technology arouses a high interest especially in theoretical chairs that usually do not reach this type of levels (Araujo Leal, Miranda and Souza 2013) (Rodriguez Bernal 2017). It is necessary to implement strategies and methodologies that allow to capture the attention of students and therefore improve their academic performance. During 2 months of study, 63.38% positive attention of the students is obtained. The rest decays in negativity and neutrality, so it can be recommended to improve and change the attention mechanism in class in this case to improve EVA Functionalities. This implies that if in the beginning the robot caught the attention, observing it continuously and not showing new alternatives, positive attention in the class begins to decline. Eighty one percent of the time (in phrases and faces) remains coherent in emotions and in discourse, so it can be considered that during development and explanation there is a high interest in the teaching mechanism. It is clarified that the discourse of the teacher and students is not divided, so in future investigations an individual analysis will be carried out in order to particularize the emotional. Permission Statements to Use the Images We thank the universities that were part of the present study, and at the same time stated that we have the corresponding permits for the images of this document, by the consent of the university authorities.

References 1. Martinez-Sierra Gustavo, M.G.-G.: Student’s emotions in the high school mathematical class: appraisals in terms of a structure of goals. Int. J. Sci. Math. Educat. 349–369 (2017) 2. Danlie, N.: Designing and Research on English Listening Teaching Assisted by Computer Multimedia. Int. J. Emerg. Technol. Learn. 32–43 (2017) 3. Valle, L.G.: Colombia Médica (1997) 4. Cassà, È.L.: La educación emocional en la educación infantil, Emotional education in infant education, vol. 19, no. 193, pp. 153–167 (2005) 5. Vañó, C.A.C: Aplicaciones Educativas de la Psicología Positiva 6. De Grado, T.F.: Escola Tècnica Superior d’ Enginyeria Informàtica Universitat Politècnica de València Grado en Ingeniería Informática, Valencia (2016) 7. Baviera, T.: Técnicas para el análisis del sentimiento en Twitter Aprendizaje Automático Supervisado y SentiStrength Tecnhiques for sentiment analysis in Twitter. Supervised Learning and SentiStrength 33–50 (2016) 8. Santamaría, V.H.: Nuevas herramientas para el análisis de opinión en flujos de texto (2015 9. Nielsen, E.S.: Integrando tecnologías de desarrollo de Google en las asignaturas de Inteligencia Artificial y Agentes Inteligentes (2009) 10. Diez, M.B.: Big Data para el análisis de sentimientos en imágenes (2017) 11. Olcina Valero, A.N.A.: Desarrollo de aplicaciones web con el API de Google Cloud (2017)

Emotional Strategy in the Classroom Based on the Application …

261

12. Araújo, J.R., Larrea, M.I.P., Alvarado, H.F.G., Morales, P.C., Ratté, S.: Neurociencias aplicadas al análisis de la percepcion: Corazon y emoción ante el Himno de Ecuador. Revista Latina de Comunicación Social (2015) 13. Swathi, V., Kumar, S.S., Perumal, P.A.: Novel Fuzzy-Bayesian classification method for automatic text categorization. IJSRST 2395–6011 (2017) 14. Rodriguez Bernal, C.M.: Student-centred strategies to integrate theoretical knowledge into project development within architectural technology lecture-based. Architect. Eng. Design. Manag. (2017) 15. Blunsdon, B., Reed, K., McNeil, N., McEachern, S.: Experiential learning in social science theory: an investigation of the relationship between student enjoyment and learning. High. Educat. Res. Develop. 43–56 (2010) 16. Strauss, P., Mooney, S.: Assessment for learning: capturing the interest of diverse students on an academic writing module in postgraduate vocational education. Teach. High. Educat. (2017)

Using Clustering for Package Cohesion Measurement in Aspect-Oriented Systems Puneet Jai Kaur and Sakshi Kaushal

Abstract Packages are basic programme units in any object-oriented system including aspect-oriented systems, which allow the grouping of dependent elements. The cohesion of a package is the degree of encapsulation among the elements in the package. For any software to exhibit high quality, it must have high cohesion. Many metrics have been framed in the past to measure the cohesion in aspect-oriented systems but all metrics are defined at the class level. Most of the research focused on the class of structural cohesion metrics which measures cohesion by software design extracted from the source code. The aim of this paper is to propose an approach to explore the use of hierarchical clustering technique for improving the cohesion of packages in aspect-oriented. The results obtained from our approach are then compared with the already available metrics. The achieved results show that our proposed approach determines the cohesion of packages more accurately.







Keywords AOSD Packages Cohesion Package cohesion Clustering Hierarchical clustering PCohA





1 Introduction Packages are the reusable components that support maintainability and reusability in object-oriented systems. The software is rated good if it exhibits high quality. Good quality software is easily maintainable if it is reusing the existing packages or developing packages. Packages are the groups of related elements (classes or interfaces). P. J. Kaur (✉) Department of Information Technology, U.I.E.T, Panjab University, Chandigarh, India e-mail: [email protected] S. Kaushal Department of Computer Science and Engineering, U.I.E.T, Panjab University, Chandigarh, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_26

263

264

P. J. Kaur and S. Kaushal

Since elements of package are closely related, it must exhibit maximum cohesion [1]. Packages having low cohesion contain independent classes, which mean that classes may perform different functions independently. Developing software using such less cohesive packages will result in increasing the overall effort. Thus, software must contain packages that have high cohesion amongst its element and also low coupling. Package level cohesion plays an important role in software packaging, maintainability and reusability [2]. The cohesion is defined as the degree of interrelatedness among the elements. In other words, it is the degree of similarity among the elements. Grouping of elements by similarity is known as clustering [4]. Clustering is an activity to form clusters of the given set of elements so that elements in a cluster exhibit high similarity to each other [5]. Clustering techniques are also being used in activities like partitioning [6], architecture recovery [7] and restructuring [5, 8–11]. Aspect-Oriented Software Development (AOSD) is a new concept over traditional development models that concentrate on the identification of cross-cutting concerns and to modularize them into different functional units [3]. All the metrics for measuring cohesion in Aspect-Oriented System (AOS) are defined at aspect/ class level only. Since no work has been cited on measuring cohesion of a package in AOS, we have defined a new metric, PCohA [20], for measuring cohesion at package level for AOS. Clustering is the most commonly used technique for data mining. Clustering has been widely used in feature extraction and data classification. Phenomenon of modularization is defined as the assemblage of large amounts of elements in groups called modules so that the elements in a group are strongly related with each other as compared to the elements of other groups. Such groups of strongly related elements are known as clusters in cluster analysis. Cluster is thus defined as those regions of space containing a relatively high density of points, separated from other regions containing a relatively low density of points [4]. The only goal of clustering is to extract an existing natural cluster structure of similar entities. On the basis of similarity measures, there are a number of clustering techniques available, such as partitioning clustering, hierarchical clustering, graph clustering, grid-based clustering, density-based clustering, etc. Among all the clustering algorithms, hierarchical clustering is the most widely used [12]. Hierarchical clustering is defined as a method for decomposing data in the form of a tree by either splitting or merging of the data points. It means the data objects are grouped together in a form of hierarchy or tree on the basis of similarity between them. At each step, a merge or split operation is applied to form the tree. Therefore, the algorithms of hierarchical clustering are broadly divided into two categories: agglomerative algorithm and divisive algorithm. Agglomerative algorithm starts by assuming each data point as its own cluster. Iteratively, two clusters are merged at each stage on the basis of the similarity between them. These clusters are merged together to form larger cluster until there is only one largest cluster. Since each point has its own cluster initially, it is a bottom-up approach [6]. Divisive algorithm follows a top-down approach. Initially, only one cluster containing all the data points is given. This cluster is further divided into sub-clusters

Using Clustering for Package Cohesion Measurement …

265

based on the dissimilarity between points of a cluster. This division of clusters takes place until each data point has its own cluster [6]. We are focusing on bottom-up approach in our framework. For implementing hierarchical clustering algorithm, statistical analysis tool XLSTAT is being used. In XLSTAT, agglomerative hierarchical clustering is implemented to generate clusters in the form of the dendrogram. The aim of this paper is to propose a framework to explore the use of clustering techniques in measuring cohesion of packages using PCohA in AOS. In our proposed framework, we are using hierarchical clustering to measure the cohesion of packages within an AOS using PCohA. The proposed approach has been evaluated using AspectJ examples available as open source [32, 33]. For empirical validation, the obtained results are compared with the structural cohesion metric LCOO and PCohA. The analysis shows that the proposed approach helps in better analysing the design structure of the software as compared to other cohesion metrics. The rest of the paper is organized as follows: Sect. 2 covers the brief literature survey of AOSD and work done in using clustering for measuring cohesion. Section 3 defines our approach for measuring cohesion using clustering. Section 4 presents the implementation of our approach on AspectJ examples followed by comparison with some existing measures. Finally, Sect. 5 presents the conclusion and future work.

2 Literature Work In this section, we have discussed the various cohesion metrics defined for AOS and highlighted the work related to use of clustering in software development.

2.1

Cohesion Metrics for AOS

In literature, there exist few frameworks for measuring cohesion in AOS and most of the measures are defined for aspect-oriented programming language AspectJ. Chidamber and Kemerer [29] had explored the concept of similarity in the quality measurement for OO software. A framework was proposed by Zhao [13], for measuring aspect cohesion. The framework analyses the degree of coherence between aspects attributes and modules to measure the aspect cohesion. Another framework was proposed by Sant’ Anna et al. [3] to design a metric suite for measuring internal quality attributes, which includes metrics for cohesion, coupling, separation of concerns and size. The author had also proposed a metric that measures the lack of cohesion of a component and named it as Lack of Cohesion in Operations (LCOO). Another framework given by Ceccato and Tonella [14] defined the cohesion metric, Lack of Cohesion in Operations (LCO), measured as the number of

266

P. J. Kaur and S. Kaushal

operations working on different class fields minus number of operations working on similar fields. Gelinas et al. [15] proposed ACoh for measuring cohesion of aspect using dependency analysis. As per authors, two types of dependency connections can happen, one between module and data and another between module and module. These connections can be made to measure cohesion of aspect. On the similar criteria of dependency, Kumar et al. [16] defined different types of connections that affect the cohesion. A framework was designed on these connections to define cohesion metric for AOS termed as Unified Aspect Cohesion (UACoh).

2.2

Clustering and Software Design

Lung et al. [17, 18] demonstrated the use of clustering techniques for restructuring and partitioning the software system. They concluded that the proposed method could easily obtain software design and architecture details. Tzerpos and Holt [19] have surveyed the approaches to the software clustering problem and had proved their effective utilization in a software context. Wiggerts [4] has presented the analysis of clustering methods for the purpose of their use in re-modularization of software. In another framework given by Lakhotia [21], comparison between various subsystem classification techniques has been done by entity description and clustering methods. Anquetil et al. [22, 23] have presented an analysis of parameters influencing the re-modularization of a software system using hierarchical clustering. Kang and Beiman [24, 25] have applied a cohesion measure to the problem of restructuring, visualizing and quantifying the software system. They suggested the restructuring of modules during the design or maintenance phases using the relationship between input and output components. Kim and Kwon [26] had worked on the methods for restructuring modules at the software maintenance phase. They suggested to restructure only those modules that perform multiple functions. Alkhalid et al. [27] performed software refactoring at the package level. The authors have introduced Adaptive K-Nearest Neighbour (A-KNN) algorithm for refactoring at the package level. Sadaoui and Badri [28] have given a framework for improving the cohesion measurement of classes in object-oriented systems using clustering techniques. They concluded that their approach could better interpret the cohesion of classes than traditional structural cohesion metrics. It is found from the literature review that no work has been carried out on defining metrics for measuring cohesion at package level for aspect-oriented systems. Hence, we have designed package cohesion metric for aspects, PCohA [1] to explore the significance of packaging in AOS. PCohA is measured by the dependency of its elements. This paper proposes a new framework to measure package cohesion in AOS using hierarchical clustering for determining the design structure of the system. The proposed approach is evaluated using AspectJ examples available as open source. In the next section, we are discussing the framework for measuring cohesion using clustering.

Using Clustering for Package Cohesion Measurement …

267

3 Framework for Measuring Package Cohesion Using Clustering This section gives our proposed framework for measuring package cohesion using clustering. In this framework, clusters are formed for the elements of the package (classes, aspects and interfaces) to group the similar elements of the package. The proposed approach is described in Fig. 1. Steps for using clustering to find cohesion at package level are—constructing the similarity matrix, performing hierarchical clustering and then calculating cohesion using PCohA.

3.1

Constructing the Similarity Matrix

An element of a package can have many references in other elements of the packages and also it can refer many other elements. For example, if an element A creates an instance of another element B, then any call to B initiated by A is considered as access to element or a reference to element. The similarity matrix is a square matrix m x m, where m is the number of elements of the package P. It gives the relationship between the elements of the package. If two elements are related to each other, the value is 1; otherwise, it is 0.

3.2

Hierarchical Clustering

For performing clustering in XLSTAT, agglomerative hierarchical clustering is implemented using the similarity matrix S as input. The information from matrix S is used to develop nested partitions of the elements in the form of hierarchical tree

Fig. 1 Framework for measuring cohesion using clustering

Source

Similarity matrix

Cluster Formation

Measuring Cohesion

Constructing a Graph

268

P. J. Kaur and S. Kaushal

or dendrogram. For calculating the similarity among elements, Jaccard coefficient is selected as similarity measure as it is mostly used for software entities [30].

3.3

Cohesion Measurement

Software is composed of different components like classes, interfaces, etc. Elements of an AOS may include aspects, classes, interfaces, etc. As defined earlier, package is defined as the group of related elements. Thus, an AOS package is a group of aspects, classes, interfaces and sub-package which are related to each other in one or other way. Package cohesion is measured as degree of dependency amongst its elements. Dependency is measured for all elements that are used directly or indirectly by the package. From the study of various frameworks defined for class cohesion measures [24], package cohesion metric PCohA is defined as the measure of relatedness between the package members [20]. Based on dependency measures, PCohA is defined as the percentage of the number of relations of each element present in the package to other elements of the package, divided by the total number of elements in the aspect. Since clustering is a technique to make groups (clusters) of related elements based on their similarity, in this paper, we are using clustering method for measuring package cohesion. Consider package as an undirected graph, where the nodes represent the elements of the package. If the elements belong to the same cluster, there is an edge between two nodes. As cluster can contain only related elements, all the pairs of elements within a cluster are related. Since edge represents a relation between the connected nodes, a total number of edges will contribute to the total number of relations among the elements of the package. For calculating package cohesion using clustering, a number of edges in the connected graph give the number of relations. Thus, for any package with n number of elements, package cohesion using clustering can be calculated as Eq. 1: PCohAc =

Total Number of edges in Graph  n n −2 1

ð1Þ

Since edge gives the relation, a total number of edges give the total number of relations in the package and it is divided by the total number of possible pairs, i.e. n (n − 1)/2. Next section gives the empirical evaluation of our proposed framework.

Using Clustering for Package Cohesion Measurement …

269

4 Empirical Evaluation of Proposed Framework In this section, the empirical evaluation of the proposed framework is performed by implementing the proposed metric, on AspectJ packages [32, 33]. To analyse the results of this approach, the comparison is done with structural cohesion metric: LCOO [31] and package cohesion metric, PCohA [1].

4.1

The Subject–Observer Protocol

This protocol is the AspectJ version for the subject/observer design pattern. The protocol consists of a label named coloured that works on a set of colours and a count to record the number of cycles it has been worked on. A button is an action item that records the count of number of times it is clicked. A subject/observer relationship is built with these two elements, where coloured labels are the observers and buttons are the subjects. Coloured labels observe the clicks of buttons. The protocol is implemented using interfaces, subject and observer, and the abstract aspect SubjectObserverProtocol. Apart from interfaces and aspects, the protocol contains four classes—Demo, colour label, button and display. By implementing AHC algorithm of XLSTAT, the dendrogram is formed, as shown in Fig. 2. The dendrogram clearly gives the following four clusters of elements: C1 = {Display}, C2 = {button, colour label, Observer, Subject, SubjectObserverProtocol} and C3 = {Demo, SubjectObserverProtocolImpl}. The connected graph showing relations between elements of the package, obtained after clustering, is shown in Fig. 3. Elements in the same cluster are close to each other and thus similar. From the graph, it is clear that there are four clusters. The number of elements in the package is 8, and number of relations is 11. From the definition of PCohAc, its value is 0.39. Table 1 gives the values for package cohesion metric, PCohA and the structural cohesion measure LCOO. LCOO is a class-level cohesion metric. It is clear from Table 1 that package is not cohesive. The value of PCohAc is less than the value of PCohA, i.e. 0.21. It indicates that elements of the package are

Dendrogram

Fig. 2 Dendrogram for subject observer protocol

Colorlabel

0

Observer

0.5

Subject SubjectObseverprotoc ol Button

1 Demo SubjectObseverprotoc olImpl Display

Dissimilarity

1.5

270

P. J. Kaur and S. Kaushal

Fig. 3 Cluster graph for subject observer protocol

SubjectObserverProtocolImpl

Demo

SubjectObserverprotocol

Colorlabel

Observer

Subject

Button Display

Table 1 Values of selected metrics

Metrics

PCohA

LCOO

Values

0.21

0

weakly related. There are three clusters—one having five elements and other two having only one and two elements each at dissimilarity index 0.8. If dissimilarity index is kept near to 1, then there will be only two clusters—C1 {Button, Colorlabel,Subjectobserverprotocol, Observer, Subject, Display] and C2 (Demo, SubjectObserverProtocolImpl}—and number of connected edges will increase to 16. Thus, value of PCohAc becomes 0.6, which is much more desirable. It can be concluded that clustering helps in determining the structure of the package and also affects the measurement of cohesion. Hence, measuring cohesion using our approach can reflect the structure of the package better and thus can help in increasing the cohesion and overall quality of the software. Similar approach is followed for calculating PCohAc for telecom project.

4.2

Telecom

The telecom project developed in AspectJ is a simple simulation application of a telephone system. Telephone system manages the calls which customers make, accept, merge and hang up. This simulation application is developed in three configurations: basic, timing and billing, which are programmed as classes— BasicSimulation, TimingSimulation and BillingSimulation, respectively. All the three configurations share a common superclass AbstractSimulation. Other elements of the telecom package are the classes—call, connection, local and long distance. In total, there are seven elements in the package. By implementing AHC algorithm of XLSTAT, the dendrogram is formed, as shown in Fig. 4. From the graph, shown in Fig. 5, it is clear that the number of elements in the package is 7 and number of relations is 5. Values of PCohAc, PCohA and LCOO are given in Table 2.

Using Clustering for Package Cohesion Measurement …

271

Dendrogram

Fig. 4 Dendrogram for telecom example

Fig. 5 Cluster graph for telecom

AbstractSimulation

Longdistance

Local

Connection

Customer

0

Call

0.5 Abstarct simulation BasicSimulation

Dissimilarity

1

BasicSimulation

Connection

LongDistance

Local Call

Table 2 Values of selected metrics

Customer

Metrics

PCohAc

PCohA

LCOO

Values

0.23

0.38

0.28

As in observer package, this package is also very less cohesive. The value of PCohAc is almost equal to the value of LCOO, i.e. 0.28. It indicates that elements of the package are not strongly related. There are three clusters—one having three elements and two having two elements each at dissimilarity index 0.8. If dissimilarity index is more near to 1, then there will be only two clusters with 11 connected edges. Thus, PCohAc is calculated as 0.5, which is good cohesion. Hence, measuring cohesion using our approach helped in increasing cohesion by reflecting the structure of the package better. By following a similar approach, the value of PCohAc is calculated for five more packages. Table 3 gives the corresponding values for PCohAc, PCohA and LCOO for all packages. Figure 6 gives the comparison between three approaches for measuring cohesion. As it is clear from Fig. 6, PCohAc has shown better results for five packages. This information can be used to redesign the structure of other packages with the help of clustering, so that package cohesion can be increased by re-modularizing the existing packages. Since high cohesion and low coupling is the overall motto for software quality, clustering can help in achieving high cohesion and alternatively, it can help in achieving high quality of the system.

272

P. J. Kaur and S. Kaushal

Table 3 Values for PCohAc Package Name Subject observer Telecom Coordinator. spacewar Spacewar. spacewar Aspectj observer Figure Tracing Jsim.Queue Cretrevism

No. of elements

No. of connected Edges

No. of clusters

PCohAc

PCohA

LCOO

8

11

3

0.39

0.21

0

7 8

5 10

3 4

0.23 0.35

0.38 0.15

0.28 0.40

18

36

4

0.23

0.05

0.27

8

9

3

0.32

0.46

0

7 4 11 9

7 1 19 8

3 3 3 4

0.33 0.16 0.34 0.22

0.25 0.33 0.5 0.19

0 0.06 0.16 0.23

0.6 0.5 0.4 0.3 0.2 0.1 0

PCohA LCOO PCohAc

Fig. 6 Comparison between approaches for measuring cohesion

5 Conclusion and Future Work In this paper, efforts have been made to use the clustering methods for the measurement of cohesion of package in aspect-oriented system. Since package cohesion plays an important role in defining the quality of the software, a new approach is proposed in this for measuring cohesion using clustering at package level in AOS. In this approach, we have used hierarchical clustering to show improvement in measuring cohesion at the package level. Hierarchical clustering is achieved using similarity matrix, obtained from the elements of the package, as input to the XLSTAT tool. XLSTAT tool gives clustering in the form of dendrogram which is then used to form graph-based clusters. From the graph-based clusters, Package Cohesion using clustering (PCohAc) is calculated. Our proposed approach has been evaluated using AspectJ packages taken from eclipse platform. The results have

Using Clustering for Package Cohesion Measurement …

273

shown that the use of clusters for determining cohesion can better illustrate the structure of the package. This information can be used to redesign the structure of other packages with the help of clustering, so that package cohesion can be increased by re-modularizing the existing packages. Since high cohesion and low coupling is the overall motto for software quality, clustering can help in achieving high cohesion and alternatively, and it can help in achieving high quality of the system. Also, clustering approach can effectively help in determining the disparity between the functions implemented by the evaluated package.

References 1. Kaur, P.J., Kaushal, S.: Package level metrics for reusability in aspect oriented systems. In: International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), pp. 364–368. IEEE Amity University, Noida (2015) 2. Gupta, V., Chhabra, J.K.: Package level cohesion measurement in object oriented software. J. Brazilian Comput. Soc. Springer 18, 251–266 (2012) 3. Sant’Anna, C., Garcia, A., Chavez, C., Lucena, C., Von Staa, A.: On the reuse and maintenance of aspect oriented software: an assessment framework. In: 17th Brazillian Symposium on Software Engineering (2003) 4. Wiggerts, T.A.: Using clustering algorithms in legacy systems remodularization. In: Proceedings of the 4th Working Conference on Reverse Engineering, Washington, pp. 33– 43 (1997) 5. Czibula, I.G., Czibula, G., Cojocar, G.S.: Hierarchical clustering for identifying crosscutting concerns in object-oriented software systems. In: Proceedings of the 4th Balkan Conference in Informatics, Thessaloniki, vol 8, no 3, pp. 21–28 (2009) 6. Anquetil, N., Fourrier, C., Lethbridge, T.: Experiments with hierarchical clustering algorithms as software remodularization methods. In: Proceedings of the Working Conference on Reverse Engineering, Benevento, 23–27 October 1999 7. Lung, C.-H.: Software architecture recovery and re-structuring through clustering techniques. In: Proceedings of the 3rd International Software Architecture Workshop, Orlando, pp. 101– 104 (1999) 8. Czibula, I.G., Czibula, G.: A partitional clustering algorithm for improving the structure of object-oriented software systems. Studia Universitatis Babes-Bolyai, Series Informatica, vol. 3, no. 2 (2008) 9. Lung, C., Zaman, M., Nandi, A.: Applications of clustering techniques to software partitioning, recovery and restructuring. J. Syst. Softw. 73(2), 227–244 (2004) 10. Lung, C.-H., Zaman, M.: Using clustering technique to restructure programs. In: Proceedings of the International Conference on Software Engineering Research and Practice, Las Vegas, pp. 853–860 (2004) 11. Serban, G., Czibula, I.G.: Restructuring software systems using clustering. In: Proceedings of the 22nd International Symposium on Computer and Information Sciences, Ankara (2007) 12. Moldovan, G.S., Serban, G.: Aspect mining using a vector space model based clustering approach, pp. 36–40. Proceedings of Linking Aspect Technology and Evolution Workshop, Bonn (2006) 13. Zhao, J.: Towards a Metric Suite for Aspect Oriented Software. Technical report, SE 136–25, Information Processing Society of Japan (IPSJ) (2002) 14. Ceccato, M., Tonella, P.: Measuring the effects of software aspectization. In: Proceedings of first workshop on Aspect Reverse Engineering, WARE (2004)

274

P. J. Kaur and S. Kaushal

15. Gelinas, J.F., Badri, M., Badri, L.: A cohesion measure for aspects. J. Object Technol. 5(7), 97–114 (2006) 16. Kumar, A., Kumar, R., Grover, P.S.: Towards a unified framework for cohesion measurement in aspect-oriented systems. In: IEEE Proceedings of 19th Australian Software Engineering Conference Perth, Western Australia, pp. 57–65 (2008) 17. Lung, C.H., Xu, X., Zaman, M., Srinivasan, A.: Program restructuring using clustering techniques. J. Syst. Softw. Elsevier 79, 1261–1279 (2006) 18. Ling, C.-H., Zaman, M., Nandi, A.: Applications of clustering techniques to software partitioning, recovery, and restructuring. J. Syst. Softw. 73(2), 227–244 (2004) 19. Tzerpos, V., Holt, R.C.: Software botryology automatic clustering of software systems. In: Proceedings of 20th Annual International Conference of the IEEE (1998) 20. Puneet, Sakshi, Arun, Francesco: A framework for assessing reusability using package cohesion measure in aspect oriented system. Int. J. Parallel Prog. Special Issue on Programming Models and Algorithms for Data Analysis in HPC Systems. Springer, 1–22 (2017) 21. Lakhotia, A.: A unified framework for expressing software subsystem classification techniques. J. Syst. Softw. 36(3), 211–231 (1997) 22. Anquetil, N., Fourrier, C., Lethbridge, T.: Experiments with hierarchical clustering algorithms as software remodularization methods. In: Proceedings of Working Conference on Reverse Engineering (1999) 23. Anquetil, N., Lethbridge, T.C.: Comparative study of clustering algorithms and abstract representations for software remodularisation. In: IEEE Proceedings of Software vol. 150(3), pp. 185–201 (2003) 24. Kang, B.K., Beiman, J.M.: Using design abstractions to visualize, quantify, and restructure software. J. Syst. Softw. 42(2), 175–187 (1998) 25. Kang, B.K., Beiman, J.M.: A quantitative framework for software restructuring. J. Softw. Mainten Res. Pract. 11(4), 245–284 (1999) 26. Kim, H.S., Kwon, Y.R.: Restructuring programs through program slicing. Int. J. Softw. Eng. Knowl. Eng. 4(3), 349–368 (1994) 27. Alkhalid, Alshayeb, Mahmoud: Software refactoring at the package level using clustering techniques. IET software 5(3), 274–286 (2011) 28. Sadaoui, Badri, Badri: Improving class cohesion measurement: towards a novel approach using hierarchical clustering. J. Softw. Eng. Appl. Sci. Res. 5, 449–458 (2012) 29. Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994) 30. Naseem, Maqbool, Muhammad: Improved similarity measures for software clustering. In: 15th European Conference on Software Maintenance and Reengineering, IEEE, pp. 45–54 (2011) 31. Kaur, P.J., Kaushal, S.: Cohesion and Coupling measures for Aspect Oriented Systems. AETS/7/590, Elsevier, vol. 7, pp. 784–788 (2013) 32. http://eclipse.org/aspectj/doc/released/progguide/examples.html 33. http://abc.comlab.ox.uk/

Fungal Disease Detection in Maize Leaves Using Haar Wavelet Features Anupama S. Deshapande, Shantala G. Giraddi, K. G. Karibasappa and Shrinivas D. Desai

Abstract Agriculture is the backbone of Indian economy. Diseases in crops are causing huge loss to the economy. Only early detection can reduce these losses. Manual detection of the diseases is not feasible. Automated detection of plants diseases using image processing techniques would help farmers in earlier detection and thus prevent huge losses. Maize is an important commercial cereal crop of the world. The aim of this study is the detection of common fungal diseases, common rust, and northern leaf blight in maize leaf. The proposed system aims at early detection and further classification of diseases into common rust, northern leaf blight, multiple diseases, or healthy using first-order histogram features and Haar wavelet features based on GLCM features. Two classifiers, namely, k-NN and SVM are considered for the study. The highest accuracy of 85% is obtained with k-NN for k = 5 and accuracy obtained with SVM-based classification is 88%.





Keywords Maize leaf Disease detection First-order histogram features K-NN classifier SVM classifier Haar wavelet GLCM features







A. S. Deshapande (✉) Department of Computer Science and Engineering, B.V. Bhoomaraddi College of Engineering and Technology, Hubli, India e-mail: [email protected] S. G. Giraddi ⋅ K. G. Karibasappa ⋅ S. D. Desai School of Computer Science and Engineering, KLE Technological University, Hubli, India e-mail: [email protected] K. G. Karibasappa e-mail: [email protected] S. D. Desai e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_27

275

276

A. S. Deshapande et al.

1 Introduction Agriculture plays an important role in Indian economy. Agriculture is the second largest in farm output production in the world. Agriculture and forestry together provide 13.7% of GDP according to 2013 survey that is equal to 50% of workforce. Today, economic contribution from agriculture to the GDP is declining gradually due to many reasons such as climatic conditions, quality of seeds, diseases in plants, and the maintenance issues. Roughly, direct yield loss caused by pathogens is responsible for losses ranging between 20 and 40% of global agricultural productivity. Maize is the widely cultivated crop around the globe. It is a source of nutrition as well as phytochemical compounds. Phytochemical compounds play an important role in preventing chronic diseases. Global maize production has grown at a CAGR of 3.4% over last 10 years [1]. It is the third important crop in India after wheat and rice. It is cultivated throughout the year. In India, production of maize crop has increased at a CAGR of 5.5% from 14 MnMT in 2004–05 to 23 MnMT in 2013–14 [1]. It is not only used for human and cattle food but also as a commercial crop for the production of cornstarch, corn oil, and baby corn. So there is a need to detect the diseases in plants as early as possible to minimize the loss occurred. Manual detection of plant disease is not accurate and is time-consuming, so various image processing methods are used to detect the diseases in plants which occur on leaves and stem. Maize leaf can be affected by many diseases. Some of the common diseases are common rust, northern leaf blight, brown spots, southern rust, etc. This study focuses on two main diseases: common rust and northern leaf blight. Angelique et al. [2] give details about the maize crop diseases. In common rust disease, the rust produces distinctive reproductive structures called pustules that erupt the surface of leaf. These arrive typically in mid-June to July. Susceptible and long season hybrids and later planting increase the risk of yield loss. Disease is favored by high relative humidity (greater than 95%) and cool temperatures (60–75 °F). Northern leaf blight is typified by long lesions having tapered ends with gray-green to tan lesion in color. Residues that reside near the soil increase the risk of diseases. Susceptible hybrids and high nitrogen soil also increase disease probability. Tichkule and Gawali [3] have used GLCM features along with K-NN and neural networks for the detection of diseases in cotton, grape, and wheat crops. K-NN provided good accuracy of 95%. Devi et al. [4] proposed decision support system to detect the disease of plant leaf using k-means clustering and FCM-based clustering. The image is segmented using graph cut segmentation and proved that FCM is better compared to k-means clustering. Joshi and Jadhav [5] proposed and built a system using various features like shape, color, and texture for rice plant disease detection and diagnosis using YCbCr segmentation and classification is done by K-NN and minimum distance classifier (MDC). Giraddi et al. [6] used Haar wavelet-based first-order features for diabetic retinopathy detection. Shantala et al.

Fungal Disease Detection in Maize Leaves …

277

[7] conducted a study on Haar wavelet-based GLCM features for exudates detection in fundus images. Sachin and Shailesh [8] proposed a system plant leaf disease detection in which segmentation is performed using FCM clustering algorithm. Then, various shape features like center of gravity, orientation, equivdiameter, and eccentricity are extracted from the candidate regions. The candidate regions are classified using backpropagation neural network. Mohmad Azmi and Muhammad [9] performed study on disease detection in orchid leaves. The color images are converted to grayscale images and then thresholded using Otsu’s method. Noise removal is performed using morphological operations. Remaining regions are classified using fuzzy classifier with area property. Kamalpurkar [10] proposed a system for detection in grapefruits. Authors considered powdery mildew and downey mildew which can cause heavy loss to grapefruit. Shape features of leaf such as major axis, etc. are used for classification using artificial neural network. Mohamed [11] proposed a method of medical image coding using feedforward neural networks and have used graph-based segmentation for images, and prediction is carried out using both lossless and near-lossless manner for comparing FPNN at compression and decompression levels.

2 Proposed Method The proposed system aims at detecting the disease and further classifying the maize leaf into four categories: healthy leaf, common rust, northern leaf blight, or multiple diseases. The schematic diagram of proposed methodology is shown in Fig. 1. A. Image Acquisition Images of maize leaf are captured using Samsung digital camera PL200 from agricultural fields in Agricultural University, Dharwad [8]. To maintain the uniform brightness, the images are captured at 11 a.m. maintaining a constant distance of two feet. All images are of PNG format. The images of each category are shown in Fig. 2. Totally, two hundred images are captured, 50 healthy, 50 common rust, 50 northern leaf blight, and 50 images with both diseases present in it. B. Image Preprocessing The images are of varying contrast. Preprocessing is necessary in order to correct nonuniform illumination. RGB image is converted into L * A * B color space and the luminance component (L) is extracted. Contrast-limited adaptive histogram equalization (CLAHE) is applied to L component and concatenated with a and b components. The image is converted from L * a * b color space to RGB space.

278

A. S. Deshapande et al.

Fig. 1 Flowchart of proposed system

Image Acquisition

Image Preprocessing

Feature extraction

Classification

Healthy Leaf

Common Rust

Northern Leaf Blight

Multiple diseases

Fig. 2 Images of maize leaf: a Healthy leaf. b Leaf affected by common rust. c Leaf affected by northern leaf blight. d Leaf affected by both common rust and northern leaf blight

C. Feature Extraction The red, blue, and green channels are separated. The red, green, and blue channels of the sample image are shown in Fig. 3. First-Order Histogram Features: The first-order histogram statistical moments are estimated properties of individual pixel values by waiving the spatial interaction between image pixels [6]. Description of first-order histogram features considered for the study is given below

Fungal Disease Detection in Maize Leaves …

279

Fig. 3 a RGB image of maize leaf. b Red channel of image. c Green channel of image. d Blue channel of image

Fig. 4 Histogram of images. a Histogram of healthy image. b Histogram of leaf affected by common rust. c Histogram of leaf affected by northern leaf blight

[7]. Histogram of healthy as well as diseased leaves is shown in Fig. 4. Table 1 shows the sample features. R, G, and B components of image are separated, and six features are extracted from each of them. Therefore, a total of 18 features per image are extracted. Mean: Mean considers the average level of intensity of an image or texture being examined, where n is number of pixels and xi is the ith pixel.

Table 1 First-order histogram features of maize leaf images Images

First-order histogram features Mean Variance Kurtosis

Skewness

Energy

Entropy

Healthy leaf Common rust Northern leaf blight Multiple diseases

3.65 3.09 4.63 5.18

0.15 0.42 −0.18 0.31

1.87 0.35 0.16 0.24

3.99 1.38 1.94 1.66

4.85 5.61 2.99 1.91

1.74 1.53 2.53 3.00

280

A. S. Deshapande et al.

m=

1 n ∑ xi . n i=0

ð1Þ

Variance: Variance is the variation or dispersion existing from the mean or expected value. It gives how a pixel varies from neighboring pixels. V=

1 n−1 ∑ ðxi − mÞ2 . n i=0

ð2Þ

Skewness: Skewness is described as measure of whether dataset is symmetric or not. For normal distribution, bit is zero and for symmetric it is nearly zero. Darker regions tend to be more positively skewed as compared to lighter ones. It is related pffiffiffiffi to third-order moment. Here, σ is standard deviation which is defined as V . S=E

  x − m3 σ

ð3Þ

.

Kurtosis: Kurtosis is the measure of whether data is heavily or lightly tailed with respect to normal distribution. It is related to fourth-order moment. Kurtosis = E

  x − m4 σ

.

ð4Þ

Other two features which are extracted from histogram are energy and entropy. Energy: Energy can be defined as a measure of extent to which pixel pair repetitions are present. It gives the uniformity of image. Energy value will be high when pixels are very similar. G−1 G−1

Energy = ∑ ∑ ½pði, jÞ2 .

ð5Þ

i=0 j=0

Entropy: Entropy is the measure of randomness that is used to characterize the texture of input image. The value of entropy is maximum when all elements of co-occurrence matrix are the same. G−1 G−1

Entropy = ∑ ∑ ði − jÞ2 pði, jÞ.

ð6Þ

i=0 j=0

Feature Extraction using Haar wavelet and GLCM Texture analysis plays an important role in image analysis, such as disease detection, machine vision, and content indexing of image databases. There are several approaches for texture analysis. Over the recent years, wavelet representation has emerged as a solid and unified mathematical framework for texture analysis. The

Fungal Disease Detection in Maize Leaves …

281

Haar transform has been used as a necessary tool in the wavelet transform for feature extraction. The application of Haar wavelet filters produces four nonoverlapping sub-bands: Approximation A1, Horizontal H1, Vertical V1, and Diagonal D1. The low-frequency sub-band A1 can be further decomposed into four sub-bands A2, H2, V2, and D2 at the next coarser scale by Subramanya et al. [12]. First- and second-order statistics of the wavelet detail coefficients provide very good discriminating texture descriptors. Using texture descriptors in transform domain is similar to the human visual system processing images in a multiscale way. Figure 5 shows the green channel of the maize leaf and A1, D1, H1, and V1 coefficients of the image. The method is based on application of Haar wavelet on RGB image of the preprocessed image to obtain horizontal, vertical, and diagonal coefficients. Then,

Fig. 5 Haar wavelet features of maize leaves. a Maize leaf. b Approximation coefficient of image. c Diagonal coefficient of image. d Horizontal coefficient of image. e Vertical coefficient of image

282

A. S. Deshapande et al.

Gray-Level Co-occurrence Matrix (GLCM) is constructed in two directions (0 and 90°) for each of the coefficients. Four statistical features contrast, correlation, energy, and homogeneity are computed from each GLCM. From each of the coefficients obtained using Haar wavelet transforms, four GLCM features as shown below are extracted in two directions 0° and 90°, i.e., a total of eight features for one image. The GLCM-based texture features studied by Haralick et al. [13] considered in the present study are contrast, correlation, energy, and homogeneity, which are defined below: N−1

Contrast = ∑ Pi, jði − jÞ2 .

ð7Þ

I, J = 0

  ði − μi Þ j − μj Correlation = ∑ Pi, j pffiffiffiffiffiffiffiffiffiffiffiffi . I, J = 0 σi2 σj2 N−1

Energy = ∑ Pði, jÞ2 .

ð8Þ ð9Þ

I, J = 0 N−1

Homogeneity = ∑ Pði, jÞ ̸ R .

ð10Þ

i, j = 0

D. Classification The extracted features from the previous phase are used for classification. Two classifiers, namely, k-Nearest Neighbor (k-NN) classifier and Support Vector Machine (SVM)-based multiclass classification, are used in the study. k-Nearest Neighbor (k-NN) classification: The extracted features from first-order histogram of images are used for training the k-NN classifier. k-NN classification gives the output in the form of class membership. An object is classified by considering the nearest neighbor’s (k) group for which it belongs to the most. The distance metric used by default is Euclidean distance. The classification is also carried out by applying different distance metrics such as S Euclidean, Minkowski, and City Block. The algorithm is iterated for different values of k, number of neighbors. Support Vector Machine (SVM) classification: SVM is the supervised learning models which are associated with the classification algorithms. A set of training image features is given which are marked as belonging to one or more categories. SVM builds a model that assigns the testing image to any category as defined by training. The multiclass model for support vector machine is considered using fitcecoc function in Matlab. We have considered different kernel functions for designing the SVM model. It takes training data along with the grouping attributes to which the images are to be classified.

Fungal Disease Detection in Maize Leaves …

283

3 Results and Discussions The system involves collection of maize leaf images and classifying them into different categories as mentioned above. Matlab 2017a is used for the implementation. The proposed work aims at detecting the disease of maize crop using k-NN and SVM-based classification techniques. Our database consists of 200 images which include three categories of diseased images along with healthy images of leaves. Ten cross-validations are carried out with both K-NN classifier and SVM classifier. The obtained results from the above classification techniques are presented in the form of graph in below figures. K-NN classifier Figure 6 presents the accuracy of k-NN classification technique for varying k value and different distance metrics applied for the classification using first-order feature extraction. Figure 7 presents the accuracy obtained for k-NN classification using Haar wavelet feature extraction. Fig. 6 Classification accuracy of k-NN with first-order feature extraction

Fig. 7 Classification accuracy of k-NN with Haar wavelet feature extraction

284

A. S. Deshapande et al.

Fig. 8 Classification accuracy of SVM with first-order feature extraction

Fig. 9 Classification accuracy of SVM with Haar wavelet feature extraction

Support Vector Machines Study is conducted with various kernel functions. Classification accuracies obtained with these kernel functions are shown in Fig. 8 for first-order feature extraction and Fig. 9 shows the accuracy obtained from SVM classification using Haar wavelet feature extraction for the three different channels.

4 Conclusion and Future Scope The proposed work is implemented to detect maize crop diseases using first-order features and Haar wavelet-based GLCM features. Two classifiers, namely, K-NN and SVM, are used in the study. The system thus developed detects two diseases, namely, common rust and northern leaf blight, and multiple diseases in single leaf. K-NN classifier with Euclidian distance metric has yielded highest accuracy of 84% (K = 5). SVM has yielded highest accuracy of 85% with RBF kernel. The accuracy obtained from Haar wavelet feature extraction using k-NN classification is

Fungal Disease Detection in Maize Leaves …

285

85% for k = 7 and using SVM accuracy is 88%. The accuracy of the system can be increased using good quality images of maize leaves. The study can be extended to detect other diseases of maize crop and also to detect other crop diseases with little modifications. The study of disease detection using deep learning approaches with first- and second-order features is under implementation. Quality analysis of images can be performed in various color spaces. There is scope for computing the features in other color spaces. Acknowledgements We would like to thank University of Agricultural Sciences, Dharwad [8] for their cooperation in getting maize crop images. We would like to thank Prof. V. B. Nargund, Dept of Plant Pathology, University of Agricultural Sciences, Dharwad for providing domain knowledge and Prof. Shantala Giraddi for the encouragement and guidance.

References 1. India Maize Summit (2014). http://ficci.in/spdocument/20386/India-Maize-2014_v2.pdf 2. Corn Foliar Diseases Identification and Management Field Guide. https://web.extension. illinois.edu/nwiardc/downloads/42809.pdf 3. Tichkule, S.K., Gawali, D.H.: Plant diseases detection using image processing techniques. In: 2016 Online International Conference on Green Engineering and Technologies (IC-GET), pp. 1–6. IEEE (2016) 4. Devi, R., Hemalatha, R., Radha, S.: Efficient decision support system for Agricultural application. In: 2017 Third International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), pp. 379–381. IEEE (2017) 5. Joshi, A.A., Jadhav, B.D.: Monitoring and controlling rice diseases using Image processing techniques.” In: International Conference on Computing, Analytics and Security Trends (CAST), pp. 471–476. IEEE (2016) 6. Giraddi, S., Hiremath, P.S., Pujari, J., Gadwal, S.: Detecting abnormality in retinal images using combined haar wavelet and GLCM features. Int. J. Control Theor. Appl. 9(10), 4339– 4346 (2016) 7. Giraddi, S., Gadwal, S., Jagadeesh, P.: Abnormality detection in retinal images using Haar wavelet and First order features. In: 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), pp. 657–661. IEEE (2016) 8. Jagtap, M.S.B., Hambarde, M.S.M.: Agricultural plant leaf disease detection and diagnosis using image processing based on morphological feature extraction. IOSR J. VLSI Signal Process. (IOSR-JVSP) 4, 24–30 (2014) 9. Bin MohamadAzmi, M.T., Isa, N.M.: Orchid disease detection using image processing and fuzzy logic. In: 2013 International Conference on Electrical, Electronics and System Engineering (ICEESE), pp. 37–42. IEEE (2013) 10. Kamlapurkar, Sushil R.: Detection of plant leaf disease using image processing approach. Int. J. Sci. Res. Publ. 6(2), 73–76 (2016) 11. Ayoobkhan, M.U.A., Chikkannan, E., Ramakrishnan, K.: Feed-forward neural network-based predictive image coding for medical image compression. Arabian J. Sci. Eng. 1–9 (2017) 12. Subramanya, S.R., Sabharwal, C.: Performance evaluation of hybrid coding of images using wavelet transform and predictive coding. In: Proceedings of Fourth International Conference on Computational Intelligence and Multimedia Applications, 2001. ICCIMA 2001 pp. 426– 431. IEEE (2001)

286

A. S. Deshapande et al.

13. Haralick, Robert M., Shanmugam, Karthikeyan: Textural features for image classification. IEEE Trans. Syst. Man Cybernet. 6, 610–621 (1973) 14. Pawar, P., Turkar, V., Patil, P.: Cucumber disease detection using artificial neural network. In: International Conference on Inventive Computation Technologies (ICICT), vol. 3, pp. 1–5. IEEE (2016) 15. University of Agricultural Sciences, Dharwad

Features Extraction and Dataset Preparation for Grading of Ethiopian Coffee Beans Using Image Analysis Techniques Karpaga Selvi Subramanian, S. Vairachilai and Tsadkan Gebremichael Abstract Coffee is the natural gift of Ethiopia. Generally the export quality washed coffee beans of Ethiopia are classified into two grades and sundried coffee beans into five different grades based on their number of defects. The objective of the research is to extract the features of green coffee beans from the images which would be helpful on classifying the coffee beans to different grades by an automated system. Different image processing techniques are applied on the images to perform preprocessing, segmentation and feature extraction. The extracted features of the coffee beans from images are broadly classified as morphological, textural and color. The morphological feature includes area, perimeter, major axis length, minor axis length, Eccentricity. Energy, Entropy, contrast and homogeneity are the information relevant to texture. Individual color component values of the three primary colors, along with its hue, intensity and saturation values are extracted for color features of the coffee beans. Automated classification and machine learning algorithms needs a data set for further processing. Data mining applications for discovering different patterns for various grades and knowledge discovery need a highly accurate dataset. Hence, preparing such data set by applying image processing techniques is becoming the objective of this research study. The objective is realized with a dataset of 100 observations for each grade with morphological, textural, and color features.





Keywords Feature extraction Texture features Morphological features Color features Image analysis techniques Ethiopian coffee bean dataset





K. S. Subramanian (✉) School of Electrical and Computer Engineering, Ethiopian Institute of Technology-Mekelle, Mekelle University, Mekelle, Ethiopia e-mail: [email protected] S. Vairachilai Faculty of Science and Technology, IFHE University, Hyderabad, India e-mail: [email protected] T. Gebremichael School of Electrical and Computer Engineering, Mekelle University, Mekelle, Ethiopia © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_28

287

288

K. S. Subramanian et al.

1 Introduction Ethiopia is a birth place of Arabic Coffee beans. Ethiopia is the largest producer of coffee beans in Sub-Saharan Africa and is the fifth largest coffee producer in the world next to Brazil, Vietnam, Colombia, and Indonesia [1]. Coffee beans are country’s major export item and a source of high foreign exchange. The name Coffee was driven from Ethiopian province “kaffa” where coffee beans were first blossomed [2]. The coffee beans produced and exported by Ethiopian government are two types namely wet (washed) and dry (unwashed). Washed coffee beans are divided into two grades and sundried coffee beans into five different grades based on their number of defects. From the total production of coffee beans in Ethiopia is 50% from gardens, 30% from semi-forest, 10% from plantation and 10% from forest [3]. According to [1], from the total export of coffee beans, 70–80% covers unwashed and 20–30% washed. An organization established by the Government of Ethiopia is Ethiopian Commodity Exchange (ECX) [4]. ECX is providing a platform to sell almost all coffee beans on the ECX floor either directly to the importers or to the domestic exporters [3]. Another organization under ministry of Agriculture is Coffee Quality Inspection Center (CQIC) which is responsible for further process to the exportable coffee beans that passed through ECX. According to the ECX and CQIC, while grading coffee beans manually, 40% is given to the physical analysis (raw) and 60% for cup test (liquor). Number of defects is primary parameters for grading the coffee beans. A total of seven grades in which two grades for washed and five grades for unwashed are considered. The sample coffee beans for all different seven grades are collected from these organizations for the research purpose. Currently the above-mentioned organizations conduct grading manually. The manual inspection system has a lot of challenges in identifying the right grade or good quality of the coffee beans. Even though human vision is powerful and can easily identify things, its effectiveness and accuracy is less due to different dependency and environmental factors. It is also time-consuming and demands high labor cost. Hence the objective of the proposed research work is to extract the physical features of the green coffee beans by machine vision and image processing techniques, which in turn can be used as an input to the automatic classification of the coffee beans to their targeted groups by applying machine learning methods [5].

2 Literature Review Many researchers have been working on similar problems on feature extraction from images of different agricultural products and classifying them into different groups by using various machine learning methods based on those features. The state of art literatures are reviewed and their findings are summarized in this section.

Features Extraction and Dataset Preparation for Grading …

289

A research study done by Sukhvir and Derminder [6], proposed a model that identify the quality of rice grains by extracting the geometric features of rice grains. The method extracted seven geometric features of individual rice grain from digital images and those features are used for training a classifier that classifies them into three different classes. A study carried out by Meesha and Nidhi [7], discussed classification of wheat using machine learning algorithm. They compares Neural Network (LM) and Support Vector Machine (OVR) classifiers that classified the wheat grains to four grades. Based on results acquired, ANN was found to be better than SVM. Another study by Siddagangappa and Kulkami [8], automated system was applied to identify type of grain and to analyze the quality. The paper used probabilistic Neural Network classifier that was trained by morphological and color features for identification and classification. The performance acquired in this work was 98% for type identification, rate of quality analysis 90 and 92% for grading. Birhanu et al. [9] classified the coffee beans belongs to different provinces of Ethiopia as Hararghe, Jimma, Yirgachefe and Wellega. Image processing techniques were applied to extract color, morphological and texture features which are used to predict the places of the coffee beans from where they are harvested. ANN was implemented and a performance of 95%, 100%, 87.7% and 100% were achieved for color, morphology, texture and combination of morphology and color respectively. The objective of the research is only to predict the places from where the coffee beans are gathered but not grading. According to Faridah et al. [10], Indonesia Coffee beans were classified into six grades using texture and color features. Neural Network algorithm was employed to accomplish the work. Betelihem [11], proposed a system that detects defective coffee beans. Their system identifies the broken coffee beans from the mixture of good and defective beans. Area and perimeters were taken to calculate a defective metric and coffee beans whose metric values are equal or greater to 0.65 are considered as healthy and others as damaged. All the above studies no dataset was explicitly prepared but on the fly feature extraction was performed to train the machine learning algorithm. Even if some have prepared a dataset, that are for a very specific application and that could not be able to use as common dataset for similar research studies. Hence in the research study it is decided to prepare an accurate dataset that includes all necessary features of Ethiopian green coffee beans for grading according to ECX and CIQC. The prepared dataset is common and would be used by any machine learning methods.

290

K. S. Subramanian et al.

3 Proposed Methodology for Extracting (Preparing) the Coffee Bean Data Set Previously there have been no data set prepared for export standard coffee beans and hence we proposed the preparation of new data set by ourselves from scratch. Sample coffee beans were collected from Ethiopian Commodity Exchange (ECX) and Coffee Quality Inspection Center (CQIC), Ethiopian government authorities for exporting and coffee beans quality checking respectively. The proposed methodology comprises of different image analysis and feature extraction methods. The design of different techniques and process flow are as shown in Fig. 1.

3.1

Image Acquisition

The sample images were captured in a controlled environment with illumination on a constant white background by maintaining the camera at a distance of 15 cm and 45° angle with reference to ground. Canon Digital camera with 16.2 MP was used and uniform images with 4608 × 3570 pixels resolution were acquired. Total of 700 images, in which 100 images for each grade were prepared.

3.2

Image Preprocessing

The image captured might be exposed to different environmental factors and affected with noises. Hence the required information may not be visible clearly. Preprocessing techniques helps to improve the quality of the image by removing noises and adjusting the contrast while preserving the information. Based on the sample images acquired it was decided to use median filtering and histogram equalization for preprocessing. Median filtering was the proper technique

Fig. 1 Block diagram of the proposed methodology

Features Extraction and Dataset Preparation for Grading …

291

Fig. 2 Original sample image and its histogram

Fig. 3 Enhanced image and its histogram

for enhancing by removing noises because it preserved the edge information which was important to our work. Histogram equalization was applied to enhance the contrast which was necessary to separate coffee beans from background. Original image and an image enhanced by histogram equalization are shown in Figs. 2 and 3 respectively. The next process after preprocessing was segmentation [12]. It is the process of partitioning an image into different objects of interest. The coffee bean objects need to be partitioned from the background. Binarization is the process of converting the gray scale image to binary image represented by two values 1 and 0. Thresholding can be used to create binary images by turning all pixels below some threshold value to zero and above to one. Thresholding was applied to separate the coffee beans from the background. The binary image is still having some noise due to shadow of beans. Morphological filters such as dilation and erosion are performed in order on binary images. First erosion operation was applied to eliminate the shadow of the beans. Next dilation operation was applied to the eroded image to enhance and improve boundary sharpness. These preprocessed images (a sample is shown in Fig. 4) were used to get the morphological features.

292

K. S. Subramanian et al.

Fig. 4 Segmented binary image

3.3

Feature Extraction

Feature extraction is the process of extracting useful information from the image. The acquired sample images were undergone various preprocessing techniques mentioned in the above sections. These processed images were input to the feature extraction phase and provided a vector values as the extracted features. The main features extracted from the coffee bean images are morphological, texture and color. A total of sixteen features in which five morphological, four textural and seven colors were extracted for all sample images. Morphological features Morphological features were most important features for coffee beans grading since it described the shape and size of the coffee beans. Morphological features were playing a vital role in differentiating the grades of coffee beans. The main features used in research work were Area: This feature determined the size of coffee bean by calculating the number of pixels occupied by the bean. Perimeter: This was total sum of pixels occupied by the edge or boundary of the object. Major Axis Length: It was distance between the end points of the longest line that could be drawn through the coffee bean. The major axis endpoints were found by computing the pixel distance between every combination of border pixels in the seed boundary.

Features Extraction and Dataset Preparation for Grading …

293

Minor Axis Length: It was distance between the endpoints of the longest line that could be drawn through the seed while maintaining perpendicularity with the major axis. Both major axis length and minor axis length collectively described the shape of the coffee bean. Eccentricity: It was the ratio of the distance between the major axis length and the foci of ellipse. The value is between 0 and 1. E=

H , W

ð1Þ

where H is Major Axis and W minor Axis of the coffee beans Percentage of broken grain: It was also other parameter of Morphology. It was calculated by dividing the number of broken coffee beans by the total Number of coffee beans multiplied by 100% as given in Eq. (2). Percentage of broken beans =

Number of Broken beens * 100 Total number of beans

ð2Þ

Algorithm 1: Pseudo code for extracting morphological features from sample image. 1. Read acquired image 2. Convert color image to gray level image 3. Enhance the gray image by applying histogram equalization and median filtering 4. For each pixel repeat step 5 and 6 5. If (pixelvalue < thresholding value) (a) Image intensity = 1 6. Else (a) Image intensity = 0 7. Count number of objects by identifying No of connected component 8. For all connected components in the image calculate Area, Perimeter, Major Axis, Minor Axis and Eccentricity 9. Calculate the mean value of each and save. Textural features Texture parameters described structural arrangement of surfaces and intensity variation of objects on the Coffee Bean image. They enable us to visualizing pattern and surface properties of coffee beans. Energy: Measures the concentration of intensity pair in co-occurrence matrix. Entropy: Measures the degree of randomness of intensity distribution. Contrast: Measures the difference in the strength between intensity in image.

294

K. S. Subramanian et al.

Homogeneity: Measures the homogeny feature of the intensity variation within the image and is the inverse of contrast. Algorithm 2: Pseudo code for extracting textural features from sample image 1. 2. 3. 4. 5.

Convert color image to gray level image Extract the GLCM from the Gray level Calculates the statistics from the GLCM For all connected components in the image Calculate mean value of Homogeneity, Contrast, Correlation and Energy and save.

Color Features In feature extraction, color is an important feature for image representation. After separating the RGB color from the image the mean value of R, the mean value of G, the mean value of B will be measured. In addition to this, the mean value of Hue, Intensity and Saturation will be calculated. The standard deviation values of R, G, B color are the other features for our work. R=

R ðR + G + B Þ

ð3Þ

G=

G ðR + G + BÞ

ð4Þ

B=

B ðR + G + B Þ

ð5Þ

The mathematical formula that converts RGB color space to HSI is given as follows. 1 I= ð6Þ 3ð R + G + B Þ S=1−

H = arcos

3 ½minðR, G, BÞ ðR + G + B Þ ( ½ðR − GÞ + ðR − BÞ

ð7Þ

2 1

½ðR − G2 Þ + ðR − BÞðG − BÞ2

Algorithm 3: Pseudo code for extracting Color features. 1. Separate the RGB components from original color image 2. Obtain the HIS components from RGB components 3. Find mean value of R, G, B, H, S and I and save.

ð8Þ

Color

Texture

Morphology

Area Perimeter Major axis Minor axis Eccentricity Contras Homogeneity Correlation Energy Mean red Mean green Mean blue Mean saturation Mean intensity Mean hue Standard deviation

Features

Table 1 Data set structure

Sample 1

77840.87 1288.073 382.1573 257.515 0.730117 0.109431 0.94564 0.962239 0.316276 161.237 149.321 139.891 0.476 0.792 0.214 6.274239

Sample 2 72166.37 1282.303 357.8846 254.6251 0.696163 0.118927 0.940835 0.957347 0.300859 159.213 147.569 137.891 0.454 0.777 0.135 5.932787

Sample 3 79240.87 1322.001 383.4455 260.6888 0.72611 0.141802 0.929429 0.946826 0.255949 157.126 146.458 136.782 0.39 0.759 0.114 4.923659

Sample 4 71009.43 1255.566 354.1265 254.1151 0.685526 0.154326 0.923117 0.933018 0.258171 156.158 145.176 135.236 0.368 0.724 0.091 4.632457

Sample 5 65895.09 1228.042 329.8854 234.0021 0.698531 0.1415 0.929404 0.901864 0.433231 154.265 144.389 134.27 0.355 0.711 0.075 4.534563

Sample 700 57078.43 1080.44 304.544 236.8712 0.623601 0.059983 0.970123 0.947038 0.629298 151.721 141.503 130.302 0.31 0.444 0.031 4.032457

… … … … … … … … … … … … … … … … …

Features Extraction and Dataset Preparation for Grading … 295

Color

Texture

Features Morphology

Area Perimeter Major axis Minor axis Eccentricity Contras Homogeneity Correlation Energy Mean red Mean green Mean blue Mean saturation Mean intensity Mean hue Standard deviation

Statistics

49455.6 1018.43 285.6616 217.3734 0.593971 0.044696 0.90606339 0.88999064 0.181624 150.003 140.301 130.001 0.3 0.444 0.019 4.032457

Minimum

Table 2 Statistics of the features vectors 85338.83 1412.662 400.0826 273.778 0.759485 0.188232 0.9777721 0.9788017 0.682005 161.832 149.911 139.932 0.598 0.799 0.214 6.532212

Maximum 68374.44 1219.232 347.59 247.0171 0.689958 0.124802 0.9378811 0.9440837 0.353039 156.0443 145.275 135.3784 0.383903 0.70577 0.091204 4.935011

Mean 69324.67 1231.174 349.6732 248.8726 0.69674 0.127016 0.9367678 0.9467684 0.31842 156.423 145.369 135.456 0.37 0.729 0.091 4.732457

Median 7888.877 87.45986 26.91645 11.31733 0.033682 0.021906 0.0109379 0.0141146 0.097421 3.267213 2.45111 2.596088 0.059926 0.08846 0.043444 0.750029

Standard deviance

62,234,385 7649.227 724.4954 128.0819 0.001134 0.00048 0.0001196 0.0001992 0.009491 10.67468 6.007939 6.739674 0.003591 0.007825 0.001887 0.562543

Variance

296 K. S. Subramanian et al.

Features Extraction and Dataset Preparation for Grading …

297

4 Discussion and Analysis of the Result The features that are extracted by applying above mentioned techniques were stored in an excel file with an additional attribute of its grade. Totally 700 hundred samples images were used to extract 16 features from each. The feature vector includes the values 16 attributes. The attributes such as area, perimeter, major axis, minor axis, eccentricity describe the shape and size features. The texture details of the coffee beans are described by contrast, homogeneity, correlation, energy, and entropy. The mean value of primary three colors and it hue saturation and intensity attributes describe the color features of the coffee beans. This could be used as data set for different machine learning applications like classification, clustering, and other predicting models. In our research work we planned to develop an optimized classification model by conducting various experiments by choosing different machine learning algorithms and different set of attributes of the dataset. The output vector obtained by feature extraction method is shown in Table 1. The statistics of the features discussed in Table 1 are discussed here in Table 2.

5 Conclusion and Future Work Our main objective here in this paper is to prepare a data set of by extracting the features such as morphological, textural and color features of the exportable Ethiopian coffee beans. A total of 700 images were captured and 16 features were prepared for each image so that they can be used for automatic classification (grading) purpose. Therefore, the extracted features can be used as input to any classifications of coffee beans based on machine learning algorithms. It can also be used as input data set to researches that will be done on data mining area for the future.

References 1. Abu, T., Teddy, T.: Ethiopian Coffee Annual Report. USDA ET- 1302 (2013) 2. Selamta (2016). http://www.selamta.net/EthiopianCoffee.html 3. Alemayehu, A.: Coffee production and marketing in Ethiopia. Eur. J. Business Manag. 6(37), 109–121 (2014) 4. ECX (2016). http://www.ecx.com.et 5. Renugambal, K., Senthilraja, B.: Application of image processing techniques in plant disease recognition. Int. J. Eng. Res. Technol. (IJERT) 4(3), 919–923 (2015) 6. Sukhvir, K., Derminder, S.: Geometric feature extraction of selected rice grains using image processing techniques. Int. J. Comput. Appl. (0975–8887) 124(8), 41–46 (2015) 7. Meesha, P., Nidhi, B.: Classification of wheat grains using machine algorithms. Int. J. Sci. Res. (IJSR) 2(8), 363–366 (2013)

298

K. S. Subramanian et al.

8. Siddagangappa, M.R., Kulkarni, A.H.: Classification and quality analysis of food grains. IOSR J. Comput. Eng. (IOSR-JCE), 16(4), 01–10 (2014) 9. Birhanu, T., Getachew, A., Girma, G.: Classification of Ethiopian coffee beans using imaging techniques. East African J. Sci. 7(1), 1–10 (2013) 10. Faridah, F., Parikesit, G.O., Ferdiansjah, F.: Coffee bean grade determination based on image parameter. TELKOMNIKA 9(3), 547–554 (2011) 11. Betelihem, M.: Method of coffee bean defect detection. Int. J. Eng. Res. Technol. (IJERT), 3 (2), 2355–2357 (2014) 12. Rajinikanth, V., Satapathy, S.C.: Segmentation of Ischemic stroke lesion in brain MRI based on social group optimization and Fuzzy-Tsallis entropy. S.C. Arab. J. Sci. Eng. (2018) https:// doi.org/10.1007/s13369-017-3053-6

An Overview of Internet of Things: Architecture, Protocols and Challenges Pramod Aswale, Aditi Shukla, Pritam Bharati, Shubham Bharambe and Shekhar Palve

Abstract Internet of Things is a fast-emerging technology, flourishing in many trends, like making smart homes, industries, health care, and many more. From a technological outlook, it allows the expansion of fresh protocols and circumstances, as the existing protocols cannot handle the increasing amount of devices connected and the data being transferred. Internet of Things (IoT) can be defined as the prevalent and worldwide network which helps in delivering a system for checking and regulating the physical world with the help of protocols and IOT sensors. The IOT enforces that all objects, like smartphones, smart watches, and similar gadgets insert with the other components as in sensors linked to a common network so any individual may communicate with any resource at any time when required, by using a source that is known in the network. This paper confers a rational survey of architecture, protocols, applications, and challenges in context of Internet of Things. Keywords Internet of Things (IOT) ⋅ IOT architecture ⋅ IOT protocols

1 Introduction Internet of things is expanding since few years. The basic aim behind this technology is to have different objects such as RFID, NFC, sensors, mobile phones, etc., using ideal addressing schemes that can interact among themselves [1]. The first technology supported by IOT was RFID and it was in 1999, when the first device P. Aswale (✉) Department of E & TC, Sandip Institute Of Engineering and Management Nashik, Nashik, Maharashtra, India e-mail: [email protected] A. Shukla ⋅ P. Bharati ⋅ S. Bharambe ⋅ S. Palve Department of E & TC, Sandip Institute of Technology and Research Centre, Nashik, Maharashtra, India A. Shukla ⋅ P. Bharati ⋅ S. Bharambe ⋅ S. Palve Savitribai Phule Pune University, Pune, Maharashtra, India © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_29

299

300

P. Aswale et al.

was designed ranging from 10 cm to 200 m. IOT started to come in picture in 2010– 2011 as the number of physical objects linking with Internet grew to 12.5 billion by assuming devices per person should be more than one. Although Knowing the importance of developing IOT, it contains many challenges like scalability, reliability, economy, management, interoperability, mobility, and many more [2]. Basically, the device with IOT has low memory, reduced battery capability, and susceptible radio conditions. As the TCP/IP protocol is not apt for this situation, so developers initiated to emerge with new protocols [3]. IOT requires different protocols than the ones that are already existing. Hence, new protocols are evolved such as CoAP, MQTT, mDNS, DNS-SD, IEEE, and many more protocol assists differently in various applications. This paper consists of a survey on the architecture, different protocols of IOT, and the challenges faced.

2 Architecture As number of objects are to be connected using IOT, hence, it requires a layered architecture which should be flexible in nature. Data capacity should be large so that if there is an increase in number of objects it should not create congestion [2]. Many different models of IOT have evolved based on their types of architecture, the most general model is the one having three-layered architecture comprising of perception, network, and application layer. As there are certain challenges faced by the traditional architectures, the emerging formats have taken care to prevent those challenges [2].Types of IOT architecture are given as follows: (a) Three-layer architecture, (b) Middle-ware architecture, (c) SOA-based architecture, (d) Five-layer architecture, (6) Cloud-specific architecture [4].

2.1 Three-Layer Architecture It indicates the most general model, suitable for remote health, smart grids, and so on [4]. The three-layer architecture are made of network layer, application layer, and perception layer. Perception layer is the lowest layer [5]. It consists of various sensors for detecting objects that collect data [6]. The network layer is also referred as the transmission layer, it is the middle layer [5]. It is responsible for transmitting and operating on the information given by the sensors, gathered from perception layer [4].

An Overview of Internet of Things: Architecture, Protocols and Challenges

301

2.2 Middle-Ware Architecture It is basically a software or collection of sub-layers amidst technological and application layer [7]. It addresses services such as reliability, scalability, coordination, etc., it is responsible for connecting various parts of network and gives a link for transmitting information among various LANs [4].

2.3 SoA Architecture It aims on developing the workflow of coordinated services and also ensures the reuse of software and hardware components. In SoA, there are four layers communicating with each other, namely application layer, service layer, network layer, and perception layer. Service layer is further divided into two layers called as service management sublayer and service composition sublayer. The working of the other layers except the service layer is similar as in the previous architecture models. The service layer is responsible for giving support to application layer. Service layer comprises of service composition, service management, service discovery, and service interface in which service discovery is utilized to discover waned service requests [5]. Service management is used to manage each object [4], service composition is responsible to communicate with the connected devices and to add or subtract objects as required and service interface provides link for communication between all the services [5].

2.4 Five Layer Architecture (a) Perception Layer: It is the most bottom layer and identical to the physical layer of OSI model. It consists of data acquisition technologies such as RFID, sensors also two-dimensional object that collects data of parameters such as weight, temperature, humidity, and many more. (b) Object abstraction layer/network layer: It major involves in transmitting data to above layers through guarded links. Information is transmitted to the middle-ware information processing a system via networking technique with W-wire, 3G, GSM, Wi-Fi, and so on. (c) Service management/middleware layer: It provides a link between application layer and sensors to ensure competent interaction between software. (d) Application layer: It gives the different application services required by the customer and all data understanding takes place at this level [2]. (e) Business/management layer: It provides checking and managing of remaining four layers. It looks after the complete IOT services and applications and gives high- level analysis reports. This layer facilitates higher authorities to take rightful decisions [2].

302

P. Aswale et al.

2.5 Cloud-Specific Architecture It comes out with visualization in ingratiation with networking, storage, and computation to enable for the required extensibility to catchup different and at times competing needs of various fields.

3 Protocols Protocols are the format required for transmitting data from one device to another whether it is same or different networks. IoT requires different protocols for number of activities [2]. Since IoT connects a large number of objects, hence it will create enormous traffic and massive amount of data capacity is needed. In order to this, IoT will face certain issues, particularly is privacy and security, therefore new protocols are needed to be developed [8]. Protocols are broadly classified as Application protocol, Service discovery protocol, Infrastructure protocol, and other influential protocols [6] (Fig. 1).

Fig. 1 Classification of protocols [6]

An Overview of Internet of Things: Architecture, Protocols and Challenges

303

Fig. 2 a CoAP architecture [6]. b CoAP message format [6]

3.1 Application Protocol 3.1.1

CoAP—Constrained Application Protocol

Internet engineering task force has developed Co-AP which majorily focus in machine-to-machine applications [22]. HTTP is complex for smell IoT device hence CoAP is specified which is based on REST, i.e., Representation state transfer. It is a web transfer protocol that helps to transmit data among servers and clients [3]. Rest allows clients and servers to open up and use web services such as simple object access protocol (SoAP) but in a simpler method with the help of uniform resource identifier (URLs) as noun and HTTP get, post, put and delete ways as verbs [6] (Fig. 2). Request/response sublayer takes care of REST communication while messaging sub-layers handles every single message transmitted between end users, detects duplication and ensure reliable communication [2, 6]. CoAP have publish/subscribe architecture which ensures multi-cast communications. Header consists of the following files, Ver is version, T is typer of transaction, OC is optional count, and code indicates the request method (1–10) or response code (40–255) [6].

3.1.2

MQTT—Message Queue Transport Telemetry

It is utilized to collect the measured information on remote sensors and send the data to the servers [5]. MQTT targets at linking embedded devices and network with applications and middle-ware MQTT is a lightweight, open and easy to implement. MQTT runs with TCP/IP or another network protocol that give ordered, lossless and bidirectional links [6]. It is comprised of three components: subscriber, broker and publisher. The device interest is bound to register as a subscriber for certain

304

P. Aswale et al.

Fig. 3 a Architecture of MQTT [6]. b Message format of MQTT [6]

topics, informed by the broker at a time when publisher publish topics of interest. The publisher sends the information to the invested subscribers with the help of broker. Broker ensures security by checking authorization of the publishers and subscriber [6]. Each subscriber sends a CONNECT message to the broker to develop a connection to keep the connection active, the client should timely send messages to the broker, either it should be in form of data or ping. They can also transmit PUBLISH messages, having a topic of interest and a message, or can send SUBSCRIBE message to receive packets, the clients can send UN-SUBSCRIBE to stop getting messages on a particular topic and also they acknowledge each packet they send [3] (Fig. 3). Initial 2 bytes of messages are fixed header. In the format, the value of the message type field shows types of messages including CONNACK, CONNECT, SUBSCRIBE, PUBLISH, and likewise. UDP flag shows if the message is copied and the receiver has got it before. Three level of QoS is identified by QoS level. The retain field tells the server to preserve the last received publish message and provide it to new subscribers as their initial message. The rest field indicates the remaining length of message that is the optional part [6]. MQTT is presently utilized by facebook messenger application, as it allows sending messages in milliseconds [9]. It is used by many applications such as machine-to-machine messaging, in health care monitoring, and many more [3].

3.1.3

XMPP—Extensible Messaging and Presence protocol

It is an IETF instant messaging (IM) standard, that is utilized for multi-chat, voice, and video call and telepresence. XMPP helps the user to interconnect with each other by sending instant messages over the internet without considering the operating system used. XMPP ensures IM applications to get access control, authentication,

An Overview of Internet of Things: Architecture, Protocols and Challenges

305

Fig. 4 a Publish/subscribe mechanism in AMQP [6], b Message format of AMQP [6]

end-to-end encryption, privacy measurement, and compatibility with the remaining protocols [6]. It have publish/subscribe protocol and quest/response message systems. It ensure low latency message exchange and have TLS/SSL security [2].

3.1.4

AMQP: Advanced Message Queuing Protocol

For IoT aiming on message-oriented environment, AMQP is an open standard application layer protocol. It assures dependable communication through message delivery guarantee involving at most once, at least once, and exactly one delivery. It needs a dependable transport protocol like TCP or transfer messages [6]. Communication is looked after by two important components , message and exchange queues. It works for operation among clients and message middle-ware servers. It is a binary protocol [2]. There are two types of messages, namely bare messages, transferred by the sender and annotated messages, and received by the receiver [6] (Fig. 4).

3.1.5

DDS—Data Distribution Service

It is a publish/subscribe protocol designed for real-time machine-to-machine communication which is developed by object management group(OMG). Regardless of other publish/subscribe protocols, DDS is based on a architecture that is broker less and operates with multicasting for better reliability and Quality of service for applications [6]. DDS architecture gives two layers, namely Data-centric publish-subscribe (DCPS) and Data-local reconstruction layer (DLRL). DCPS delivers the data to the subscriber. DLRL is an optional layer and helps as the interface to DPCS functionalities [6].

306

P. Aswale et al.

3.2 Service Discovery Protocol 3.2.1

Multi-cast DNS (mDNS)

mDNS is a protocol that is capable of performing operations of unicast DNS server. As DNS is used without any extra expenditure or configuration and it is flexible. It is an apt choice for embedded Internet-based applications because of three reasons: (1) No need of manual configuration or additional administration. (2) It can run without infrastructure. (3) It can continue working even if infrastructure fails. mDNS check names by transmitting an IP multi-cast message to all nodes in the local domain. With the help of this check, the client enquirer devices that have given a name to reply back. As soon as the target machine owns a name, it multicasts a response message that consists of its IP address.

3.2.2

DNS-SD-DNS service discovery

The pair function of needed services by the clients utilizing mDNS is termed as DNSbased service discovery. With the help of this protocol, the clients can evolve required services in a particular network by applying standard DNS messages [6]. DNS-SD uses mDNS for sending packets to particular multi-cast addresses through UDP. The service discovery process consists of two important steps namely, (a) Searching for the hostname of the needed service. (b) Pairing the IP addresses with their hostnames with the help of mDNS.

4 Challenges The challenges faced by IoT like energy supply, security, and many more and they are different from each other and belong to certain category [10]. 1. Authenticating devices: All the device using sensors or other such objects must trail certain policy and proxy rules for confirmation to authorize the sensor to make their information public. Presently, if security is to be provided to the objects than a high cost result is used [11]. 2. Identification of IOT environment: Identification is important in all layers of IoT. It is the most important challenge as IoT will have vast applications and structures with different patterns and characters. This is more difficult in distributed environment which the main focus of IoT. This challenge remains valid for bounded and closed environments. Different parameters state the main use of the identifier like security, governance, and privacy. Hence, there is a need of global reference for identification [11]. 3. Data management: One of the most important issues is to manage the data. Cryptographic mechanisms and related protocols are the most efficient options to

An Overview of Internet of Things: Architecture, Protocols and Challenges

307

protect the data but at times, it is not possible to deploy these techniques. Hence, there should be different policies to manage the data regardless of the type of the data but for this implementation, there is a need to change many existing mechanisms [11]. 4. Continuous operation: The applications of IoT are noted similar to that of computers because computers have human to give them command but in the case of smart objects, they have to configure themselves without any human intervention, adjust themselves in any situations, and also act independently. They require appropriate sensors for making decisions [10]. 5. Detection: In context with Internet of Things, the population of objects is increasing with the increase in human population as every human is carrying more than one device. These increasing devices should be detected and what is going on these devices should also be known [10].

5 Conclusion The evolving technology Internet of Things is changing the lifestyles in modern by connecting limitless devices in near future, there will be a web of IoT connecting worldwide device bringing nations closure. It will help to connect people and gain information anytime and anywhere in the world. This paper puts the light on a survey of the architecture, protocols, and challenges regarding IoT.

References 1. Agrawal, S., Vieira, D.: A survey on Internet of Things. Abaks, Belo Horizonte 1(2) 78–95 (2013). ISSN: 23169451 2. Choudhary, G., Jain, A.K.: Internet of Things: a survey on architecture, technologies, protocols and challenges. In: IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2016). Jaipur, India, 23–25 Dec (2016) 3. Florea, I., Rughinis, R., Ruse, L., Dragomir, D.: Survey of standardized protocols for the Internet of Things. In: 2017 21st International Conference on Control Systems and Computer Science 4. Belkeziz, R., Jaris, Z.: A Survey on Internet of Things Coordination. 978-1-5090-49264/16/$31.00. IEEE (2016) 5. Lin, J., Yu, W., Zhang, N., Yang, X., Zhang, H., Zhao, W.: A survey on Internet of Things: architecture, enabling technologies, security and privacy, and applications, pp. 2327–4662. IEEE (2016) 6. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., Ayyash, M.: Internet of Things: a survey on enabling technologies, protocols and applications. pp. 1553–877X (2015) 7. Atzori, L., Morabito, G., Lera, A.: The Internet of Things a survey. https://www.researchgate. net/ 8. Kraijak, S., Tuwanut, P.: A survey on Internet of Things Architecture, Protocols, Possible Applications, Security, Privacy. In: Real-World Implementation and Future Trends Proceedings of ICCT (2015)

308

P. Aswale et al.

9. Nstase, L.: Security in the Internet of Things: a survey on application layer protocols. In: 2017 21st International Conference on Control Systems and Computer Science 10. Nalbandian, S.: A survey on Internet of Things: applications and challenges. In: Second International Congress on Technology, Communication and Knowledge (ICTCK). Mashhad Branch, Islamic Azad University, Mashhad, Iran, 11–12 Nov 2015 11. Zolanvari, M.: IoT security: a survey. http://www.cse.wustl.edu/%7Ejain/cse570-15/ftp/iot_ sec/index.html

Assay: Hybrid Approach for Sentiment Analysis D. V. Nagarjuna Devi, Thatiparti Venkata Rajini Kanth, Kakollu Mounika and Nambhatla Sowjanya Swathi

Abstract We live in an age of massive online business, e-governance, and e-learning. All these activities involve transactions between customers, businessman, service providers, and recipients. Usually, the recipients give some comments on the quality of products and services. In this study, we proposed an algorithm named ASSAY (which means Analysis), to find the polarity at the document level. In our algorithm, initially we classify the reviews of each domain using naive Bayes and Support Vector Machine (SVM) algorithms which are in machine learning approach and then find the polarity at document level using HARN’s algorithm which comes under lexicon-based approach. In this algorithm, we use TextBlob for Parts of Speech (POS) tagging, where NV-Dictionary, ordinary dictionary, and SentiWordNet are used for extracting the polarities of features. Here, we combine both machine learning and lexicon-based approaches to calculate the result at document level accurately. In this way, we get the result about 80–85% more accurately than HARN’s algorithm which is proposed in lexicon-based approach. Keywords NV-Dictionary



Ordinary dictionary



TextBlob

D. V. N. Devi (✉) ⋅ K. Mounika ⋅ N. Sowjanya Swathi Rajiv Gandhi University of Knowledge Technologies, Nuzvid, India e-mail: [email protected] K. Mounika e-mail: [email protected] N. Sowjanya Swathi e-mail: [email protected] T. Venkata Rajini Kanth Jawaharlal Nehru Technological University, Hyderabad, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_30

309

310

D. V. N. Devi et al.

1 Introduction The process of computationally analyzing the piece of text (reviews) provided by the customer and reviewing the sentiment or opinion is called as opinion mining. A lot of research is going on opinion mining in the technology industry. Dave et al. introduced sentiment analysis in 2003. The research on opinion mining is now growing briskly since 2000. Sentiment analysis is one of the fascinating applications of data mining. Nowadays, we are aiming toward automation, because it is difficult for the service providers to read all the opinions and reviews manually and based on that, to arrive at a conclusion and take the necessary decision to upgrade the quality of the product/service. Web-based Sentiment Analysis (SA) has come out as one of the effective computational tools to identify and categorize the opinions expressed in a piece of text in response to the merits/demerits of products/ services. The main focus of this paper is to classify the products and then find polarity at both sentence and document levels.

2 Related Work Today, most of the research work is going on sentiment analysis and opinion mining. In our research work, we found many algorithms in machine learning approach and some few algorithms in lexicon-based approach. Among all, we use Naive Bayes (NB) and support vector machine [1] in machine learning approach for classification of reviews into domains and HARN’s algorithm [2] in lexicon-based approach to extract their polarity at sentence level [3]. Reference [4] validates results with 10-fold cross-validation and shown that NB gives an improvement of +28.78%; on the other hand, SVM gives an improvement of +63.6% when compared with baseline results using SVM and naive Bayes algorithms.

3 Illustrations Figure 1 represents the evolution of sentiment analysis. In this hybrid method, we used both supervised learning and lexicon-based approaches to get the result. Using naive Bayes and SVM algorithms, we first classify the reviews and then we get the polarity of reviews. In order to get the polarity of reviews, we designed a new method and this will help to get polarity at document level or sentence level.

Assay: Hybrid Approach for Sentiment Analysis

311

Fig. 1 Sentiment analysis

4 The Proposed Method Figure 2 describes the classification algorithms (Naive Bayes and SVM) used in this model.

4.1

Classification of Domains Using Machine Learning

Naive Bayes Classification [4–6]: Naive Bayes is one of the machine learning algorithms which is used for classification. It is used to classify high-dimensional data sets. Few examples are spam detection, sentiment analysis, classification of domains, etc. The mathematical formulation of naive Bayes algorithms is calculating the conditional probability. Naïve Bayes is defined as P(A|B): Calculating probability of occurrence of event A given event B already happens, whereas P(A) and P(B): Probabilities of the occurrence of event A and B, respectively. Input the sentence to naive Bayes algorithm to predict the domain. In this section, we calculated the probability of given reviews. If that probability is greater than 0.5, then classify the review; otherwise, that review will be input to SVM algorithm for classification. PðA ̸ BÞ =

PðB ̸ AÞ * PðAÞ . PðBÞ

PðA ̸XÞ = PðB1 jAÞ × PðB2 jAÞ × . . . × PðBn jAÞ × PðAÞ,

ð1Þ

312

D. V. N. Devi et al.

Fig. 2 Algorithm for domain classification

where P(A/B) P(A) P(B/A) P(B)

is is is is

the posterior probability of class, the prior probability of class, likelihood which is the probability of predictor given class, and the prior probability of predictor.

Support Vector Machine (SVM) [1, 7]: It is a classification algorithm which is used for classification. We can classify the reviews into more than two domains called multiclass SVM. In our work, we use multiclass SVM to classify Amazon reviews into domains which are not classified by naive Bayes.  Vij = Sij , Pij , Nij ,

ð2Þ

where Vij Pij Nij

is the overall polarity of a sentence, is the positive polarity of that sentence, and is the negative polarity of that sentence.

The overall domain classification for five domains in Amazon dataset will be shown in Fig. 3.

Assay: Hybrid Approach for Sentiment Analysis

313

Fig. 3 Domain classification

4.2

Algorithm for the Document Level Polarity

Input: The classified document. Output: Polarity at document level. Algorithm: 1. 2. 3. 4.

Take a document as an input. Split the document into several sentences. Do the Parts of Speech (POS) tagging to the sentence. Polish the sentence by removing unnecessary words and focusing on adjectives, verbs, and adverbs. 5. Getting the polarity for each word, (a) Check the word in NV-Dictionary. If the word is found, get that polarity. (b) If not check that word in ordinary dictionary. If the word found, then it gets the polarity and appended to NV-Dictionary. (c) If the word did not found in above methods, we get the polarity of the word either from TextBlob or from SentiWordNet [8] and append to both. 6. 7. 8. 9.

Calculate the sum and the product of all polarities. Process the alternately of words and polarities and give sentence polarity. Repeat steps 1 to 7 until we calculate polarity for all sentences in the document. Provide document polarity based on the number of positives, negatives, and neutral sentences. 10. Repeat steps 1 to 9 for all documents. 11. Draw the pie chart for individual product and also overall domain polarity (We use the two dictionaries to reduce the searching time and maintain the bulk amount of data with polarities).

314

D. V. N. Devi et al.

TextBlob [9]: TextBlob is an inbuilt python library. It is a Natural Language ToolKit (NLTK) that provides text processing and text mining. TextBlob also supports POS tagging, noun detection, word polarity (like SentiWordNet) etc., based on NLTK, patterns, and some natural language processing tools. We have used the methods like POS tagging and word polarity for our algorithm. Parts of speech tagging [9]: TextBlob takes a sentence as an input for POS tagging. The output of TextBlob function is the list of tuples containing word and its POS tagging. This list is split into two lists, which will be used for further steps. NV-Dictionary [2]: This is the third phase in the process of getting the sentiment. NV-Dictionary is a Noun–Verb dictionary. It plays the key role in the algorithm. In the creation of NV-Dictionary, we make use of dictionary concept in python language. In this dictionary, verbs, adverbs, and adjectives act as keys and their respective polarities are the values of corresponding keys and nouns are the name of the dictionary. The NV-Dictionary is maintained as Samsung = {“good”:1, “bad”:−1, “fast”:1} Battery = {“good”:1, “draining”:1, “fast”:−1} Ordinary Dictionary: This is also a kind of file which contains a list of all words. Here, positive words are assigned to one list and negative words are assigned to another list and neutral words are to another list. In this, we maintain three lists to store words based on their opinion. The structure of an ordinary dictionary is as follows: All_Positive_Words = [good, better, best] All_Negative_Words = [bad, worst, discourage] All_Neutral_Words = [again, before, after] N-Gram: In natural language processing, the n-gram is a continuous list of words in a given sequence or sentence which is used for predicting the next word. N-gram may contain letters, words, numeric, symbols, etc. In our algorithm, we considered the sequence of words which were encountered in our dataset. E.g., Samsung is better than Nokia. N-grams (Samsung, is, better, than Nokia).

4.3

Our Algorithm Special Rule

I. Our algorithm has one significant rule, i.e., if adjective comes before a verb or an adverb, it increases the polarity of the verb or an adverb. II. If an adjective comes after the verb or an adverb, it increases the polarity of the verb or the adverb.

Assay: Hybrid Approach for Sentiment Analysis

315

For example, “Samsung is very good”. Here “good” is an adjective and “very” is adverb. “good” is positive words. “very” enhances the positive polarity of good, i.e., the total polarity of given sentence is positive. Comparison sentences [10]: Using HARN’s algorithm, we differentiate comparison-oriented sentences using comparative words. In comparison of sentences, we compare two different products with the single feature. In that prominent cases, sentences have one positive polarity production and one negative polarity. E.g., “Lenovo performance is better than Acer”. The output will be like this: According to Lenovo → Positive According to Acer → Negative.

5 Implementation Process Reviews are given to naïve Bayes and SVM for domain classification. After domain classification, we implemented a new algorithm called automated HARN’s algorithm. According to this algorithm, we find Parts of Speech (POS) tagging using TextBlob to extract verbs, adverbs, and adjectives. After this, we get the polarity at aspect level using NV-Dictionary, ordinary dictionary, and SentiWordNet. Then, we find polarity at sentence level by finding sum polarity and product polarity and also used the polarity given by TextBlob. Finally, we find the average of all polarities at sentence level to find polarity at document level.

6 Result Using our algorithm, we find the classification of reviews and their polarities either at sentence level or document level or whole domain polarity. In order to calculate the result, we use product and sum of polarities at the word level. 1. The product of polarities: It means multiplying each and every polarity in the list. 2. Sum of polarities: It means the sum of each and every polarity in the list. The domain-level polarity will be represented in Fig. 4. E.g., “Samsung is not bad”. But in this sentence, we encounter with “not”. So the overall polarity becomes “Positive”. Initially, we combine the polarity of sentences which were identified by TextBlob, sum of polarities, and product of polarities. Second, we analyzed the

316

D. V. N. Devi et al.

Fig. 4 Overall polarity for amazon domains

results and provided the conclusion at both sentence and document levels. Using this algorithm, we got 80–85% accuracy.

6.1

Evaluation

During the evaluation of the result, we have considered True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). Using these four parameters, we formulate sensitivity, specificity, positive predictive value, and negative predictive value, and calculate the above four values as shown in Table 1. True Positive (TP) True Negative (TN) False Positive (FP) False Negative (FN)

Hit Correct Rejection False alarm Miss PRECISION =

TP = 90.36%. TP + FP

Table 1 Classification statistics

ð1Þ

Actual values Positive Negative Predicted values

Positive Negative

1500 (TP) 250 (FN)

160 (FP) 375 (TN)

Assay: Hybrid Approach for Sentiment Analysis

RECALL = ACCURACY =

TP = 85.7%. TP + FN

TP = 81.925%. TP + FN + FP + TN

317

ð2Þ ð3Þ

7 Conclusion and Future Work We started designing the HARN’s algorithm and got 75% accuracy and found that this algorithm does not analyze the sentiment for some troublesome reviews. Then, we started to design a new successful algorithm and acquired proficiency with 80– 85% accuracy. If we started using HARN’s algorithm, it requires some space for storing the structure of sentences, yet it becomes complicated when the number of sentences increases. HARN’s algorithm will save the structures of the numerous similar structured sentences. It will become a competitive task to solve and takes more time for acquiring desired sentence structure. So, in order to overcome this problem, we have designed our algorithm that does not take any space to save those structures. By designing in this way, it will help us to get the result with more accuracy. Reviews of sentences can be analyzed as positive, negative, or neutral opinions that will help the marketer come to a conclusion because of this simple way of reviewing the impact of the products. With advancements in sentimental analysis, the future works can be detection and calculation of polarity of sentences which are in message language (also known as shortcut language). If the given sentences in any language are written in English, finding out the prescribed language and obtaining the polarity is a big challenging task.

References 1. Ahmad, S.R., Bakar, A.A., Yaakub, M.R.: Metaheuristic algorithms for feature selection in sentiment analysis. In: Science and information Conference 0152 July 28–30, 2015, London, UK (2015) 2. Rajashekar, P., Akhil, G.: Sentiment analysis using HARN’s algorithm. In: 2016 IEEE Conference (2016) 3. Khan, A., Baharudin, B.: Sentiment classification using sentence-level semantic orientation of opinion terms from blogs. In: IEEE National Postgraduate Conference (2011) 4. Hassan, S., Rafi, M., Shaikh, M.S.: Comparing SVM and naïve Bayes classifiers for text categorization with Wikitology as knowledge enrichment. In: 2011 IEEE 14th International Conference on Multitopic Conference (INMIC) (2011) 5. Pang, B., Vaithyanathan, S., Lee, L.: Thumbs up?. Sentiment classification using machine learning techniques, Empirical Methods in Natural Language Processing (2002)

318

D. V. N. Devi et al.

6. Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. The Association for Computational Linguistics, pp. 271–278 (2004) 7. WAD, W.A.A.: Machine learning algorithms in web page classification. Int. J. Comput. Sci. Informat. Technol. (IJCSIT) 4(5) (2012) 8. SentiWordNet. http://sentiwordnet.isti.cnr.it/. Last accessed 01 Sept 2018 9. Quora. https://www.quora.com/How-do-I-add-custom-words-positive-and-negatives-for-sentimentanalysis-using-textBlob-nltk-in-python (2018). Last accessed 01 Sept 2018 10. Lunando, E.: Indonesian social media sentiment analysis with sarcasm detection. In: 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS) (2013)

Animal/Object Identification Using Deep Learning on Raspberry Pi Param Popat, Prasham Sheth and Swati Jain

Abstract In this paper, we have explained how to implement a Convolutional Neural Network(CNN) to detect and classify an animal/object from an image. By using the computational capabilities of a device known as Raspberry Pi, which has a relatively lower processing power and an infinitesimal small GPU, we classify the object provided to the CNN. An image is fed to Raspberry Pi, wherein we run a Python-based program with some dependencies, viz. TensorFlow, etc., to identify the animal/object from the image and classify it in appropriate category. We have tried to show that deep learning concepts like convolutional neural networks, and other such computation intensive programs can be implemented on an inexpensive and relatively less powerful device. Keywords Animal identification ⋅ Object identification ⋅ Deep learning Raspberry Pi ⋅ Convolutional neural networks

1 Introduction Animal and object identification plays a crucial role in daily activities performed by humans as well as some specific profession specialists, viz. photographers, etc. Animal identification is tiresome and extremely difficult task for a photographer or even a layman, a wildlife photographer may spend hours capturing a perfect shot of a bird but may not know the species of the bird that was captured in the photograph. An identifier may come handy in such situations which take the captured photograph as input and identify the animal or bird present in the photograph. P. Popat (✉) ⋅ P. Sheth ⋅ S. Jain Institute of Technology, Nirma University, Ahmedabad 382481, Gujarat, India e-mail: [email protected] P. Sheth e-mail: [email protected] S. Jain e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_31

319

320

P. Popat et al.

The identifier implemented here tries to build a bridge between low computation capability possessing devices and high computation capability requiring deep learning concepts. We have tried to implement a Convolutional Neural Network on a Raspberry Pi. Raspberry Pi is a card-sized computer device which is relatively inexpensive and runs on open-source Linux-based operating system. Raspberry Pi supports python programming thus making it easier to implement deep learning programs built on Python. It possesses a highly minimal computational power, extremely limited memory and other resources, unlike a regular research laboratory computer or even a personal computer. In the last few years, there has been a steady increase in use and implementation of deep learning in almost every field of science, technology, business, analytics, graphics, photography and much more. We observe that it is quite an expensive process to set up the infrastructure for such implementations, thus to find an inexpensive solution, we chose Raspberry Pi and implemented this definition over it. In the further sections, we have elaborated about the architecture used for implementation, specifications of the Raspberry Pi used, implementation procedure and results.

2 Architecture 2.1 Model Architecture Our brain differentiates animals, various objects and recognises human faces by identifying various features, this process is a relatively easier task for the human brain but performing the same task with computers is a tough job. For this, we are required to implement some special concepts that come under the field of deep learning known as Convolutional Neural Networks. Herein we have used the Inception v-3 model for implementing a Convolutional Neural Network (CNN). Inception v-3 [1] is a model designed for implementing Convolutional Neural Network (CNN) and it did surpass its ancestor GoogLeNet on the ImageNet benchmark. Conventionally, by applying convolutional filter, only linear functions of the input could be realised. To overcome this and realising the nonlinear functions, the multiple perceptrons were linked together. These perceptrons were nothing but the 1 × 1 convolutions and thus were easily fit in the Convolutional Neural Network (CNN) framework. A variety of convolutions are applied in the Inception v-3 model. A typical inception model consists of 1 × 1, 3 × 3 and 1 × 1 convolution layers. At any point of time, the computer is unable to predict which convolution would be most suitable for extracting the feature map. Thus, in Inception v-3 module, the outputs of all these convolutional layers are concatenated. Also, before performing higher order convolutions a 1 × 1, convolution is applied which would result into a significant lower number of computation operations. The reduction in the number of computations can be easily recognised

Animal/Object Identification Using Deep Learning on Raspberry Pi

321

Fig. 1 Inception module as given in [1]

from the following data: Consider a batch of inputs (192@ 28 × 28) and obtaining feature maps (32@ 28 × 28) by applying a 1 × 1 convolution. When a 1 × 1 is directly applied over it the number of computations is 5 × 5 × 28 × 28 × 192 × 3 = 120, 422, 400. While for same, if applying a 1 × 1 convolution before applying 5 × 5 convolution would result into 28 × 28 × 92 × 16 + 5 × 5 × 28 × 28 × 16 × 32 = 1,244,368 computations only. This reduction in operations as well as concatenating the different feature maps would result into an improved accuracy for classifying the images (Fig. 1).

2.2 System Specifications 2.2.1

Training Device Specification

We re-trained the model using a system bearing Intel(R) Core(TM) i7-7700HQ CPU @ 2.80 GHz paired with a 16 GB RAM and NVIDIA GeForce GTX 1050 Ti.

2.2.2

Raspberry Pi Specifications

The Raspberry Pi used for implementation of the identifier comes with the following specifications (Table 1).

322 Table 1 Raspberry Pi technical specifications (https://www.raspberrypi.org/ products/raspberry-pi-3modelb/)

P. Popat et al. Raspberry Pi 3 Model B CPU RAM USB ports

Quad core 1.2 GHz Broadcom BCM2837 64bit 1 GB 4

3 Need and Applications ∙ The identifier discussed above could be extremely helpful for implementing a traffic surveillance system. Roadways, especially the highways, city main roads, etc., are prone to get heavily congested or face any other issue like damages which may eventually lead to multi-vehicle collision, blockage, and amount to number of deaths and injuries. A system which can identify such a situation and can also inform about the onset of such a bloackage from the live CCTV footage may trigger an alarm at the nearest authority by which appropriate steps can be taken and such an event can be averted. ∙ A similar and frequent situation is accidents due to stray cattle roaming on the highways, especially in countries like India. This identifier would expedite the process of clearing the cattle. Thus using the model, roads can be made much safer for humans and can also reduce cattle deaths due to such accidents. ∙ The identifier described here can be used to identify day-to-day objects, books with titles, apparels, etc., and can be linked with a powerful search engine which eventually looks for the object over the internet and provides results with cheapest price tag and highest rated matching products from a plethora of e-commerce websites. Thus, increasing the ease of access for the user as they can directly point their camera on anything they want to purchase, and the identifier along with the module for searching would do the rest. ∙ A similar system can be used to differentiate the various currency denominations which may come in handy for banks, differently abled people, foreign exchange counters, companies, traders, etc. ∙ This identifier could also be extremely helpful in the biological and medical field. If we train it using the different types of images available after medical inspection, it can be used to differentiate between malignant and non-malignant tumours. The use of such an identifier would result in increased efficiency as well as accuracy for identifying various types of tumours. It can thus expedite and aid in increasing the efficiency of the process of diagnosis.

Animal/Object Identification Using Deep Learning on Raspberry Pi

323

4 Implementation 4.1 Dataset For training the model, we created a dataset for each of the 10 classes, viz. Cow, Cat, Dog, Horse, Rose, Sunflower, Pen, Chair, Table and Ball. For each of these classes, 900 images were gathered. The dataset collected, here, was a good mix of photos taken in different variations of background, exposure, composition, angles and noise. The different background for the images helped us to avoid the Tank Recognition Problem and the rest of the variations helped in making the identifier more robust.

4.2 Retraining the Model In this project, we have tried to implement the concept of transfer learning to expedite the process of getting the model up and running. Thus, we retrain a fully trained model, i.e in our case Inception v-3 [1] model instead of fully training a new model which can take a large amount of time considering the large amount of parameters an object recognition model has. Here, we retrain only the final layer of the model while leaving the other layers as it is. The retraining part is mainly divided into three steps, viz, bottlenecks, modifying the top layer and validation. Retraining was carried out in Python environment using Tensorflow [2] library.

4.2.1

Bottlenecks

The layer before the final output layer of any CNN is informally known as Bottleneck. The function of this layer is to map the set of values amongst all the classes such that it fulfils the requirement of the classifier. Thus, it helps in creation of a summary for the images, since it would be having the exact information required by the classifier for identifying from the images (Fig. 2).

4.2.2

Modifying Top Layer

After completing the bottlenecks, we start modifying the top layer of the neural network. The accuracy of the training depends majorly upon the dataset used for the training purpose. It also depends upon the number of training steps provided.

324

P. Popat et al.

Fig. 2 Creating bottlenecks for each image file

4.2.3

Validation

The most important part of the whole process is the validation. Imagine a network which can classify the images properly. But, the thing which is concerned of her is that if the network is classifying the images based upon its background and not the features of the object than the model would fail when an image from outside the dataset provided for training would be used. This problem is known as overfitting. To avoid this normally, the dataset is divided into three parts in a ratio of 8:1:1. 80% of the images into the main training set, keep 10% aside to run as validation frequently during training and then have a final 10% that are used less often as a testing set to predict the real-world performance of the classifier (Fig. 3). By making use of Docker tool, retraining of the outer most layer of Inception v-3 model is carried out. Retraining is carried out by executing a Python command line instruction in which we mention the directories for bottleneck, model and images. Initially, bottlenecks are created, mapping is done onto graph and a graph file (extension .pb) is generated. The nodes on this graph represent the functions that were performed on the data(in form of tensors). For each node, the incoming edge shows the input provided to the function, and the outgoing edge represents the output of that function. In the end, a text file is used to map the results with the user-defined tags. We use these generated files, i.e. graph and tag file while executing our python program to identify the object.

Animal/Object Identification Using Deep Learning on Raspberry Pi

325

Fig. 3 Validation being done

4.3 Transfer Learning and Execution As Raspberry Pi supports Python and provides a terminal, we make use of terminal to execute our program. For execution, we need to transfer our learned model into the Raspberry Pi, and for this, we transfer the graph file generated while training the model along with the text file containing the tags of the objects and animals. After this, the execution can taken care of by typing the command: python3 -m code-file.py –graph=name-of-graph.pb –image=img-file.jpg Let us take an image of a dog for a test execution. The image taken for testing is shown here and the output in terms of tag and confidence in percentage is expressed below in Table 2. As we can see that the output generated by the identifier for Fig. 4 clearly states that the image consists of a dog. By giving a 89.49% match with tag dog, our identifier successfully identifies a dog. We also see that some percentage of the result converges to cow and cat, this is because the animal dog shares some features with cow and cat, like structure of body and colour. In the similar fashion, many test cases Table 2 Test result for Fig. 4

Object tag

Result (%)

Dog Horse Cat

89.49 1.41 1.19

326

P. Popat et al.

Fig. 4 A dog

were executed on Raspberry Pi. The details of which along with the time taken to process an image and identify the object/animal are discussed in Sect. 5.

5 Results For testing and generating results, we resized the test set images to dimensions 100 × 100. All the testing images after being resized were provided as input to the identifier on Raspberry Pi. After exhaustive testing of our identifier on Raspberry Pi, we were able to gather data pertaining to the time taken to process different images and also the time taken to identify the image. For different objects and animals, accuracy in terms of percentage varied by some amount ranging from 69% accuracy to 95% accuracy. In Table 3, we have presented the data in terms of Tag and Average accuracy. The model trains with a final test accuracy of 87.9% after 500 steps of validation. In Norouzzadeh et al. [3], they have provided a comparative analysis of top-1 error of various architectures for detecting images that contain animals.

Animal/Object Identification Using Deep Learning on Raspberry Pi Table 3 Result after testing for all the tags

Tag

Average accuracy in (%)

Dog Cat Cow Horse Rose Sunflower Pen Chair Table Ball

87.40 90.19 89.60 84.29 85.40 83.06 79.69 89.90 87.66 90.45

327

6 Conclusions Thus, after successfully implementing the animal and object identifier using a Convolutional Neural Network (CNN) on a Raspberry Pi, we could build a bridge between tasks requiring high computational resources like classifying objects provided to the Convolutional Neural Network (CNN) and low computational power bearing devices like Raspberry Pi. This helps us materialise a lot of opportunities involving IoT with Deep Learning. The model used for Convolutional Neural Network (CNN) can be chosen from many of the available options like various versions of Inception model [1], or MobileNets (224, 192, 160 or 128). The MobileNets are comparatively faster than the Inception model but this is at the cost of accuracy and hence the model should be selected as per the requirement and availability of computational power.

References 1. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions (2014) CoRR, abs/1409.4842. arXiv:1409.4842 2. TensorFlow. https://www.tensorflow.org/ 3. Norouzzadeh, M., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M., Packer, C., Clune, J.: Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning (2017). CoRR, abs/1703.05830. arXiv:1703.05830 4. Verma, N., Sharma, T., Rajkumar, S., Salour A.: Object identification for inventory management using convolutional neural network. In: 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). https://doi.org/10.1109/AIPR.2016.8010578 5. D urr, O., Pauchard, Y., Browarnik, D., Axthelm, R., Loeser, M.: Deep Learning on a Raspberry Pi for real time face recognition. In: EG 2015. https://doi.org/10.2312/egp.20151036

Efficient Energy Harvesting Using Thermoelectric Module Pallavi Korde and Vijaya Kamble

Abstract The increase in the energy consumption of any device has led to the concept of energy harvesting (which simply means extracting energy from the environment). In this paper, the environmental waste heat energy is extracted and then converted into electric energy using thermoelectric concept. Depending on the temperature difference between heat obtained from the heat sources and the environment, the conversion is done. Here, the waste heat energy, we are using, is obtained from heat leakages produced by the objects like hot bag, boiling water, electric heat pad, human body, etc., which is converted to charge real-time applications and to run various digital circuits. The ultra-low voltage step-up converter and buck–boost circuit are used for obtaining required potentials for charging and running circuits. This paper gives the information about the thermoelectric effects, concept of TE modules, use of TEC as TEG, and the simulation of electric system design to charge real-time applications. Keywords Energy harvesting ⋅ TE modules ⋅ Thermoelectric materials Thermoelectricity ⋅ DC–DC buck–boost converter

1 Introduction Energy harvesting is simply extracting the available energy from the environment. There are basically three areas to harvest electricity from waste energies, in the fields of piezoelectric, photovoltaic, and thermoelectric producing energies from vibrations, light, and heat, respectively [1]. All these energy sources are with the greatest development and are commercially available at any time. Piezoelectric transducers P. Korde ⋅ V. Kamble (✉) Bharatiya Vidya Bhavan’s, Sardar Patel Institute of Technology, Mumbai 400058, Maharashtra, India e-mail: [email protected] P. Korde e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_32

329

330

P. Korde and V. Kamble

are more frequent, whereas photovoltaic transducers offer highest available power densities. Instead of these advantages, there is a problem to obtain energy as vibrations and light are not available for longer times. The thermoelectric transducer turns efficiently the heat energy into electrical energy. These transducers have high reliability, small size, low cost, low weight, long lifetime, intrinsic safety for hazardous electric environments, precise temperature control, and they offer direct DC conversion with the help of TE module. They are maintenance-free and are environmentfriendly due to lack of refrigerant gases. Hence, the easiest way to harvest such a waste energy is with the help of thermoelectric transducers. Thermoelectric energy harvesting is used in thermoelectric heat pumps that are capable of refrigerating solid or fluid objects. These modules are capable of increasing or reducing the temperatures. This functioning is typically used in household air conditioners and refrigerators. One of the features of importance is the ease with which a TE module can be precisely temperature control, which is an important advantage for scientific, military, and aerospace applications. They are also used in ocean thermals, steam engines, wireless sensors, industrial motors, biological sources (such as body motion, biomass), etc. One of the typical applications of them is in the solar thermoelectric generation to produce electricity. Similarly, they can also be used in charging real-time applications.

2 Thermoelectricity 2.1 Thermoelectric Effects Thermoelectricity involves the direct conversion of heat energy to an electrical potential and vice versa. The thermoelectric effects are the effects caused by this conversion. It consists of three thermodynamically reversible effects, namely, Seebeck effect, Peltier effect, and Thomson effect. 2.1.1

Seebeck Effect

When a temperature difference (ΔT) is applied to a TE module, then the electric potential (ΔV) occurs in a conductive material. This is called as Seebeck effect. Mathematically, this effect is given by ΔV = S ⋅ ΔT

(1)

where (S) is the Seebeck coefficient. Practically, if two electric wires are connected at both the ends and a temperature difference is imposed between the two junctions, a tension raise is observed in the loop. It links the temperature effect with the voltage effect. When putting a voltmeter in the loop, we get the electric potential as shown in Fig. 1a.

Efficient Energy Harvesting Using Thermoelectric Module

331

Fig. 1 a Seebeck effect. b Peltier effect. c Thomson effect seen in practice

2.1.2

Peltier Effect

When the electric current is applied to a TE module, then the heat flux occurs at both the sides. This is called as Peltier effect. Generated heat flux (qPeltier ) is directly proportional to the electric current (J) in an electrical circuit. Mathematically, this effect is given by (2) qPeltier = −P ⋅ J where (P) is the Peltier coefficient. Negative sign indicates the change in the polarities of electric potential. Practically, if we connect two wires and apply a potential difference in the loop which produces an electric current, we get heat flux one positive and one negative running in and out of the junctions. As shown in Fig. 1b, the polarities get reversed.

2.1.3

Thomson Effect

Thomson effect is the degree of warming per unit length and this results in current density through the conductor, in which a uniform temperature gradient is present. It is an extension of the Peltier–Seebeck effects. In this, the heat source (QThomson ) is proportional to both electric current and temperature gradient. Mathematically, this effect is given by (3) QThomson = −𝜇Th ⋅ J ⋅ ΔT where 𝜇Th is the Thomson coefficient. Practically, this effect is not only occurring at junctions but also in the wires as shown in Fig. 1c.

332

P. Korde and V. Kamble

2.2 TE Module and Its Structure Thermocouples are the basic units of Thermoelectric (TE) module. A thermocouple properly calibrated is a temperature sensor which can convert temperature gradient into potential gradient when the temperature difference is applied to it, and also can convert potential gradient to temperature gradient when the potential difference is applied across copper junctions. The thermocouples are made from thermoelectric materials which are the alloys of semiconductors like Bismuth–Telluride (Bi–Te), Lead–Telluride (Pb–Te), Silicon– Germanium (Si–Ge), and Bismuth–Antimony (Bi–Sb) [2]. Among these, Bismuth Telluride(Bi2 Te3 ) material works within a temperature ranges that are best suited for most of the electronic applications. Bi2 Te3 is the best solution to other materials because it will produce more power at a lower temperature. Hence, commonly it is used to make TE modules. As the temperature increases, other alloys are preferred to make TE modules. The thermoelectric materials are heavily doped to provide individual units or couples having distinct n and p characteristics with deficiency and excess of electrons respectively [3]. Both of the thermoelectric modules, thermoelectric coolers (TECs) and thermoelectric generators (TEGs), consist of a lot of thermocouples [4, 5]. The thermocouples are connected electrically in series and thermally in parallel as shown in Fig. 2. The electrically series combination allows them to increase voltage and power at the output, and the thermally parallel combination allows them to flow heat from one side to another side. This structure is sandwiched between two ceramic plates. Both of these structures look similar and manufactured with same dimensions [6]. An insulating filling material is filled in all the TECs but not all the TEGs. The height of thermocouples also differentiates between TEG and TEC, i.e., thermocouples in TECs are taller and thinner than TEGs [7]. The thicker element size in TEGs relates to a larger heat flow through the device and gives a higher power output, and the shorter legs are optimized for power generation [8]. But with this shorter length, they are more susceptible to the contact effects and reduce open-circuit voltage. The thermocouples in TECs are taller with insulator material fillings, and hence the contact effects are negligible in them, which can Fig. 2 Internal structure of TE module

Efficient Energy Harvesting Using Thermoelectric Module

333

Fig. 3 Temperature effect of TEC module

achieve the maximum figure of merit and to perform effectively. The thickness in TEGs is well maintained as it can affect the power output of the complete system, whereas in TECs the thickness is not very exact. In the Comsol multiphysics simulation tool, we can even check for applications of thermoelectric effects. So here an application of TEC module is created with 7 mm × 8 mm × 2.5 mm dimensions [2]. The simulation is done at higher temperature of 353 K. The temperature effect produced in this simulation is shown in Fig. 3. In the manufacturing of TE module, encircling material and wire materials are also important. Ceramic plates are used for encircling in both of these modules because they have high strength and excellent wear resistance. To measure electric current, two terminals made up of conductors are used. And for insulating these wires, Teflon and PVC materials are used in TEG and TEC, respectively. Teflon works more efficiently than PVCs at higher temperatures. And the last and main point while operating any of these modules is that the material used at the cooler side must have high thermal conductivity and low coefficient of thermal expansion [4].

2.3 Mathematical Formulae The characteristics of any thermoelectric module include electrical resistivity (𝜌), thermal conductivity (k), and Seebeck coefficient (S) which affects its performance. Hence, the figure of merit (ZT) for the temperature (T) at which material is maintained is defined mathematically as

334

P. Korde and V. Kamble

ZT =

S2 ⋅ T 𝜌⋅k

(4)

The figure of merit is used for both the determining parameters and thermoelectric materials. In both the cases, the same conditions of the thermoelectric material are provided, and hence it is identical for the two different devices in the same temperature range. Coefficient of performance is defined mathematically as

𝜙max

√ T Th ( 1 + ZT − Tc ) h = √ (Tc − Th )( 1 + ZT + 1)

(5)

where Tc and Tc are temperatures at cold and hot sides, respectively. Coefficient of performance of thermoelectric module plays very important role in the selection of module [5].

3 Procedure 3.1 TECs Working as TEGs TEGs are designed to work for high-temperature gradients which range from tens to hundreds of ◦ C [9]. TECs have the ability to achieve their highest efficiency on lowtemperature gradients as their main goal is to sink the heat to the atmosphere [10]. Another advantage is that TECs are commercially more widely available than TEGs. So to work at lower temperatures, TEGs can be replaced by TECs. As discussed earlier, TEGs and TECs work on Seebeck and Peltier effects, respectively. And as they are reversible, we can combine them. All the materials have Seebeck coefficients. But in many of these materials, this coefficient is not constant in temperature. As temperature increases, it will increase too which eventually degrades the performance of the system at lower temperature ranges. If a current is driven by this gradient, then a continuous Peltier effect is observed producing heat flux. Hence, both Seebeck and Peltier effects appear together. In this way, a TEG could potentially work as a TEC and vice versa. Mathematically, both of these coefficients are related by P=S⋅T

(6)

When we apply temperature difference between both the sides of TEG module, we get electric potential across wires [7]. And when we apply the electric current across wires of TEC module, heat flux is observed at the sides of the module in reversed directions, i.e., the cold side of TEG becomes the hot side of TEC and the hot side TEG becomes the cold side TEC. Hence, to combine these two effects, we have to

Efficient Energy Harvesting Using Thermoelectric Module

335

Fig. 4 Use of TEC as TEG for harvesting heat energy

substitute TEC instead of TEG in reverse direction or even one can change the heat sink from one side to another side by attaching a load between wires to obtain electric potential. As we apply a temperature difference between plates, an electric potential is observed across load as shown in Fig. 4. As TEC module works for lower temperatures, care must have to be taken for its survivability. Since TECs are intended for cooling, they must be operated below 100 ◦ C. Constituents of the solder material can rapidly diffuse into the thermoelectric material at higher temperatures and degrade the performance of TEC which can even cause failure of the module [11]. This process is controlled by the application of a diffusion barrier onto the TE material present in the TEC at the manufacturing process. But some manufacturers of TECs employ no barrier material at all between the solder and the TE materials, which leads to a short-term survivability at higher temperatures. So some qualification testings must be taken to assure long-term operation of TEC as TEG [12]. The maximum operating temperature is always given in the datasheets of TEC module which is beneficial for a researcher to work with it. Hence, for the power generation in temperatures below 100 ◦ C, TECs can be used instead of TEGs. Here, we are dealing with the TEC1-12706 module. This TEC module is manufactured by Hebei I.T. (Shanghai) Co., Ltd. with 127 thermocouples. It can withstand maximum voltage around 16.4 V and a maximum operating temperature of 125 ◦ C. A patch of the material of one square centimeter in the area can produce up to 30 µW of power. A thermoelectric device placed on the hot bodies will generate power as long as the ambient air is at a lower temperature than that of the body. Waste heat energy from daily objects such as hot bag, electric heat pad, boiling water, and human body is converted to DC electric potential with the help of this module. The potential generated using hot bag is 145 mV as shown in Fig. 5. The generated electric potentials are shown in the Table 1.

336

P. Korde and V. Kamble

Fig. 5 Practical harvesting of heat energy from a hot pad with the help of TEC module

Table 1 TEC module output ranges Waste heat producing objects Hot bag Electric heat pad Boiling water Human body

Produced output voltage at TEC (mV) 145 205 334 96

3.2 Conversion Circuit To charge real-time applications, we must have to uplift these potentials. As we had obtained potentials in the range of 20–400 mV, the LTC 3108 is best suited for this uplift. This IC operates from inputs of 20 mV and uses compact step-up transformers to harvest the power of the system. It has selectable output voltages of 2.35, 3.3, 4.1, and 5 V which can be obtained by combining Vs1 and Vs2 pins with VLDO and Vaux pins. This IC is used in the low-power applications in which energy harvesting is used to generate system power as battery power is inconvenient for many of the applications [13]. It can also be used to charge a supercapacitor or rechargeable battery, using energy harvested from a Peltier or photovoltaic cell. Here, in this LTspice simulation shown in Fig. 6a, we have given input potential as 145 mV obtained from heat pad, with TEC module resistance as 1.98 ohms. As we had used a small external 1:20 step-up transformer, the input potential must be greater than or equal to as 60 mV. Here, 1:20 transformer is chosen as it is more efficient from 1:50 and 1:100 transformer. We have selected output as 5 V by joining vs 1 and vs 2 pins to Vaux pin. And a capacitor of 10uF is used at the output to trickle charge. The output in this simulation is shown in Fig. 6b. Here, the green line gives the output potential of 5 V. One can use this 5 V DC output voltage to charge mobile phones using a USB socket [14]. To change the output voltage from 5 V to required voltages, we were in need of a DC-to-DC converter that can manage its output voltage magnitude either greater than or less than the input voltage magnitude. Hence, we had chosen two different

Efficient Energy Harvesting Using Thermoelectric Module

337

Fig. 6 LTC 3108 harvesting circuit with its output

topologies: buck converter and boost converter. Both of them can produce a range of output voltages lesser or greater than input voltages. Those are the types of switchedmode power supply. They are managed mainly by the switching pulses. One can use NPN or PNP transistor for switching. But we have used a power supply with a specific pulse configuration. The simulation circuit in LTspice IV adds the LTC 3108 energy harvesting circuit and the buck–boost configuration as shown in Fig. 7a. We have selected the specific voltages of +12 and −12 V as most of the practical electronic appliances work on those voltages. We have selected both of these positive and negative voltages because any electronic digital circuit or embedded system operates on them. The outputs for the simulation of the stated combination

338

P. Korde and V. Kamble

Fig. 7 Circuit combining of LTC3108 harvesting and buck–boost converter with its output

of harvesting circuit and buck–boost circuit are as shown in Fig. 7b. Here, the blue and green lines give +12 V (Boosted) and −12 V (Bucked) potentials, respectively. By applying any of the output voltages from Table 1, one can get different output voltages by the combination of LTC 3108 harvesting circuit and DC–DC buck–boost converter circuit. Here, different voltages are achieved in the simulation results using 145 mV output from hot bag. Hence, all the simulation output voltages are summarized in Table 2.

Efficient Energy Harvesting Using Thermoelectric Module Table 2 Final voltages achieved Circuit used LTC 3108 energy harvesting circuit LTC 3108 energy harvesting circuit with buck–boost converter circuit

339

Final output voltage achieved 5V −12 V, +12 V

4 Conclusion The paper mainly focuses on converting waste heat into electric potential using a TEC module by combining Seebeck and Peltier effects. This conversion of waste heat into electric potential is in the range of 20–440 mV from daily objects some of which are hot bag, electric heat pad, boiling water, and human body. Such a small electric potential is stepped up effectively with the use of 1:20 transformer attached externally to an ultra-low voltage step-up regulator IC LTC 3108. The output electric potential of 5 V can be used to charge many electronic appliances such as mobile phones and rechargeable batteries. Further, this potential is buck-boosted to −12 V and +12 V, respectively, for further use.

5 Future Scope Here, the harvesting of heat energy from heat producing objects is done practically, but the further system is performed in simulation. One can implement this design and use the system practically. This configuration is compact in size so it can be used as portable charger. As heat is available at every time, the charging of the electronic device could also be done at that time.

References 1. Ugalde, C., Anzurez, J., Lazaro, I.I.: Thermoelectric Coolers as Alternative Transducers for Solar Energy Harvesting. In: IEEE Electronics. Robotics and Automotive Mechanics Conference, pp. 637–641. Morelos (2010) 2. Pallavi, K., Vijaya, K.: Comparative study of TE materials for heat energy harvesting with the help of TEC module. Int. J. Adv. Res. Sci. Eng. 06(11), 1–11 (2017) 3. Bulat, L.P.: Effect of electrons and phonons temperature mismatch on thermoelectric properties of semiconductors. In: 1997, XVI International Conference on Thermoelectrics Proceedings ICT, pp. 567–570. Dresden (1997) 4. Nesarajah, M., Frey, G.: Thermoelectric power generation: Peltier element versus thermoelectric generator. In: 2016 IECON–42nd Annual Conference of the IEEE Industrial Electronics Society. pp. 4252–4257. Florence (2016)

340

P. Korde and V. Kamble

5. Buist, R.J., Lau, P.G.: Thermoelectric power generator design and selection from TE cooling module specifications. In: 1997, XVI International Conference on Thermoelectrics Proceedings ICT ’97, pp. 551–554. Dresden (1997) 6. Maneewan, S., Zeghmati, B.: Comparison power generation by using thermoelectric modules between cooling module and power module for low temperature application. In: 2007, 26th International Conference on Thermoelectrics, pp. 290–293. Jeju Island (2007) 7. Poh, C.S., et al.: Analysis of characteristics and performance of thermoelectric modules. In: 2016, 5th International Symposium on Next-Generation Electronics (ISNE), pp. 1–2. Hsinchu (2016) 8. Chen, M.: Adaptive removal and revival of underheated thermoelectric generation modules. IEEE Trans. Ind. Electron. 61(11), 6100–6107 (2014) 9. Zaman, H.U., Shourov, C.E., Al Mahmood, A., Siddique, N.E.A.: Conversion of wasted heat energy into electrical energy using TEG. In: IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC), pp. 1–5. Las Vegas, NV (2017) 10. Ishiyama, T., Yamada, H.: Suppression of heat leakage by cooling thermoelectric device for low-temperature waste-heat thermoelectric generation. In: IEEE International Telecommunications Energy Conference (INTELEC), pp. 1–5. Osaka (2015) 11. Rocha Liborio Tenorio, H.C., Vieira, D.A., De Souza, C.P.: Measurement of parameters and degradation of thermoelectric modules. IEEE Instrum. Meas. Mag. 20(2), 13–19 (2017) 12. Anatychuk, L.I., Lysko, V.V.: Methods for assuring high quality electric and thermal contacts when measuring parameters of thermoelectric materials. J. Thermoelectr. 4, 83–92 (2014) 13. Kannan, H., Reshme, T.K., Parthiban, P.: Thermoelectric charger. In: 2016 Online International Conference on Green Engineering and Technologies (IC-GET), pp. 1–4. Coimbatore (2016) 14. Yap, Y.Z., Naayagi, R.T., Woo, W.L.: Thermoelectric energy harvesting for mobile phone charging application. In: IEEE Region 10 Conference (TENCON), pp. 3241–3245. Singapore (2016)

Refining Healthcare Monitoring System Using Wireless Sensor Networks Based on Key Design Parameters Uttara Gogate and Jagdish Bakal

Abstract Wireless Sensor Network (WSN) can be effectively used for continuous monitoring of patient in hospitals and in homes for elderly and baby care. Proposed system is a smart healthcare monitoring system using WSN which can monitor patients admitted in the hospital continuously without any interference of wires around patient bed. We are accentuating the advantage of wireless sensor network over wired system by attaching various advanced sensors to this network to collect various body parameters of a patient such as blood pressure, temperature, heart rate, pulse rate and blood oxygen level (SPO2) wirelessly. A kind of WSN called as WBAN is used for this purpose. Many designing requirements like flexibility, miniaturization, portability, non-intrusiveness, and low cost are studied and considered to the make system more efficient. Different available wireless standards are compared for the intermediate communication based on various parameters. Proposed system is a Healthcare Monitoring System with ESP8266 NodeMcu WiFi wireless communication using Arduino Nano boards. This system can detect the abnormal health conditions of patients, issue an alarm in emergency conditions and send SMS/E-mail to the physician, caregiver and relatives of patient. Keywords Healthcare monitoring system NodeMcu WiFi





WSN



Arduino

U. Gogate (✉) Department of Information Technology, TSEC, University of Mumbai, Bandra, Mumbai, India e-mail: [email protected]; [email protected] J. Bakal S.S.Jondhale C.O.E. Dombivli (E), Dombivli, Maharashtra, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_33

341

342

U. Gogate and J. Bakal

1 Introduction In current healthcare systems different vital body parameters of patient are to be continuously monitored closely or remotely by experts in the medical field. Previously the process was carried out manually by trained nursing staff. This work was tedious and there were chances of errors due to wrong placement of electrodes, probes, and fatigue. So there was a need of monitoring system which can measure different parameters efficiently and correctly. Wireless sensor network based healthcare systems are grabbing attention in this decade and are currently being applied to improve healthcare around the world [1]. Healthcare monitoring using wireless sensor network is a wireless network based health parameter monitoring system by eliminating use of complex wires and electrodes so that the patients can move around freely without something attached to their body. Many vital body parameters are measured using sensors and send to server via wireless networks for storage and further processing [2]. A small area wireless sensor network (range of 2 m) is used to collect sensor data called as Wireless body area network (WBAN) [3].

2 Literature Survey WBAN can be used in different ways in many medical applications like Wearable WBAN a. b. c. d. e.

Assessing Soldier Fatigue and Battle Readiness Aiding Professional and Armature Sport Training Sleep Staging Asthma Wearable Health Monitoring.

Implant WBAN a. Cardiovascular Diseases b. Cancer Detection. Remote Control of Medical Devices a. Ambient-Assisted Living (AAL) b. Patient Monitoring c. Tele-medicine Systems [3]. All healthcare systems which are based on WSN must follow some specific designing requirements like flexibility, miniaturization, portability, non-intrusiveness and low cost [4]. For healthcare applications the system should have some abilities like:

• Mass causality • Real-time physiological status monitoring

Less because of bandwidth limitations

High cost

Multihop routing network

Lack of reliable Communication

The system integrates a localization system called MoteTrack—RF—based localization system Wearable sensor—describes new, miniaturized sensor mote designed for medical use

Application

Availability

Affordability

Compatibility

Efficiency

Integrity

Wearability

Systems CodeBlue

Requirements

Wearable

Robust system

Lack of WSN standardized communication protocols for development of WSN applications Difficult to troubleshoot • Difficult user interface

Low cost

Less because of degrees of software and hardware limitations

In home and hospitals to care for the aged and those in poor health

MoteCare

Table 1 Comparison of three systems based on key designing parameters

Wearable Wrist-worn

Required medical accuracy on all the measurement, the duration of measurements is too long

Poor results due to high measurement of noise

GSM/universal mobile telecommunications systems

High cost

High-risk cardiac/respiratory Patients Inside and out yards home and hospitals Less to improve the design needs improvements

AMON

• Wearable—waist pack • Portable

Battery life is about three hours • Sufficient wireless network capacity for reliable comm Robust—Open platform hardware and software for ease of modification

• High because Easy Adding a mobile component for the caregiver • Geopositioning system Low cost Inexpensive commodity components High compatible—PDA design

Inside and outside • Both onsite and during transportation

SMART

Refining Healthcare Monitoring System Using Wireless Sensor … 343

344

U. Gogate and J. Bakal

1. Wearability—Ability to wear on body—System must be light in weight [5]. 2. Portability—Ability to carry—System must be small in size. 3. Tolerance—Ability to withstand—System must be robust and must consume low power. 4. Affordability—Ability to afford—System should be low in cost and affordable to common people. 5. Compatibility—Ability to be compatible—System must be compatible to standards-based interface protocols for heterogeneous wireless communication in different communication tiers. 6. Integrity—Ability to integrate—System must have simplified integration into different tiers of WBAN and E-health applications. 7. Suitability—Ability to patient—specific calibration—System must be calibrated according to specific thresholds suitable to specific patient [6]. Many researches have worked to design systems for specific applications in healthcare considering different parameters like minimal weight, miniature form factor, low power operation, and patient-specific calibration [7]. Four state of art systems are discussed in Table 1 based on different design parameters. Table 1 provides a summary of the HCWSN characteristics used to meet the specific needs. Four state of art previously proposed HCWSNs systems CodeBlue [8, 9], MoteCare [10, 11], AMON [12, 13] and SMART [14], on the basis of key system architecture requirements mentioned, are surveyed. It is observed that most systems are using wired intra BAN communication and ZigBee or WiFi inter BAN communication [15]. Therefore availability and efficiency is reduced. We are aiming to use wireless sensors in tier-I of our proposed system with existing WiFi communication and Internet facility of hospital will be used for inter BAN and beyond BAN communication. It will greatly help to increase integrity, reliability and affordability of the system.

3 Proposed System In order to achieve the objectives of the system, the modules of the project are summarized as follow: • Sensors: Various bio-sensors are used to acquire medical parameters from patients. More advanced sensors are used for more accurate results. • Wireless sensor network is used to transmit and receive sensor data from patient to server. Different communication technologies are compared based on various parameters to select right technology in different communication tiers of WBAN. WiFi is used for tier-1 communication. Arduino boards are used for integrating the system.

Refining Healthcare Monitoring System Using Wireless Sensor …

345

• Monitor at ICU server and main server GUI is used to display and update the parameters of patients in a real time. • Alert system Android-based/mobile phone GSM system is used to send alert messages to authorized user.

4 Implementation Step 1: Attach the sensory device to the body of the patient and turn it on. Step 2: The sensors will collect the data and transmit it through EPS8266 EX WiFi module attached on the Arduino Nano board. Step 3: At the receiving end, transmitted data is received on the laptop/PC. Step 4: The received data will be processed and displayed using custom software which is developed on VB6. Step 5: All the parameters viz. body temperature, stress, Heart rate and ECG of the patient would be displayed using specially developed software. Step 6: The processed data will be stored in database for further reference and sent to main server. Step 7: If the received values exceed the medically predefined thresholds, alerts will be sent to corresponding doctors, nurses and relatives.

5 Result Analysis Implementation of proposed system [15] is done using sensors for vital parameter measurements. Simple Arduino boards like Arduino Uno and Nano were used to transmit data from the sensors and to process received data from the sensors using ZigBee or WiFi as wireless communication. Many revisions in the proposed system design are made to meet requirements of efficient healthcare monitoring system. Two such attempts are compared in Table 2. Table 2 Comparison of proposed and revised systems 1. Name of sensor

Type of sensor proposed system

Type of sensor proposed revised system

Temperature Skin Response sensor Pulse and heart rate sensor ECG 2. Name of board 3. Communication

LM 35 Galvanic skin response LED/LDR based Pulse oximeter – Arduino Uno ZigBee

DS18B20 Galvanic skin response MAX 30100 AD 8232 Arduino Genuino/Nano WiFi

346

U. Gogate and J. Bakal

Fig. 1 Proposed System architecture based on WSN

Using the revised system in Fig. 1, we can measure various vital body parameters of the patients inside and outside of the hospital or in home in real time even when patient is roaming around. We used the system to collect actual data of 10 patients. Figure 2 show sample results of temperature and pulse rate and Fig. 3 shows ECG of patient 1. In the revised system the sensors are connected to Arduino Nano board. As the size of the board and sensors is very small as compared to previous Arduino Uno boards, form factor is considerably reduced. NodeMcu –ESP8266 WiFi Wireless communication is used to make the communication simpler, low cost and energy efficient. By making above changes 1. As form factor of sensors and board is considerably reduced, therefore size of system is reduced. 2. Communication is WiFi, therefore availability and energy efficiency is increased.

Fig. 2 Sample sensor result showing temperature and pulse rate of patient 1

Refining Healthcare Monitoring System Using Wireless Sensor …

347

Fig. 3 Sample ECG sensor result showing heart rate of patient 1

3. Number of sensors are increased therefore complexity of the system is increased. By revising proposed system many parameters are achieved 1. Wearability—due to miniaturization of components, size of sensors and board is reduced; therefore form factor is considerably reduced. 2. Portability—due to reduced weight and size, system becomes portable. 3. Affordability—components in revised system are cheaper than previous system, so reduced cost 4. Availability—Communication is WiFi, therefore availability is increased. 5. Accuracy—More advanced sensors are used which increases accuracy and precision of data. 6. Integrity—as existing WiFi from home or hospital can be used, integrity is increased. 7. Efficiency—Number of low energy consuming sensors are increased therefore complexity and efficiency of the system is increased.

6 Conclusion Wireless healthcare monitoring system design proposed here depicts properties like light in weight, miniaturized form factor, low power consumption and patient specific calibration. It is low cost, easy to operate and user friendly system which

348

U. Gogate and J. Bakal

can measure vital body parameters like temperature, pulse rate, Galvanic skin response, heart rate and ECG successfully. As the size of the board and sensors is very small as compared to previous Arduino Uno boards, form factor is considerably reduced. Low cost NodeMcu—ESP8266 WiFi communication chip is used for wireless communication to make the communication more reliable by increasing its availability. Thus with the use of more advanced and low cost available system components like sensors and boards, our revised system becomes wearable, more cost effective and efficient.

7 Future Scope In the system implementation, GSR sensor readings are not yet added on monitor. They are to be added later on. Blood pressure sensor and respiration rate sensors with high precision and accuracy are to be added. The existing system is to be extended for more number of sensors and for more number of patients to propose a prototype healthcare system. According to suggestions by doctors, new sensor like one for urine level and saline level detection can be added to the system. Acknowledgements Author Uttara Gogate and Author Jagdish Bakal want to thank Dr. R. A. Marathe, Dr. H. M. Thakur and management and staff of M. D. Thakur memorial Hospital, Dombivli, Thane, MS. (India) for their help. Conflict of Interest: Author Uttara Gogate and Author Jagdish Bakal declare that they have no conflict of interest.

Funding There is no funding source. Disclosure of potential conflicts of interest Author Uttara Gogate and Author Jagdish Bakal declare that they have no potential conflict of interest. Informed consent Informed consent was obtained from all individual participants included in the study. Ethical approval Ethical approval is obtained from institute ethical committee of S. S. Jondhale C.O. E. Dombivli, India. This article does not contain any studies with animals performed by any of the authors. All procedures performed in studies involving human participants (whose written consents were obtained) were in accordance with the ethical standards of the institutional ethical research committee and with the accords comparable ethical standards of Indian medical council.

Refining Healthcare Monitoring System Using Wireless Sensor …

349

References 1. Giordano, S., Puccinelli, D.: When Sensing Goes Pervasive. Networking Laboratory, ISIN-DTI, SUPSI, Switzerland 2. Gogate, U., Bakal, J.: Evaluation of performance parameters of healthcare monitoring system. In: Proceedings of Conference ICIRTE 2017 (2017) 3. Movassaghi, S., Abolhasan, M., Lipman, J., Smith, D.: Wireless body area networks: a survey. IEEE Commun. Surv. Tutor. Accepted for publication, 2013. IEEE 4. Mahesh Kumar, D.: Healthcare monitoring system using wireless sensor network. Int. J. Advanc. Netw. Appl. 04(01) 1497–1500 (2012) ISSN: 0975-0290 5. Milenković, A., Otto, C., Jovanov, E.: Wireless sensor networks for personal health monitoring: issues and implementation (2006) 6. Zhang, S., Qin, Y.P., Mak, P.U., Pun, S.H., Vai, M.I.: Real time medical monitoring system design based on intra-body communication. J. Theor. Appl. Informat. Technol. 47(2), 649– 652 (2013) 7. Filipe, L., Fdez-Riverola, F., Costa, N., Pereira, A.: Wireless body area networks for healthcare applications: protocol stack review. Int. J. Distrib. Sens. Netw. 2015 (2015) 8. Shnayder, V., Chen, B., Lorincz, K., Fulford-Jones, T.R.F., Welsh, M.: Sensor Networks for Medical Care. Harvard University, Technical Report TR-08–05 (2005) 9. Malan, D., Fulford-Jones, T., Welsh, M., Moulton, S.: CodeBlue: an ad hoc sensor network infrastructure for emergency medical care. In: Proceedings of MobiSys 2004 Workshop on Applications of Mobile Embedded Systems (WAMES 2004), June 2004 10. Navarro, K.F., Lawrence, E.: WSN applications in personal healthcare monitoring systems: a heterogeneous framework. In: Second International Conference on e-Health, Telemedicine, and Social Medicine (2010) 11. Navarro, K.F., Lawrence, E., Debenham, J.: Intelligent network management for healthcare monitoring. In: Proceedings of 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems on Trends in Applied Intelligent Systems, IEA/AIE 2010, Cordoba, Spain, June 1–4, 2010, Part III (2010) 12. Anliker, U., et al.: AMON: a wearable multiparameter medical monitoring and alert system. IEEE Trans. Informat. Technol. Biomed. (TITB) 8, 415–427 (2004) 13. Anliker, U., Beutel, J., Dyer, M., Enzler, R., Lukowicz, L.Thiele, Troster, G.: A systematic approach to the design of distributed wearable systems. IEEE Trans. Comput. 53(8), 1017– 1033 (2004) 14. Curtis, D., Shih, E., Waterman, J., Guttag, J., Bailey, J. et al.: Physiological signal monitoring in the waiting areas of an emergency room. In: Proceedings of BodyNets 2008. Tempe, Arizona, USA (2008) 15. Gogate, U., Bakal, J.W.: Smart healthcare monitoring system based on wireless sensor networks. In: International Conference on Computing, Analytics and Security Trends (CAST). IEEE (2016)

OLabs of Digital India, Its Adaptation for Schools in Côte d’Ivoire, West Africa Hal Ahassanne Demba, Prema Nedungadi and Raghu Raman

Abstract This exploratory study examined the potential for adapting Online Labs (OLabs), an innovative educational initiative of Digital India, for use in secondary schools in Côte d’Ivoire, West Africa. OLabs provides exposure to and participation in scientific experiments for students attending low-resource schools and cannot provide expensive laboratory equipments. Following a site visit in India, we examined Côte d’Ivoire curriculum documents and interviewed a variety of Ivorian stakeholders including school principals, science teachers, and students to evaluate needs and identify which experiments might be best suited for the Côte d’Ivoire context. Thirty experiments across the three scientific disciplines were identified as appropriate, and of these, 12 were translated from English to French, the official language of instruction in Côte d’Ivoire. These 12 experiments were published online and tested in Côte d’Ivoire by Ivorian teachers in terms of connectivity, user registration, accessibility to the translated experiments, and clarity of the content of translated experiments. The translated experiments were tested by teachers (n = 5) from Côte d’Ivoire. 80% of participants in this test rated the connectivity platform as good and indicated that the translated experiments were easily accessible. However, their deployment, monitoring, and empirical evaluation will necessarily need to be accompanied by public authorities and organizations of goodwill. Keywords Simulation Experimental practice



Adaptation



Usability



Training

H. A. Demba Université de Cergy Pontoise, Cergy, France e-mail: [email protected] P. Nedungadi (✉) Center Research in Analytics & Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India e-mail: [email protected] R. Raman Amrita School of Business, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_34

351

352

H. A. Demba et al.

1 Introduction The lack of experimental equipment and poor school systems in Africa in general and Côte d’Ivoire in particular constitute a significant handicap for the teaching and learning of experimental sciences—physics, chemistry, and biology. A major solution to overcome this problem involves usage of computer simulations in the teaching of experimental sciences. Several scientific studies have shown the positive contributions of this approach, including improvement in the level of the student conceptualization [1] and understanding the behavior of phenomena despite knowledge gaps in mathematics [2]. Use of scientific computer simulations offers additional benefits such as avoiding exposure to dangerous experiments [3] and it addresses problems related to lack of funding for lab equipment [4]. Use of simulations could not only address the pervasive lack of laboratories in secondary schools [5] in Africa but also attract and motivate students as digital technologies have become part of their culture. Successful adaptation, however, will require the adaption of technologies to the context of the African environment. In addition, potential learners would need to develop an understanding of the scientific method: experimentation, observation, interpretation, and conclusion. Unfortunately, many existing applications do not address these requirements. They are usually partial and only focus on teaching/learning a single concept. The integration of digital technologies in teaching and learning in schools has sometimes been a source of tension. Generally imposed by policymakers, these technologies—known under the name of Interactive Learning Environment (ILE)— often seem to appear without any introduction or training to ensure smooth transition and adaptation into existing pedagogical activities. Learning activities involve complex processes which depend on complex variables such as psychological, sociological, cultural contextual, and pedagogical approach. All of these variables must be adequately considered to enhance the probability of successful outcomes. Therefore, it raises the approach of Tricot et al. [6], which focuses on utility, usability, and acceptability, and offers a parsimonious way to consider these complex variables. According to them, the utility is relative to the educational effectiveness of the tool: Does the ILE allow students to learn what they are supposed to learn? Usability relates to the ability to manipulate the ILE. Is it easy to handle, to use, and reuse the ILE without wasting time and without making handling mistakes? Does acceptability concern the decision to use the ILE: Is it compatible with the values, the culture, and the organization in which it must be adopted? One aspect of acceptability is language; thus in the case of adapting OLabs to Côte d’Ivoire schools, each experiment must be translated into French. Adapting instruments to the users (teachers, students) is, therefore, a critical factor in their educational integration. Hence, we analyzed the possibility of adaptability of these online experiments for their use in the Ivorian context and conducted a pilot study.

OLabs of Digital India, Its Adaptation for Schools …

353

2 Online Labs (OLabs) in Digital India Online Labs (OLabs) is a national initiative under Digital India, to provide training on OLabs to 30,000 teachers in all Indian states. OLabs is available in local Indian languages and offered at both urban and rural schools, and it is hosted on the National Knowledge Network (NKN) to support the scaling to millions of students across the nation. OLabs is an innovative web-based pedagogical tool for practical laboratory experiments (Fig. 1). OLabs include laboratory experiments from physics, chemistry, and biology subjects for 9–12-grade students and the content is based on NCERT curriculum. An OLabs experiment contains a well-explained theory, real lab and simulator manual, very interactive simulation, attractive animations, real laboratory videos, quizzes, and resources. These online experiments are extremely helpful for the students who have no facility to do a practical lab experiments/lack of apparatus due to high coast in their schools. A similar kind of project named “MEDSIM” is being used by medical students to visualize, learn, and practice a variety of medical skills and procedures by computer-based medical simulations [7]. OLabs is hosted on Amrita VLCAP platform [8] that allows for multilingual authoring and publication of various sections of an experiment such as Theory and Procedure, Simulation and Animation sections and also allows links to external objects. The use of OLabs in the classroom constitutes a paradigm shift in the pedagogical activity of Ivorian educational institutions. The triangle of analysis, a

Fig. 1 Online Labs (OLabs) for science experiments

354

H. A. Demba et al.

Fig. 2 The structure of a human activity system

sequence of activity (Fig. 2), according to the Engeström model [9], reflects this complexity. In this model, subject is a student or a group of students established for the activity. The object is the topic of teaching or learning. Tools and signs are the set of tools used in the activity (experiments online, teaching language, or blackboard). The community is the set of students in the class. Division of labor symbolizes the different actions that actors realize to achieve the same goal. Rules help maintain and regulate the action within the classroom (instructions, prescriptions, guiding students, and validation of students’ answers during the process of construction of knowledge). We believe that all these poles taken into account can show the real impact of the OLabs on teaching and learning. Appropriate improvements could be introduced to the strategy of teachers training for their use. It is under such conditions that the strategy of the use of the OLabs will be built for the Ivorian context.

3 Educational System in India and Côte d’Ivoire According to the website of Ecole du Monde (Schools of the World), the education system in India typically has five cycles of education, whereas the education system in Côte d’Ivoire is divided into four cycles. Although the two educational systems are different, some equivalencies exist (Fig. 3). Thus, levels of study of middle school, the secondary school, and higher secondary school in the Indian education system correspond to levels of study of secondary education in the Ivorian education system. In the Indian education system, experimental science is generally taught through the end of the middle school, whereas in Côte d’Ivoire it is taught through the end of primary school. In India, differentiated science subjects are begun in Class 9 [10], but in Côte d’Ivoire, this separation begins in the 6th grade. The teaching of differentiated science subjects does not begin at the same level in the two systems. The objective of this research is to study the possibilities of successfully adapting the OLabs online approach to STEM learning [11] in Côte d’Ivoire. This involves analyzing the

OLabs of Digital India, Its Adaptation for Schools …

355

Fig. 3 Comparing education system in India and Côte d’Ivoire

content of online experiments of classes 9 and 10 in the education system of India and assessing how these fit with the programs of teaching physics, chemistry, and biology in the secondary level in Côte d’Ivoire, selecting the relevant experiments, customizing and translating them to French and finally, to pilot OLabs experiments in the Ivorian context.

4 Research Hypothesis To conduct this study, we propose the following hypothesis: Given that there is an equivalence of levels in the Indian and Ivorian education system, it is possible to adapt some online experiments used in the secondary education in India to the secondary education in Côte d’Ivoire. • It is possible to find correspondences between the proposed experiments taught in classes 9 and 10 (India) with experiments taught in secondary schools in Côte d’Ivoire. • It is possible to adapt the experiments used in the proposed experiments by updating context and translating to French.

356

H. A. Demba et al.

The variables of interest were the connectivity to the website, the access to the translated online experiments, and the clarity of the French language used to express the content of these online experiments.

4.1

Procedure, Participants, and Field of Study

Initially, the Côte d’Ivoire team visited a school in Kerala, India, in order to observe how OLabs is used in the classroom and a comparison between the OLabs content and official curriculum documents of Côte d’Ivoire to identify online experiments are appropriate for the Côte d’Ivoire educational context. This step involved the gathering of opinions from interviewed participants, science teachers (n = 9), students (n = 27), and Ivorian teachers (n = 5). Thirty online experiments were identified for this trial: 11 in physics, 12 in chemistry, and 7 in biology. Of these, 12 (4 in physics, 4 in chemistry, and 4 in biology) were selected for translation from English to French, the official language of instruction in Côte d’Ivoire. These 12 experiments were integrated into the curriculum. Use of OLabs in classroom was then carefully observed and assessed.

4.2

Data Collection and Analysis

Data of this research concern the online experiments and the material conditions of their use in the classroom. The data regarding the online experiments include their aim, specific class in which they are used, information on how they have been modeled, and the technical accessibility of the content. The data of the conditions of use of the online experiments include their mode of deployment, their accessibility for online use, the hardware required, usage comfort level by teachers and students, the frequency of use, problems encountered, and their resolution. Data on the use of online experiments in the classroom included the scenarios in which each experiment was used (moment of their entry in the conduct of the course, interactions of students during the use of the online experiment by the teacher, experimental practices of students, and activities proposed by the students to measure the learning outcomes after the use of the online experiment), utility and usability of the online experiments presented by the teacher, and the student’s attitude in relation to these online experiments. The resources used for data collection are OLabs brochures, website, mailing lists for user feedback, etc., information retrieval on the Internet, guides and official curriculum documents for the teaching of physics and chemistry in the secondary level in Côte d’Ivoire, and mails were exchanged with the Science and Technology department of ENS Abidjan in Côte d’Ivoire for getting information, interviews. The interviews were conducted in India with the participation of teachers and students in the study. Data regarding online experiments were processed according to the following procedure: Compare online experiments to the

OLabs of Digital India, Its Adaptation for Schools …

357

content of educational programs in Côte d’Ivoire, select usable experiments for Côte d’Ivoire, extract the content of the online experiments, and translate that content into French. By consulting guides and teaching programs (secondary school physics, chemistry programs in Côte d’Ivoire) and Science and Technology personnel of ENS Abidjan, 30 usable online experiments appropriate for the Ivorian context were identified: 11 in physics, 12 in chemistry, and 7 in biology. The translated information from the Theory, Procedure, and Resources sections were updated using Amrita VLCAP [12]. Relevant images of the content were associated using the Insert tool of the editor. Then, the links related to the sections of Animation, Simulator, and Video were inserted in the editor.

5 Results 5.1

Experiments Corresponding to the Ivorian Context

In the secondary level in Côte d’Ivoire, the same experiment can be shown in different levels of classes but using concepts related to the level of the class. 30 online experiments are identified as usable in Côte d’Ivoire secondary schools: 11 in physics (P), 12 in chemistry (C), and 7 in biology (B). Online experiments P3, P5, P8, P11, C5, C7, C9, C10, C11, B4, and B7 can, respectively, be used in more than one class as shown in Table 1. Some of the 30 usable online experiments may be used in multiple contexts in Côte d’Ivoire schools, so these may be counted more than once; thus, the original 30 actually correspond to 51 usable online experiments when considering the different application cases and different experimental techniques. In terms of adaptation, especially as regards language, we translated 12 online experiments including 4 in physics, 4 in chemistry, and 4 in biology. Table 1 shows the name of the translated online experiments into French.

5.2

Instance and Pilot of Online Labs for Côte d’Ivoire

A version of OLabs was created for the Ivorian context in the initial pilot. Five Ivorian teachers participated in the test of this version, and each completed an online questionnaire. The graphs in Fig. 4 show the result of this test.

358

H. A. Demba et al.

Table 1 List of usable online experiments translated into French Experiments online (number of experiments included) Physics P1 Verification of Archimedes’ principle (02) P2 Determination of density of solid (02) P3 Verification of Newton’s 3rd law of motion using two spring balances (01) P4 Newton’s second law (01) Chemistry C1 Separation of mixtures using different techniques (05) C3 Chemical reactions (05) C7 Determine pH with pH indicator strips/universal indicator solution (02)

C10 Saponification—the process of making soap (01) Biology B1 Onion and cheek cells (02) B3 Detection of starch in food samples (01) B4 Life cycle of a mosquito (01) B6 Asexual reproduction in amoebae and yeast (02)

Fig. 4 Diagram bars showing the results of the pilot

Related classes India Côte d’Ivoire Classe 9 Classe 9 Classe 9 Classe 9 Classe 9 Classe 9 Classe 10 Classe 10 Classe 9 Classe 9 Classe 9 Classe 10

3ème 3ème 3ème; 2nde Tle

5ème 5ème, 4ème, 3ème; 2nde 3ème; 2nde, Tle 1ère

1ère D 3ème 5ème; 4ème 3ème

OLabs of Digital India, Its Adaptation for Schools …

359

6 Discussion This exploratory study suggested that Online Labs (OLabs), developed in India, may be successfully adapted for use in secondary schools in Côte d’Ivoire, West Africa. This would allow students attending low-resource schools to participate in a wide variety of experiments in biology, chemistry, and physics. The preferred mode of deployment of the OLabs is through the Internet, thus allowing anywhere, anytime access to virtual experiments using a desktop computer, a laptop, or a digital tablet. An offline mode deployment is also available for schools. OLabs makes available to teachers and students of secondary education in Côte d’Ivoire, important teaching aids to improve the teaching and learning of physics, chemistry, and biology at the secondary level. It helps to compensate for the lack of experimental equipment in secondary schools in Côte d’Ivoire. Having established that OLabs can be adapted in Côte d’Ivoire, next steps involve the following: 1. Development of educational scenarios, by subject and class, to show when the experiments must exactly be used. 2. Organization of workshops for teachers’ training, in two steps (for new computer users and for the use of online experiments based on the pedagogical scenarios). 3. Monitoring and evaluation of the use of the online experiments in common classroom. According to Usoh and Slater (1995), “in an immersive virtual environment ideal, all the sensory input from the user is continuously supplemented by sensory stimuli from the virtual environment” [13]. Some of OLabs experiments are extremely immersive at the animation level and as future work can be augmented with haptics so that the user should be able to act directly on the hardware to handle it (take, turn, open up, pull, etc.), not just to click [14, 15]. In some experiments, the learner receives detailed guidance. But according to Tricot [16], if the learner is guided too much in learning, he does not learn, or he learns little. Future studies can compare the performance and motivation of learners based on the amount of guidance provided by the system [17].

7 Conclusion The objective of this research was to study the possibilities for adaptation and use, in the Ivorian context, of OLabs used in secondary schools in India for the teaching and learning of physics, chemistry, and biology. In this study, 30 existing online experiments used in the secondary schools in India are potentially usable in the secondary schools in Côte d’Ivoire. The 12 translated experiments were deployed tested by teachers from Côte d’Ivoire. 80% of participants in this test rated the connectivity platform as good and indicated that the translated experiments were easily accessible. All the participants (100%) believed that the content translated into French was clear. In addition, the study showed that OLabs are visually immersive. The study also suggested that introduction of OLabs into the classroom induces a paradigm shift in the educational activity of the teacher. Direct

360

H. A. Demba et al.

observations of videotaped courses should be conducted on the school grounds to analyze their actual impact on teachers and learners. This could lead to refinements either to the interactive applications or to the strategy of teachers’ training. There has thus far been no scientific study relating to the adaptation of eLearning innovation such as OLabs in Côte d’Ivoire. Thus, the choice of an exploratory approach for this study was assessed to be suitable and appropriate [18]. In addition, the participants of this study are the “social” actors relevant to the subject of research [19]. The diversity of information sources allowed the collection of different opinions in different contexts [20]. The use of various data collection instruments led to some data triangulation [21]. We believe that exceptional quality of OLabs constitutes an important tool to improve teaching and learning of experimental sciences in West African countries as well as in Côte d’Ivoire. The comparison of curricula enriches the curriculum of school education and the teacher training programs. We see OLabs adaptation as a significant opportunity that West African countries must take up to provide virtual labs to make up for the current paucity of sources in Ivorian experimental sciences. OLabs initiative could substantially help West African countries under the Indo-Africa cooperation between the government of India and those of West African countries. Acknowledgements The first author was supported by the CV Raman International fellowship under the India-Africa initiative of the Department of Science & Technology, Government of India. Online Labs for School Experiments (OLabs) is an initiative of the Ministry of Electronics and Information Technology, Government of India. The current research study is in full accordance with the ethical standards established by Ethical Research committee at our institution. Informed consent was obtained from the participants in our study.

References 1. Durey, A., Beaufils, D.: L’ordinateur dans l’enseignement des sciences physiques: Questions de didactique. Actes des 8ème Journées Nationales Informatique et pédagogie des Sciences Physiques, Udp et INRP, pp. 63–74 (1998) 2. Frising, F. et Cardinael, G. (1998). L’aide informatique aux travaux pratiques de Physique: avant, pendant et après la manipulation. 8èmes Journées Informatique et Pédagogie des Sciences Physiques, Montpellier 3. Milot, M.C., 1996. Place des nouvelles technologies dans l’enseignement de la physique-chimie 4. Richoux, B., Salvetat, C., Beaufils, D.: Simulation numérique dans l’enseignement de la physique: enjeux, conditions. Bulletin de l’Union des physiciens 842, 497–521 (2002) 5. Kolikant, Y.B.D.: Digital students in a book-oriented school: Students’ perceptions of school and the usability of digital technology in schools. J. Education. Technol. Soc. 12(2), 131 (2009) 6. Tricot, A., Plégat-Soutjis, F., Camps, J.F., Amiel, A., Lutz, G. and Morcillo, A.: Utilité, utilisabilité, acceptabilité: interpréter les relations entre trois dimensions de l’évaluation des EIAH. In: Environnements Informatiques pour l’Apprentissage Humain, pp. 391–402. ATIEF, INRP (2003) 7. Nedungadi, P., Raman, R.: The medical virtual patient simulator (MedVPS) platform. Intelligent Systems Technologies and Applications, pp. 59–67. Springer, Cham (2016)

OLabs of Digital India, Its Adaptation for Schools …

361

8. Nedungadi, P., Ramesh, M.V., Pradeep, P., Raman, R.: Pedagogical Support for Collaborative Development of Virtual and Remote Labs: Amrita VLCAP. In: Auer, M.E., Azad, A.K. M., Edwards, A., de Jong, T. (eds.) Cyber-physical laboratories in engineering and science education. Springer, Cham (2018) 9. Benayed, M., Verreman, A.: Evaluation d’un dispositif d’apprentissage collaboratif à distance. In Colloque TiceMed (2006) 10. Nedungadi, P., Malini, P., Raman, R.: Inquiry based learning pedagogy for chemistry practical experiments using OLabs. Advances in Intelligent Informatics, pp. 633–642. Springer, Cham (2015) 11. Raman, R., Haridas, M., Nedungadi, P.: Blending Concept Maps with Online Labs for STEM Learning. Advances in Intelligent Informatics, pp. 133–141. Springer, Cham (2015) 12. Raman, R., Nedungadi, P., Achuthan, K., Diwakar, S.: Integrating collaboration and accessibility for deploying virtual labs using vlcap. Int. Trans. J. Eng. Manag. Appl. Sci. Technol. 2(5), 547–560 (2011) 13. Tyndiuk, F., 2005. Référentiels Spatiaux des Tâches d’Interaction et Caractéristiques de l’Utilisateur influençant Performance en Réalité Virtuelle (Doctoral dissertation, Université Victor Segalen-Bordeaux II) 14. Ramesh, M.V., Pradeep, P., Divya, P.L., Devi, R.A., Rekha, P., Sangeeth, K., Rayudu, Y.V.: AMRITA remote triggered wireless sensor network laboratory framework. In: SenSys, pp. 24–1 (2013) 15. Ramesh, G., Sangeeth, K., Maneesha, R.: Flexible extensible middleware framework for remote triggered wireless sensor network lab. Intelligen. Syst. Technol. Appl. (2016) 16. Tricot, A.: Guidages, apprentissages et documents: Réponse à Françoise Demaizière. Notions en Quest. 8, 105–108 (2004) 17. Raman, R., Achuthan, K., Nedungadi, P., Diwakar, S., Bose, R.: The vlab oer experience: modeling potential-adopter student acceptance. IEEE Trans. Educat. 57(4), 235–241 (2014) 18. Trudel, L., Simard, C., Vonarx, N.: La recherche qualitative est-elle nécessairement exploratoire? Recherches qualitatives, pp. 38–45 (2006) 19. Savoie-Zajc, L.: Comment peut-on construire un échantillonnage scientifiquement valide? Recherches Qual. 99–111 (2006) 20. Grimes, D., Warschauer, M.: Learning with laptops: a multi-method case study. J. Education. Comput. Res. 38(3), 305–332 (2008) 21. Karsenti, T., Collin, S., Harper-Merrett, T.: Intégration pédagogique des TIC: Succès et défis de 87 écoles africaines. IDRC, Ottawa, CA (2011)

Energy Harvesting Based on Magnetic Induction A. A. Gaikwad and S. B. Kulkarni

Abstract Currently, an increasing research study in the field of energy conservation is concentrating on the application of energy management sensors in the power distribution system. This study represents the way of an energy harvesting system using stray magnetic energy formed by electric current across power line of a power distribution system to provide energy for low-power devices like wireless sensor network. The methods used the ferrite toroidal cores tested and validated with the various number of turns targeting harvesting power device. The basic of ferromagnetic materials focused on the magnetic field theory to get energy for providing power to small electronic devices. The first step determines the magnetic field around the current carrying wire using magnetic field theory. The second step tests the proposed harvester by varying secondary turns. The third step compares between the selected cores which led to the best result.



Keywords Energy harvesting Magnetic fields Magnetic flux density Ferrites Permeability





⋅ ⋅

Electromagnetic induction Voltage

1 Introduction Recently, the expansion of low-power devices and demand for wireless sensors together lead to the development of energy harvesting. Energy harvesting is the collection and transformation of little measure of promptly available energy in the surrounding atmosphere into operational electrical energy. It is adopted for straight application as well as scavenge and stored for later utilization. This gives an A. A. Gaikwad (✉) Electrical, Electronics and Power, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, India e-mail: [email protected] S. B. Kulkarni Electrical Engineering, Government Polytechnic Washim, Washim, Maharashtra, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_35

363

364

A. A. Gaikwad and S. B. Kulkarni

effective cause of energy for applications where there is no network power and it is not sufficient to introduce other renewable energy sources like wind and solar. Mainly, the energy harvesting is maintenance free and is globally approachable. Recently, the environment prefers the harvesting energy obviously for giving devices with battery—less or in another way an integrated battery charger device. Energy harvesting is very important in the applications such as remote corrosion monitoring systems, implantable devices and remote patient monitoring, structural monitoring, Internet of Things (IoT), and equipment monitoring. The energy harvester is a device that creates energy from the external sources. Sometimes, the energy from harvested sources is irregular and small, so it is the most important that the systems design should be efficient to harvest and store the power. Energy harvesting system based on stray magnetic induction is getting substantial care; meanwhile, it is also appropriate to determine the variables in network lines. Previous studies [1–9] have introduced several methods of energy harvesting near the power line as (1) on the basis of an electric field around the power line. (2) One way to utilize the magnetic energy by the principle of magnetic induction close to the current carrying power line. (3) Piezoelectric-based energy harvesting is a source of energy to power MEMS sensor in smart grid applications. A case of utilization of energy harvesting based on magnetic induction can be realized. Roscoe and Judd [10] determined an inductive harvester in order to apply in positions where the magnetic field is found around the line conductors and the results show that 300 μW can be achieved using magnetic flux density of 18 μT. In [11], author described that an optimal device as an inductive harvester for use in the place where there is a static magnetic field around current carrying conductor provides 17.64 W power while the current in the range of 102.5 Arms in primary conductor. In [12], a miniature harvesting device was introduced consisting of five layers of a cylindrical core made up of flexible magnetic material with 280 secondary coil turns generated at a voltage level of 0.87 Vrms at a line current 13.5 A. In [13], some experiments were performed that have shown that ferrite-based energy harvesting system is more flexible than nanocrystalline core and iron powder core. The experimental method of this study used an electromagnetic energy harvesting using stray magnetic field generated by electric current around a single-phase power cord of a distribution system that can be utilized to provide the more energy for the small power energy management circuit in low-power devices.

2 Magnetic Field Theory The available magnetic field around the current carrying conductors, for energy harvesting, can be explained by the Ampere’s law as follows. Due to the electric current flowing through wire, the magnetic field is formed as concentric circle around the conducting wire. Therefore, as indicated by Ampere’s

Energy Harvesting Based on Magnetic Induction

365

law, the magnetic flux density at a given distance r, from an infinitely long wire having length l ≫ r, carrying an alternating current is B=

μ0 I ðTÞ 2πr

ð1Þ

where B is the magnetic flux density in a distance r from the wire and I is the current in the current carrying conductor. From Eq. (1), it is seen that magnetic flux density directly depends on magnitude of current, as increase in value of current magnetic flux density increases. Figure 1 shows the distribution of electromagnetic flux density around a conducting wire in the built atmosphere. The conductor is of 18 AWG and the current carrying capacity is 1–10 A. The magnetic flux density on the graph in Fig. 1 is calculated using permeability of free vacuum (μ0 = 4π × 10−7 H/m) and Eq. 1. The simulation on the graph in Fig. 1 shows a radial distance of 2–30 mm from the current carrying conductor. At 2 mm radial distance, magnetic flux density is less than 0.1 T and it is increasing as increase in magnitude of current in the wire more than 0.6 T while current is at the value of 10 A. Also, it is seen that magnetic flux density around conducting wire is reduced with increase in radial distance from the conductor.

Fig. 1 Magnetic flux density around conducting wire for current 1–10 A

366

A. A. Gaikwad and S. B. Kulkarni

The magnetic flux Φ can be concentrated inside a ferromagnetic core, with sectional area A of the core given by Φ = BA

ð2Þ

The induced voltage in secondary turns is given by Faraday’s and Lenz’s law: Vs =

N2 dΦ dt

ð3Þ

where N2 is the secondary winding turns.

3 Experimental Details To perform the experiment and validate the theoretical analysis, an experimental setup was used which consists of electrical measuring devices like multimeter and clamp meter for measuring the voltage and current, respectively. Also, an autotransformer was used for adjusting the voltage of an AC power supply at constant value of single phase 230 V, 50 Hz. Ten switches are used to switch the resistive load (ON and OFF) to supply the different values of load current through the ferrite core. Two ferrite cores are selected of the same area of cross section (0.8 cm2) having relative permeability (ur) of 5000 and 3000. By varying the resistive load in the range of 200 W to 2 KW, the current in primary can be varied from 0 to 10 A. The proposed experimental setup is as shown in Figs. 2 and 3.

Fig. 2 Experimental setup

Energy Harvesting Based on Magnetic Induction

367

Fig. 3 Ferrite core for energy harvesting with copper coil turns (0.255 mm diameter)

4 Experimental Results After performing the several experiments on J-type (ur = 5000) and T-type (ur = 3000) ferrite core by varying the number of secondary coil ranging from 1000 to 2500 turns, the results are drawn in Figs. 4 and 5 to demonstrate ferrite core J-type and T-type, respectively.

10 1000 turns 1500 turns 2000 turns 2500 turns

Measured Induced Voltage (V)

9 8 7 6 5 4 3 2 1 0

1

2

3

4

5

6

7

8

9

Primary current (Amp)

Fig. 4 Measured induced voltages (V) versus primary currents (IP) with number of turns (Ns) of the secondary as a parameter for J-type ferrite core

368

A. A. Gaikwad and S. B. Kulkarni 10 1000 turns 1500 turns 2000 turns 2500 turns

Measured Induced Voltage (V)

9 8 7 6 5 4 3 2 1 0

1

2

3

4

5

6

7

8

9

Primary current (Amp)

Fig. 5 Measured induced voltages (V) versus primary currents (IP) with number of turns (Ns) of the secondary as a parameter for T-type ferrite core

10 T-Type core 9 J-Type core Measured Induced Voltage (V)

8 7 6 5 4 3 2 1 0

1

2

3 4 5 6 Primary current (Amp)

7

8

9

Fig. 6 Best obtained results

Considering the high induced voltage achieved from cores, the best remarkable results were acquired, while 2500 secondary coil turns wound on ferrite J-type core followed by ferrite T-type core as shown in Fig. 6.

Energy Harvesting Based on Magnetic Induction

369

Table 1 Experimental results Type of ferrite core

Relative permeability ur

Measured induced voltage Vrms Vpeak

Secondary coil turns Ns

Primary current Ip Arms

J-type T-type

5000 3000

8.3 8

2500 2500

9 9

11.7362 11.312

Table 1 shows the RMS values of load current, load in watts, induced RMS voltage, and peak voltage which determined at the terminal of the secondary coil.

5 Conclusion From the experimental result, it was possible to generate AC voltage in the range of 8.3 Vrms from J-type and 8 Vrms from T-type ferrite core considering 9 Arms current on primary, while the number of secondary coil was 2500. Finally, it is concluded that relative permeability and secondary coil turns have influence on the value of induced voltage. Larger the number of turns, higher the induced AC voltage in secondary turns. Hence, ferrite core is very convenient as it decreases the loss of the eddy current considerably with high magnetic permeability.

References 1. Yuan, S., Huang, Y., Zhou, J., Xu, Q., Song, C., Thompson, P.: Magnetic field energy harvesting under overhead power lines. IEEE Trans. Power Electron. 30(11), 6191–6202 (2015) 2. Pai, P., Chen, L., Chowdhury, F.K., Tabib-Azar, M.: Non-intrusive electric power sensors for smart grid. Sensors, 1–4. IEEE (2012) 3. Chang, K.S., Kang, S.M., Park, K.J., Shin, S.H., Kim, H.S., Kim, H.S.: Electric field energy harvesting powered wireless sensors for smart grid. J. Electric. Eng. Technol. 7(1), 75–80 (2012) 4. Hu, J., Yang, J., Wang, Y., Wang, S.X., He, J.: A nonintrusive power supply design for self-powered sensor networks in the smart grid by scavenging energy from AC power line. IEEE Trans. Indust. Electron. 62(7), 4398–4407 (2015) 5. Khan, F.U.: Energy Harvesting from the stray electromagnetic field around the electrical power cable for smart grid applications. Sci. World J. (2016) 6. Tashiro, K., et al.: Energy harvesting of magnetic power-line noise. IEEE Trans. Magn. 47 (10), 4441–4444 (2014) 7. Vo Misa N., Noras, M.A.: Energy harvesting from electromagnetic field surrounding a current carrying conductor. In: Proc. ESA Annual Meeting on Electrostatics (2013) 8. Wen, Y., Li, P., Yang, J., Peng, X.: Harvester Scavenging AC Magnetic Field Energy of Appliance Cords Using Piezomagnetic/Piezoelectric Me Transducers 9. Hosseinimehr, T., Tabesh, A.: Magnetic field energy harvesting from AC lines for powering wireless sensor nodes in smart grids. IEEE Trans. Industr. Electron. 63(8), 4947–4954 (2016)

370

A. A. Gaikwad and S. B. Kulkarni

10. Roscoe, N.M., Judd, M.D.: Harvesting energy from magnetic fields for power condition monitoring sensors. IEEE Sens. J. (2013) 11. Wang, W., Huang, X., Tan, L., Guo, J., Liu, H.: Optimization design of an inductive energy harvesting device for wireless power supply system overhead high-voltage power lines. Energies 9(4), 242 (2016) 12. Bhuiyan, R.H., Dougal, R.A., Ali, M.: A miniature energy harvesting device for wireless sensors in electric power system. IEEE Sens. J. 10(7), 1249–1258 (2010) 13. Oliveira de Moraes Júnior, T., Percy Molina Rodriguez, Y., Protásio de Souza, C., Cleudson Sousa Melo, E.: Experimental results on magnetic cores for magnetic induction-based energy harvesting. Instrum. Viewpoint 14, 65–65 (2013)

Design of Asset Tracking System Using Speech Recognition Ankita Pendse, Arun Parakh and H. K. Verma

Abstract Speech recognition has seen a significant development in past decade. Advances in digital speech processing are supporting affordable applications in a variety of human/machine communication. This emerging technology has been implemented in fields like security, medicine, media, and recently in web browsing by Google Voice search. This paper discusses a speech-activated assistive system for locating misplaced items. Technologies such as Bluetooth and Android open-source platform are used in the design. Bluetooth-enabled system ensures a user-compatible, compact, wireless, and secure operation. Microcontroller-based asset tracking tags are developed which are attached to items that most commonly get misplaced. Using speech commands as input to the Android application created, these items can be tracked. Buzzer interfaced with the microcontroller in every tracking tag assists the tracking process. This work paves the way to promote the use of speech recognition for the development of a variety of innovative-assistive technologies. Keywords Speech recognition



Assistive technology



Bluetooth

1 Introduction Keys, wallets, and purses have become an important part of our daily life. Prone to getting lost or misplaced, a lot of energy and time goes into finding such items. This time and energy if saved can result in increased efficiency of day-to-day life. Asset tracking systems can provide a solution to such situations. This paper discusses a A. Pendse (✉) ⋅ A. Parakh ⋅ H. K. Verma Shri G.S. Institute of Technology and Science, Indore, India e-mail: [email protected] A. Parakh e-mail: [email protected] H. K. Verma e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_36

371

372

A. Pendse et al.

system that uses smartphones for asset tracking. A large market today uses smartphones. Such vast popularity ensures easy adaptability of users to any system that uses the smartphone technology. Asset tracking tags developed, work on Bluetooth technology which makes it compatible with smartphones. To reduce manual effort by the user, speech recognition is used as the mode of operation [1, 2]. With the introduction of speech search feature, this technology has experienced a huge demand among the smartphone users. This natural mode of operation needs less training and can prove to be of assistance to elderly and differently abled people. Speech recognition has today come a far way since 1960s when it was in its infancy. In 1967, Leonard E. Baum and J. A. Eagon developed Hidden Markov model. This probabilistic method was then applied to the field of speech recognition and has now become the basis of most of the speech recognition algorithms used today [3]. Speech-based operation is a major contributing factor to the system developed here. It sets an example of the ways in which speech-based automation can be done. It enables increased use of speech as a mode of command and operation in day-to-day activities. Assistive technologies based on popular technologies such as Bluetooth and Android ensures widespread reach and easy acceptability.

1.1

Organization of the Paper

Introduction briefly discusses the need of an asset tracking system and use of state-of-the-art technologies for this purpose. Books and research papers studied for development of this system have been discussed under Literature Survey. Working of the system as a whole and all its integral processes are elaborated under Design. Implementation lists out various software and hardware components used in creating this asset tracking system. Experiments followed by result, analyze, and discuss various conditions under which the system was studied and the results thus obtained. Conclusions are then made regarding the performance of the system, and discussion about the possible amendments that can be done in the future is done.

2 Literature Survey A valuable amount of research has been carried out on this topic. Various technologies like asset tracking, speech recognition, and Android-based application have been studied, a few of which have been described here. Paper by Nusbaum et al. [1] discusses voice recognition and synthesis from the aspects of intelligibility and naturalness. It also proposes a new methodology for determining the naturalness of a synthetic voice, independent of its intelligibility. Paper presented by Raman and Gries [2] proposes an interactive computing system, ASTER, for audio formatting of electronic documents. It converts documents

Design of Asset Tracking System Using Speech Recognition

373

written in Latex to audio documents. The book by Roe and Wilpon [3] is a compilation of work on speech recognition technology by various eminent personalities over the period of time. It discusses the need, the initial developments, and the contemporary form of speech recognition techniques. This paper presented by Ahmad et al. [4] provides a comprehensive study of various existing asset tracking technologies. All the technologies are studied under various aspects such as active and passive, indoor and outdoor performance, etc., and are then differentiated on different criteria. Bisio et al. [5] present a paper that discusses two processes, detection of proximity of an asset and location finder for mobile objects. It emphasizes on automatic switching ON and OFF of the GPS unit as and when required, thus maximizing battery lifetime of smartphones. The design proposed by Dofe et al. [6] aims at creating an asset tracking system for people who misplace their items often. The system uses an Android application as the wireless signal transmitter and Bluetooth low energy tags as receivers. This paper presented by Tarneja et al. [7] discusses an Android-based application capable of recognizing speech commands. This interactive application is designed so as to operate all the basic Android smartphone functions using speech commands. In the paper presented by Nafis and Safaet Hossain [8], they discuss ways in which speech recognition can be achieved in real time. It is designed to help people with hearing disabilities to interact more efficiently. Paper presented by Rodman and Williams [9] proposes a system that makes telephone calls to voters to play a short campaign message. The calls are so programmed that if unanswered, they leave a message on the answering machine. Work proposed by DeHaemer et al. [10] discusses a comparison made between two input modes for keyboards, computers, and Continuous Voice Recognition (CVR). The two methods are compared based on various tasks like accuracy, task completion time, correction count, word count, and user confidence. Work proposed by Khilari et al. [11] provides a comprehensive study of speech recognition which is presented as innovative and convenient means of communication between humans and machines. This paper presented by Sutar Shekhar et al. [12] discusses development of a system based on Android Operation System and implements voice recognition to create an intelligent assistive system. Paper presented by Frewat et al. [13] discusses the Mel frequency cepstral coefficients for speech processing. It proposes techniques for achieving higher levels of accuracy in voice control processes. Work presented by Lin et al. [14] proposes the development of waterproof Bluetooth low energy tags for dementia patients, designed to be worn all the time.

3 Design The main components of the proposed system are the speaker/user, the smartphone-based Android application, and the tracking tag [4]. Speaker initializes the tracking process by giving voice commands to the system. These speech

374

A. Pendse et al.

Fig. 1 Asset tracking process

commands are the passwords in sync with specific Bluetooth Ids in the tracking tags [5, 6]. On receiving the speech commands, the first function performed by the Android application is to convert it into a text string [7]. It is critical to the operation of the process that the speaker gives clear, precise, and proper commands as the text obtained from these is used by the system [8, 9]. The text obtained is authenticated with the preset text string. In case of a successful authentication, the smartphone Bluetooth is paired with that of the specific tracking tag. The user can now command the tracking tag buzzer to switch ON/OFF using speech commands (as shown in Fig. 1). Primary mode of operation here is speech recognition. Following advances have converged to make the new technology possible in the case of continuous speech recognition (CSR): • better speech modeling techniques resulting in higher accuracy CSR, • reduction in time needed for high-accuracy recognition using better recognition search strategies, and • improved audio perception of microphones. Speech recognition technology is pervasive and will immensely influence the nature of communication of humans with machines and that with other humans. Most algorithms used today to this effect work on the principle of Hidden Markov Model (HMM). It is a probabilistic model which consists of states and transition between the states. Unlike Markov model which has a single output per state, the output states in HMM are probabilistic (as shown in Fig. 2). Therefore, it is also called as doubly probabilistic or doubly stochastic method. In speech recognition mode, an HMM can be used to compute the probability of generating a sequence of spectra. The process starts with state 1 and given an input speech spectrum that has been quantized to a large number of templates, probability of that spectrum can be found using the lookup table. Thereafter, if a transition is made from state 1 to state 2, the previous output probability is multiplied by the transition probability from state 1 to state 2. A new spectrum is now computed over the next frame of speech and quantized. The corresponding output probability is then determined from the output probability distribution corresponding to state 2.

Design of Asset Tracking System Using Speech Recognition

375

Fig. 2 Three-state hidden Markov model [3]

That probability is multiplied by the previous product, and the process is continued until the model is exited. The result of multiplying the sequence of output and transition probabilities gives the total probability that the input spectral sequence was “generated” by that HMM using a specific sequence of states. For every sequence of states, a different probability is obtained. For recognition, the probability computation is performed for all possible phoneme models and all possible state sequences. The sequence that results in the highest probability is declared to be the recognized sequence of phonemes [3].

4 Implementation Various technologies put together for developing this system are discussed here.

4.1

Hardware Tools Used

Each tracking tag consists of a Bluetooth module HC-05, an Arduino Pro Mini, a Piezo buzzer, and a battery. Bluetooth is a wireless technology that works on radio waves and enables devices to exchange data over a short range of distance. Bluetooth Module HC-05 (shown in Fig. 3) is a Bluetooth SPP (Serial Port

Fig. 3 Bluetooth module HC-05

376

A. Pendse et al.

Fig. 4 Arduino Pro Mini

Protocol), designed for wireless serial connection setup. It has a footprint of 12.7 mm × 27 mm. It provides an operational range of over 10 m. Arduino Pro Mini is a microcontroller board based on the ATmega328 (shown in Fig. 4). This board was developed for applications and installations having space constraints. Here, the Arduino microcontroller is programmed to compare the input text strings to those predefined. Accordingly, it sets the output pins, connected to the Piezo buzzer, high or low.

4.2

Software Tools Used

Arduino IDE (Integrated Development Environment) is a computer software written in Java programming language and is implemented on multiple computing platforms. It provides an easy single-click mechanism to compile and upload programs to the Arduino microcontroller. It supports C and C++ programming languages. App Inventor is an open-source web application developed by Google. It applies graphical interface technique for coding, similar to scratch user interface, which allows users to create a software by drag-and-drop action on visual objects (blocks of code). Amateur computer programmers can therefore create software applications for the Android OS using App Inventor. Google Cloud Speech API is an open-source API that enables developers to convert audio to text by applying powerful neural network models. The API recognizes over 80 languages and variants, to support the global user base. Most advanced deep learning neural network algorithms are applied here which provide unprecedented accuracy. As Google improves the internal speech recognition technology used, the accuracy of speech API improves over time. Speech commands given by the user, to the Android application, create vibrations in the atmosphere. These vibrations are analog in nature. Sound is then digitized at frequent intervals. This digitized wave is filtered to remove noise signals. The filtered wave is divided into bands of frequency [10]. The speech input patterns are not always uniform in nature. Thus, it becomes highly necessary to regulate the input speech waveform in terms of speed and volume, making it easy to match with the sample sound templates in the system

Design of Asset Tracking System Using Speech Recognition

377

database. Regulated signals are further divided into small segments. Vowels are segmented to 100th of a second and consonants to 1000th of a second. These segments are then matched with the sample templates to identify the phonemes. These phonemes when put together create the text string [11–13].

5 Experiments Performance of this system was observed for various parameters, as discussed below. Google speech API is used here for speech recognition. Therefore, overall performance of the system depends on how efficiently and accurately it recognizes the speech and converts it into text strings. The system when tested for parameters like range of operation (distance between tracking tag and smartphone) and voice command perception range (distance between speaker and smartphone) presents results as discussed below. System is supplied power using a 9 V battery, and a piezo buzzer is the load. Test results, as shown in Fig. 5, establish the fact that the range of operation of the system is a radius of 10 m around the asset tracking tag. The range of perception of the voice commands by the smartphones is 3 m.

Fig. 5 Test results for range of operation

378

A. Pendse et al.

6 Results According to the test results obtained, it can be deduced that the system has an operation range of 10 m due to range of Bluetooth module. Observations of Trial 1 and Trial 3, unlike Trial 2 and Trial 4, show system efficiency less than 100% within operation range. This drop in efficiency is due to failure in speech to text conversion which can be a direct result of various reasons such as low internet speed, unclear speech command, etc. Use of Bluetooth module in asset tracking tags ensures successful communication with the smartphones within the range of operation irrespective of physical obstacles between them. A voice command perception range of 3 m using Google Speech API on smartphone ensures feasibility of tracking process.

7 Conclusion and Future Scope Speech recognition technology has come up to be of particular interest as it proves to be a direct mode of communication between humans and machines. Also, as there is increase in the number of assets in daily lives which are of great importance and also prone to getting misplaced, there is an increased need for innovative tracking methods. System developed here shows how speech recognition along with Android and Bluetooth technology paves the way in this direction as they provide an easy and more accessible way of communication and operation. The objective for future developments in this system is enhancing the quality of speech recognition by including various languages and accents. The range of operation can also be increased and by incorporating technologies like IoT (Internet of Things) and GPS to the system [14].

References 1. Nusbaum, H.C., Francis, A.L., Henly, A.S.: Measuring the naturalness of synthetic speech. Int. J. Speech Technol. 2(1), 719 (1997). https://doi.org/10.1007/BF02277176 2. Raman, T.V., Gries, D.: Audio formatting making spoken text and math comprehensible. Int. J. Speech Technol. 1(1), 2131 (1995). https://doi.org/10.1007/BF02277177 3. Roe, D.B., Wilpon, J.G.: Voice Communication between Humans and Machines. ISBN 978-0-309-04988-7. https://doi.org/10.17226/2308 4. Ahmad, S., Lu, R., Ziaullah, M.: Bluetooth an optimal solution for personal asset tracking: a comparison of bluetooth, RFID and miscellaneous anti-lost traking technologies. Int. J. u and e Serv. Sci. Technol. 8(3) (2015). https://doi.org/10.14257/ijunesst.2015.8.3.17 5. Bisio, I., Sciarrone, A., Zappatore, S.: Asset tracking architecture with bluetooth low energy tags and ad hoc smartphone applications. In: 2015 European Conference on Networks and Communications, June 2015. https://doi.org/10.1109/eucnc.2015.7194118

Design of Asset Tracking System Using Speech Recognition

379

6. Dofe, R.S., Jadhav, S.K., Pingle, B.: Smart object finder by using android and bluetooth low energy. Int. J. Innovat. Sci. Res. 13(1) (2015) 7. Tarneja, R., Khan, H., Agrawal, R.A., Patil, D.D.: Voice commands control recognition android app. Int. J. Eng. Res. Gen. Sci. 3(2), 145–150 (2015). ISSN: 2091-2730 8. Nafis, N.A., Safaet Hossain, M.D.: Speech to text conversion in real-time. Int. J. Innovat. Sci. Res. 17(2), 271–277 (2015). ISSN: 2351-8014 9. Rodman, R.D., Williams, O.: Voter response to computerized campaigning. Int. J. Speech Technol. 1(1), 33–40 (1995). https://doi.org/10.1007/bf02277178 10. DeHaemer, M.J., Wright, G., Richards, M.H., Dillon, T.W.: Keep talking performance effectiveness with continuous voice recognition for spreadsheet users. Int. J. Speech Technol. 1(1), 41–48 (1995). https://doi.org/10.1007/BF02277179 11. Khilari, P., Bhope, V.P.: Implementation of speech to text conversion. Int. J. Innovat. Res. Sci. Eng. Technol. 4(7), 3067–3072 (2015). https://doi.org/10.15680/IJIRSET.2015.0407167 12. Sutar Shekhar, S., Pophali Sameer, S., Kamad Neha, S., Deokate Laxman, J.: Intelligent voice assistant using android platform. Int. J. Advanc. Res. Comput. Sci. Manag. Stud. 3(3), 55–60. ISSN: 2231-2803 (2015) 13. Frewat, G., Baroud, C., Sammour, R., Kaseem, A., Hamad, M.: Android voice recognition application with multi speaker function. In: Mediterranean Electrotechnical Conference, April 2016. https://doi.org/10.1109/melcon.2016.7495395 14. Lin, Y.-J., Chen, H.-S., Su, M.-J.: A cloud based bluetooth low energy tracking system for dementia patients. In: Eighth International Conference on Mobile Computing and Ubiquitous Networking (ICMU), January 2015. https://doi.org/10.1109/icmu.2015.7061043

Cybercrime: To Detect Suspected User’s Chat Using Text Mining Khan Sameera and Pinki Vishwakarma

Abstract In this fast growing era of modern technology, it has become important need for people to communicate with each other. Various mediums like social media are used for communication business, organizational details, etc. Due to the increase in technology, there are chances of performing the crimes in newer ways. Using these Social Networking Sites (SNS), many criminal activities have the ability to extract the information. Such activities are spamming, cyber prediction, cyber threatening, killing, blackmailing, phishing, etc. The suspicious messages can be transferred via different SNS, mobile phones, or other sources. Most of the crime-related activities on information on a web are in text format, which is tedious task to trace those criminal activities. Detecting and exploring the crime and identifying the criminals are involved in the analyzing “crime process.” Criminology is a field of using text mining techniques that describe the complexity of relationship between crime datasets. Text mining technique is an effective way to detect and predict criminal activities. Text mining is the process of extracting interesting information or knowledge or patterns from the unstructured text that are from different sources. Since this uses text mining algorithm to continuously check for suspicious words even if they are in short forms or phrases, then they find social graph of users. In this framework, n-gram technique with SCHITS (Hyperlink-Induced Topic Search) algorithm is used to find suspected message, user, sessions, etc. Information can be extracted using social graph-based text mining, and also the suspected user’s profile has been found. Keywords Social network analysis (SNA) investigation



Social graph



Cybercrime

K. Sameera (✉) ⋅ P. Vishwakarma Shah & Anchor Kutchhi Engineering College, Mumbai, India e-mail: [email protected] P. Vishwakarma e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_37

381

382

K. Sameera and P. Vishwakarma

1 Introduction In this busy modern world, people have become fond of Internet and its technologies. The Internet is being used extensively for most of the real-life applications such as sending e-mails, distant learning, online searching, chatting in collaborative environment, etc. in the past few years. At present, there are various chat tools as well as chat rooms that are available on Internet. Due to the emergence of such rooms and tools, communications between Internet users all over the world have been enriched. Due to the growth of Internet technology, many legal as well as illegal activities have been increasing. The evolution of Internet may lead to many cybercrimes. Cybercrime is increase day by day on social media. One of the most amazing boons of Internet technology is its ability to connect with various individuals around the globe with the help of various social networking sites. Analyzing these sites for various criminal activities again becomes difficult. Social network analysis deals with analyzing the group behavior, organizations, or individuals and determining its behavioral patterns. For counterterrorism applications, social network analysis became a major tool. Social Networking Site (SNS) provides facilities for individuals to construct a profile, which is either public or private. SNS consists of list of user with whom we can communicate, share the connections, and also share the information. SNS users communicate by messaging, blogs, chatting with music files and video. SNS is very important in the human life. This has become the main communication medium between the users and organization after verbal and nonverbal communication. An important aspect that has emerged because of the breakthrough of Internet technology is the misuse of this technology in communicating abduction of young teenagers and children via chat rooms or e-mails that are difficult to monitor. There are many searching and network mining tools for structured documents, but it fails to analyze unstructured documents like chat log or instant messages. Monitoring such chat rooms will be helpful in the detection of crime and even to some extent crime prevention. In social media, many users attempt suspicious activities in the form of text, image, audio, etc. and exchange them online with other users. The wide amount of information stored in text form. The text data are collected from different sources like newspapers, blogs, social networking sites, e-mail, etc. Because it is in unstructured format, therefore, we cannot extract required information. There are data mining techniques which are used to extract the information from text documents like clustering, classification, neural network, decision tree, and visualization. Text mining can work with the unstructured or semi-structured datasets such as e-mails, full-text documents, HTML files, etc. and information is there in the natural form called as text. Unlike data mining where it handles only structured data, in this post, we focus on text mining. In text mining techniques, we can detect and predict criminal activities by extracting interesting information from unstructured text from different sources [1, 2]. According to new statistics, more than 16,000 criminal activities on social

Cybercrime: To Detect Suspected User’s Chat Using Text Mining

383

media, including Facebook and Twitter, have been reported over the last year. It has become very important to monitor online forums in order to detect criminal or more broadly terrorist activities. There is not much research on monitoring chat room conversations for potentially harmful activities. The current monitoring techniques are basically manual which is costly, tedious, difficult, and time-consuming. Thus, our project is a very good option to detect such activities since it uses text mining algorithm to continuously check for suspicious words even if they are in the form of code words or short forms or phrases, and then they find social graph of users [3]. In social graph of user, one weighted graph is formed which maps the user and his group on a network [8].

2 Related Work Cybercrime is increase day by day on social media. One of the most amazing boons of Internet technology is its ability to connect with various individuals around the globe with the help of various social networking sites. Analyzing these sites for various criminal activities again becomes difficult. Social networking analysis deals with analyzing the behaviors of organizations, groups, and individuals and determining its behavioral patterns. This became a major tool for counterterrorism applications. Text mining technique is an effective way to detect and predict criminal activities. Thus, our project is a very good option to detect such activities since it uses text mining algorithm to continuously check for suspicious words even if they are in the form of code words or short forms [4]. Since many social networking sites allow information to be publicly available, many criminal cases can be solved by analyzing this publicly available information on social media. An emoticon is a pictorial representation of a facial expression characters, usually punctuation marks, numbers, and letters, to express a person’s feelings or mood, or as time-saving shorthand. Generally, they are used in nonverbal communication. The most commonly used emoticons “:-(”, ;-)” and “:-)” which represent frown expression, a wink and smile, respectively. Mostly these emoticons are used in social networking sites, blogs, etc. so such kind of emotion text mining expert and researchers to devise generic and scalable techniques for analyzing the noisy unstructured textual data.

3 Proposed System Suspicious messages sent through SNS or instant message are untraceable which leads to hindrance for cybersecurity. The existing general forensic search tools have some short comes. We proposed a framework that identifies and predicts the

384

K. Sameera and P. Vishwakarma

Fig. 1 Proposed social network graph based on text mining framework for chat log investigations

suspicious chat logs on SNS. The main objective is to develop a system to detect and monitor suspicious activity over a network and to find suspected user’s profile [5]. The system architecture of our project is shown in Fig. 1. It consists of data extraction and normalization, suspicious word detection, construction of social network graph, identification of user group, and user profile detection. In the first step, the chat logs are gone through different information components and done normalization for noise and error removal as well as slang neutralization. In the second step, n-gram technique is used to extract a set of vocabulary of the chat. After this, HITS algorithm is applied to find the suspected user. At the end, social network graph is created to find the user’s profile, user group, etc.

3.1

Data Extraction and Normalization

In data extraction and normalization, the raw chat log is converted into machine-readable format. It consists of three subtasks described below.

Cybercrime: To Detect Suspected User’s Chat Using Text Mining

3.1.1

385

Information Component Extraction

In the first step, messages and chats are extracted from web or any SNS. These gathered information can be further used by security agency, police, or any crime branch investigators. We have extracted a big amount of dataset samples consisting of messages or chats to perform the text mining and data analysis. Chat messages are in various formats based on the type of application platforms and their architecture. Logs can be in HTML files, XML files, etc. each of which contains discussion and information about one or more chat logs or sessions.

3.1.2

Noise and Slang Normalization

Mostly, every text file contains dirty data for those records we first need a method to clean those data. The most common form of noise in chat messages is the unnecessary repeated use of, stop words, punctuation marks or letters, and some particular information which is not required during the further checking process, for example, “helloooo” and “tcccc”. Regular expression is used to normalize those repeat words. The slang expressions are commonly used in chat messages which have no place in standard dictionary. We used a word net which contains slang expressions and their equivalent standard words coming from different sources.

3.2

Vocabulary Extraction

We define vocabulary as a set of valuable information containing key terms exchanged among participating users during its complete life of communication. Therefore, the vocabulary extraction process aims to identify vocabulary of the community involved in chat discussions and it comprises four subtasks such as n-gram extraction, stop-word removal, stemming, and case folding.

3.3

Key Information Extraction

After normalization, system extracts the key information form chat log, identifies that which user has been used that words, and also finds chat session and group of that suspected user. To perform all the functions, self-customized HITS algorithm is used [5], to compute hub and authority scores. The pseudocode for Self-Customized Hyperlink-Induced Topic Search (SCHITS) algorithm is shown in Fig. 2.

386

K. Sameera and P. Vishwakarma

Fig. 2 The pseudocode for Self-Customized Hyperlink-Induced Topic Search (SCHITS) algorithm

Algorithm 1. SCHITS Input: Web Graph G = (V, E) define for node p, q which is subset of V Output: auth—set of authority pages hub—set of hub pages

3.4

Social Network Graph Construction

A chat session contains a group of users communicating or interacting with each other, and such interactions establish a kind of relationship between them. A social network graph of user is constructed using weighted graph using their interaction patterns in the group chat sessions. After investigation, system will show the entire group, and other users connected with that suspected user. Also, it shows the hub and authority score of that user. Suspected users are indicated by red-filled circle. It calculates the weight which shows how many time suspected user chat with other users.

Cybercrime: To Detect Suspected User’s Chat Using Text Mining

3.5

387

User Group Identification

Using clustering method, we can find user group. Clustering refers to find the users who use the same key information component. Proposed system makes a social network graph to show the suspected user’s group.

3.6

User Profile Identification

In this block, we find suspicious user profile by using text mining analysis. After extracting the suspected words, first system will identify the information of that user like profile, etc., and second continuously analyze the user’s activities or behavior.

4 Results and Discussion The proposed system achieves the following objectives like.

4.1

Social Network Creation

Here, it allows users to create the account and login to the system. After this, it will connect all the people socially in which they are connected to it. User can find the friends on a network, chat with their friends, block the other user, etc., thus first this system creates a social network to connect all the users (Fig. 3).

Fig. 3 A sample screenshot of social network creation

388

K. Sameera and P. Vishwakarma

Fig. 4 Chat log investigation report

4.2

Chat Log Investigation

All the activities of the users will be given to Admin. With the help of sentiment word, analyzer system can find the suspected user and allow admin to investigate their chats. All the information related to that user like profile, IP address, e-mail id, date time, etc. will be shown to admin (Fig. 4). Also, they calculate hub and authority scores of suspicious word. In this dataset, two users [email protected] and [email protected] are communicated with each other as shown in Table 1.

4.3

Social Network Graph

After investigation, system will show the entire group, and other users connected with that suspected user. Also, it shows the hub and authority scores of that user. Suspected users are indicated by red-filled circle. It calculates the weight which shows how many time suspected user chats with other users. Table 1 Hub and authority score ID

Node

Hub score

Authority score

1 2

[email protected] [email protected]

0.70710 0.70710

0.70710 0.70710

Cybercrime: To Detect Suspected User’s Chat Using Text Mining

389

Fig. 5 Screen short of social network graph

The weighted graph can be calculated by the following formula: wij

  Φvi ⋅ Φvj deg vi , vj × degðvi Þ + deg vj ×  = . 2 × degðvi Þ × deg vj jΦvi j Φvj

ð1Þ

where deg (vi, vj): deg (vi) and deg (vj):

The degree of session where two users interacted with each other. The number of the nodes corresponding to the users vi and vj, respectively, in the social network graph (Fig. 5).

5 Conclusion As the social media plays a vital role in once life, there is also a chance of criminal activities done online. Text mining techniques is an effective way to detect and predict criminal activities. To detect such activities, it uses text mining algorithm to continuously check for suspicious words even if they are in the form of code words or short forms. Cybercrime: To detect suspected user’s chat is an innovative approach. The proposed framework uses “Self-Customized Hyperlink-Induced Topic Search” (HITS) algorithm to identify, key users, key terms, and sessions to finds information using social network graph-based text mining, to find the suspected user with suspected user’s profile as well on a network.

390

K. Sameera and P. Vishwakarma

References 1. Nasa, D.: Text mining techniques—a survey. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(4) (2012) 2. Mahesh, T.R., Suresh, M.B., Vinayababu, M.: Text mining: advancements, challenges and future directions. Int. J. Rev. Comput. 2009–2010 lJRIC & LLS 3. Adamic, L., Zhang, J., Bakshy, E., Ackerman, M.: Knowledge sharing and yahoo answers: everyone knows something. In Proceeding of the 17th International Conference on World Wide Web. ACM (2008) 4. Lew, A., Mauch, H.: Intoduction to Data Mining Principles‖, SCI. Springer (2006) (Text mining) 5. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM, 46(5), 604e32 (1999) 6. Abulaish, M., Anwar, T.: A web content mining approach for tag cloud generation. In: Proceedings of the 13th International Conference on IIWAS, p. 52e9 (2011) 7. Anwar, T., Abulaish, M.: Web content mining for alias identification: a first step towards suspect tracking. In: Proceedings of the IEEE International Conference on ISI, p. 195e7 (2011)

Techniques to Extract Topical Experts in Twitter: A Survey Kuljeet Kaur and Divya Bansal

Abstract An Online Social Network (OSN) such as Facebook, Twitter, Google+, etc., socially connects users around the world. Through these social media platforms, users generally form a virtual network which is based on mutual trust without any personal interaction. As more and more users are joining OSNs, the topical expert identification is a literal necessity to ensure the relevance and credibility of content provided by various users. In this paper, we have reviewed the existing techniques for extraction of topical expertise in Twitter. We provide an overview of various attributes, dataset, and methods adopted for topical expertise detection and extraction. Keywords Topical experts



OSN



Twitter



Security

1 Introduction Various OSNs allow the exchange of real-time information across the wider audience in a fraction of seconds. In microblogging sites like Twitter, posts may be viewed as a micropost (e.g., 140-character tweet). Also, microblogs assist users in getting their microposts reach the audience in microseconds. Likewise sensors, wherein real-time data comes in with every second, every micropost has shorter life span due to numerous posts from varied locations each second [1]. As users on OSN grow exponentially, a credible search system is must to find relevant users. In other words, how a user can rely and trust on the content he comes across in the Twitter. These credible, relevant, reliable authors or experts on a specific topic are termed as topical experts. Recognizing this, Twitter had launched WTF service (Who to Follow) in 2010 to extract experts related to a topic. But it is K. Kaur (✉) ⋅ D. Bansal PEC University of Technology, Chandigarh, India e-mail: [email protected] D. Bansal e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_38

391

392

K. Kaur and D. Bansal

found that WTF [2] sometimes generates results constituting of users whose Twitter profile description (called “bio”, 160-character personal description) contains the related query but, in actual, is not really related to the content. The traditional approaches identify topical experts using attributes like tweets, profile bio, lists, number of followers, etc. Our work consolidates the existing approaches and highlights future directions. The structure of the paper is organized as follows: the next section describes the methodology to carry out this review. In Sect. 3, we explain the motivation behind the review. In Sect. 4, we highlight attributes used for various approaches, followed by Sect. 5, which presents a comparative analysis of work done. Section 6 constitutes the discussions, and in Sect. 7, we discuss the research direction. Finally, Sect. 8 concludes the paper.

2 Review Methodology For the concerned topic, the papers included for review are selected from major databases like IEEE Xplore, ACM Digital Library, and Google Scholar. The databases returned around 50 papers, out of which papers published after 2009 were shortlisted. Then, titles and abstracts were read from the shortlisted papers. Finally, 13 papers were selected whose title and abstract were closely related to topical extraction in Twitter. The compilation of various approaches in the form of review is of prime importance to new researchers, for extension of work in this domain.

3 Motivation Behind Review As the numbers of users in Twitter are growing exponentially, identifying influential users is of utmost importance in this era. For any user, the question for whose content to read to get updated and reliable information pertains to identification of topical expertise. So, mining such experts for any topic in order to keep in close contact with them or following them, is one of the most important domains explored by various researchers. This paper lists all the approaches, which will help researchers to review the work done in this area.

4 Attributes Used in Various Studies From the previous studies, it can be deduced that different authors have apparently used different combinations of attributes to find authoritative users on a topic. Below are the features used in categorizing and distilling topical experts in the preceding studies:

Techniques to Extract Topical Experts in Twitter: A Survey

393

• Tweet [3]: 140-character message can be textual or can contain links to multimedia content, • Retweet [3]: Forwarded tweets form retweets, • Mentions [3]: Replies to the messages with @Username, • Hashtags [3]: #topic or #keyword presents all tweets related to a topic or keyword, • #Followers [3]: Number of users who receive your tweets in their timelines, • #Followings [3]: Also called friends, whose tweets you receive on your timeline, • Bio [3]: 160-character personal description, and • Lists [3]: 140-character name and an optional description, used for managing followers.

5 Comparative Analysis of Various Approaches OSN is the fastest way for disseminating real-time information. Twitter acts both as a microblogging and social networking site. Following any user in Twitter does not require any access right from the person; thus, the circle of virtual friends grows to form the social network. Table 1 represents the previous studies’ detailed analysis regarding topical expert extraction with contribution. As the information grows exponentially with the users, thus, a credible search system is needed to find relevant users. By relevant users, we mean the experts of a topic as well as the seekers also, who help in spreading message to anonymous larger audience. Weng et al. [4] found influential twitterers with a specific topic. The author described that Twitter itself gives more influence to users with more number of followers. But focusing on only “following” relationships is not reliable as the trends of following back a friend either due to courtesy or common interests (homophily) are analyzed. Thus, to find the influential twitterers, TwitterRank approach is applied which considers both link structure and topic sensitivity into account. To analyze the link structure in the collected data, all friends and followers of each user were considered along with their tweets. LDA [5] was used to mine topics of a user from tweets for topic sensitivity, followed by ranking of users’ influence. The results showed that active twitterers do not imply influential twitterers. They either share some followers or followings of one another. The experiments showed the highest similarity between this algorithm and TSPR [6] due to topic sensitivity. Pal and Counts [7] identified topical authorities on the basis of tweets, mentions, and graphs using Gaussian mixture model clustering method. The tweets related to oil spill, iPhone, and world cup were mined using sample substring. Self-similarity score showed level of expertise of a user in a specific topic. The clustering algorithm [8] was applied on 17 features, followed by ranking of authors in the three selected categories. The survey conducted showed that the users find tweets as

394

K. Kaur and D. Bansal

Table 1 Comparative analysis of existing approaches Author

Attributes used

Methodology

Dataset

Results

Weng et al. [4]

Tweets, following relation

LDA approach, graph-based

Pal and Counts [7]

Tweets

17 features used, clustering-based

Top 1000 Singapore-based twitterers from Twitter-holic.com 5 days’ tweets from firehose dataset

Bhattacharya et al. [9]

Tweets, lists

Purohit et al. [11]

Tweets, profile metadata

Semantic approach based on lists, profile, and tweets Three methods proposed, modified tf-idf approach

Topic-sensitive influential twitterers tracked with improved accuracy Tweets collected for the selected three categories seemed authoritative and informative to the users Identified topical groups on niche topics, and missing member if any

Wagner et al. [12]

Lists, bio, tweets, retweets Tweets

Pohl et al. [13]

Canini et al. [14] Ghosh et al. [15]

Tweets, link structure Lists

LDA approach

Modified tf-idf and online clustering algorithm LDA approach, tf-idf approach Mining lists metadata, ranking experts

38.4 M Twitter user’s profiles

Twitter profiles, Wikipedia, personal websites, and US labor statistics WeFollow directory and Twitter profiles 1943 tweets on Hurricane Sandy, 2012 WeFollow directory, Twitter profiles 54 M Twitter profiles

Wu et al. [17]

Lists

Snowball sampling and ranking experts

Firehose dataset of 42 M users

Sharma et al. [19]

Lists

Mining lists metadata and ranking experts

54 M Twitter profiles

Promised 92.8% summaries as informative in best case and 70% in worst case Best results with Bio and lists Online Clustering algorithm with minimized clusters uncovered all sub-events Content and social status of experts affect trust of followers Better performance than the Twitter WTF service for more than 52% of the queries Elite users are responsible for spreading the content to larger audience Cloud of attributes, describing a person’s interests, is generated

useful and interesting from the top authors represented by this approach. It also showed that users trust either quality content or renowned authors presented to them. The two most important features concluded are topical signal (extent of involvement of an author with a topic) and mention impact (@username while replying or referring to other users).

Techniques to Extract Topical Experts in Twitter: A Survey

395

Bhattacharya et al. [9] utilized lists to find topical groups (experts and seekers) and analyzed their characteristics. The study highlighted many differences between topical and bond based groups in terms of size, member type, interests, etc. It is found that community detection algorithms [10] cannot be applied due to weak connectivity between experts and seekers. From the collected data, first, experts of a topic were found followed by seekers, then merging the two to form a topical group. The approach successfully discovered niche topical groups. It is noteworthy that the numbers of experts are directly proportional to numbers of seekers. The approach resulted in a single connected component covering 90% of the experts, which shows well interconnectivity between experts. Purohit et al. [11] proposed approaches to generate automatic informative summaries of users in limited characters. To generate summaries, three approaches were used, namely, occupation pattern-based, link triangulation-based, and user classification-based. 92.8% of summaries generated by Link Triangulation are considered to be informative and useful on the basis of evaluation done by users, considering readability, specificity and interestingness metrics. For the users, who were less popular and active, meformer data (written by user himself, self-descriptive) was used to generate the summary. Wikipedia pages were also considered as a source of informer data for the generation of summary in Link Triangulation method, which showed the highest favorability. Wagner et al. [12] elaborated that out of tweet, retweet, bio, and List, which user-related content make a good topical expertise profile. Two experiments conducted by choosing experts with the known topic of expertise from WeFollow directory. The first experiment resulted in worst expertise judgment when the participants were shown only content (tweets and retweets) which changed into best when contextual information (bio and tweet) were shown. The second experiment done to know the similarity of inferred topics from four user-related data analyzed that lists performed the best by revealing 77.67% of the exact topic of interest of expertise. The similarity of topics shown by tweets and retweets is also noteworthy. Another contribution made is that bio plays an important role in the inference of topic. Pohl et al. [13] represent the implication of using social media data for emergency management. The dynamic selection of features from the incoming data using online clustering algorithm uncovered sub-events (effects of events or crisis). The terms extracted from incoming data and those with the highest frequency were given maximum importance and used for clustering. The evaluation done on Hurricane Sandy, 2012, real-time data showed that both online and offline clustering are similar in behavior but quality wise online outperforms the offline clustering algorithm. Another noteworthy analysis constitutes lower set of clusters in online clustering algorithm due to ignorance of earlier sub-events. Canini et al. [14] concentrate on finding which factors do users trust more to judge the credibility of authors. The experiment showed that more quantitative the content is, more is the trust earned. Thus, content and social structure affect the credibility to a great extent. Based on these two factors, an algorithm is proposed to find topical expertise and ranking them automatically. The comparison between the

396

K. Kaur and D. Bansal

algorithm and the professionally ranking expertise’ algorithm shows great results in favor of the proposed approach. As lists depend on crowd wisdom, Ghosh et al. [15] proposed a topical expert search system which uses single feature, Twitter Lists for inferring the topical experts. The methodology includes collection and mining of all public lists of 54 million users who joined Twitter before August 2009. The mining of metadata generated many topics, each user was ranked according to an algorithm [16], and then the association of member is done with the topic according to his rank. The membership of a user in many lists, created by many users adds certain topics to the category in which a user is an expert. Unless previous studies in the context, which uses either user’s own information (bio and tweets) or network graph to extract experts of a topic, the relying of study only on the wisdom of crowds (Twitter Lists) makes the study unique. The analysis shows that Cognos [15] provides better results as compared to the official WTF for more than 52% of the queries. Another noteworthy result came out is that WTF relies more on organizational accounts, whereas Cognos follows personal accounts to get the information, this implies not relying only on the standard news agencies but giving equal importance to each Twitter user. Wu et al. [17] contributed significantly by classifying users into elite and ordinary topically, lifespan of content directly proportional to the type of content, and how information flows indirectly to a larger audience. Lists are used for finding elite users using snowball sampling [18]. The elite users mined for each of the four categories are found to be more active. The elite users besides constituting of only 0.05% of the total population, 50% attention in the twitter is created by them. It is also found that textual content has a shorter lifespan as compared to multimedia content. The two-step flow policy highlights forwarding of elite users’ content either via retweet (acknowledged content) or reintroducing content (unacknowledged) to a wider audience. The study by Sharma et al. [19] is related to the previous study [20], which used a machine learning technique to find the semantic topic of a web page. Ramage et al. [21] used LDA [5] to analyze the content of tweets semantically. Kim et al. [22] applied chi-square distribution on tweets to associate them topically. This study generated a cloud of attributes by mining the lists’ metadata and associating the mined topics with the members of the list. Inferred attributes include information from bio, perceptions of users and topics of expertise. For checking accuracy and reliability, ground truths and human feedbacks were considered. The analysis showed 94% of the evaluations to be accurate.

6 Discussions The studies discussed above rely more on self-provided information (bio) and “following” relationships. The generation of automatic summaries of Twitter users from tweets, bio, mentions, etc., is quite subjective in nature. The validation of the

Techniques to Extract Topical Experts in Twitter: A Survey

397

above studies on a wider audience with varied topics may vary the results if larger sample space is considered. However, the outcomes based on analyzing social media data may have far-reaching consequences if so applied in real world such as policy decision in government, business, or any individual organization. There may be a need to test the outcome, intelligent information, on basis of certain parameters by discounting possible chances of irrelevant or disinterested information. The degree of inaccuracy, thus, has to be analyzed in each and every analytic method in mining OSN data intelligently. Only then, it will be possible to utilize the true potential of OSN for generating intelligence and situation learning.

7 Research Directions A lot of work has already been done for utilizing Twitter data for various purposes. As OSN covers data from ordinary users, spammers, and experts, thus, extraction of useful data is needed for building intelligent recommendation systems. An important aspect for collecting credible data in the form of tweets could be via “Lists” on a certain topic, which provide a way to follow a class of users, who are believed to be topical experts in a single timeline. After extracting topical expertise by mining lists’ metadata, if their tweets are analyzed semantically followed by some annotations supporting their opinion, can primarily help in predicting sensitive issues such as terrorism, riots, etc., and countermeasures required to cope with them. Other useful implications of topical experts could be business forecasting, market research, financial decision making, stage of product life cycle, opinion polls, crime patterns, social trends, econometrics and response of stakeholders to specific topics such as plebiscite or referendum, etc. However, appropriate weightage may have to be accorded in the algorithm to discount misleading information campaigns by rival opinion makers.

8 Conclusions The OSN provides real-time data covering global audience and topics. To analyze data intelligently, various approaches are used, that use different attributes. From the survey, it is concluded that lists, a crowdsourced feature, if created carefully can give indications regarding the topic of expertise of its members. The reason for favoring lists more than other attributes lies in its association with crowd wisdom. Also, a list is the best way to differentiate elite users from general OSN users with crowdsourcing as its prime feature. The other attributes, like fake bio, being provided by the user, may mislead the search system. Moreover, previous studies have proved that a combination of other attributes with lists generates more accurate results.

398

K. Kaur and D. Bansal

References 1. Microblogging (wiki article). http://en.wikipedia.org/wiki/Microblogging. Accessed 12 Feb 2016 2. Sullivan, D.: Twitter Improves “Who to Follow” Results & Gains Advanced Search Page. http://selnd.com/wtfdesc. Accessed 15 Feb 2016 (2011) 3. Malhotra, A., Totti, L., Meira, W., Jr., Kumaraguru, P., Almeida, V.: Studying user footprints in different online social networks. In: Proceedings of ACM International Conference on Advances in Social Networks Analysis and Mining, Washington DC, USA, pp. 1065–1070 (2012) 4. Weng, J., Lim, E., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential Twitterers. In: Proceedings of 3rd ACM International Conference on Web Search and Data Mining, New York, USA, pp. 261–270 5. Blei, David M., Ng, Andrew Y., Jordan, Michael I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003) 6. Haveliwala, T.H.: Topic-sensitive pagerank. In: Proceedings of 11th ACM International Conference on World Wide Web, pp. 517–526 (2002) 7. Pal, A., Counts, S.: Identifying topical authorities in microblogs. In: Proceedings of 4th ACM International Conference on Web Search and Data Mining, New York, USA, pp. 45–54 (2011) 8. Clustering Algorithms (wiki Article). Available at https://en.wikipedia.org/wiki/Cluster_ analysis. Accessed on March 3, 2016 9. Bhattacharya, P., Ghosh, S., Kulshrestha, J., Mondal, M., Bilal Zafar, M., Ganguly, N., Gummadi, K.P.: Deep Twitter diving: exploring topical groups in microblogs at scale. In: Proceedings of 17th ACM Conference on Computer-Supported Cooperative Work & Social Computing, New York, USA, pp. 197–210 (2014) 10. Fortunato, Santo: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010) 11. Purohit, H., Dow, A., Olonso, O., Duan, L., Haas, K.: User taglines: alternative presentations of expertise and interest in social media. In: Proceedings of IEEE International Conference on Social Informatics, Lausanne, Switzerland, pp. 236–243 12. Wagner, C., Liao, V., Pirolli, P., Nelson, L., Strohmaier, M.: It’s not in their tweets: modeling topical expertise of Twitter users. In: Proceedings of IEEE International Conference on Social Computing, Amsterdam, The Netherlands, pp. 91–100 (2012) 13. Pohl, D., Bouchachia, A., Hellwagner, H.: Online processing of social media data for emergency management. In: Proceedings of IEEE International Conference on Machine Learning and Applications, Miami, Florida, pp. 408–413 (2013) 14. Canini, K.R., Suh, B., Pirolli, P.L.: Finding credible information sources in social networks based on content and social structure. In: Proceedings of 3rd IEEE International Conference on Social Computing, Boston, Massachusetts, pp. 1–8 (2011) 15. Ghosh, S., Sharma, N., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Cognos: crowdsourcing search for topic experts in microblogs. In: Proceedings of 35th ACM International Conference on Research and Development in Information Retrieval, New York, USA, pp. 575–590 (2012) 16. Clarke, C.L.A., Cormack, G.V., Tudhope, E.A.: Relevance ranking for one to three term queries. Informat. Process. Manag. 36(2) (2000) 17. Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on Twitter. In: Proceedings of 20th ACM International Conference on World Wide Web, New York, USA, pp. 705–714 (2011) 18. Snowball Sampling (wiki Article). https://en.wikipedia.org/wiki/Snow-ball_sampling. Accessed 26 Feb 2016 19. Sharma, Naveen, Ghosh, Saptarshi, Benevenuto, Fabricio, Ganguly, Niloy, Gummadi, Krishna P.: Inferring Who-is-Who in the Twitter social network. Proc. ACM SIGCOMM Compu. Commun. Rev. 42(4), 533–538 (2012)

Techniques to Extract Topical Experts in Twitter: A Survey

399

20. Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, J.A., Zien, J.Y.: Semtag and seeker: bootstrapping the semantic web via automated semantic annotation. In: Proceedings of 12th ACM International Conference on World Wide Web, New York, USA, pp. 178–186 (2003) 21. Ramage, R., Dumais, S., Liebling, D.: Characterizing microblogs with topic models. In: Proceedings of 4th AAAI Conference on Weblogs and Social Media, Washington DC, USA, pp. 130–137 22. Kim, D., Jo, Y., Moon, C., Oh, A.: Analysis of Twitter lists as a potential source for discovering latent characteristics of users. In: Proceedings of ACM Workshop on Microblogging, Georgia, USA (2010)

Comparison of BOD5 Removal in Water Hyacinth and Duckweed by Genetic Programming Ramkumar Mahalakshmi, Chandrasekaran Sivapragasam and Sankararajan Vanitha

Abstract In this study, macrophyte-based plants such as duckweed (Lemna Minor) and water hyacinth (Eichhornia Crassipes) are considered for the removal of biochemical oxygen demand (BOD5) in domestic wastewater. The maximum value of BOD5 removal for duckweed and water hyacinth is almost the same (99%). Experiments are conducted in order to get a wide range of data for mathematical modeling. BOD5, retention time (t), and wastewater temperature (Tw) are the parameters considered for modeling, and a comparison is made between the models of these plants. This study reveals that there are similar functionality relationships that exist for both the plants between the parameters BOD5 and retention time on the removal of BOD5. This function is found to be linear. It is also revealed that Tw is also an important parameter as it influences the treatment systems. Genetic programming (GP) based modeling is effective to understand the wetland system by comparing the removal of BOD5. Keywords Genetic programming Wastewater temperature



Duckweed and hyacinth



BOD5 removal

1 Introduction The population is increasing at an alarming rate which leads to urbanization and industrialization to meet the human needs. The wastages that are related to the associated activities often end up in the environment as pollutants. There are many R. Mahalakshmi ⋅ C. Sivapragasam (✉) ⋅ S. Vanitha Center for Water Technology, Department of Civil Engineering, Kalasalingam Academy of Research and Education, Krishnankoil 626126, Tamil Nadu, India e-mail: [email protected] R. Mahalakshmi e-mail: [email protected] S. Vanitha e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_39

401

402

R. Mahalakshmi et al.

types of wastes and can be classified into different types such as solid waste, liquid waste, hazardous waste, nonhazardous waste, industrial, surface runoff, residential, etc. These wastes should be treated to lessen the impact on the environment. Different methods such as physical, chemical, and biological exist to treat the domestic wastewater. Most of the conventional methods are expensive and not eco-friendly to treat wastewater. Numerous researchers have conducted many studies and researches to treat the wastewater in an economical and eco-friendly way [1]. Constructed wetland system is one such a method to treat the wastewater using plants. Different plants are used to treat different types of wastewater. Many researchers have reported that duckweed and hyacinth can remove pollutants such as BOD5, phosphate, nitrate, and chemical oxygen demand (COD) effectively in domestic sewage [2, 3]. Duckweed and Hyacinth are very effective in shallow ponds system [4, 5]. Many researchers have considered BOD5 and retention time for duckweed and water hyacinth-based treatment system and observed an efficiency of more than 90% BOD5 removal. Some researchers have reported that wastewater temperature (Tw) is an important parameter in these treatment systems [6, 7]. Akratos et al. [6] observed a maximum BOD5 removal of 89% in water hyacinth while Korner et al. [7] observed a maximum removal of 94% in duckweed. Mietto et al. [8] reported that there is a significant removal of pollutants on phragmites australis-based constructed wetland for every 1 °C rise in Tw. Due to the complexity involved in constructed wetland systems, it is difficult to understand its behavior and is often considered as “black box” [9]. Many researchers studied about process-based models to understand the complex behavior of wetland system [9–11]. Many researchers have also carried out regression modeling to understand how the input parameters affect output in modeling constructed wetlands [12–14]. Even with limited data set, genetic programming (GP) based mathematical models can be useful to understand the wetland system behavior better but such studies are very limited. Vanitha et al. [15] modeled Dissolved Oxygen (DO) to establish the superiority of such models. In this study, it is decided to model the BOD5 removal in duckweed and water hyacinth-based wetland systems with limited data set. Tw is an important parameter, and it is considered for its significance in BOD5 removal. To the authors’ knowledge, no such studies have been reported in the past.

2 Methodology and Experimental Setup The domestic wastewater is collected from the nearby wastewater treatment plant after the primary treatment. Duckweed and water hyacinth plants are taken from a nearby pond, and it is thoroughly cleaned by distilled water. To construct the wetland systems, a volume of 15 L of wastewater is taken for duckweed setup, and

Comparison of BOD5 Removal in Water …

403

a volume of 25 L of wastewater is taken for hyacinth setup. The depth of duckweed and hyacinth pond is 30 cm. Plants are then spread over on the duckweed and hyacinth pond. Initial characteristics of sewage are tested in the laboratory. It is to be noted that the value of BOD5 at inlet ranges between 140 and 170 mg/l and BOD/COD is 0.32. The value of pH at the inlet is 7.88. The samples are collected every day for testing BOD5. The experimental setups are run for both the plants and the data are collected for up to the maximum removal are used for modeling.

3 Genetic Programming This algorithm is based on Darwin’s theory of natural selection and survival of the fittest. It operates on the basis of parse trees to generate the mathematical model (or equation) relating the output and input. The algorithm generates the initial set of population of equations through random combinations of terminal set and functional set. Constants and variables that govern the process constitute the terminal set. The functional set has arithmetic operators and mathematical functions. Choosing the correct combination of terminal set and functional set gives the physically meaningful model. This requires an understanding of the process being modeled. The initial set of population is refined to evolve a better model by using crossover or mutation or elitist or a combination of one or more. The mutation operator makes sure that the search space has a broader scope, the elitist ensures the best model from the previous generation in the new set of population and the crossover operator exchanges the information between two parents [16]. Fitness criteria are met by repetitive and successive generation of the process that includes number of population, number of generations, and other relevant information. GP is implemented using Discipulus tool [17].

4 Modeling of GP GP models are evolved for BOD5out by considering parameters of BOD5in, t, and Tw for duckweed and water hyacinth-based wetland system. The percentage of training, testing, and validation of duckweed are 50%, 35%, and 15%, respectively. Similarly, the respective values for hyacinth are 57, 29 and 14%. These values are obtained by trial and error method. Addition, subtraction, multiplication, division, and exponential function are used for generating these models. The functional form of input parameters is shown in Eq. (1).

404

R. Mahalakshmi et al.

Table 1 Training, testing, and validation values for duckweed model Data

Residual BOD5in (%)

T (h)

Tw (°C)

Residual BOD5out (%)

Training

100 100 100 100 100 100 100 100 100 100 100 100

48 24 48 72 48 96 18 42 66 24 72 96

17 16.7 24 24 16.9 16 25 12.7 13.3 25 17.3 25

79 50 38.1 25 17 3 56 47 5 73.8 26 7.1

Testing

Validation

BOD5 out = f ðBOD5 in, t, Tw Þ

ð1Þ

where BOD5out BOD5in t Tw

Residual outlet BOD5 (%), Residual inlet BOD5 (%), Time (h), and Wastewater temperature (°C)

The data collected from the experiments are mixed and then separated in such a way that the data used for training covers all range of values. Then the remaining data are used for testing and validation. The data set that has training, validation, and applied values of GP modeling for duckweed and hyacinth models is shown in Tables 1 and 2.

4.1

Performance Measure

Normalized Root Mean Square Error (NRMSE) is taken as the performance measure for the models and is given in Eq. (2). NRMSE =

RMSE Xmax − Xmin

ð2Þ

where Xmax is maximum value in the range of observed data, Xmin is minimum value in the range of observed data, and RMSE is root mean square error (as per Eq. (3)).

Comparison of BOD5 Removal in Water …

405

Table 2 Training, testing, and validation values for hyacinth model Data

Residual BOD5in (%)

T (h)

Tw (°C)

Residual BOD5out (%)

Training

100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

24 48 24 72 48 72 120 96 72 96 192 144 24 48 24 72 96 120 48 96 144

26 27 26 25 27 25 26 26 26 26 26 25 26 26 26 25 27 24 26 26 25

87.93 75.4 78.57 57.81 62.5 48.21 36.06 34.37 21.31 16.55 1.37 2.36 89.06 77.58 72.13 38.27 14.75 1.65 71.87 33.92 14.84

Testing

Validation

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 1 n  ∑ ðX m Þi − ðX s Þi RMSE = n i=1

ð3Þ

where x is any variable subjected to modeling, m is observed value, and s is simulated value.

5 Results and Discussions The results from the GP run using the data sets are shown in Table 3. The prediction of BOD5 is quite accurate in both duckweed and hyacinth models. C code is arrived after the GP run is finished, and it is then decoded to get the equation, and the evolved equations are shown below. The evolved equation for duckweed model is shown in Eq. (4):

406

R. Mahalakshmi et al.

Table 3 Comparison of actual residual BOD5out (%) and predicted residual BOD5out (%) for duckweed and hyacinth model Water hyacinth model Predicted residual Actual residual BOD5out (%) BOD5out (%)

Duckweed model Actual residual BOD5out (%)

Predicted residual BOD5out (%)

89.06 77.58 72.13 38.27 14.75 1.65 71.87 33.92 14.84

56 47 5 73.8 26 7.1

63.8 43.3 26.7 59.6 23.7 9.2

83.3 70.1 83.3 40.2 28.5 17.5 70.1 27.5 12.4

BOD5 out = 0.7ðBOD5 in − tÞ + 0.2Tw

ð4Þ

The evolved equation for water hyacinth model is shown in Eq. (5): BOD5 out = 0.5ðBOD5 in − tÞ + Tw

ð5Þ

The NRMSE for the duckweed model and the hyacinth model are 0.08 and 0.1, respectively. The NRMSE of hyacinth model is almost the same as duckweed model. The inlet BOD5 concentration decreases with time and is obtained in both duckweed and hyacinth model. Both these models show some similarity. The relationship of retention time, wastewater temperature, and BOD5in with BOD5out is similar. In both hyacinth and duckweed model, the time is inversely proportional to the BOD5out, and the wastewater temperature is directly proportional to the BOD5out. This shows that the retention time plays a major role in duckweed and water hyacinth model. However, in water hyacinth model, the influence of wastewater temperature is higher when compared with duckweed. Figure 1 shows the comparison of actual BOD5out and predicted values of BOD5out by GP for duckweed and water hyacinth model. As seen from Fig. 1, duckweed model predicts well when compared to water hyacinth model. This indicates that duckweed model is better than the water hyacinth model.

Comparison of BOD5 Removal in Water …

407

Observed BOD 5 out for duckweed Predicted BOD 5 out for duckweed Observed BOD 5 out for water hyacinth

BOD5 out ( mg/l)

Predicted BOD 5 out for water hyacinth 150 100 50 0 1

2

3

4

5

6

7

8

9

Days

Fig. 1 Observed and predicted values of BOD5out for duckweed and water hyacinth model

6 Conclusion Reduction in concentration of BOD5 depends on the retention time is a known fact. The GP model found a relationship between BOD5in and retention time in a functional form (BOD5 – t) for duckweed and hyacinth. This means that BOD5 removal has linear functionality when considering the retention time. Tw is an important parameter in macrophyte-based treatment system. This study reveals that Tw has higher influence in hyacinth than duckweed on BOD5 removal.

References 1. Dhoti, S., Dixit, S.: Water quality improvement through macrophyte-a review. Environ. Monit. Assess. 152, 149–153 (2009). https://doi.org/10.1007/s10661-008-0303-9 2. Priya, A., Avishek, K., Pathak, G.: Assessing the potentials of Lemna minor in the treatment of domestic wastewater at pilot scale. Environ. Monit. Assess. 184, 4301–4307 (2012). https://doi.org/10.1007/s10661-011-2265-6 3. Mandi, L.: Marrakesh wastewater purification experiment using vascular aquatic plants Eichhornia Crassipes and Lemna Gibba. Water Science Technology 29(4), 283–287 (1994) 4. Iram, S., Ahmad, I., Riaz, Y., Zahra, A.: Treatment of wastewater by lemna minor. Pak. J. Bot. 44(2), 553–557 (2012) 5. Valipour, A., Raman, V.K., Motallebi, P.: Application of shallow pond system using water hyacinth for domestic wastewater treatment in the presence of high total dissolved solids and heavy metal salts. Environ. Eng. Manag. J. 9(6), 853–860 (2010) 6. Akratos, C.S., Tsihrintzis, V.A.: Effect of temperature, HRT, vegetation and porous media on removal efficiency of pilot-scale horizontal subsurface flow constructed wetlands. Ecol. Eng. 29, 173–191 (2007). https://doi.org/10.1016/j.ecoleng.2006.06.013 7. Korner, S., Vermaat, E.J., Veenstra, S.: The capacity of duckweed to treat wastewater: Ecological considerations for a sound design. J. Environ. Qual. 32, 1583–1590 (2003) 8. Mietto, A., Politeo, M.: Breschigliaro, Borin, M.: Temperature influence on nitrogen removal in a hybrid constructed wetland system in northern Italy. Ecol. Eng. 75, 291–302 (2015). https://doi.org/10.1016/j.ecoleng.2014.11.027

408

R. Mahalakshmi et al.

9. Langergraber, G.: Modeling of Processes in Subsurface Flow Constructed Wetlands: A Review. Vad. Zone J. 7(2), 830–842 (2008). https://doi.org/10.2136/vzj2007.0054 10. Langergraber, G., Simunek, J.: Modeling variably saturated water flow and multicomponent reactive transport in constructed wetlands. Vad. Zone J. 4, 924–938 (2005). https://doi.org/10. 2136/vzj2004.0166 11. Kadaverugu, R.: Modelling of subsurface horizontal flow constructed wetlands using openFOAM. Model Earth Syst. Environ. 2, 55 (2016). https://doi.org/10.1007/s40808–0160111-0 12. Heide, T., Roijackers, R., Nes, E., Peeters, E.: A simple equation for describing the temperature dependent growth of free-floating macrophytes. Aquat. Bot. 84, 171–175 (2006). https://doi.org/10.1016/j.aquabot.2005.09.004 13. Guo, Y., Liu, Y., Zeng, G., Hu, X., Xu, W., Lin, Y., Liu, S., Sun, H., Huang, H.: An integrated treatment of domestic wastewater using sequencing batch biofilm reactor combined with vertical flow constructed wetland and its artificial neural network simulation study. Ecol. Eng. 64, 18–26 (2014). https://doi.org/10.1016/j.ecoleng.2013.12.040 14. Tomenko, V., Ahmedb, S., Popova, V.: Modelling constructed wetland treatment system performance. Ecol. Model. 205, 355–364 (2007). https://doi.org/10.1016/j.ecolmodel.2007. 02.030 15. Vanitha, S., Sivapragasam, C., Nampoothiri, N.: Modelling of dissolved oxygen using genetic programming approach. In: First International conference on Lecture Notes in computer science, vol. 10398, pp. 445–452. Springer, India (2017). https://doi.org/10.1007/978-3-31964419-6_56 16. Koza, J.R.: Genetic Programming: on the programming of computers by natural selection, 1st edn. MIT Press, Cambridge, MA (1992) 17. Sivapragasam, C., Muttil, N., Muthukumar, S.L., Arun, V.M.: Prediction of algal blooms using genetic programming. Mar. Pollut. Bull. 60, 1849–1855 (2010). https://doi.org/10.1016/ j.marpolbul.2010.05.020

Mathematical Modeling of Gradually Varied Flow with Genetic Programming: A Lab-Scale Application Chandrasekaran Sivapragasam, Poomalai Saravanan, Kaliappan Ganeshmoorthy, Atchutha Muhil, Sundharamoorthy Dilip and Sundarasrinivasan Saivishnu Abstract This study suggests how research interest can be inculcated in the undergraduate students, and advanced knowledge can be mined by extending the scope of the conventional experiments that students study in their curriculum through the use of ICT-based modeling tools. The experiment on flow over rectangular notch experiment of the Civil Engineering curriculum is taken. Conventionally, the objective of the experiment is to find the coefficient of discharge for the notch. However, here an attempt is made to redefine the objectives beyond the scope of the curriculum by modeling the flow profile past the notch. In the presence of the notch, the flow behavior gets modulated. The application of genetic programming results in a new research finding and is found to be highly useful to draw an insightful understanding of the process being studied. This study is also important in dissemination of importance of use of such data mining methods and promoting interdisciplinary research from the early stage of engineering education.



Keywords Inculcating research aptitude Genetic programming Flow over notch Fluid mechanics Gradually varied flow





1 Introduction Education is primarily a search for knowledge. An undergraduate research can be considered effective if an investigation makes an original intellectual or creative knowledge contribution to the domain [1]. According to constructivist theory, learning is a process of integrating new knowledge with prior knowledge so that C. Sivapragasam ⋅ P. Saravanan (✉) ⋅ K. Ganeshmoorthy ⋅ A. Muhil S. Dilip ⋅ S. Saivishnu Center for Water Technology, Department of Civil Engineering, Kalasalingam Academy of Research and Education, Krishnankoil 626126, Tamil Nadu, India e-mail: [email protected] C. Sivapragasam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_40

409

410

C. Sivapragasam et al.

advancement of knowledge is enhanced by the individual [2]. Almost every curriculum has blends of both theory and laboratory courses and innovative teaching approaches are being suggested, tried, and implemented with an explicit objective of inculcating research aptitude in undergraduate students. Danowitz et al. [3] suggest a team-taught course in the third year of study wherein students are exposed to prepare and present research proposals which can also be taken as research plan for the remaining part of their study. Although such broader approaches are very important, another perspective to the whole thing is to bring into practice a culture of knowledge discovery even through conventional experiments that an undergraduate student is introduced to as early as second year or even first year of his/her study. Of late, many tools and techniques are available which can be integrated with the conventional experiments to mine knowledge beyond the stated objectives of the experiments. Genetic programming (GP) is one such technique. The main aim of this study is to demonstrate through an example how use of software can help students extend even a simple conventional experiment in the laboratory course to mine knowledge beyond the stated objectives of the experiment. The demonstrated example is taken from Fluid Mechanics laboratory course which is one of the basic engineering laboratory courses for many disciplines such as Civil Engineering, Mechanical Engineering, Chemical Engineering, Electrical Engineering, etc., in the second year of undergraduate curriculum. Although there are many laboratory manuals which give elaborate details on how to conduct the experiments and record the readings, there is no explicit emphasis on how to take the students beyond the scope of the curriculum to a search for knowledge. Some laboratory manuals give more emphasis on interpreting the results, preparing good scientific reports to communicate the findings, and/or extend the conventional experimental setups to more complex experiments [4]. The example case study considered is the experiment on flow over a rectangular notch in an open channel. Conventionally, the objective of the experiment is to estimate the coefficient of discharge (Cd) for the given notch for a range of discharge [4]. It is demonstrated how even in the absence of a full-fledged experimental setup for studying gradually varied flows (GVFs) in an open channel, it is possible to extend the flow through notch experiment to demonstrate the behavior of GVF through use of relevant data analysis software. The next section briefly provides the information about different methods of computation available for determining the water profiles of GVF. GP is then introduced to analyze the data because of its ability to evolve physically meaningful mathematical models from data. A demonstration is given on how to frame a research problem and how very insightful research conclusions can be derived.

Mathematical Modeling of Gradually Varied Flow …

411

2 Methods of GVF Computations Quantitative information on the variation of the flow depth and flow velocity along channel is required in many engineering applications. Methods of computation of GVF are broadly classified into three, namely, the graphical integration method, the direct integration method, and the step methods (among which the standard step and direct step methods) are widely used [5, 6].

2.1

Direct Step Method

The direct step method is a simple step method applicable to prismatic channels which basically involve the dynamic equation of gradually varied flow from which the water profiles can be determined in its finite difference form [7]. dE = S0 − Sf dx

ð1Þ

where E is the specific energy, x is the length of the channel, S0 is the bottom slope of the channel, and Sƒ is the average value of friction slope.

2.2

Standard Step Method

The standard step method is also applicable to non-prismatic channels. Unlike prismatic channels, non-prismatic has varying depth and slope which leads to the necessity of conducting field survey to collect the data required at all the sections. In such cases, the distance between stations will be known and the depth can be determined from it. This method does not require the dynamic equation of gradually varied flow; rather, fundamental principles of law of conservation of energy is used from which the water profiles can be determined [7]. Z1 + y1 + α1

v21 v2 = Z2 + y2 + α2 2 + hf + he 2g 2g

where Z y α

is the elevation of water surface from the datum, is the depth of the flow, is the energy coefficient,

ð2Þ

412

C. Sivapragasam et al.

V is the velocity of flow, hƒ is friction loss, and he is eddy loss. The details of direct step method and standard step method can be found in any standard textbook in Open Channel Hydraulics.

3 Genetic Programming This algorithm is based on Darwin’s theory of natural selection and survival of the fittest. It works on the concept of parse trees for generating equations (mathematical models) establishing appropriate relationship between output and input variables. The algorithm generates the initial set of population of equations through random combinations of terminal set and functional set. Constants and variables that govern the process constitute the terminal set. The functional set has arithmetic operators (such as addition, subtraction, etc.,) and mathematical functions (exponential, logarithmic, etc.). For the mathematical models to be physically meaningful, it is necessary to choose the correct combination of terminal set and functional set. This requires an understanding of the process being modeled. The initial set of population is refined to evolve a better model by using crossover or mutation or elitist or a combination of one or more. The mutation operator makes sure that the search space has a broader scope, the elitist ensures the best model from a previous generation in the new set of population and the crossover operator exchanges the information between two parents [8]. Fitness criteria are met by repetitive and successive generation of the process that includes number of population, number of generations, and other relevant information. In this study, GP is implemented using Discipulus tool [9, 10].

4 Modeling the GVF with Gp Flow over rectangular notch is the example case study which is being considered here. The specific objective considered here is to mathematically model the shape of the resulting GVF profile as the flow passes over the notch. Figure 1 illustrates the resulting shape. The presence of notch redefines the profile shape. Experiments are conducted in the laboratory experimental setup for flow over a rectangular notch. For the given discharge (Q), depth of flow (y) is measured along the flow direction in three different fixed points (A, B, and C) as shown in Fig. 1. This is repeated for a range of discharge between 0.218 and 1.124 m3/s. The observed data is split into three components, viz., training, testing, and validation for developing the GP-based mathematical model (Table 1). Only basic arithmetic functions are

Mathematical Modeling of Gradually Varied Flow …

413

Fig. 1 Flow over rectangular notch

considered in GP modeling since the physical characteristics of the process do not mandate the use of other complex functions such as trigonometric functions, log functions, or exponential functions.

5 Results and Discussions The GP predicted depth of flow for the validation set is shown in Table 2. The percentage of training, testing, and validation are 51%, 31%, and 17%, respectively. These values are obtained by trial and error method. The functional form of input parameters is shown in Eq. (3).

Table 1 Observed data for the case study Training Q (m3/s)

x (cm)

y (cm)

0.218 0.240 0.490 0.781 1.124 0.272 0.327 0.619 1.042 1.124 0.272 0.327 0.665 0.781 1.042

1.5 1.5 1.5 1.5 1.5 3 3 3 3 3 6 6 6 6 6

9.1 10.3 12.4 13.3 15.9 13.1 13.5 14 15.3 17.6 15.1 15.7 17 17.5 18

Testing Q (m3/s)

x (cm)

y (cm)

0.327 0.218 0.327 0.490 0.781 1.042 0.490 0.619 1.124

1.5 3 3 3 3 3 6 6 6

12.3 11.7 13.5 13.8 14.5 15.3 16.3 16.6 19.4

Validation Q (m3/s) x (cm)

Y (cm)

0.665 1.042 0.240 0.665 0.218

12.8 14 12.3 14.2 14.5

1.5 1.5 3 3 6

414

C. Sivapragasam et al.

Table 2 Measured and predicted depth for validation data Discharge (m3/s)

Distance (cm)

Measured depth (cm)

GP predicted depth (cm)

ANN predicted depth (cm)

0.665 1.042 0.240 0.665 0.218

1.5 1.5 3 3 6

12.8 14 12.3 14.2 14.5

12.73 14.56 12.37 14.23 15.25

14.89 14.71 12.20 15.43 11.90

y = f ðQ, xÞ

ð3Þ

where y is depth of flow in (cm), Q is discharge in m3/s, and x is distance in (cm). As seen from Table 2, GP is able to quite accurately predict the depth of flow for a given discharge and location. A comparison is also made with more popular regression method, namely, the Back Propagation Artificial Neural Network (ANN). The performance of GP is better than ANN. Moreover, ANN does not reveal the mathematical relationship between the input and output variables [11]. The exact nature of the mathematical equation for this modeling can be derived from the code evolved by GP. The decoded equation is shown in Eq. (4). The GP evolved model can be written as y = 8.53 + x + 4Q

ð4Þ

A closer look at the equation reveals the following information: (a) The profile of GVF is linear in nature in the presence of the barrier (rectangular notch in this case) as against a condition in the absence of the notch (which will most likely be parabolic). This is an interesting understanding that is evolved with GP modeling (Fig. 2). (b) The constant 8.53 indicates the resulting effect due to the presence of a notch which creates a pool of water of about 11 cm depth upstream of the notch. The flow over this pool and the notch modulates the form of the profile (Fig. 3). (c) The students feel that GP-based mathematical modeling helped them recognize the importance and relationship of mathematics to engineering application in a much better way.

Mathematical Modeling of Gradually Varied Flow … Actual Depth

415 Depth Predicted By GP

Depth Predicted by ANN Depth of Water

16 15 14 13 12 11 1

2

3 LocaƟons

4

5

Fig. 2 Comparison of actual depth with predicted depth

Fig. 3 Effect of notch

6 Conclusions The following conclusion can be arrived based on this study: (a) It is found that the shape of the flow profile is modified due to the presence of the notch and can be modeled as a linear function of discharge. (b) ICT tools can be very effectively used for inculcating research interest and aptitude in the young students through its judicious application in simple and conventional experiments in the laboratory. (c) GP seems to be very potential and simple tool for data mining and knowledge discovery from laboratory experiments. Acknowledgements The authors would like to thank Mr. Vallamkonda Rakesh Ramayya and Mr. Rajesh Kumar Thangaia Viswanathan of Centre for Water Technology for their valuable help and support to complete the work.

416

C. Sivapragasam et al.

References 1. Lewiston, M.E.: Enhancing Research in the Chemical Sciences at Predominantly Undergraduate Institutions. Report from the Undergraduate Research Summit, Bates College National Science Foundation (2003) 2. Anshu.: Nurturing research aptitude in undergraduates medical students. Med. Stud. Res. 14 (2), 50–51 (2009). https://doi.org/10.1111/j.1365-2923.2010.03792.x 3. Danowitz, A.M., Brown, R.C., Jones, C.D., Taylor, C.E.: A combination course and lab-based approach to teaching research skills to undergraduates. J. Chem. Educ. 93(3), 434– 438 (2016). https://doi.org/10.1021/acs.jchemed.5b00390 4. Sivapragasam, C., Deepak, M., Vanitha, S.: Experiments in Fluid Mechanics & Hydraulic Machinery, 1st edn. Lambert Academic publishing, Germany (2016) 5. Chow, V.T.: Open channel hydraulics, First edition. Mc-Graw Hill, New york (1959).https:// doi.org/10.1126/science.131.3408.1215-a 6. Kumar, A.: Integral Solutions of the Gradually Varied Equations for Rectangular and Trianzgular Channels. Proc. Inst. Civ. Eng. 65, 509–515 (1978). https://doi.org/10.1680/iicep. 1978.2802 7. Subramanya, K.: Flow in open channels, 4th edn. Tata Mc-Graw Hill Education (2009) 8. Koza, J.R.: Genetic Programming: On the Programming of computers by Natural Selection, 1st edn. MIT Press, Cambridge, MA (1992) 9. Sivapragasam, C., Muttil, N., Muthukumar, S.L., Arun, V.M.: Prediction of algal blooms using genetic programming. Mar. Pollut. Bull. 60, 1849–1855 (2010). https://doi.org/10.1016/ j.marpolbul.2010.05.020 10. Muttil, N., Lee, J.H.W., Jayawardena, A.W.: Real-time prediction of coastal algal blooms using genetic programming. In: Hydroinformatics, pp. 890–897 (2004). https://doi.org/10. 1142/9789812702838_0110 11. Sivapragasam, C., Vanitha, S., Muttil, N., Suganya, K., Suji, S., Selvi, M.T., Selvi, R., Sudha, S.J.: Monthly flow forecast for Mississippi River basin using artificial neural networks. Neural. Comput. Appl. 24, 1785–1793 (2014). https://doi.org/10.1007/s00521-013-1419-6

A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies Pragma Kar, Soumya Banerjee, Kartick Chandra Mondal, Gautam Mahapatra and Samiran Chattopadhyay

Abstract Network Intrusion Detection System (NIDS) deals with perusal of network traffics for the revelation of malicious activities and network attacks. The diversity of approaches related to NIDS, however, is commensurable with the drawbacks associated with the techniques. In this paper, an NIDS has been proposed that aims at hierarchical filtration of intrusions. The experimental analysis has been performed using KDD Cup’99 and NSL-KDD, from which, it can be clearly inferred that the proposed technique detects the attacks with high accuracy rates, high detection rates, and low false alarm. The run-time analysis of the proposed algorithm depicts the feasibility of its usage and its improvement over existing algorithms. Keywords NIDS ⋅ KDD Cup’99 ⋅ NSL-KDD ⋅ Feature selection Preprocessing ⋅ Decision tree ⋅ Isolation forest ⋅ K-nearest neighbor

1 Introduction The eminence of network is noteworthy as it provides a platform for important daily life activities, abundant information storage and exchange, thus being prone to infringement. Machine learning techniques make the applicability of Network P. Kar ⋅ S. Banerjee ⋅ K. C. Mondal ⋅ S. Chattopadhyay (✉) Jadavpur University, Kolkata 21218, West Bengal, India e-mail: [email protected] P. Kar e-mail: [email protected] S. Banerjee e-mail: [email protected] K. C. Mondal e-mail: [email protected] G. Mahapatra Research Centre Imarat, DRDO, Ministry of Defence, Govt of India, Hyderabad 500069, Telangana, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_41

417

418

P. Kar et al.

Intrusion Detection System (NIDS) optimal. By inspecting the pattern of network traffic, the system can identify a malicious connection, by matching it with the known patterns (misuse/signature-based detection) or by evaluating its significance diversion from the normal connections (anomaly-based detection). Misuse-based detection, even though guarantees high true positive rates, fails to detect an intrusion if its pattern is not prerecorded. In anomaly-based detection, previous knowledge about a malicious connection is not required, but this technique encounters high false positive rates. Hybrid intrusion detection methods integrate the two techniques to utilize their combined benefits. Supervised or unsupervised learning algorithms are largely affected by the relevant features for classification of the data samples. In this paper, an NIDS has been presented that incorporates hierarchical filtration of anomalies (HFA). The completeness of the approach is reflected by its ability to resolve the problems associated with the used datasets and identifying the intrusions. The approach assumes that selection of relevant features result into a compact dataset where the attacks vary from the normal connections, in terms of some pattern exhibited by the features. Experimental evaluation of the algorithm proves that the system performance, when measured by parameters like accuracy, detection rates, false alarm, and runtime, unveils significant improvement over conventional approaches. The rest of the paper is organized as follows: Sect. 2 presents a survey of related works. The proposed algorithm has been described in Sect. 3. Section 4 presents the feasibility and efficiency of the proposed algorithm through experimental results. The paper has been concluded along with the discussion for future scopes of this research in Sect. 5.

2 Related Work The concept of NIDS integrates multiple concepts that encircle the core notion of identifying a malicious activity in the network. Hence, the research field encompasses, but is not limited to, feature selection, data processing techniques, and intrusion detection algorithms. Amiri et al. [1] presented feature selection techniques and an intrusion detection model (Least Squares Support Vector Machine) and compares their performances. The forward feature selection algorithm obtains the features that contribute at maximizing the mutual information between the input and class output, by employing a greedy selection procedure. Modified mutual information-based feature selection additionally minimizes the feature replication during the selection, whereas the Linear correlation-based feature selection method reduces the complexity of the former method. The conventional feature selection techniques can be combined to optimize the feature selection process. Peng et al. [2] presented a feature selection technique that aims at minimizing the redundancy, maximizing the relevance, and minimizing the classification error by considering the mutual information between a class and the selected feature set. In this approach, selection of candidate feature set is followed by forward and backward selection of relevant features. The dataset used to analyze the performance of an approach can be refined

A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies

419

by preprocessing techniques like normalization, discretization [3], balancing, etc. In recent years, major research has been conducted in the field of intrusion detection involving machine learning techniques. The approach presented in [4] provides a simplified efficient technique to produce a feature with reduced dimension. However, the algorithm does not take into account the redundancy and class imbalance issue associated with the dataset. Also, the algorithm fails to detect U2R and R2L attacks and incorporates a high time complexity. In [4, 5], a clustering technique has been applied to the KDD Cup’99 dataset for deriving the new feature. However, cluster analysis on the dataset manifests heterogeneity of the clusters. The authors of [6] presented a hybrid intrusion detection technique that aims at building a profile for normal connections and hence detecting the anomalies. The combination of C4.5 decision tree and one class Support vector machine forms the base of this approach. In [7], BIRCH clustering is used, followed by elimination of unimportant features pertaining to each attack class. The intrusion detection is finally performed by combining the SVM models, built for each attack class. Although the performance of this system was significant for DoS and probe attacks, detection of U2R and R2L attacks were not remarkable. Wang et al. [8] frames the statistical drawbacks associated with the KDD Cup’99 dataset. Redundant records and class imbalance incorporates biased results. Thus, pre-processing techniques are required to refine the dataset. According to [9], the distribution of smurf and Neptune overpowers the presence of other records. If the training set is populated with uneven records, the training process is malformed thus impeding the learning algorithm. The NSL-KDD dataset is a corrected version of the KDD Cup’99 where redundancy is removed, based on all 41 features. The dataset incorporates some additional unknown attacks into the testing set.

3 Proposed Algorithm 3.1 Related Concepts ID3: ID3 [10] is a decision tree, where a supervised learning technique is employed, that aims at classifying data into corresponding classes. Let, DS be the given dataset with F features and C different class labels. The algorithm calculates the entropy EN(DS) of DS. If frac(n) defines the proportion of data belonging to class n to DS, then the entropy is calculated by Eq. 1 and Information Gain (IG) for each feature Fi ∈ F is calculated by Eq. 2. EN(DS) = −

C ∑ n=1

frac(n) log2 frac(n)

(1)

420

P. Kar et al.

IG(Fi , DS) = EN(DS) −

D ∑

frac(p)EN(p)

(2)

p=1

Here, D is the number of subsets created by splitting DS over Fi , frac(p) is the proportion of data in p to DS and EN(p) is the entropy of subset p. The feature with highest IG is selected and the entire process is repeated for the remaining features until all data points are classified or all features are selected. k-Nearest Neighbor: k-Nearest Neighbor (k-NN) [11] is one of the popular supervised learning techniques where an unknown data point is assigned a class label, based on the class labels of its k nearest neighbors. In this algorithm, the k nearest neighbors of an unknown data point are chosen in an n dimensional space. The class to which majority of the neighbors belong to, is assigned to the unknown data point. The unlabeled data point is considered and k data points, nearest to it, are selected, based on the Euclidean distance. Isolation Forest: Isolation forest (IForest) [12] is a learning technique that can be used for “isolating” an anomaly from the normal instances. A data point can be isolated by randomly splitting the dataset in an iterative manner. An abnormal data point can be isolated by s number of splits, whereas a normal connection requires t splits where s < t. The splitting is performed by random selection of a feature F and a corresponding split value p ∣ Fmin ≤ p ≤ Fmax . This scenario can be represented by an isolation tree, where, s ∝ pathlength (root, node). Multiple isolation trees are created to form the IForest. The path length of a particular data point is estimated by the average prediction of its path length, by all the isolation trees. The data point is considered as an outlier (anomaly) if its average path length is significantly shorter than that of others.

3.2 Algorithm Description In the proposed approach, Hierarchical Filtration of Anomalies (HFA), a hybrid intrusion detection system has been used, which comprises of three main stages: segregation, pre-processing, and intrusion detection, as shown in Fig. 1. The proposed approach necessitates the formation of two categories of training data. The dataset containing the normal records only (TN ) and the dataset containing both normal and attack records (TV ). The training set (TV ), and the testing dataset (TV′ ) are formed by the 10 fold cross-validation method in case of the KDD Cup’99 dataset. In this method, the entire dataset is segmented into 10 nonoverlapping parts. The system is then trained and tested in an iterative manner for 10 times. In each iteration, one of the ten segments is considered as the testing set and the rest nine parts are used as training data. Each training dataset is used to train the decision tree model and the corresponding testing dataset is combined with the training dataset to form the final testing dataset TV′ . For NSL-KDD dataset, the available training and testing datasets are used.

A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies

421

Fig. 1 The HFA Process

The data pre-processing step is described in Algorithm 1. Experimental analysis has proved that choosing 19 features using the Fisher score [13], balances the relevance of each class hence improves the performance of the proposed algorithm. Algorithm 2 describes the intrusion detection stage.

4 Experimental Results 4.1 Result Analysis on KDD Cup’99 Dataset The KDD Cup’99 dataset [16] contains 494,020 records from which the training and testing sets are derived by tenfold cross-validation. The average result after training and testing the system is presented in this section. Each TV has 444,618 records, TN contains 97,278 records and each TV′ contains 49,402 instances initially. The record count of TV after redundancy removal and removal of class imbalance are 79,791 and 222,444, respectively. Figure 2a graphically presents the distribution of records belonging to different classes in TV at various pre-processing stages. Algorithm 1: Pre-processing stage

1 2 3 4

Input: TN , TV , TV′ Output: TN with selected features, TV with distinct and balanced records in training phase. TV′ with selected features in testing phase Select a subset of features Fs from the feature space F ∣ Fs ⊆ F from TN , TV during the training phase and TV′ during the testing phase. Map non-numeric feature values to numeric values. TN ← TN with Fs features, TV ← TV with Fs features, TV′ ← TV′ with Fs features Select distinct records from each class C ∣ C ∈ {Normal, DoS, Probe, U2R, R2L} in TV . TV ← TV with distinct records. For each data dm ∈ minority class, in TV , select K nearest neighbors |ki ∈ K and ki is represented by F features |Fi ∈ F. Calculate synthetic data point [14], Syntheticval (Fi ) = dm (Fi ) + (ki (Fi ) − dm (Fi )) × r where r ← [0, 1]. Apply Edited Nearest Neighbor (ENN) [15] on TV . TV ← TV with balanced records.

422

P. Kar et al.

Algorithm 2: Intrusion detection stage

1 2 3 4 5 6 7 8 9 10

Input: TN , TV in training phase. TV′ in testing phase. Output: Connections predicted as normal or anomaly. Build ID3 model, Isolation forest model and k-NN classifier. Train the ID3 model and k-NN (k = 1) classifier model using TV . Train the Isolation forest model for outlier detection using TN . ID3n ← ∅, ID3a ← ∅, kNNn ← ∅. foreach recordi ∈ TV′ do Test recordi using ID3 model to check whether it is normal or attack. if recordi is classified as Normal then Add recordi to ID3n else if recordi is classified as Attack then Add recordi to ID3a

foreach recordj ∈ ID3a do Test recordj using k-NN Classifier. 13 if recordj is classified as Normal then 14 Add recordj to kNNn 11 12

15 16 17 18 19 20 21 22 23

else if recordj is classified as Attack then Classify recordj as DoS, Probe, U2R or R2L attack and block the connection. Combine ID3n and kNNn to form Normalt . foreach recordk ∈ Normalt do Test recordk using Isolation forest model to check whether it is normal or attack. if recordk is classified as Normal then Predict recordk as Normal (Inlier). else if recordk is classified as Attack then Predict recordk as Attack (Outlier) and block the connection.

(a) KDD Cup’99 Fig. 2 Distribution of records at pre-processing steps

(b) NSL-KDD

A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies

423

Table 1 Confusion matrices for intrusion detection stage: decision tree, k-NN classification, outlier detection on KDD Cup’99 Actual ↓

Predicted →

Decision tree

k-NN Classification

Normal Attack Normal Normal 73,239 Attack

1327

1316 195,964

Outlier detection

Normal DoS

Probe

491

364

DoS

37

75,412

Probe

29

158

U2R

1411

185

R2L

281

2

20

U2R

Normal 68,970 2763

Attack 5585 194,528

Normal Attack

89

352

146

0

1

Normal 68,970

41,365

16

0

Attack

0

32,315

4823

0

36

39,747

Table 2 Overall system performance on KDD Cu’99 dataset Predicted → Accuracy (%) Actual ↓ Normal Attack

R2L

96.92

2763

4760 322

Detection rate False alarm (%) (%) 97.20

7.49

Table 1 presents the intermediate results of individual intrusion detection models through confusion matrix. The accuracy rates for normal, Dos, Probe, U2R, and R2L classes can be calculated to be 37.31%, 99.75%, 99.51%, 83.42%, and 99.20%, respectively, in k-NN classification. The overall system performance on KDD Cup’99 dataset is shown in Table 2. By comparing the accuracy of classification with CANN [4], it can be seen that unlike CANN, all 4 attacks are accurately detected in our approach. The classification accuracy for Normal, DoS, Probe, U2R and R2L attacks in CANN are 97.04%, 99.68%, 87.61%, 3.85%, and 57.02%, respectively. Whereas, the overall accuracy for normal classes in the proposed system is 92.5% and that of DoS, Probe, U2R and R2L attacks are 99.75%, 99.51%, 83.42% and 99.20% respectively. Hence, HFA clearly outperforms CANN in case of classification accuracy. Although, the false alarm rate is relatively higher in our approach, the total runtime for CANN, including data preparation, training and testing is 30 h. Whereas, for our system, the total runtime is 1.12 h for tenfold cross-validation, using KDD Cup’99 dataset. For each round of training and testing, our system takes approximately 6.73 min. This indicates the feasibility of our approach.

4.2 Result Analysis on NSL-KDD Dataset TV for NSL-KDD [17] contains 35,567 records initially. TN contains 18,974 records and the TV′ contains 21040 instances initially. The record count of TV after redundancy removal and removal of class imbalance are 30,683 and 102,692 respectively. Figure 2b graphically presents the distribution of records belonging

424

P. Kar et al.

Table 3 Confusion matrices for intrusion detection stage: decision tree, k-NN classification, outlier detection on NSL-KDD Actual ↓

Predicted →

Decision tree

k-NN classification

Normal Attack

Normal 315 Normal 36,824 Attack

4835

711 81,362

Outlier detection

Normal DoS

Probe

143

U2R

R2L

122

62

DoS

262

31,239

204

4

7

Normal

Probe

294

139

18,810

1

26

Attack

U2R

27

0

1

14595

266

R2L

214

8

15

245

15,005

Table 4 Overall system performance on NSL-KDD dataset Predicted → Accuracy (%) Actual ↓ Normal Attack

Normal 33,651 3596

Normal Attack

69

Attack 3884 82,601

93.95

33,651 3488 3596

2036

Detection rate False alarm (%) (%) 95.5

10.34

to different classes in TV at various pre-processing stages. Hence, finally TV′ contains 123,732 instances without class labels. Table 3 shows the confusion matrix for the intermediate intrusion detection stages. The accuracy rates for normal, Dos, Probe, U2R, and R2L classes are 44.3%, 98.49%, 97.65%, 98.03%, and 96.88%, respectively, in k-NN classification. Table 4 presents the confusion matrix for the overall system by combining the number of records predicted as normal or attack at the intermediate stages and hence measures its performance. Figure 3a presents a visual comparison of accuracy of the approach at different steps of intrusion detection, applied on the datasets. Figure 3b shows that the time required to run the entire algo-

(a) Comparison of accuracy at intermedi- (b) Comparison of run time at intermediate steps ate steps

Fig. 3 Comparison of system performance on KDD Cup’99 and NSL-KDD

A Hybrid Intrusion Detection System for Hierarchical Filtration of Anomalies

425

Table 5 Accuracy of decision tree, k-NN and HFA Accuracy % KDD Cup’99

NSL-KDD

Normal

DoS

Probe

U2R

R2L

Normal

DoS

Probe

U2R

R2L

Decision tree

98.2

97.45

97.52

94.8

99.0

97.34

93.29

93.15

96.78

85.1

k-NN (k = 1)

90.19

97.40

92.03

80.29

98.70

90.0

92.10

93.76

95.44

82.68

HFA (k = 1)

92.5

99.75

99.51

83.42

99.20

89.65

98.49

97.65

98.03

96.88

rithm is more in case of KDD Cup’99 dataset. This is due to the fact that the runtime is proportional to the number of records in the dataset. Table 5 compares the accuracy of Decision Tree, k-NN, and HFA on the preprocessed dataset. It can inferred that even though, for HFA, the accuracy of normal class is comparably less than the decision tree algorithm, the accuracy of the attack classes are significantly higher than the other two, for our proposed algorithm for both datasets. This guarantees the elimination of maximum intrusions from the normal traffic by HFA.

5 Conclusion In this paper, we introduce a hierarchical filtration process for anomalies, where the attacks are filtered out by stepwise assessment of the testing dataset. The algorithm aims at classification of maximum attacks and also provides a scope for detecting attacks which might not be properly classified by the intermediate filtration and classification. This provision ensures that even if a particular attack cannot be classified properly into one of the known classes, it should at least be removed from the normal traffic, thus promoting in-depth filtration and elimination of anomalies from the traffic. Experimental analysis shows that the proposed approach has efficient performance both in presence of known and unknown attacks. While the system acquires desirable accuracy for both the datasets, future research will focus on lowering the false alarm rate and further tuning the classification technique.

References 1. Amiri, F., Yousefi, M.R., Lucas, C., Shakery, A., Yazdani, N.: Mutual information-based feature selection for intrusion detection systems. J. Netw. Comput. Appl. 34(4) (2011) 2. Peng, H., Fuhui L., Chris D.: Feature selection based on mutual information criteria of maxdependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 8, 1226–1238 (2005)

426

P. Kar et al.

3. Deshmukh, D.H., Ghorpade, T., Padiya, P.: Intrusion detection system by improved preprocessing methods and Nave Bayes classifier using NSL-KDD’99 Dataset. In: IEEE Electronics and Communication Systems (ICECS). IEEE (2014) 4. Lin, W.-C., Ke, S.-W., Tsai, C.-F.: CANN: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl. Based Syst. 78, 13–21 (2015) 5. Tsai, C.-F., Lin, C.-Y.: A triangle area based nearest neighbors approach to intrusion detection. Pattern Recognit. 43(1), 222–229 (2010) 6. Kim, G., Lee, S., Kim, S.: A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Expert Syst. Appl. 41(4), 1690–1700 (2014) 7. Horng, S.-J., Su, M.-Y., Chen, Y.-H., Kao, T.-W. Chen, R.-J., Lai, J.-L., Perkasa, C.D.: A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Syst. Appl. 38(1), 306–313 (2011) 8. Wang, Y., Yang, K., Jing, X., Jin, H.L.: Problems of KDD Cup’99 dataset existed and data preprocessing. In: Applied Mechanics and Materials, vol. 667, pp. 218–225. Trans Tech Publications (2014) 9. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP’99 data set. In: IEEE Computational Intelligence for Security and Defense Applications, CISDA, pp. 1–6. IEEE (2009) 10. Quinlan, J.: Ross, “Induction of decision trees”. Mach. Learn. 1, 81–106 (1986) 11. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997) 12. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: Proceedings of ICDM (2008) 13. Xue-qin, Z., Chun-hua, G., Jia-jun, L.: Intrusion detection system based on feature selection and support vector machine. In Communications and Networking in China, ChinaCom’06, pp. 1–5. IEEE (2006) 14. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority oversampling technique. J. Artif. Intell. Res. 16, 321–357 (2002) 15. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern 2(3), 408–421 (1972) 16. http://www.kdd.org/kdd-cup/view/kdd-cup-1999 17. https://github.com/defcom17/NSL_KDD

A Novel Method for Image Encryption Ravi Saharan and Sadanand Yadav

Abstract In general, playfair cipher support 5 × 5 matrix is used to generate a key matrix for encryption, which supports only 25 uppercase alphabets. To succeed in dealing with this drawback, various extended playfair cipher techniques are used. Here, a new cryptographic method of image encryption is proposed, which contains two stages to make it secure. In the first stage, we, first, performed the rotation operation, and then, we apply diffusion operation (XOR) on two consecutive pixels. Then, the extended playfair cipher (EPC) technique along with secret key is used. Here in EPC, we use 16 × 16 key matrixes which give us better result as compare to other playfair cipher technique. The proposed encryption technique is implemented on images for evaluation of security aspects and performance analysis on the basis of key size analysis, key sensitivity analysis and statistical analysis.





Keywords Playfair cipher XOR Rotation (flip left to right) Extended playfair cipher (EPC) Encryption Decryption





1 Introduction In the modern time, a lot of digital images are sending from one person to another through Internet. So information exchange through Internet and storage of digital data in open network create a condition in which illegitimate users can get the very important information. Due to such digital advancement, images has a very essential part of our life such as medical imaging system, military, cable TV, national security agencies, diplomatic affairs and online personal photo album [1]. So, we need some authentic, fast and robust security techniques to encrypt digital images. The main goal of keeping image protected is to maintain confidentiality, R. Saharan (✉) ⋅ S. Yadav Central University of Rajasthan, Bandarsindri, NH-8, Ajmer 305817, Rajasthan, India e-mail: [email protected] S. Yadav e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_42

427

428

R. Saharan and S. Yadav

authenticity and integrity [2]. Generally, cryptography has been divided into two categories [3]: symmetric and asymmetric key cryptography. In symmetric key cryptography, same key is used for encryption and decryption while in asymmetric key cryptography, public key is used for encryption and secret key is used for decryption.

1.1

Categorization of Image Encryption Algorithm

The image encryption algorithms are mainly categorized into three main groups [4]: 1. Position permutation-based (shuffling). 2. Value transformation-based. 3. Visual transformation-based. Chen and Chen [5] described SCANmethodology for images’ encryption, which is done by scan patterns which are generated by a variety of scanning paths to encrypt images. Sankpal and Vijaya [6] featured a chaotic approach in which a combination of confusion and diffusion operation is performed. In both operations, there are several chaotic maps used as secret key. Jolfaei and Mirghadri [7] also used chaotic method which is a combination of pixel shuffling and W7 stream cipher. Here, they used Henon map to generate a permutation matrix. Al-Husainy [8] proposed a method which is a combination of diffusion operation (XOR operation) followed by confusion operation (circular right rotate), and to improve security both operation perform multiple times, Bibhudendra et al. [9], Acharya et al. [10] presented methods for generating self-invertible key matrix, useful in extended hill cipher algorithms. Panigrahy et al. [11] proposed a method to encrypt images by using self-invertible key matrix in hill cipher algorithm to encrypt image. Chand and Bhattacharyya [12] proposed playfair cipher which uses matrix (6 × 6) having four iteration steps. Srivastava and Gupta [13] have proposed image encryption method which uses extended playfair cipher followed by LFSR. Hans et al. [14] proposed playfair cipher using random swap and patterns. There are several variants of playfair cipher, extension of playfair cipher which uses matrix (16 × 16) and modified playfair cipher using rectangular matrix which uses 10 × 9 matrix [15]. Iqbal et al. [16] have proposed a method which uses playfair technique (6 × 6 key matrix) using Excess 3 Code ( × S3) and Caesar cipher. In Sect. 2, the proposed technique is explained; Sect. 3 explains the experiments, results and security analysis; Sect. 4 explains the conclusion; and finally Sect. 5 explains future work of this paper.

A Novel Method for Image Encryption

429

2 Proposed Work Here, we used a combined approach technique for the image encryption as proposed work. The proposed method basically has two stages: 1. In this step, we, first, perform rotation operation (flip left to right) and, second, perform diffusion operation which is done by XOR operation. 2. In this step, we perform extended playfair cipher technique on generated o/p image from first stage with the help of secret key (password). The architecture of proposed work is given in Fig. 1.

2.1

Extended Playfair Cipher Algorithm

Step 1: Read an alphanumeric password and eliminate the repeating word from password then convert into its equivalent ASCII values. Step 2: Construct a key matrix of dimension 16 × 16 by filling the values of password and left out numbers between 0 and 255. Step 3: Then, we apply working rule of extended playfair which is same as the traditional one. In this section, we discuss two algorithms: 1. Encryption algorithm and 2. Decryption algorithm.

INPUT IMAGE PASSWORD ROTATION OPERATION FOLLOWED BY DIFFUSION OPERATION

EXTENDED PLAYFAIR CIPHER ALGORITHM

OUTPUT IMAGE

Fig. 1 Architecture of proposed work

430

2.2

R. Saharan and S. Yadav

Encryption Algorithm

To create the encrypted image IE from the input image, the following steps are taken: Step 1: Select an Alphanumeric password/secret key of length N. Step 2: Number of pixel in input image is length × width (L × W). Select two consecutive pixel from start, X and X + 1 from input image and then apply a rotation operation (flip Left to Right) on pixel X + 1. Step 3: Then, we perform a XOR operation with this rotated pixel X + 1 to X, and result will be stored at X. Step 4: Steps 2 and 3 continue until all the pixels have been traversed. Step 5: Then, we perform an extended playfair cipher (EPC) using password on generated image from the previous step. Step 6: It gives us final encrypted image.

2.3

Decryption Algorithm

To create the decrypted image ID from the encrypted image IE, the following steps are used: Step 1: Use the same secret key of length N, which is selected in encryption algorithm. Step 2: Select encrypted image IE and perform again extended playfair cipher (EPC) using password. Step 3: Select two consecutive pixel X and X − 1. Here, we start select X and X − 1 from last pixel (L × W) from encrypted image IE. Then, we perform a Rotation operation (flip Left to Right) on pixel X − 1. Step 4: Then, we perform a XOR operation with this rotated pixel X − 1 to X, and the result will be stored at X. Again perform a Rotation operation (flip left to right) on the result of XOR operation and this result stored at X − 1. Step 5: Continue until all the pixel has been traversed. Step 6: Finally, we have the input image.

3 Experiments, Results and Security Analysis For evaluation of the proposed technique, it is tested on several images which have different sizes. For the same, the standard test image like ‘Lena image’, ‘Cameraman image’ and ‘Baboon image’ is used. Also, we have done some security

A Novel Method for Image Encryption

431

analysis on the proposed technique, such as key space analysis, correlation analysis, statistical analysis and key sensitivity analysis to show good security features for the proposed method.

3.1

Key Space Analysis

For a constructive cryptosystem, key size should be larger so there is no possibility of brute force attack easily. In this proposed work, key space is 256! which makes a secure cryptosystem. Figure 2 shows various encrypted images of the Standard test image Lena, by using ‘sada’ key and ‘sadanand’ key.

3.2

Key Sensitivity Analysis

For sensitivity analysis, a small bit change is done in secret key and then uses this key for decryption method which gives us a completely different image, which is shown in Fig. 3.

3.3

Statistical Analysis

In cryptanalysis, the statistical attack is a one of the most important attacks. Any effective cryptosystem must be sustainable against this attack. To verify this security features, we perform histogram of the standard test image Lena and encrypted image (IE) in Fig. 4 and (Table 1).

Fig. 2 a Standard test image Lena. b Encrypted image(IE) with ‘sada’ key. c Encrypted image (IE) with ‘sadanand’ key

432

R. Saharan and S. Yadav

Fig. 3 a Standard test image Lena. b Encrypted image(IE). c Decrypted image (ID) with wrong key

Fig. 4 a Histogram for standard test image Lena, and b histogram of the encrypted image (IE)

4 Conclusion We presented an image encryption method. This method has two stages; for both the stages, only one password is required. Experiments and results of the proposed work show the stronger and secure image against various attacks such as brute force attack and statistical attack. It can be concluded that a combined approach of two stages of proposed work to qualifies various security parameters. And also concluded that on the basis of experimental results, the proposed technique is perfect and good for all types of images.

A Novel Method for Image Encryption

433

Table 1 Experiments, results and security analysis are done based on standard test images like ‘Lena’, ‘Baboon’ and ‘Cameraman’ [17]

5 Future Work To make the proposed algorithm more secure, we implemented third stages, where we use randomization technique with the help of unique code generator. And also to make it a more robust algorithm, we design the proposed algorithm by including avalanche effect.

References 1. Rajput, A.S., Mishra, N., Sharma, S.: Towards the growth of image encryption and authentication schemes. In: 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 454–459. IEEE, Aug 2013

434

R. Saharan and S. Yadav

2. Kumar, M., Aggarwal, A., Garg, A.: A Review on various digital image encryption techniques and security criteria. Int. J. Comput. Appl. 96(13) (2014) 3. Stallings, W.: Cryptography and Network Security: principles and practices. Pearson Education India (2006) 4. Patel, K.D., Belani, S.: Image encryption using different techniques: a review. Int. J. Emerg. Technol. Adv. Eng. 1(1), 30–34 (2011) 5. Chen, C.S., Chen, R.J.: Image encryption and decryption using SCAN methodology. In: 2006 IEEE Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’06), pp. 61–66. IEEE, Dec 2006 6. Sankpal, P.R., Vijaya, P.A.: Image encryption using chaotic maps: a survey. In: 2014 Fifth International Conference on Signal and Image Processing (ICSIP), pp. 102–107. IEEE Jan 2014 7. Jolfaei, A., Mirghadri, A.: An image encryption approach using chaos and stream cipherl. J. Theor. Appl. Inf. Technol. 120–122 (2010) 8. Al-Husainy, M.A.F.: A novel encryption method for image security. Int. J. Secur. Appl. 6(1) (2012) 9. Bibhudendra, A.: Novel methods of generating self-invertible matrix for hill cipher algorithm. Int. J. Secur. 1(1), 14–21 (2006) 10. Acharya, B., Panigrahy, S.K., Patra, S.K., Panda, G.: Image encryption using advanced hill cipher algorithm. Int. J. Recent Trends Eng. (2009) 11. Panigrahy, S.K., Acharya, B., Jena, D.: Image encryption using self-invertible key matrix of hill cipher algorithm. In: International Conference on Advances in Computing (2008) 12. Chand, N., Bhattacharyya, S.: A Novel Approach for Encryption of Text Messages Using PLAY-FAIR Cipher 6 by 6 Matrix with Four Iteration Steps. Int. J. Eng. Sci. Innov. Technol. 3, 2319–5967 13. Srivastava, S.S., Gupta, N.: A novel approach to security using extended playfair cipher. Int. J. Comput. Appl. pp. 0975–8887 (2011) 14. Hans, S., Johari, R., &Gautam, V.: An extended Playfair Cipher using rotation and random swap patterns. In: 2014 International Conference on Computer and Communication Technology (ICCCT), pp. 157–160. IEEE, Sept. 2014 15. Basu, S., Ray, U.K.: Modified Playfair Cipher using Rectangular Matrix. In: IJCA, pp. 0975– 8887 (2012) 16. Iqbal, Z., Gupta, B., Gola, K.K., Gupta, P.: Enhanced the security of playfair technique using excess 3 code (XS3) and ceasar cipher. Int. J. Comput. Appl. 103(13) (2014) 17. www.mit.edu 18. Nithin, N., Bongale, A.M., Hegde, G.P.: Image Encryption Based on FEAL Algorithm (2013) 19. Dey, S.: SD-EI: a cryptographic technique to encrypt images. In: 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic (CyberSec), pp. 28–32. IEEE June 2012 20. Goyal, P., Sharma, G., Kushwah, S.S.: Network Security: A Survey Paper on Playfair Cipher and its Variants. Int. J. Urban Des. Ubiquitous Comput. 3(1), 9 (2015) 21. Dhenakaran, S.S., Ilayaraja, M.: Extension of Playfair Cipher using 16 × 16 Matrix. Int. J. Comput. Appl. 48(7) (2012)

PPCS-MMDML: Integrated Privacy-Based Approach for Big Data Heterogeneous Image Set Classification D. Franklin Vinod and V. Vasudevan

Abstract In the current digital world, the Big Data and Deep learning are the two fast maturing technologies. The classification algorithms from Deep learning provide key and prominent advances in major applications. Deep learning spontaneously learns hierarchical illustrations in deep architectures using supervised and unsupervised methods for classification. The image classification is a bustling research area and applying it in big data will be a great contest. With analysis on big data, it is noticeable that the veracity characteristic unnerving the privacy requirement of data shared. While the data is shared for feature selection process, the privacy is in need for user and databank holders. Also since the feature selection process influences the performance of a classifier, a privacy-based feature selection process is mandatory. In this paper, we propose an integrated technique using PPCS (privacy-preserving cosine similarity) and MMDML (multi-manifold deep metric learning) algorithms for a secure feature selection and efficient classification process on Cancer Image datasets. Keywords Data privacy



Big data



Deep learning



Image classification

1 Introduction The solid growth in computational power and the amount of data generated from web, video cameras, sensors, social media, etc., made the rising of the term Big Data. The model of generating and consuming data has been changed. In the past, the organizations will generate data and all others consumed data but now all of us generating data and all of us are consuming data. Big Data [1] means a data to be D. Franklin Vinod (✉) ⋅ V. Vasudevan Department of IT, Kalasalingam Academy of Research and Education, Krishnankoil, Tamil Nadu, India e-mail: [email protected] V. Vasudevan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_43

435

436

D. Franklin Vinod and V. Vasudevan

greater in size from normal storage processing and computational capacity of traditional database systems and data analysis methods. The big data works with five ‘V’ dimensions. The dimension Volume represents the large quantity of data. The data volume is increasing exponentially and many organizations are generating data at a rate of terabytes per day. Such data are having essential value so it cannot be discarded. The dimension Velocity represents quickly moving data. It is being generated fast and needs to be processed fast. Late decisions will miss the opportunities. The dimension Variety represents various formats, types and structures. The dimension veracity represents data in doubt. This data may be incomplete, noisy, inconsistent etc. The dimension value represents for which business community the data is useful for. These dimensions insist that of the various issues, classification is an important task to be performed in big data. Classification is predicting data depend upon the provisions. There are large amount of data collected and stored which make classification process difficult. The image classification is used to find an object from a set of image incidences hired from various points or from various illuminations. The important issue in image classification is how to model classifier efficiently and put on every image set as it consists of nonlinearity of samples. To handle image classification, there are several methods to approach. In this paper, we concentrate image classification using deep learning for cancer datasets. The Deep learning [1, 2] spontaneously learns hierarchical illustrations in deep architectures using supervised and unsupervised methods for classification. Many deep learning algorithms are framed for unsupervised learning which makes use of unlabelled data efficiently which other algorithms cannot do. Also, in big data, major part of data will be in unlabelled format. So, applying deep learning for classification of big data will provide efficient performance. In the current world, there is ‘n’ number of features. The performance of classification algorithm is degraded when it is fed with all ‘n’ number of features. So, selecting appropriate features for the classification algorithm is a most challenging process. From raw feature sets, a feature selection algorithm will choose subset of features necessary for efficient classification process. Even though the feature selection algorithms can provide better performance to increase classification accuracy, they miss to put up sufficient consideration towards data privacy. The term big data cannot be universally welcomed unless the privacy and security contest are not well discussed [3, 8]. The privacy of user is violated when querying an input image for identifying similar features within the data bank may lead to know unusual intrinsic information of respective user and vice versa. So, privacy protection is a quickly emerging research activity in big data environment. Even though few relevant survey/review papers are published stating the elementary perception of protecting privacy in big data, they declined many essential points of view [4, 6]. In this paper, we have given a solution for protecting privacy in big data environment using integrated PPCS and MMDML algorithms.

PPCS-MMDML: Integrated Privacy-Based Approach for Big Data …

437

In this paper, we map out the related works in Sect. 2. In Sect. 3, we propose an Integrated PPCS-MMDML for selecting subset of features securely and provide efficient classification accuracy. The Results and Discussion of our proposed mechanism is explained in Sect. 4.

2 Related Work Arandjelovic et al. [5] suggested a statistical approach for face recognition using the concept of manifold density divergence. The author also considered Kullback– Leibler Divergence and flexible mixture model for describing probability density function of samples. The author also compared his work with state of the art methods like MSM (Mutual Subspace method) and simple KLD. The recognition rate produced b this method was only about 94% that can be still increased. Mehmood et al. [6] pointed out the necessity of big data privacy protection. He analysed different privacy preservation methods of big data and current demands of the existing mechanisms. Also, they compared different integrity verification schemes with respect to features and limitations. The examination of various privacy demands and issues are carried out for every level of big data lifecycle and demonstrated merits and demerits of existing privacy-preserving methods. Sheikhalishahi and Martinelli [7] described the procedure of privacy aware collaborative Information share for data classification. They proposed a method to exclude inappropriate set of features with respect to accuracy and privacy. This method will assure the data providers privacy requirement and data continue to valuable for classification. They implemented this work in distributed environment and used secure weighted average protocol.

3 PPCS-MMDML In general, when a user sent input cancer image to a databank for collecting similar features using any feature selection algorithm, there is possibility of getting extra information for both. By this extra information compromising, the network and hacking their private data’s are achievable. This is a serious threat to the privacy of users and databank’s sensitive information. For this, we discuss an integrated proposed technique PPCS and MMDML to get better classification accuracy on Cancer Image datasets with privacy. Our proposed PPCS will identify suitable and quality features securely and MMDML will get the identified features as input from PPCS for efficient classification.

438

3.1

D. Franklin Vinod and V. Vasudevan

PPCS Working

The feature selection is a technique which identifies subclass of significant features for constructing a classification model. It discards unnecessary and unrelated features without any trouble. The PPCS [8] also performs the same effectively and securely. This algorithm will find cosine similarity between user input image with every image features in the databank securely. The cosine similarity is identified without sharing information between the user and databank owners. The cosine value is calculated between the input image and every image features in the databank. The limits of cosine similarity measure lies between −1 and 1, respectively. The similarities of features between the user input image and databank images will be high when angle of both feature vectors is zero. Once all cosine similarities are measured, the features with highest cosine values will be added to the empty subset which is given as input to MMDML classifier.

3.2

PPCS Algorithm

The steps of Cosine Similarity Computing algorithm are as follows: UA = User Image A a⃗ = ða1 , a2 , . . . an Þ DB = Databank Image B b ⃗ = ð b1 , b2 , . . . bn Þ Pace 1: UA Computation: k1, k2, k3, k4 are the given security factors Identify larger prime numbers α, p Predefine jpj = k1 , jαj = k2 Apply an + 1 = an + 2 = 0 Select a large random number S Є Zp and n + 2 random numbers cx. x = 1, 2, . . . n + 2, with jcx j = k3 . For every ax, x = 1, 2, . . . n + 2.  Sðax ⋅ ∝ + cx Þmod p, ax ≠ 0 Cx = S ⋅ cx mod p, ax = 0 End for Determine A = ∑nx = 1 a2x Preserve s−1 mod p as secret Send ðα, p, C1 , . . . Cn + 2 Þ to DB Pace 2: DB Computation (Computed for all databank images with respect to input) Apply bn + 1 = bn + 2 = 0 For every bx , x = 1, 2, . . . n + 2

PPCS-MMDML: Integrated Privacy-Based Approach for Big Data …

 Dx =

439

bx ⋅ ∝ ⋅ Cx mod p, bx ≠ 0 rx ⋅ Cx mod p, bx = 0

where rx is a random number with|rx| = k4 End for. n

B = ∑ b2x

and

x=1

n+2

D = ∑ Dx mod p x=1

Send (B, D) to UA Pace 3: UA Computation Determine E = s − 1 ⋅ D mod p Determine ! a ⋅ b ⃗ = ∑nx = 1 ax ⋅ bx =

E − ðE mod ∝2 Þ ∝2 !   a⃗ . b cos a⃗, b ⃗ = cos pffiffiffipffiffiffi A B

Now, the subset is formed without sharing information between the user and the databank. Hence, there is no possibility of security attack from both sides and the user and databank private data are secure.

3.3

MMDML

The suitable features from PPCS are taken as input to the MMDML network. The network consists of L + 1 layer. For the given image xci, the output of the network for the first layer of the MMDML network is given by Eq. (1)   h1ci = s Wc1 xci + blc

ð1Þ

where Wc1 represents the projection matrix for the respective layer, and the bias vector that has to be learned for each layer is represented as b1c and s denotes the component-wise nonlinear active function. Thus, at the top layer, the network output will be represented as following Eq. (2):

440

D. Franklin Vinod and V. Vasudevan

  hLci = s WcL − 1 hLci− 1 + bLc

ð2Þ

where hLci− 1 represent the output of the layer prior to the output layer. As the image set was considered as manifold, the distance between manifolds must be maximized to a greater extent in order for to improve the classification performance. Distance-based criterion as given in Eq. (5) was considered in MMDML for minimizing intra-manifold in Eq. (3) and maximizing inter-manifold distance in Eq. (4). The intra-manifold distance is given by 2   1 K1    D1 hLci = ∑ hLci − hLcip  K1 p = 1 2

ð3Þ

2   1 K2    D2 hLci = ∑ hLci − hLciq  K2 p = 1 2

ð4Þ

Nc      minfc ∑ D2 hLci − D1 hLci

ð5Þ

i=1

The MMDML algorithm also includes stochastic sub-gradient descent algorithm for selecting the optimized values for Wcl , blc . The algorithmic steps for optimization algorithm were given in [9]. The classification process for the testing image set Xiq where i represent the number of image in the training set and q represents the class. A label Lq is assigned between the training image set Xc and testing image set Xq and is given in Eq. (6)   Lq = argmin c Xq , Xc

ð6Þ

The process of calculating the distance Ci includes the feature space values hc xqj from the learned deep network and its Euclidean distance from training sample hci , and the smallest distance among them will be selected as distance between sample xqj and manifold. These point to manifold distances will be then averaged and considered as distance between manifold X q and X c .

3.4

MMDML Algorithm

Input: Image Dataset features from PPCS, parameter λ, learning rate μ and Number of iterations. Output: W lc and blc Step 1: Initialization of W lc and blc with appropriate values. Step 2 (Calculation of intra-manifold and inter-manifold) For t = 1, 2, . . . , T do

PPCS-MMDML: Integrated Privacy-Based Approach for Big Data …

441

Compute the intra-manifold and inter-manifold neighbours. For l = 1, 2, . . . , L do Compute hlci , hlcip , hlciq using the deep networks. End Step 3 Optimization by stochastic sub-gradient descent algorithm Compute the best values of Wcl and blc

4 Results and Discussion For experimental evaluation of the proposed PPCS-MMDML algorithm, we consider it for classification of brain, bone and breast datasets that are available in public repositories [10, 11]. Table 1 provides a statistical comparison of the run-time analysis between traditional MMDML and PPCS-MMDML and it was clearly illustrated in Fig. 1. It also conveys that the proposed one will take less time than the traditional. Table 2 expresses the comparison of classification accuracies between MMDML and PPCS-MMDML in three datasets. Results prove that our proposed mechanism provides higher classification rates preserving privacy.

Fig. 1 Comparison of running time between MMDML and PPCS-MMDML for various datasets

S. no.

Datasets

Running time ( × 105 ms) MMDML PPCS-MMDML

1 2 3

Bone cancer Brain cancer Breast cancer

1.7 2.1 2.8

3

Running Time (ms)

Table 1 Comparison of running time between MMDML and PPCS-MMDML

1.4 1.8 2.2

x 10 5

2.5 2 1.5 1 MMDML

0.5

PPCS-MMDML

0 Bone Cancer Brain Cancer Breast Cancer

Dataset

442 Table 2 Comparison of classification accuracy rates of MMDML and PPCS-MMDML

D. Franklin Vinod and V. Vasudevan S. no.

Datasets

Classification accuracy (%) MMDML PPCS-MMDML

1 2 3

Bone cancer Brain cancer Breast cancer

66.5 68.5 71.6

74.2 72.6 77.8

5 Conclusion The PPCS-MMDML algorithms select a suitable feature subclass securely and increases the performance of image set classification. It is noticed that in the existing deep learning algorithm, the classification performance is affected due to unsuitable and unnecessary features. Also feature selection process is unaware about privacy needed between the user and the databank. The analysis on large dataset insists that the secure feature selection has significant role before classification in big data environment. The proposed work execute well in big data environment for three cancer datasets. The experimental result establishes that upgraded performance is achieved by the integration of privacy based feature set selection algorithm with the classification algorithm PPCS-MMDML.

References 1. Chen, X.W., Lin, X.: Big data deep learning: challenges and perspectives. In: IEEE Access, pp. 514–525 (2014) 2. Bengio, Y., Bengio, S.: Modeling high-dimensional discreate data with multi-layer neural networks. In: Proceedings of Advances in Neural Information Processing Systems. pp. 400–406 (2000) 3. Lei X., et al.: Information security in big data: privacy and data mining. In: IEEE Access, pp. 1149–1176 (2014) 4. Jain, P., Gyanchandani, M., Khare, N.: Big data privacy: a technological perspective and review. J. Big Data 3(25) (2016). https://doi.org/10.1186/s40537-016-0059-y 5. Arandjelovic, O., Shakhnarovich, G., Fisher, J., Cipolla, R., Darrell, T.: Face recognition with image sets using manifold density divergence. In: CVPR, pp. 581–588 (2005) 6. Mehmood, A., et al.: Protection of big data privacy. IEEE Access, pp. 1827–1834 (2006) 7. Sheikhalishahi M., Martinelli F.: Privacy-utility feature selection as a tool in private data classification. In: Omatu S., Rodríguez S., Villarrubia G., Faria P., Sitek P., Prieto J. (eds.) 2017 14th International Conference Distributed Computing and Artificial Intelligence,. DCAI. Advances in Intelligent Systems and Computing, vol. 620. Springer, Cham (2018) 8. Vinod, DF., Vasudevan V.: A bi-level security mechanism for efficient protection on graphs in online social network. In: Arumugam S., Bagga J., Beineke L., Panda B. (eds.) Theoretical Computer Science and Discrete Mathematics ICTCSDM 2016. Lecture Notes in Computer Science, vol. 10398. Springer, Cham (2017)

PPCS-MMDML: Integrated Privacy-Based Approach for Big Data …

443

9. Lu, J., Wang, G., Deng, W., Moulin. P., Zhou, J.: Multi-manifold deep metric learning for image set classification, In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, pp. 1137–1145 (2015) 10. OSIRIX. Available: http://www.osirix-viewer.com/resources/dicom-image-library/ 11. Suckling, J., Parker, J., Dance, D., Astley, S., Hutt, I., Boggis, C., Ricketts, I., et al.: Mammographic Image Analysis Society (MIAS) database v1.21 [Dataset] (2015). https:// www.repository.cam.ac.uk/handle/1810/

Polynomial Time Subgraph Isomorphism Algorithm for Large and Different Kinds of Graphs Rachna Somkunwar and Vinod M. Vaze

Abstract Graphs are used to represent complex structures in pattern recognition and computer vision. In various applications, these complex structures must be classified, recognized, or compared with one another. Except for special classes of graphs, graph matching has in the worst case an exponential complexity; however, there are algorithms that show an acceptable execution time, as long as the graphs are not too large. In this work, we introduce a new polynomial time algorithm for Subgraph Isomorphism, COPG algorithm, efficient for large and different kinds of graphs The Subgraph Isomorphism is used for deciding if there exist a copy of a pattern graph in a target graph. COPG algorithm is based on three phases Clustering, Optimization, and Path Generation. Performance of the new approach is based on different types of graphs, size of graphs, and number of graphs. Dataset and test set contain 10,000 numbers of graphs and subgraphs with 10,000 nodes. It also contains different graphs and subgraphs such as Generalized, M2D, M3D, and M4D. The performance of the new approach is compared with Ullman and VF series algorithms in terms of space and time complexity.



Keywords Graphs Pattern matching Subgraph isomorphism Clustering





Subgraph



Graph isomorphism

1 Introduction Graphs are universal and powerful data structure, useful in several areas like physics, chemistry, biology, social networking, molecules, pattern matching, etc. The basic idea of these applications is to check whether one pattern of the graph is available in another pattern of the graph and this is nothing but the subgraph isomorphism.

R. Somkunwar (✉) ⋅ V. M. Vaze Shri Jagdishprasad Jhabarmal Tibrewala University, Jhunjhunu 333001, Rajasthan, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_44

445

446

R. Somkunwar and V. M. Vaze

Where one pattern represents an input graph and other pattern represents a target graph. All graphs used in this paper are finite, undirected, simple, and generalized. Graph Isomorphism can be solved in polynomial time for them [1, 2]. Subgraph isomorphism is NP-complete problem. Moreover, by marginally adjusting known confirmations [3, 4], it can be demonstrated that, Subgraph Isomorphism is NP-complete when P and Q are disjoint unions of paths or of complete graphs [5]. A survey of logical writing concerning Subgraph Isomorphism algorithms utilized as a part of Pattern matching [6–9]. From these references, the perfect matching algorithms proposed till now are based on Graph Indexing and Search Tree Time. To overcome the drawback of memory size, various recent algorithms GraphQL, QuickSI, GADDI, Spath, and TurboIso [10–14] are used. The most widely recognized method to build up a subgraph isomorphism depends on backtracking in a search tree. In order to avoid the search tree from increasing unessential large, distinctive refinement strategies such as the one by Ullman [15], forward-checking and looking-ahead [16], or discrete relaxation [17] has been proposed. Another method is used in [18, 19] where the recording approach is used for association graph and maximal clique detection is performed for a possible graph match. In [20], partitioning the graph approach is used for reducing the complexity of the subgraph detection problem. In [21], authors have introduced a similarity-based categorical data clustering technique for improving the results, complexity and high time consumptions problems. In this work [22], five different kinds of clustering methods are used for each category of industry. VF2 [23] algorithm has used DFS strategy which requires more memory. VF3 [24] algorithm is proposed for large and dense graphs and it is an advanced version of VF [25] and VF2 algorithm. In this paper, we propose a new algorithm to the problem of verification of subgraph isomorphism from a set of model graphs to an input graph.

2 The Proposed COPG Algorithm COPG algorithm is based on three phases: Phase 1: Cluster Creation. Phase 2: Optimization.

Polynomial Time Subgraph Isomorphism Algorithm for Large …

Phase 3: Path Generation. Algorithm1 COP GAlgo: It uses 3 phases, Inputs are G1 (Subgraph) and G (Fullgraph), G11 is Subgraph Structure, Procedure returns response Ɵme.

Algorithm COPGAlgo(G1,G) 1. Call Phase1(G1,G); 2. Call Phase2(G1,G); 3. Call Phase3(G1,G); 4. if (G11 G) 5. return response time 6. return 0

Fig. 1 Execution process of three phases

447

448

R. Somkunwar and V. M. Vaze

Execution process of phase1, phase2, and phase3 are given in the Fig. 1. In phase 1, clusters are created for each node of subgraph G1 using Minimum Distance. Each cluster can have less than or equal to n number of matched nodes of full graph G.   MinDist = SQRT ðI2 − I1Þ2 + ðO2 − O1Þ2 ; Reducing the size of the cluster is called as Optimization. In phase 2 clusters are optimized, so that only closely matched nodes will be used in the cluster. For Optimization, Relation (R) of nodes of clusters is to be calculated first. R1 and R2 are two different relations used for representing the relation between clusters and subnodes of clusters respectively. Algorithm2 Phase1: Inputs are G1 (Subgraph) and G (Fullgraph), Procedure returns Clusters (S1, S2,…, Sn)

Algorithm Phase1 (G1, G) 1. Count number of nodes in G1 and G 2. for every node of G1 do 3. find matching nodes in G 4. create clusters of matching nodes in G 5. return clusters

If R1 and R2 are same, then related nodes will be used in that particular cluster else it is discarded from the cluster. Algorithm3 Phase2: Inputs are G1 (Subgraph) and G (Fullgraph), Procedure returns OpƟmized Clusters or Original Cluster.

Algorithm Phase2 (G1,G) 1. Get all Clusters (S1, S2,…,Sn) 2. For i:= S1 to Sn do 3. R1:= Relation of S1 with other Cluster 4. R2:= Relation of all nodes of S1 with other Cluster 5. If (R1 = R2) 6. return original Cluster 7. else 8. { 9. Optimize Cluster 10. return Optimized Cluster 11. }

In phase 3, sort optimized clusters in ascending order and its duplicate copy is created for further use. Duplicate copy of the cluster is called as clones. The path is generated from optimized cluster. The path is nothing but the subgraph structure

Polynomial Time Subgraph Isomorphism Algorithm for Large …

449

G11 which is going to be matched with model graphs. If match found, then graphs are Isomorphic, and it returns the total execution time else graphs are not isomorphic. Algorithm4 Phase3: Inputs are G1 (Subgraph) and G (Fullgraph), Procedure returns subgraph structure G11.

Algorithm Phase3 (G1, G) 1. Sort filtered clusters in ascending order 2. Create duplicate copies of sorted clusters // clones 3. Generate Path //Subgraph structure G11 4. return subgraph structure

3 Experimental Setup For the generation of different types of graphs such as random graphs (general graphs), M2D, M3D, M4D, bounded and valence graphs, we have to implement a graph generator program to generate data set of graphs and subgraphs for testing algorithm of subgraph isomorphism. For the generation of dataset, we have used different types of computer programs of “C” language developed by http://www. mivia.unisa.it. For each demonstration, we generated 1000 model graphs with 10,000 nodes. Graphs generated here are undirected and unlabeled graphs. Unlabeled graphs are difficult to handle as they do not have weight and it is difficult to identify the identification of a node. COPG algorithm is implemented in a C++ language on a Linux Suse OS (13.01) with 32 bits OS type. The experimental environment is Intel (R) Core (TM) i3 CPU, number of cores are 4, 2 GB RAM, 512 GB HDD, 3.20 GHz Processor Speed.

4 Results and Discussion Thus, all the demonstrations in this section are related to the subgraph isomorphism. To compare the performance of existing algorithms, number of tests is repeated for large and different types of graphs. The execution is then estimated by determining the total number of basic computation steps that are executed while searching for subgraph isomorphism. Each test is repeated 10 times for different permutations of the vertices of the input graph with the model graphs. We started our test with 10 numbers of nodes and 15 numbers of edges and ended with a graph consisting of 10,000 numbers of nodes. Table 1 shows the experimental results of general graphs, M2D, M3D, and M4D graphs with its total execution time in ms. Execution

471 1490 9302 4378 − − − − − −

100 200 300 400 500 600 700 800 900 1000

1278 14,649 2500 3731 6715 60,455 70,896 90,474 67,012 106,747

VF2 (Time in ms) Gen M2D (ms) (ms)

Graph Size (Number of nodes) 233 723 7240 2772 4158 6237 9355 14,035 12,533 32,299

M3D (ms) 331 316 2000 2456 1228 1842 2763 4144 6216 9324

M4D (ms) 988 28,232 175 280 36,934 19,194 − − − −

15,800 358,894 30,084 36,934 56,123 − − − − −

2013 955 44,614 6782 9494 13,261 18,565 25,991 36,387 50,640

Ullman (Time in ms) Gen M2D M3D (ms) (ms) (ms)

Table 1 Result of subgraph isomorphism using general, M2D, M3D, and M4D graphs

952 s 512 768 1152 1728 2592 3888 5832 8748 13,122

M4D (ms) 223 461 978 1107 1574 2736 3204 4603 4522 6581

8017 10,224 7505 6479 8679 52,499 66,774 88,487 47,242 48,336

164 359 1053 1207 1516 3540 4726 4473 3963 7056

228 203 778 932 1074 923 5200 3455 2595 2654

COPG Algorithm (Time in ms) Gen M2D M3D M4D (ms) (ms) (ms) (ms)

450 R. Somkunwar and V. M. Vaze

Polynomial Time Subgraph Isomorphism Algorithm for Large …

451

Fig. 2 Comparison between Ullman, VF2 with graph isomorphism algorithm (GIAlgo) for M2D graphs

time varies as the number of nodes varies. The results obtained for general graphs, M2D, M3D ,and M4D graphs of graph size 1000 is Isomorphic and execution time is 6581 ms, 48,336 ms, 7056 ms, 2654 ms, respectively. We can notice here, performance of VF2 and Ullman is zero for large graphs (600 nodes to 1000 nodes) while COPG algorithm is good in this case. Figure 2 shows the comparative performance of Graph Isomorphism Algorithm (GIAlgo) with Ullman and VF2 algorithms. Figure 3 and 4 shows the comparative performance of the proposed algorithm COPG (SGIAlgo) with Ullman and VF2 algorithms and the type of graph used here is Generalized, M2D, M3D, and M4D graphs. The time performance of the new approach is found to be better than Ullman and VF2 algorithms.

Fig. 3 Comparison between Ullman, VF2 with subgraph isomorphism algorithm (SGIAlgo) for Gens and M2D graphs

452

R. Somkunwar and V. M. Vaze

Fig. 4 Comparison between Ullman, VF2 with subgraph isomorphism algorithm (SGIAlgo) for M3D and M4D graphs

5 Conclusion and Future Scope In this work, we have introduced a polynomial subgraph isomorphism algorithm, named COPG, for large and different kinds of graphs. In Sect. 2, the description of the algorithm is given in detail. We have performed an extensive experimental evaluation, with a dataset and test set of 1000 graphs with 10,000 nodes for the graph and subgraph isomorphism. Our proposed algorithm is also compared with Ullman and VF series algorithms, which shows our proposed algorithm is faster than these algorithms. COPG algorithm can also work for undirected graphs, unlabeled graphs and different kinds of graphs like generalized, M2D, M3D, and M4D graphs. In future, this approach can be extended for parallel and multicore systems.

References 1. Colbourn, C.J.: On testing isomorphism of permutation graphs. Networks 11, 13–21 (1981) 2. Lueker, G.S., Booth, K.S.: A linear time algorithm for deciding interval graph isomorphism. J. ACM 183–195 (1979) 3. Garey, M.R. Johnson, D.S.: Computers and Intractability: a guide to the theory of NP-completeness. W.H. Freeman and Company (1979) 4. Damaschke, P.: Induced subgraph isomorphism for cographs is NP-complete. In: WG’90. Lecture Notes in Comput. Sciences, vol. 487, pp. 72–78 (1991) 5. Garey, M.R., Johnson, D.S.: Computers and Intractability: a guide to the theory of NP-Completeness. Freeman (1979) 6. Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in Pattern Recognition. Int J Pattern Recogn 18(3), 265–298 (2004)

Polynomial Time Subgraph Isomorphism Algorithm for Large …

453

7. Vento, M.: A long trip in the charming world of graphs for pattern recognition. Pattern Recognit. 1–11 (2014) 8. Foggia, P., Percannella, G., Vento, M.: Graph matching and learning in pattern recognition on the last ten years. Int. J. Pattern Recogn. 28(1) 2014 9. Livi, L., Rizzi, A.: The graph matching problem. Pattern Anal. Appl. 16(3), 253–283 (2013) 10. He, H., Singh, A.: Graphs-At-A-Time: query language and access methods for graph databases. In: Proceedings of the 2008 ACM SIGMOD International, pp. 405–417 (2008) 11. Shang, H., Zhang, Y., Lin, X., Yu, J.X.: Taming verification hardness: An efficient algorithm for testing subgraph isomorphism. Proc. VLDB Endow. 1(1), 364–375 (2008) 12. Zhang, S., Li, S.,Yang, J.: Gaddi: Distance index based subgraph matching in biological networks. In: EDBT, pp. 192–203. ACM (2009) 13. Zhao, P., Han, J.: On graph query optimization in large networks. Proc. VLDB Endow. 3(1–2), 340–351 (2010) 14. Han, W., Lee, J.-h., Lee, J.: Turbo Iso: towards ultrafast and robust subgraph isomorphism search. In: Large Graph Databases, SIGMOD, pp. 337–348 (2013) 15. Ullman, J.R.: An Algorithm for Subgraph Isomorphism, of the Assoc. for Computing. Machinery 23(1), 31–42 (1976) 16. Haralick, R.M., Elliot, G.L.: Increasing Tree Search Efficiency for Constraint Satisfaction Problems. Artif. Intell. 14, 263–313 (1980) 17. Kim, W.Y., Kak, A.C.: 3-D Object Recognition Using Bipartite Matching Embedded in Discrete Relaxation, IEEE Trans. Pattern Anal. Mach. Intell. 13, 224–251 (1991) 18. Falkenhainer, B., Forbus, K.D., Gentner, D.: The Structure-Mapping Engine: Algorithms and Examples. Artif. Intell. 41, 1–63 (1990) 19. Myaeng, S.H., Lopez-Lopez, A.: ™Conceptual graph matching: A Flexible Algorithm and Experiments. Exp. Theor. Artif. Intell. 4, 107–126 (1992) 20. Blake, R.E.: Partitioning Graph Matching with Constraints. Pattern Recognit. 27(3), 439–446 (1994) 21. Narayana, G.S, Vasumathi, D.: An attributes similarity-based K-medoids clustering technique in data mining. Arab. J. Sci. Eng. 1–14 (2017) 22. Bhatnagar, V., Ritanjali M., Pradyot R.J.: Comparative performance evaluation of clustering algorithms for grouping manufacturing firms. Arab. J. Sci. Eng. 1–13 2017 23. Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004) 24. Carletti, V., et al.: Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with VF3. IEEE Trans. Pattern Anal. Mach. Intell. (2017) 25. Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: 3rd IAPR-TC15 Workshop on GBR, pp. 149–159 (2001)

Study of Different Document Representation Models for Finding Phrase-Based Similarity Preeti Kathiria and Harshal Arolkar

Abstract To find phrase-based similarity among documents, it should first analyze the text data stored within the document before applying any machine learning algorithms. As the analysis on textual data is difficult, the text is needed to be broken into words, phrases, or converted to numerical measure. To convert text data into numerical measure, the well-known bag-of-words with term frequency model or TF-IDF model can be used. The converted numerical data, broken words or phrases, are to be stored in some form like vector, tree, or graph known as document representation model. The focus of this paper is to show how different document representation models can store words, phrases, or converted numerical data to find phrase-based similarity. Phrase-based similarity methods make use of word proximity so it can be used to find syntactic similarities between documents in a corpus. The similarity is calculated based on the frequency of words or frequency of phrases in sentences. This paper analyzes and compares different representation models on different parameters to find phrase-based similarity. Keywords Document representation model Document index graph model



Phrase-based document similarity

1 Introduction To group together similar text documents in datasets, the similarity among all the text documents should be measured. From the measured similarities based on some threshold value, the groups can be established. Text data differs from ordinary numerical data in a manner that the feature set of a text document is the set of words P. Kathiria (✉) Nirma University, Ahmedabad, India e-mail: [email protected] H. Arolkar GLS University, Ahmedabad, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_45

455

456

P. Kathiria and H. Arolkar

that represent the document. Algorithms designed for working on numerical features cannot be directly applied to textual features; some conversion of the words into usable numerical measures has to take place at first. The method used to do such conversion came from the field of information retrieval, where the vector space model is used along with cosine similarity to compute similarities between documents in a dataset. Vector space model first requires that documents be represented into a bag-of-words; this representation denotes the set of unique words in a document and the number of times that they appear in a document. Further on methods like tf–idf are used in conjunction with cosine similarity to compute the similarity of documents. Although the vector space model has been used extensively in both research and industrial applications, it has limitation. The model at first requires that the text be represented in a bag-of-words model. In such a representation, the document is represented as a set of single terms only; all similarity is calculated based on the occurrence of single terms of the document. This model can be expanded to incorporate word sequencing with the help of matrix of n-grams. But as the document size grows, the n-grams matrix becomes sparse, and it can be used for fixed length phrase. However, computing and giving importance to the similar phrases of any length that are found in a pair of documents are also useful and can lead to more accurate similarity measures between a pair of documents. Phrases provide the order in which the terms reside in the documents. So other document representation models like document index graph and suffix Tree are needed. In document index graph model, the document is fragmented into sentences and each sentence is segmented into individual words which represent nodes of the graph and edges preserve the ordering of the words in the sentence. All the documents can be stored within the single graph and from the graph, similar phrases can be identified easily and further analysis can be done. Suffix tree model uses trie data structure to store sentences of all the documents and similar pattern among documents can be found.

2 Related Work In paper, Solka [10] has given the brief introduction regarding the theory and methods of text data mining including how to extract features from the text data, how to make document representation model—bag-of words from the text document, vector space model for finding document similarity using bigram proximity matrix, reducing dimensionality of the matrix as well. For the above work, they have used round about 1,100 extended abstracts. After completing preprocessing like removal of stop words and stemming, these documents were converted into bigram proximity matrices which were sparse in nature and high dimension as well. From these matrices, an interpoint distance matrix is computed based on the measure Cosine similarity. As these matrices were of high dimension so dimensionality reduction isometric mapping—ISOMAP were applied on it to reduce up to six dimensions, and finally, clustering is done. There have been a number of

Study of Different Document Representation Models …

457

methods for dimensionality reduction for multidimensional scaling: classical, metric, or nonmetric. Hammouda and Kamel [2, 3] have used document index graph representation model for similarity computation between documents. The paper proposes to store the text in the form of a document index graph, a new document representation model. The document index graph representation model enables to perform phrase matching and similarity calculation between documents. On the basis of single term and phrase, the phrase-based similarity measure is devised. Finally, different standard document clustering methods were applied on the constructed similarity matrix, and quality of clusters is measured. The method proposed by Chim and Deng [1] suggests a similarity approach by representing the documents in the form of a suffix tree and also gives a soft clustering approach to cluster similar documents together. A suffix tree breaks down a sentence or string into its constituent words or characters, organizing it around the structure of a sentence, making it easier to detect similar patterns. Based on the shared suffixes the base clusters candidates are identified and final clusters are made based on the similarity threshold value. Li and Chung in the paper [6] proposed a method of using frequent word sequences to cluster similar documents (CFWS). As the size of documents can go up to several thousand words, it in turns makes the suffix tree much larger so the dimension of the documents should be reduced. Also, a document can represent more than one topic, such that documents should be placed in more than one clusters. To compact a document they have found frequent word sequences from the documents. Generalized suffix tree (GST) is formed out of the documents after their dimension has been reduced after finding frequent word sequences. After the GST has been formed, the paper proposes an agglomerative clustering method to form the clusters. In paper [7], the same author proposed another new text document clustering algorithm as CFWMS—clustering based on frequent word meaning sequences. They enriched the previous CFWS method to CFWMS by using frequent word meaning sequences to quantify the similarity between the documents. Frequent word meaning sequences were found using synonyms and hyponyms/ hypernyms delivered by the Wordnet ontology. The results show that CFWMS algorithm performed better as compared to CFWS algorithm. The novel method of creating ontology using Wordnet is given in paper [5] by Kathiria and Ahluwalia. In this approach, naïve Bayes classifier is used to assign labels to documents, and vector space model is used for similarity calculation against inputted document. On the basis of the threshold value of similarity calculation, cluster is formed. Text summarization is also done to shorten the text and important terms are carried out in making the ontology. The ontology structure is based on the hierarchy of the lexical relation of each word in the ontology, starting with the domain to which the word belongs to, moving to the meaning or the definition of the word, and then to the lexical relations of synonyms, hypernyms, and hyponyms. The performance of the two algorithms used in the paper [1] and paper [7] was evaluated by Rafi et al. [9] in their paper. They did the experiments on as Windows

458

P. Kathiria and H. Arolkar

7 PC using C# 3.5 on different datasets. The computational requirements of CFWS [7] are much higher as compared to STC [1], and using the F-measure criterion, the paper deems the STC method to be superior. Momin et al. [8] presented the extension of document graph model to evaluate weighted phrase-based similarity between web documents with embedded clustering algorithm in Document Index Graph construction and phrase matching. Cluster document similarity is found and the new document is put in the appropriate cluster. The cluster information for that document was also stored in the graph structure so with this small extra storage it provides better performance.

3 Classification of Document Representation Model for Finding Phrase-Based Document Similarity Methods For finding phrase-based document similarity, the document representation models are classified as follows: • Document similarity based on phrase by document model using n-grams. • Phrase-based document similarity based on document index graph model. • Phrase-based document similarity based on suffix tree representation model. Let us look at each of them in detail with the help of below mentioned three text documents as example. Each text document comprises single sentence for simplicity. D1 Young boys like to play basketball. D2 Half of young boys play football. D3 Almost all boys play basketball.

3.1

Document Similarity Based on Phrase by Document Model Using N-Grams [4]

The term n-grams here refers to combinations of n adjacent words from the sentence. The model is defined based on the value of “n”. If the value of n is 1, the model is called unigram; if n is 2, the model is known as bigram; and if n is 3, the model is known as trigram. To create a bigram matrix from the above set of document, first stop words like “to”, “of” are removed from the above documents. Now bigrams are identified from the documents using the bigram proximity matrix as given in [10]. The bigrams for the documents D1 are (young, boys), (boys, like), (like, play), (play, basketball); document D2 are (half, young), (young, boys), (boys, play), (play, football); and document D3 are (almost, all), (all, boys), (boys, play), (play, basketball).

Study of Different Document Representation Models … Table 1 Bigram matrix for the documents D1, D2, and D3

459

Bigrams/documents

D1

D2

D3

(young, boys) (boys, like) (like–, play) (play, basketball) (half, young) (boys, play) (play, football) (almost all) (all, boys)

1 1 1 1 0 0 0 0 0

1 0 0 0 1 1 1 0 0

0 0 0 1 0 1 0 1 1

Now, these bigrams can be stored in the matrix as rows and documents as columns. Table 1 represents the generated matrix. The frequency of the occurrence of these bigrams can now be stored against each document in the matrix [4]. In particular, instead of term frequencies, term frequency-inverse document frequencies of the words can also be used in each feature vector. By converting each column of the matrix into vector, the cosine similarity measure can be easily applied to the generated matrix. Here, we have two document vectors d1 and d2, and then, the cosine of the angle between them, θ, is given by ! ! d1 ⋅ d2 cos θ =  ! !  d1  d2 

ð1Þ

! Note jd1 j indicates the norm of vector d1. Larger value of this measure represents that documents are similar while smaller values represent that documents have less match. The pairwise n-gram matching is performed with over all time complexity of O(N) where N indicates the number of unique phrases throughout the documents set. But the limitation over here is that the matrix generated by the above method has high dimensionality as well as sparse for large documents. So, to store and analyze this matrix computationally, efficient matrix techniques are required. N-gram document representation model only considers fixed length phrases.

3.2

Phrase-Based Document Similarity Based on Document Index Graph Model [2, 3]

The method proposes to store the text in the form of a document index graph— DIG. Given a set S of words in a sentence, for example, the “Young boys like to play basketball.”, the sentence is segmented into individual words, which are then labeled as vertices v1, …, vn. Edges are formed from v1 -> v2; v2 -> v3, up to all consecutive vertices till vn − 1 forming an edge with vn. If a similar word is

460

P. Kathiria and H. Arolkar

Fig. 1 Document index graph for the documents D1, D2, and D3

encountered as a vertex, then the graph is traversed to check till which level the similarity is found; if there is a dissimilarity at any word, then the word is entered as a vertex; and if all or some number of contiguous words can be traversed as edges already entered in the graph, then a similar phrase is said to be encountered. Therefore, the graph is formed after stripping stop words. A sample DIG for the same set of documents mentioned above, after stop words removal, is as shown in Fig. 1. The total number of vertices in the graph will be equal to the total number of unique words in the documents for which the graph is being constructed. The matching phrases encountered are maintained in a list “L”. The similarity formula thus becomes qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ∑pi= 1 fgðli Þ * ðfi1 wi1 + fi1 wi2 Þg2     simp ðd1, d2Þ = ∑ sj1  * wj1 + sj2  * wj2

ð2Þ

j

γ

jmpi j where gðli Þ = i . of the matching phrase, and si is the length of the original Here mpi is the slength sentence. The formula is such that it rewards the matching phrases of higher lengths, that is if the two sentences are almost or entirely matching with each other, the value of g(li) will be higher. The symbol γ here is the sentence fragmentation factor, the value of γ is always greater than or equal to 1. The paper though does not state how to get the best value for γ. Through experimentation, the authors have stated that a value of γ = 1.2 gives best results. fi1, fi2 are the frequency of matching phrases; sj1, sj2 are lengths of sentences in documents being considered; and wj1, wj2 are the weights. The weights are assigned in the following fashion: (i) If the phrase match occurs at parts of the document like title, meta keywords, meta description, and section heading, then a high significance is allotted.

Study of Different Document Representation Models …

461

(ii) Phrases found if have terms in bold, italics, hyperlink text, or text found in table captions are awarded a medium level of significance. (iii) Phrases found in any other part of the document are awarded a low level of significance. Since the method checks the similarity for web documents, allotting significance based on the above scheme is possible, since the HTML portion of web documents carries the structure required. The method also calculates single term similarity since only relying on phrases is not enough. The final similarity formula becomes simðd1 , d2 Þ = α ⋅ simp ðd1 , d2 Þ + ð1 − αÞ ⋅ simt ðd1 , d2 Þ

ð3Þ

The value for α lies between [0, 1]. The value of alpha determines the weight of the phrase similarity measure, similarity blend factor. There are several advantages of this representation model. (i) For all documents under consideration, a single graph is only required to be constructed. Therefore, all phrase matchings can be done with a single graph and multiple instances need not be constructed. (ii) DIG delivers complete information about full phrase matching between every couple of documents. This provides the amount of overlap between every couple of document. (iii) The method moves from representing texts in the form of bag-of-words to a graph-based format. The graph structure allows to check for similar phrases in the documents. For example, if the text was split into vectors of sentences, then to check if occurrence of similar sentence is there the time complexity would be O(N) here N indicates the number of sentences in the document, whereas using graph approach the complexity comes up to O(w) where w is the number of words in the sentence in either document, it is obvious that N is much larger than w. This is the time complexity for finding the similar phrase for a single sentence.

3.3

Phrase-Based Document Similarity Based on Suffix Tree Representation Model [1]

The method proposes a similarity approach by representing the documents in the form of a suffix tree. The suffix tree is stored for sentences with each node being constructed by moving the pointer from word to word in a string, in the document. A suffix tree for the same set of documents mentioned above is shown in Fig. 2. The suffix tree is being constructed in the format moving word to word for each sentence, the first node being inserted is the entire sentence, in this instance “Young boys like to play basketball”, similarly the pointer moves to the next word “boys

462

P. Kathiria and H. Arolkar

Fig. 2 Suffix tree for the documents D1, D2, and D3 [7]

like to play basketball” and keeps inserting nodes in this fashion. If a similar prefix is detected, for example, “Young boys play football” in the sentence “Half of young boys play football” has similar prefix in “young boys” from “Young boys like to play basketball”; in this case, the node is broken to form two different nodes, “young boys”—“play football”, which will form the path. Each node of the suffix tree consists of the length of word sequence, the document that the word sequence lies in and the number of times the word sequence lies in each document. For example, consider the node for “young boys”, it shows that it lies in documents 1 and 2, for one time each. This information from each node will be used to calculate tf–idf for the phrases during similarity and also during the clustering operation. For computing similarity, the calculation is done node by node. Each node has a value tf (i, d) which denotes the frequency of the ith term in the document and df(i) denotes the number of documents comprising the ith term. This helps in calculating the weight of the node for each document which is part of the node, the weight is calculated by wði, dÞ = ð1 + logt f ði, dÞÞ ⋅ logð1 +

N Þ df ðiÞ

ð4Þ

Here i stands for the phrase and d the document. These weights after calculation are updated in the document vector of each document based on the value found. The final similarity formula is Cosine similarity for two documents d1 and d2 is ! ! d1 ⋅ d2 Simðd1, d2Þ =  ! !  d1  d2 

ð5Þ

The advantage of using a suffix tree instead of a graph is that similar patterns/ substring among sentences in two documents are also taken into consideration. A suffix tree breaks down a sentence/string into its constituent words/characters, organizing it around the structure of a sentence, making it easier to identify similar patterns. The method constructs the suffix tree in O(mN) time, where m is the

Dynamic

Trie

Graph

Dynamic

Suffix tree model

Matrix

Fixed length

Phrase by document model using n-grams DIG model

Data structure used for implementation

Full phrase matching (any length phrase)

Document representation models

High

Low

High

Space complexity

Yes

No

Yes

Compaction required

Table 2 Comparative analysis of document representations models

O(N)—Where N is the number of new words in the new document For N documents of average length m words: (1) Suffix tree algorithm—O(N m2) (2) Ukkonen’s algorithm—O(N m) (3) Zamir’s algorithm—O(Nlog N)

O(N)—Here N represents number of unique phrases throughout the documents under consideration

Time complexity for construction of data structure

O(N)—Where N is the number of new words in the new document O(N2) For all pairwise similarity for N documents

O(N) for literal plagiarism. If lexical as well syntactic changes is considered, then it will be O(N2)

Time complexity for phrase matching

Study of Different Document Representation Models … 463

464

P. Kathiria and H. Arolkar

number of nodes constructed, and N is the number of documents. The method catches out any length common phrases but it stores high redundancies in the form of suffixes.

4 Conclusion As can be seen in the paper, a document can be represented using different models like phrase by document model using n-grams, document index graph model, and suffix tree model. The comparative analysis of these three models based on different parameters is represented over here in Table 2. The above comparative analysis depicts that the document index model (DIG) offers complete information about full phrase matching between every couple of documents. It demands less space to store when few words are introduced later by new documents. The model constructs graph and does phrase matching almost in linear time.

References 1. Chim, H., Deng, X.: Efficient phrase-based document similarity for clustering. IEEE Trans. Knowl. Data Eng. 20(9), 1217–1229 (2008) 2. Hammouda, K.M., Kamel, M.S.: Phrase-based document similarity based on an index graph model. In: Proceedings of 2002 IEEE International Conference on Data Mining ICDM 2003, pp. 203–210. IEEE (2002) 3. Hammouda, K.M., Kamel, M.S.: Document similarity using a phrase indexing graph model. Knowl. Inf. Syst. 6(6), 710–727 (2004) 4. Hussein, A.S.: Visualizing document similarity using n-grams and latent semantic analysis. In: SAI Computing Conference (SAI), pp. 269–279. IEEE (2016) 5. Kathiria, P., Ahluwalia, S.: A Naive method for ontology construction. Int. J. Soft Comput. Artif. Intelligen. Appl. (IJSCAI), 5(1), 53–62 (2016) 6. Li, Y., Chung, S.M.: Text document clustering based on frequent word sequences. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 293–294. ACM (2005) 7. Li, Y., Chung, S.M., Holt, J.D.: Text document clustering based on frequent word meaning sequences. Data Knowl. Eng. 64(1), 381–404 (2008) 8. Momin, B.F., Kulkarni, P.J., Chaudhari, A.: Web document clustering using document index graph. In: International Conference on Advanced Computing and Communications, 2006. ADCOM 2006, pp. 32–37. IEEE (2006) 9. Rafi, M., Maujood, M., Fazal, M.M., Ali, S.M: A comparison of two suffix tree-based document clustering algorithms. In: International Conference on Information and Emerging Technologies (ICIET), pp. 1–5. IEEE (2010) 10. Solka, J.L.: Text data mining: theory and methods. Stat. Surv. 2, 94–112 (2008)

Predicting Consumer’s Complaint Behavior in Telecom Service: An Empirical Study of India, Sri Lanka, and Bangladesh Amandeep Singh and P. Vigneswara Ilavarasan

Abstract This article tests a predictive model of Consumer Complaint Behavior (CCB) in telecom industry of three countries (India, Sri Lanka, and Bangladesh). It utilizes a part of the dataset that explores the service efficiency of electricity, telecom, and government services. The data were collected from microentrepreneurs. Logisitic regression was used to validate the model that included factors business characteristics, telecom service characteristics, complaining behavior, and demographic details.



Keywords Customer complaint behavior Predictive model regression Telecom India Sri Lanka and Bangladesh









Logistic

1 Introduction In the super competitive environment of the telecommunications industry, companies are focusing more and more on retaining the existing customers. They aim to increase customer satisfaction to make their customer base loyal. Loyal customers stay longer with the company and are the good sources of revenue. Whereas, on the other hand, a dissatisfied customer not only switches to other company but also spreads a negative “word of mouth” image hurting firm’s reputation. In service industries, customer dissatisfaction largely and negatively impacts company’s

A. Singh (✉) ABV-Indian Institute of Information Technology and Management, Gwalior, Gwalior 474015, Madhya Pradesh, India e-mail: [email protected] P. V. Ilavarasan Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, Delhi, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_46

465

466

A. Singh and P. V. Ilavarasan

profitability [2]. It is clear that the higher the customer satisfaction, the higher will be the customer loyalty (or customer retention). So, the goal of the telecom companies should be able to achieve the maximum customer satisfaction, by avoiding the service failures and properly addressing the customer complaints.

2 Literature Review There have been many studies in the field of customer complaint behavior. According to [3] this, management can know if their customers are satisfied or not from their exit and voice. Exit refers to when customers stop using the product/ service of the company and voice refers to the complaints of the customers. Different approaches of machine learning and data mining have been used to understand CCB in Telecom industry [3]. In [4], Chinese mobile phone industry is studied to get the relationship between customer satisfaction, complaints, and loyalty, and concluded that direct complaints if addressed properly can lead to customer satisfaction. In [1], social influences are found positively impacting the churn rate of individuals. Social leaders of these groups can influence the churn of an entire group. Diaz-Aviles et al. [5] identified the relationship between the type of problem, mobile network provider and tendency to complain. In [6], real-time telecom customer data, related to mobile phone internet usage (download rate, retransmitted pkts, etc.) and customer care calls are analyzed to determine a good and bad experience and related complaining behavior.

3 Methodology For this model, we have used secondary data from project [8] undertaken by LIRNEasia, a think tank based in Colombo, Sri Lanka. The project aimed to understand the service delivery of government services (business registration, electricity) to the microentrepreneurs (referred as ME hereafter) in regions of good weak governance and after that, compare it with the telecom service delivery in the same regions. The data were collected from the bottom of pyramid (BoP) microentrepreneurs (MEs) in three countries—India, Sri Lanka, and Bangladesh.

3.1

Sample Description

The sample size of the original study was 3180, out of which the current project extracted 1101 to study them—MEs who possess mobile phones and are either

Predicting Consumer’s Complaint Behavior in Telecom Service …

467

complainers or non-complainers of telecom services. The following data were used for analysis: • Status of business: It includes data related to Business Location, Business Type, if ME prefers communicating to its customers and suppliers personally or over phone, location of customer, profit investment by ME, and growth status of business. • Telecom company info: The data is data related to a total number of SIM cards possessed by ME, Connection type, Monthly Recharge, and whether the user has changed primary service provider over the past years. • Mobile device information: This section has data related to availability and use of mobile features like––Touch screen, Camera, Mobile Internet, Email, Dual SIM, MMS, Google Maps, SMS, Games, and Social Media apps. • Complaining behavior in other scenarios: This has data regarding customer complaint in electricity department and business registration. • Demography: This has information regarding gender, age, education level, marital status, and also whether the user is a bank account holder, only earner of the family, etc.

3.2

Predictive Model Using Logistic Regression

We have used the following independent variables: (1) Connection type—whether MEs have prepaid connection or postpaid connection; (2) Bank account holder— whether MEs have; (3) Change in service provider—whether they have switched to another network in recent times; (4) Extra services use—how many extra mobile services [5]; (5) Gender; (6) Age; (7) Education/Profession; (8) Number of SIMs; (9) Service provider; (10) Total recharge per week; (11) Number of recharges in a month; (12) Monthly expense on the mobile phone; (13) Bill receive—how a customer is getting bills, i.e., paper or digital invoices; (14) Bill payment—mode of payment in in person or online transaction; (15) Nature of mobile phone [7]; (16) Satisfaction level of customers, in this questionnaire respondents, were asked different type of questions with related to telecom operator service and whether they are satisfied or not; (17) Nature of problem—billing problem, call drop, and, coverage problem; (18) Complaint handled at the service center; (19) Contact with the telecom provider—via email, through a call, SMS, online chat, and letters, etc. We have also included this variable in the model. Please refer Appendix for further details.

468

3.3

A. Singh and P. V. Ilavarasan

Revised Model

To find a structure that offered a better fit for the data and that also might overcome some of the shortcomings of the initial model, we created a revised model on the basis of the multiple correlations. In revised model, we have shortlisted 15 variables on the basis of Pearson coefficient and variable those are correlating with other variables significantly. We have shortlisted variables which have p-value is 0.05 or less. For this revised model, we have selected the following variables: • Number of SIMs—has significant correlation with the other variables. More number of SIMs indicates that customer uses service more. • Complaint not attended—is clear indicator of customer satisfaction. If telecom operator do not attend complaint properly customer may switch to other telecom operator. This variable has positive correlation with complaint tendency. • Change in service provider—unsatisfied customer change their telecom operator. This dissatisfaction comes when telecom operator do not handle complaint properly. This variable has positive correlation with complaint tendency. • Age—Age has negative correlation with the complaint tendency. It means older people do less complaints compared to the younger people. • Bank account holder—people have bank account and they more likely to complaint more in compared to the customer who do not have bank account. Also, it has positive correlation with the complaint tendency. • Formal education—has negative correlation with the complaint tendency. It shows that people having more education complaint less. • Nature of mobile phone—smart mobile phone owners complaint more as this variable has positive correlation with complaint tendency. • Customer satisfaction—correlation is also positive. • Number of recharge per month—has negative correlation with complaint tendency. • Direct Contact—negative correlation. • Contact through call center—customer contacts with the telecom operator through call center his tendency to complaint is more. • Coverage problem and Call drop problem—negative correlation with the complaint tendency • Gender—has positive correlation. • Private telecom company—has positive correlation. After selecting these variables, we entered these variables in the logistic regression with complaint tendency as the dependent variable, using SPSS. Forward LR method was used. After multiple iterations, five independent variables were retained (Fig. 1).

Predicting Consumer’s Complaint Behavior in Telecom Service …

469

Fig. 1 Final model to predict customer complaint behavior

4 Result 4.1

Final Variables Statistic Result

In the study, 1101 records are considered for the model creation. While data cleaning process, dependent variable complaint tendency was already re-coded as Non-Complainer-0 and Complainer-1. These same values are used by SPSS as well without changing them. Method-Entry is used for logistic regression. In this method, all the predictive variables are added at once, and then the final output is shown in Table 1. In our Table 1 Results of logistic regression to predict complaint behavior of customers

Variable

Wald statistics

Significance

Odds

Call drop Coverage Complaint not attend Bank account holder Education

32.348 18.530 15.025

0.000 0.000 0.000

0.432 0.340 2.731

28.073

0.000

2.125

9.016

0.003

0.785

470

A. Singh and P. V. Ilavarasan

Table 2 Predicted versus observed complaining behavior Observed

Have you complained about these problem to the telecom service provider Overall percentage

Predicted Have you complained about these problem to the telecom service provider No Yes No Yes

330 194

156 243

Percentage (%)

67.9 55.6 62.1

Model, R2CS = 0.578 which suggests it as a moderate model. However, value of R2N = 0.780 which suggests that the model has good predictive capability. The model has predicted complaining behavior correctly by 62.1% (Table 2).

5 Conclusions The study showed that customer with bank account complaint is more compared to nonholders. Also, those who have complained about banking services have also complained about telecom services. Less educated customers are complaining more. Unlike, educated customers they might take time to exit. Call drop and coverage related issues are main problems that trigger complaining behavior. Today in the era of Big Data, information about the customers is easily available to companies and identification of above factors that predict the Consumer Complaint Behaviors can be very useful and handy to telecom companies. Using it, they can handle the customer complaints in a better and prompt manner. They can reduce the number of complaints by closely monitoring the complainers and resolving their issues even before they complain. This will increase the level of customer satisfaction and provide an edge to the company in this competitive market.

Predicting Consumer’s Complaint Behavior in Telecom Service …

471

Appendix ICT acess and use

Please tell me how many active mobile SIM cards/connections you have in total that you regularly use Is your mobile phone connection prepaid What is average total recharge per week How many times do you recharge in a month What is your average total monthly expense for the mobile phone How do you get your bills

How do you normally make payments

CRM in telecom

Number of service used Who is Telecom Service Provider (Dummy-Public Company) Who is Telecom Service Provider (Dummy-Private Company) Smartphone availability Have any of your major complaints not been attended by the service provider Did you change your primary telecom service provider in the last year How do you Normally contact to your service Provider. (Dummy VariableDirect Contact) How do you Normally contact to your service Provider. (Dummy VariableDirect Contact) Are you satisfied with the services provided to you by telecom operator? Please tell me the mobile phone-related problem you have faced. (Dummy Variable-Billing Problem) Please tell me the mobile phone-related problem you have faced. (Dummy Variable-Call Drop) Please tell me the mobile phone-related problem you have faced. (Dummy Variable-Coverage)

Change in number

1 = No Change in number Change in number Change in number 1 = Hard Copy, 2 = SMS, 3 = Email, 4 = Check online Bill, 5 = I don’t Know 1 = Through dealer, 2 = Through registered office, 3 = I pay amount to 3rd party, 4 = I give money to my family/ friend, 5 = Online payment, 6 = other Change in number 1 = No 1 = No Change in number 1 = No 1 = No 1 = Other

1 = Other

Ranking (0–11) 1 = No

1 = No

1 = No

(continued)

472

A. Singh and P. V. Ilavarasan

(continued) Gender Do you have any bank account in your name How much highest formal education have you had

1 = No 1 = Illiterate, 6 = Master’s Degree

References 1. Richter, Y., Yom-Tov, E., Slonim, N.: Predicting customer churn in mobile networks through analysis of social groups. In: Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 732–741 (2010). https://doi.org/10.1137/19781611972801.64 2. Reichheld, F.F., Sasser, W.E.: Zero defections: quality comes to services. Harv. Bus. Rev. 68 (5), 105–111 (1989). https://hbr.org/1990/09/zero-defections-quality-comes-to-services 3. Hirschman, A.O.: Exit, Voice, and Loyalty: Responses to Decline in Firms, Organizations, and States, vol. 25, p. 4. Harvard University Press. Kang, J., Zhang, X., Zheng, Z.-H.: The relationship of customer complaints, satisfaction and loyalty: evidence from China. China-USA Bus. Rev. 8(12), 22–26 (2009) 4. Garin-Munoz, T., Perez-Amaral, T., Gijon, C., Lopez, R.: Consumer complaint behavior in telecommunications: the case of mobile phone users in Spain. Telecommunications Policy 40 (8), 804–820 (2016). https://doi.org/10.1016/j.telpol.2015.05.002 5. Diaz-Aviles, E., Pinelli, F., Lynch, K., Nabi, Z., Gkoufas, Y., Bouillet, E., Calabrese, F., Coughlan, E., Holland, P., Salzwedel, J.: In: 2015 IEEE International Conference on Big data (Big Data) pp. 1063–1072 (2015). https://doi.org/10.1109/bigdata.2015.7363860 6. Chew, H.E., Levy, M., Ilavarasan, V.: The limited impact of ICTs on microenterprise growth: a study of businesses owned by women in urban India. Inf. Technol. Int. Dev. 7(4), 1 (2011) 7. Suits, D.B.: Use of dummy variables in regression equations. J. Am. Stat. Assoc. 52(280), 548– 551 (1957). https://doi.org/10.1080/01621459.1957.105014128 8. http://lirneasia.net/projects/pro-poor-crm/

Computational Intelligence in Embedded System Design: A Review Jonti Talukdar, Bhavana Mehta and Sachin Gajjar

Abstract A system whose elementary function is not computation but is controlled by a computational system (microprocessor, microcontroller, digital system processors, custom-made hardware with a dedicated software) embedded within it is referred to as an Embedded System (ES). These systems are used in many applications like consumer electronics, business and office equipment, communication systems, automobiles, industrial control, medical systems, etc. Over the years, ES has gone through a radical renovation from traditional single-functioned system to a novel class of Intelligent Embedded Systems (IES), which are flexible and offer an improved consumer experience. The resurgence of Computational Intelligence (CI) paradigms has led to the design of IES which use adaptive mechanisms to exhibit intelligent behavior in multifaceted and dynamic real-world environments. CI offers flexibility, independent behavior, and robustness against changing real-world environment and communication failures. However, ES designers are generally unaware of the prospective CI paradigms, challenges, and opportunities available in the literature. This gap makes association and expansion of the use of CI paradigms in ES design difficult. This paper aims to fulfill this gap and nurtures collaboration by proposing a detailed introduction to ES and their characteristics. An extensive survey of CI applications as well as enabling technologies for IES design is presented and will serve as a guide for using CI paradigms for the design of IES.



Keywords Intelligent embedded systems Computational intelligence paradigms Embedded systems design Soft computing



J. Talukdar (✉) ⋅ B. Mehta ⋅ S. Gajjar Department of Electronics and Communication Engineering, Institute of Technology, Nirma University, Ahmedabad, India e-mail: [email protected] B. Mehta e-mail: [email protected] S. Gajjar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_47

473

474

J. Talukdar et al.

1 Introduction An Embedded System is a computing system designed to perform a dedicated or narrow range of functions with a minimal amount of human intervention. Today, the advanced standards in device networking, ubiquitous access to the Internet, miniaturization of processor due to advances in VLSI technology, reduction in power consumption due to CMOS technology, availability of Simulators and Emulators for system design as well as smart algorithms for intelligent learning and decision-making has taken the field of Embedded System design to a new generation of Intelligent Embedded Systems (IES) [1, 2]. Computational Intelligence (CI) paradigms consist of adaptive mechanisms to facilitate intelligent behavior in a complex, uncertain, and changing environment. These paradigms mimic the nature for solving complex problems and exhibit an ability to learn or adapt to new situations, to generalize, abstract, discover, and associate [3]. Inspired by biology and nature, some of these paradigms are fuzzy logic, artificial neural networks, evolutionary computing, and reinforcement learning. However, due to the fundamental difference between desktop and embedded systems, many paradigms need to be reviewed to determine how efficiently they can be utilized for embedded systems design. Thus, the objective of this paper is to provide an insight into CI paradigms as well as their enabling technologies to prospective IES designers. Rest of the paper is organized as follows—Sect. 2 discusses some design metrics necessary for IES design, Sect. 3 discusses the enabling technologies and Sect. 4 gives a brief overview of the CI paradigms widely used in IES design. Section 5 concludes the paper.

2 Design Metrics for Intelligent Embedded Systems Design of IES is in many ways far more challenging than that of Personal Computers. To be more specific, the system must fulfill the desired functionality and at the same time, the implementation must optimize numerous design constraints. The overall design of IES is ultimately based on the evaluation of cost versus benefit tradeoffs, which is dependent on the heterogeneity of complex design environments. This necessitates the use of automated design tools and smart algorithms to understand the disparity of options available to designers. Hence, design automation for IES needs to follow a model-based approach with flexible and modular design blocks for subsystem-level optimization [4]. This when used in combination with hybrid optimization algorithms allows the exploration of multi-objective design space for evaluating the optimum design criteria. Figure 1 shows the typical multi-objective design space exploration using model-based design methodology, where instead of using standard step design format, designers utilize dynamic objects from a repository and tailor the system architecture based on the design

Computational Intelligence in Embedded System Design: A Review …

475

specifications. Since no single design solution exists, using model-based design methodology allows designers to obtain different system specifications with different cost versus performance tradeoffs, thus allowing them to select the optimal solution, which provides the best tradeoff in the design space, also known as the Pareto-optimal front [4]. Design metric is a quantifiable feature of a system’s implementation [5]. Common relevant metrics for ES design are discussed below: • Power: The energy consumed by the system in unit time. Determines lifetime of battery-based systems. • Performance: The execution time of the system on standard instruction sets. • Fault Tolerance: Ability of the system to work correctly under extreme PVTs (process, voltage, and temperature. Directly related to noise sensitivity. • Time Criticality: Required for the design of real-time systems, which work with optimum scheduling and deadlines associated with their tasks. • Precision: Check on the system’s functionality in terms of degree of accuracy. • Physical Size: The physical space occupied by the system. Strongly correlated with the physical scaling (FET channel length) and memory elements in the system. • Flexibility: It is the ability to change the system functionality based on the application requirements with less cost.

Fig. 1 Model-based IES Design. Designers choose appropriate data models (architecture requirements) from the repository to test and simulate in a hardware–software co-design environment

476

J. Talukdar et al.

• Time-to-market: It is the amount of time required to design, manufacture, and put the system in the market for customers. • Safety: Is to ensure that the system will not cause harm to the environment or the individuals using the system • Reactiveness: It is the ability of the system to react to the changes in the external environment. The next section discusses the enabling technologies, which facilitate the implementation of IES and are used by designers to achieve these objectives.

3 Enabling Technologies A key factor when deciding the use of computationally intelligent models in the design of IES is their applicability in reducing human intervention during regular operation. Biologically inspired CI paradigms lead to better and highly dependable systems which are autonomous, resource efficient, self-organizing as well as require less supervision and maintenance [2]. CI-based systems thus exhibit the following key characteristics, making them highly desirable on the field: 1. Autonomy: Intelligent systems have the ability to execute time-critical tasks with reduced human intervention and interaction. 2. Dependability: Intelligent systems exhibit self-stability property when operating in dynamic situations with minimal performance loss or breakdown. 3. Efficiency: CI paradigms provide highly flexible and adaptable solutions which increase utility and performance metrics through resource optimization. 4. Pro-activeness: Smart systems not only react to the given stimulus but also exhibit goal-oriented behavior by displaying initiative and adapting to multiple possible actions through informed decisions. 5. Modeling: Intelligent systems solve NP-hard problems using generic self-organizing techniques thus liberating human designers from modeling and implementation issues, shortening design cycles, and accelerating deployment. 6. Self-awareness: Systems incorporating CI paradigms possess different forms of self-monitoring abilities enabling them to monitor real-time health, anomaly detection, cope up with attacks, and detect faults in a real-time environment [6]. 7. Low Maintenance: CI-based systems are self-learning in nature, requiring less frequent servicing iterations enabling unsupervised operation over a long time. Several techniques like smart CI methodologies, real-time optimization, efficient reconfigurable hardware architectures, and distributed cloud computing have made IES design possible. An analysis of these techniques is presented below.

Computational Intelligence in Embedded System Design: A Review …

3.1

477

Soft Computing Technologies

Soft computing techniques are a special set of intelligent algorithms that exploit the tolerance for uncertainty, randomness, and imprecision, inherent in dynamic situations rigorously, to attain distinct and explicit solutions which are robust to error [2]. Embedded systems consist of several sensor nodes which act as key interfaces between analog and digital domains which are difficult to model digitally. Hence, soft computing plays an important role in introducing additional information to increase the degree of confidence and accuracy of such borderline systems. Such techniques are discussed below. Sensor Fusion: It is the process of combining and processing primary sensory data or secondary data derived from primary data to produce enhanced representations of the overall process environment [7]. It enhances the spatiotemporal coverage of data-reducing ambiguity and uncertainty, thereby making the system robust to interference. Sensor fusion enhances the dependability of real-time systems by providing an extended view of the process environment through a combination of multiple sensory nodes, redundant data, and smart fusion algorithms. Majority of sensor-fusion algorithms work on smoothing, Kalman filtering, and inference methods [7] and have potential applications in real-time process control for IES. Neuro-Fuzzy Systems: Although Fuzzy Systems work well in creating an inference method for the implementation of decision and control algorithms, especially with imprecise sensor data, they lack effective mechanisms to learn from real-time data and are difficult to auto-tune [8]. Their amalgamation with neural networks overcomes this limitation by providing a self-organizing internal structure which can be trained on real-time data using if-then-rules, allowing real-time IES design. Multi-agent Systems (MASs): MASs consist of an interconnection of several widely distributed agents which act independently to achieve the same overall objective within the network [2]. All agents operate autonomously in a swarm network and proactively exhibit a goal-directed behavior. New event-triggered for decision-making in MASs [9] allow them to function beyond the capacity of a similar single monolithic integrated system. The key benefit of using MASs for IES design is the ability to design a system consisting of an ensemble of multifunctional CI-based submodules working together in a distributed environment. Moreover, MASs also enhance the overall performance through parallelism and reliability through modular redundancy and design flexibility. More details on using ensemble CI designs are discussed in Sect. 4.

478

3.2

J. Talukdar et al.

Cloud Computing and IoT

The development of an integrated cloud framework for embedded environments has great potential for providing end users with simple, flexible, and cost-effective access to pre-and post-processing cloud infrastructure without the need to install additional compute architecture. The integration of embedded systems with cloud computing can be facilitated at three different levels described below [10]: Infrastructure as a Service: This model provides all the hardware infrastructure necessary for full functional deployment to the cloud, which includes server space, storage, and network peripherals, depending on design requirements. Platform as a Service: This model provides a functionally tested and integrated development environment to deploy IES in addition to the hardware infrastructure. Software as a Service: This is the topmost layer, which provides a complete software solution as a service. Migrating certain control tasks to the cloud not only reduces the cost of hardware but also allows easier and faster resource sharing among related services. It allows the subsystems to be functionally independent and improves the maintenance of individual blocks, thus improving the overall life-cycle of the system. Since the majority of the smart CI paradigms require large compute infrastructure to be functionally useful, shifting such processes to the cloud and deploying them centrally saves compute power and increases real-time performance of IES. In addition, the convergence of smart-embedded devices with IoT [11] allows designers to deploy sensor data to the cloud, giving them access to advanced cost-effective APIs provided by third parties to develop a holistic system tailored to end-user needs. The development of integrated modular architectures [12] targeted specifically for embedded systems promises an optimized implementation of time and mission-critical functions due to the virtualization of computing and networking resources. This approach has grown popular among system architects as it allows the development of more integrated systems which can host critical and noncritical functions, support heterogeneous networks, facilitate robust resource partitioning and host wide variety of diverse processes with variable timing constraints. With the integration of sensor management, IoT and cloud computing infrastructure for IES design under a single umbrella, security and timing constraints remain the key challenging factors in the overall implementation of end-to-end CI-based distributed systems.

3.3

Hardware and Verification Technologies

The development of CI paradigms in the context of IES design can be implemented at all levels of abstraction right from block-level behavioral modeling of individual subsystems to RTL-level design. The resurgence of dynamically reconfigurable hardware [13] has led to the development of new and better approaches to

Computational Intelligence in Embedded System Design: A Review …

479

hardware/software partitioning with better on-chip transistor utilization. Recent reconfigurable embedded architectures [13] consist not only of a reconfigurable coprocessor coupled closely with a general-purpose CPU, but also a reconfigurable routing and data path coupled with memory hierarchy leading to significant computational speedups, design reusability, and greater potential for Instruction-Level Parallelism (ILP). This also increases the flexibility of application for the same design cost, allowing effective implementation of a wide variety of CI paradigms. In addition to dynamic reconfigurability, specialized architectures based on neuromorphic computing and Spiking Neural Networks (SNNs) [14] are also being implemented on resource-constrained embedded platforms due to their high power efficiency and compute capability, allowing easier deployment of resource-intensive CI paradigms like neural networks, fuzzy logic, genetic algorithms, etc. Moreover, the development of Spiking Neuromorphic Engines (SNEs) [14] provides a better alternative to traditional neural networks by deploying fast and power-efficient neuromorphic architecture for embedded cognitive applications. Since IES are often distributed real-time systems, it is essential to test such systems by modeling them based on a set of finite-state variables and distinguish every possible state-space transition. This allows the system to be more robust to unintended transitions. Hence, the development of automated model checking tools for verification of finite-state concurrent systems based on methods like preemptive scheduling, complex timing-analysis [15], etc., have led to significant improvement in the reliability of IES design. Figure 2 shows the interrelationship between synthesis and verification of IES design. It is essential to note that the overall design

Fig. 2 Interrelationship between verification and synthesis during IES design

480

J. Talukdar et al.

process is regularized as an interplay between verification and synthesis strategies, with an emphasis on hardware–software co-design to identify the correct set of architectural requirements best suited for the end user application.

4 Computational Intelligence Paradigms CI refers to sets of adaptive techniques that enable intelligent performance in multifaceted environmental conditions [16]. Engelbrecht [3] defines CI as a set of computational methods that are capable of accepting raw numerical data and process them by exploiting the representational parallelism and pipelining in application to generate reliable and timely outputs and at the same time withstanding fault tolerance. Some of the well-referred CI paradigms like fuzzy logic, neural networks, reinforcement learning, and genetic algorithms are discussed below.

4.1

Fuzzy Logic

The classical set theory works on binary logic where element states are properly quantized. This does not mirror natural analog inputs leading to imprecision. Fuzzy logic is multivalued and does not have a crisp boundary allowing an object to be a partial member of a set [17]. Fuzzy systems describe dynamic system behavior using linguistic if-then rules: if antecedent(s) then consequent(s). Antecedents are formed using a set of logical ‘OR’, ‘AND’ operators. Antecedents and Consequents form the input and output of the Fuzzy space, respectively. The overall decision-making process involves mapping the input and output space, known as the Fuzzy Inference System. The procedure of fuzzy inference involves: (i) Fuzzification of inputs and outputs (ii) Defining membership functions (iii) Applying Fuzzy Operators and rules (iv) Aggregation of all outputs (v) Defuzzification. Fuzzification is the process of mapping non-fuzzy inputs to their fuzzy representation by applying membership functions like triangular, trapezoidal, Gaussian, etc. IES designers can use multivalued Fuzzy Logic to optimize design metrics to meet competing design goals like power efficiency and performance. Fuzzy logic is flexible, scalable, fault tolerant, cheap, and resource efficient allowing low time-to-market to meet design specifications [18].

4.2

Neural Networks

Neural networks (NNs) are biologically inspired computational models that can generalize and approximate randomized mathematical functions [19]. NNs possess the ability to learn from given data and are vital in applications requiring real-time

Computational Intelligence in Embedded System Design: A Review …

481

learning and adaptation. The NN architecture consists of multiple layers, each composed of several neurons (nodes), which are interconnected in a directed graph fashion [20]. The net output of an individual neuron depends on the weighted sum of outputs of other neurons connected to it, passed through an activation function. The weights of individual neurons throughout the layers are adapted locally based on the minimization of an error function. The ability of neurons to fire to only a specific set of inputs allows for better and smarter decision-making, which is extremely important in time-critical and situation-specific operations of IES. Developments in reconfigurable and parallel computing (Sect. 3.3) have led to IES application in fields like computer vision.

4.3

Evolutionary Computing

IES design requires concurrent optimization of several competing objectives like power dissipation, cost, reliability, etc. Evolutionary Computation (EC) is a bio-inspired optimization process which analyzes the cost–benefit tradeoffs in heterogeneous design spaces. It samples the design spaces to find multiple true solutions and selects the best out of them (also known as Pareto-optimal front) [4]. EC uses computational models of evolutionary processes like—natural selection, reproduction, survival of the fittest, etc., as the basis to determine optimized solutions [3]. The quality of the optimized solution is evaluated based on a fitness function. EC algorithms are especially important in system-level design and synthesis of multi-objective design spaces, where the designer is ultimately provided with a set of tradable solutions rather than only one optimal front [21]. The EC process flow is shown below: 1. Population Initialization: design space populated with multiple solutions. 2. Fitness Assignment: Each point in the solution set is associated with a fitness value based on the optimization problem. 3. Selection and Recombination: This step eliminates all the low-quality solutions and combines remaining solutions (parents) to create new solutions. 4. Mutation: Some solutions obtained previously are further modified by a delta amount based on the optimization parameters. 5. Termination and Output: The process is terminated after iterating for a certain generation and the best outputs are mapped to the solution space. Designers can utilize EC algorithms for synthesizing IES architectures based on dynamic selection through design space exploration of Pareto-optimal fronts.

482

4.4

J. Talukdar et al.

Reinforcement Learning

Reinforcement Learning (RL) is an important area of machine learning with a focus on modeling next-generation networked wireless IES systems which include MIMO, Device-to-Device networks, femtocell and small-cells-based heterogeneous networks [22]. An RL-based system consists of the agent/learner and the environment where the agent follows a decision-making function (policy), to specify and execute an action to maximize the reward, based on the sensing environment [3]. The environment may or may not be represented by a mathematically modeled probability distribution model. Decisions are made by the agent, based on maximizing the reward. Several models for RL exist, but the most popular are listed below: Markov Decision Process (MDP): MDP plays a crucial role in situations where the outcomes are partly random and partly under the control of a decision maker. An MDP can be generalized to model an agent decision process, where the agent cannot observe the underlying state directly. Hence, it must maintain probability distribution for all set of possible states depending on the set of observations and their probabilities. This generalization is known as Partially Observable Markov Decision Process (POMDP). In networked IES systems, where the network constitutes the environment and users are regarded as agents, the MDP/POMDP constitutes the ideal tool for supporting decision-making [22]. One of the applications of MDP/POMDP is in point-to-point wireless IES architecture for Energy Harvesting (EH). Q-Learning: This is a model-free reinforcement learning technique used for optimal selection of an action for any given MDP. Q-learning algorithms have shown promising results in dynamic power management for IES design where their usage has led to a reduction in power consumption in nonstationary workload+ environments as well as for congestion routing in NoC (Network on Chip) based IES [23]. An overview of the IES design utility for all the CI paradigms discussed above as well as their comparative performance analysis is given in Table 1.

Table 1 Summary of the utility of CI paradigms for IES design CI paradigm

Computational complexity

IES optimization metrics

Applications

Fuzzy logic

Low

Neural networks Evolutionary computing Reinforcement learning

High

Power-efficient, flexible, fault tolerant, scalable Robust, real-time, reactive, fault tolerant, flexible Flexible, high performance, robust, safety-critical. Real-time, robust, reactive, fault tolerant, power-efficient

Closed loop sensor systems Vision and AI

Medium Medium

Reconfigurable systems Communication and Networking

Computational Intelligence in Embedded System Design: A Review …

483

5 Conclusion The heterogeneous design requirement for intelligent embedded systems necessitates the use of electronic design automation along with smart CI paradigms to optimize the best Pareto-optimal solution within the design space. In this paper, numerous design metrics of IES are briefly introduced, and the CI paradigms used by designers to address the challenges are concisely discussed. In addition, an overall assessment of CI paradigms in terms of their applicability for IES design is presented, which will serve as a guide for IES designers. Additionally, a brief overview of key enabling technologies for facilitating IES design is also introduced. In spite of an ample amount of success of CI paradigms in IES design, the primary concern is that most of these paradigms are still in the development stage with only a few of them having reached hardware and microarchitecture development stages. The aim of IES design community for the future should be to improve the existing applicability of CI paradigms and refine them toward well-performing real-world applications.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

11. 12. 13. 14.

15.

Altran: Intelligent Systems Pave the Way to a Smart World Sponsored. France (2013) Elmenreich, W.: Intelligent methods for embedded systems. In: WISES (2013) Engelbrecht, A.: Computational Intelligence: An Introduction, 2nd ed. Wiley (2007) Eisenring, M., Thiele, L., Zitzler, E.: Conflicting criteria in embedded system design. IEEE Design Test Comput. 51–59 (2000) Wolf, W.: Computers as Components. Elsevier Publications (2005) Santambrogio, M.D., et al.: Enabling technologies for self-aware adaptive systems. In: IEEE Adaptive Hardware and Systems (AHS) Conference (2010) Elmenreich, W.: Sensor fusion in time-triggered systems (2002) Oh, S., Pedrycz, W.: Identification of fuzzy systems by means of an auto-tuning algorithm and its application to nonlinear systems. Fuzzy sets Syst. (2000) Hu, W., Liu, L., Feng, G.: Output consensus of heterogeneous linear multi-agent systems by distributed event-triggered/self-triggered strategy. IEEE Trans. Cybern. (2017) Hallmans, D., et al.: Challenges and opportunities when introducing cloud computing into embedded systems. In: IEEE 13th International Conference on Industrial Informatics (INDIN) (2015) Cai, H., et al.: IoT-based big data storage systems in cloud computing: Perspectives and challenges. IEEE Internet Things J. 4(1), 75–87 (2017) Jakovljevic, M., Insaurralde, C.C., Ademaj, A.: Embedded cloud computing for critical systems. In: IEEE 33rd. Digital Avionics Systems Conference (DASC) (2014) Li, Y., et al.: Hardware-software co-design of embedded reconfigurable architectures. In: Proceedings of the 37th Annual Design Automation Conference. ACM (2000) Liu, T., Wen, W.: A fast and ultra-low power time-based spiking neuromorphic architecture for embedded applications. In: 18th International Symposium on Quality Electronic Design (ISQED). IEEE (2017) Cordeiro, L.: Automated Verification and Synthesis of Embedded Systems using Machine Learning. arXiv:1702.07847 (2017)

484

J. Talukdar et al.

16. Venayagamoorthy, G.K.: A successful interdisciplinary course on computational intelligence. IEEE Comput. Intell. Mag. 4(1), 14–23 (2009) 17. Zadeh, L.: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man Cybern. 3(1), 28–44 (1973) 18. Yong, K., et al.: Computational complexity of general fuzzy logic control and its simplication for a loop controller. Fuzzy Sets Syst. 11(2), 215–224 (2005) 19. Ferrari, S., Stengel, R.F.: Smooth function approximation using neural networks. IEEE Trans. Neural Netw. 16(1), 24–38 (2005) 20. Talukdar, J., Mehta, B.: Human action recognition system using good features and multilayer perceptron Network. In: IEEE 6th International Conference on Communication and Signal Processing, pp. 1–6 (2017) 21. Blickle, T., Jurgen, T., Thiel, L.: System-level synthesis using evolutionary algorithms. Des. Autom. Embed. Syst. 23–58 (1998) 22. Jiang, C., Zhang, H., Ren, Y., Han, Z.: Machine learning paradigms for next generation wireless communication. IEEE Wirel. Comm. 24(2), 98–105 (2017) 23. Farahnakian, F., Daneshtalab, M., Polsila, J., Ebrahimi, M.: Q-learning based congestion-aware routing algorithm for on-chip network. In: IEEE 2nd International Conference on Networked Embedded Systems for Enterprise Applications (NESEA) (2011)

Knowledge-Based Approach for Word Sense Disambiguation Using Genetic Algorithm for Gujarati Zankhana B. Vaishnav

and Priti S. Sajja

Abstract This paper proposes a knowledge-based overlap approach, which uses Genetic algorithms (GAs) for Word Sense Disambiguation (WSD). Genetic Algorithms have been explored to solve search and optimization task in AI. WSD problem strives to resolve which the meaning of a polysemous word should be used in a surrounding context in a given text. Several approaches have been explored for WSD in English, Chinese, Spanish, and also for some Indian regional languages. Despite the extensive research in NLP for Indian Languages, research on WSD in Gujarati Language is very limited. Knowledge-based approach uses machine-readable knowledge source. We propose to use Indo-Aryan WordNet for Gujarati as a lexical database for WSD.





Keywords Genetic algorithm Supervised learning Polysemy Semantic network Natural language processing Word sense disambiguation WordNet





1 Introduction Natural languages like English are indistinct as compared to programming languages. The meaning of the word is decided by its surrounding context in a sentence. For instance, consider the following sentences: (a) He left IIT in 1982 and settled at Delhi, where he took pupils for the university. Z. B. Vaishnav (✉) Sardar Patel University, Vallabh Vidhyanagar, Gujarat, India e-mail: [email protected] P. S. Sajja G.H. Patel PG Department of Computer Science, Sardar Patel University, Vallabh Vidhyanagar, Gujarat, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_48

485

486

Z. B. Vaishnav and P. S. Sajja

(b) Her pupils were large, making her eyes look dark. The word Pupil in these sentences specifies different meanings. A word can have a number of meanings or senses, which is termed as ambiguity. The word with a number of meanings is called polysemous word. An assignment of a meaning from more than one meanings of the polysemous word considering the context of the text is known as Word sense disambiguation (WSD). In Artificial Intelligence, natural language processing and its understanding is a hard problem and needs common sense knowledge. WSD has stalled the use of NLP techniques for many real-world applications which might get benefit from WSD and can improve their performance.

1.1

Machine Translation (MT)

The automatic conversion of one natural language into another is known as Machine Translation (MT). MT systems can be benefited by word sense disambiguation as it should select better contender. For instance, Fig. 1 shows the Google translation from Gujarati to English and it is clear that the word has two different meanings according to the context.

1.2

Information Retrieval (IR)

It is very important to resolve ambiguity in a query before retrieving information. For example, has a different meaning like So, finding the relevant sense

Fig. 1 Google translation shows the need for WSD

Knowledge-Based Approach for Word Sense Disambiguation …

487

of an ambiguous word in a particular query will give a more accurate answer to the query. Given the increasing interest in regional languages in NLP, WSD might be very useful in the determination of the correct meaning of words.

2 Types of Techniques for WSD The Word Sense Disambiguation approaches can be categorized broadly into Knowledge-Based, Supervised, and Unsupervised approach. Figure 2 shows the number of most commonly used techniques which can be categorized under these approaches. The list is not exhaustive.

2.1

Supervised Approach

In a supervised approach, classification is done to assign the correct sense of each word in two phases. In the training phase, a sense tagged corpus is used to train classifiers. Classifiers extract semantic and syntactic features from the corpus. In the testing phase, classifiers are tested to find the most appropriate sense of a word based on the surrounding context.

WSD

Supervised

Naive Bayes

Decision List Decision Tree SVM Maximum Entropy

Fig. 2 Types of approaches for WSD

Unsupervised

Knowledge Based

Context Clustering

Selectional Preferences

Co-Occurance Graphs

Overlap of Sense Definition

Graph Based

488

Z. B. Vaishnav and P. S. Sajja

Naïve Bayes Approach: A set of probabilistic parameters can be estimated given some features. These parameters are combined to assign categories that maximize the probability for a new example. The Naive Bayes algorithm uses the Bayes rule. It has been applied for WSD with a considerable success. In [1], the authors have built a WSD system using Naïve Bayes classifier for the Urdu language. They used corpus from [2]. It consists of 18 million words. For WSD process totally four words were selected according to their frequency. Only the base form is considered for all the words as Urdu is an agglutinative language. From the corpus, sentences with these words are fetched and manually tagged with senses. 80% of the sentences were used for training the Bayesian Classifier and 20% were used for testing. In [3], the authors investigated the Naïve Bayes classifier for Hindi WSD. They used features like collocations, context, and unordered list of words, nouns, and vibhaktis. From manually tagged corpus, 60 Hindi nouns were evaluated. A precision of 77.52% was obtained without a morphological upgrade and 86.11% after morphological enhancement. The results demonstrate that using rich features, the classifier can be enhanced for an unordered list of words. Naive Bayes classifiers make the assumption that all the features are independent of each other. Naive Bayes classifier is useful for supervised learning because of this independent features assumption. Decision List: Decision Lists is weighted conditions where most specific conditions appear at the beginning of the list and general conditions appear at the bottom. In the beginning, high weights are assigned and at bottom lower weights are assigned in the list. Association between weighted condition and class is determined by a function. For a new example, every rule in the list is evaluated linearly and the class is assigned accordingly. The authors in [4] proposed an algorithm, which uses a decision list for WSD in the testing corpora. All words in a sentence are disambiguated using the property which states that nearby words give consistent clues to the sense of a target word. A large set of the linguistic unit which is known as collocation for the target word was collected and probability distribution for all such units was calculated. Then, they calculated the log-likelihood ratio, where a higher ratio means more predictive evidence. Neural Network: To resolve ambiguities in Korean to Japanese Machine Translation system, [5] has proposed a WSD technique using neural networks. Generally, Neural Networks uses an extensive amount of features. The authors reduced the number of features to a practical size by using concept codes rather than lexical words. The 2-layer neural network achieved an average precision of 82.2%, which was significant for real-world MT system. Support Vector Machine: In [6], the authors have implemented WSD using support vector machines and multiple knowledge sources. The SVM performs optimization to find a hyperplane with the largest margin that separates training examples into two classes. The side of the hyperplane where test example lies decides its classification. Part-of speech (POS) tagging of neighboring words, context, collocations, and syntactic relations are used as a knowledge source. They

Knowledge-Based Approach for Word Sense Disambiguation …

489

evaluated their system on the English lexical sample task of SENSEVAL-2 and SENSEVAL-1. The recall over all SENSEVAL-2 test words was 0.656 and all SENSEVAL-1 test words with dictionary examples were 0.796 and without dictionary examples were 0.776.

2.2

Unsupervised Approaches

The supervised approach requires a manually created training data, which is very expensive. Such a problem is known as knowledge acquisition bottleneck. Unsupervised approach solves this problem by introducing the concept, that sense of the particular word, depends on their neighboring words. Context Clustering: In [7], the authors have done word sense discrimination using context clusters. Contexts of the ambiguous word in the training text are mapped to context vectors in Word Space by summing up the vectors in the context. Clusters are obtained by grouping context vectors and representing them by sense vectors. A context of the ambiguous word is disambiguated by mapping it to a context vector in Word Space. The context is assigned to the sense with the closest sense vector. Co-occurrences graphs: In [8], the authors have collected a text corpus consisting of paragraphs. From this corpus, a co-occurrence graph for the target word is built where vertices are the words in the text. Two co-occurring words in one text are connected with edges. To measure the relative frequency of the co-occurring words, the edge is assigned a weight. Once the hubs that represent the senses of the word are selected they are linked to the target word with edges weighting 0, and the Minimum Spanning Tree (MST) of the whole graph is calculated and stored. The spanning tree is used to perform WSD.

2.3

Knowledge-Based Approaches

Knowledge-based approach came into existence during the 1970s–1980s and several types of resources like dictionary or thesaurus, WordNet, SemCor, Wikipedia, etc., are used to provide the relevant meaning of a word in a context. Knowledge-based approach either uses hand-coded rule or grammar rule for disambiguation of words. Selectional Preferences: Relations between words are determined and common sense is denoted in selectional preferences. In this approach, the senses which have accordance with common sense are selected. The number of occurrences of word pairs with syntactic relations is determined. From this count, senses of words will be identified. In [9], the authors acquired the preferences for grammatical relations between nouns and adjectives or verbs. WordNet Synsets are used to define sense

490

Z. B. Vaishnav and P. S. Sajja

inventory. They exploited the hyponym relationship for nouns, the troponym relationship for verbs, and the “similar-to” relationship for adjectives. By finding the sense with the maximum probability the words are disambiguated. Overlap Approach: In [10], the authors proposed an algorithm which disambiguates words in small sentences. The sense glosses of a target word are compared with the glosses of each surrounding word in a sentence. After comparisons, a sense for which gloss has maximum common overlaps is assigned to the target word. The glosses are taken from dictionaries such as Oxford Advanced Learners. This algorithm is known as LESK. The authors in [11] adapted the original Lesk algorithm but used WordNet as a knowledge source. They evaluated this algorithm using the examples from SENSEVAL-2 with 73 target words and attained 31.7% accuracy. The original algorithm finds overlaps in the glosses of surrounding words but in [11] the authors extended the algorithm and made comparisons between the glosses of words that are related to the context and in the text. In [12], the authors proposed a GA (Genetic Algorithm) for the WSD for the Arabic language. A sentence is converted into a bag of words called context bag in a preprocessing phase, which includes stop-word removal, tokenization, and stemming. Arabic WordNet is used to get senses which are reduced into bags of words called sense bag. A GA is applied to map words to senses according to the context.

3 Gujarati Language and Gujarati WordNet Gujarati is an official language of Gujarat state in India. More than 46 million people in India and outside of India speak Gujarati language. Gujarati language is closely related to Hindi language. WSD can be formed as a search problem and solved by exploring the solution search space using evolutionary algorithms. Several approaches have been investigated for WSD in languages like English, Chinese, Spanish, etc., including knowledge-based approaches and machine learning-based approaches. However, research on WSD in Gujarati language is relatively limited.

3.1

Gujarati WordNet

WordNets offer a rich network of concepts through semantic and lexical relations. WordNets have emerged as a very useful resource for computational linguistics and many natural language processing applications. WordNet is a lexical database which comprises of synonym sets, gloss, and position in relations. A synonym set in a WordNet represents some lexical concept. The gloss gives a definition of the underlying lexical concept and an example sentence to illustrate the concept. Since the development of Princeton WordNet [13], WordNets are being built in many

Knowledge-Based Approach for Word Sense Disambiguation …

491

Fig. 3 Example of Synsets in Gujarati WordNet

other languages. Hindi WordNet was the first WordNet for the Indian language [14]. Based on Hindi WordNet, WordNets for 17 different Indian regional languages are getting built. Gujarati WordNet contains Gujarati words used in a family’s day-to-day life. The words are grouped into Synsets. It also provides definitions and examples, and records a number of relations among these Synsets [15, 16]. Currently, 26,503 nouns, 2805 verbs, 5828 adjectives, and 445 adverbs are there in Indo WordNet for Gujarati language. For example, the word has totally six Synsets in Gujarati WordNet. Figure 3 shows three Synsets with Synset ID, Synonyms, Gloss, and POS. WordNet also shows hypernymy, hyponymy, and some other relations. Figure 4 shows the overview of proposed work. A proposed approach A text T is transformed into a context bag fw1 , w2 , . . . , wk g in a preprocessing phase.

3.2

Preprocessing

Tokenization: Tokenization divides a text into pieces, called tokens. Stop-word removal: Stop-word is a word which will not add any meaning to the text. Stop-words could be important based on its context of the application. Some commonly occurring stop-words are etc. There is no standard stop-word list for Gujarati language.

492

Z. B. Vaishnav and P. S. Sajja

Unstructured Text Word Senses in terms of Gloss, Ontology Gujarati WordNet

Preprocessing

Context Bag

Sense Bag

Genetic Algorithm

Fig. 4 Overview of the proposed approach

POS Tagging: POS tagging assigns parts of speech to each word in a Text. If it is not known then we have to use Synsets associated with all the possible part of speech. Stemming: It is the process of converting related words to a base form of a word. It is done by ignoring the inflectional and derivational endings. Stemming is useful for a morphologically rich language like Gujarati, where a single word takes many forms. After the preprocessing phase, the text T will be converted into a bag of words called context bag. Now, to find out the overlap between context and senses from the WordNet, Sense Bag is obtained. The senses of each word wi in the context bag are retrieved from WordNet as glosses, which are reduced to bags of words (sense bag).

3.3

Genetic Algorithm

A GA is used to find the most appropriate mapping from words wi to senses in the context T. The best individual is returned by the GA. We need to define the following elements: • A representation of a being of the population in terms of the chromosome. An individual population can be represented by integer string. Each gene can be indexed into possible senses of the word. • Generate an initial population and decide population size.

Knowledge-Based Approach for Word Sense Disambiguation …

493

• Decide the fitness function to determine the fitness of a being that can reproduce. The fitness function is measured by the word sense relatedness. Most commonly used are the Lesk and extended Lesk measure. The Lesk measure calculates the sense which leads to the maximum overlap between the senses definitions of two or more words. • Decide genetic operators like crossover and mutation and their rates. The mutation changes randomly the new offspring. • Methods to select parents to generate offspring. There are many methods on how to select the best chromosomes, for example, roulette wheel selection, Boltzmann selection, tournament selection, rank selection, steady-state selection, Elitism, etc. • Decide the termination condition. Termination condition can be a number of evaluations or number of generations.

4 Conclusion and Future Work Word Sense Disambiguation is a very important task in many NLP applications. An extensive research is done in WSD for English language. There are very few Indian languages in which the research is done for WSD. There is negligible research done for Gujarati Word Sense Disambiguation. Many successful methods have not been explored for the Gujarati language, comparatively to other Indian languages. In future work, different aspects of Gujarati WordNet, Operators of Genetic Algorithm and can be investigated for implementation of WSD process. It can be made generalized for Indo-Aryan Languages where the Indo-Aryan WordNet for different languages can be used to disambiguate the ambiguous word.

References 1. Naseer, A., Hussain, S.: Supervised Word Sense Disambiguation for Urdu Using Bayesian Classification. Center for Research in Urdu Language Processing, Lahore, Pakistan (2009) 2. Ijaz, M., Hussain, S.: Corpus based Urdu lexicon development. In: The Proceedings of Conference on Language Technology (CLT07), University of Peshawar, Pakistan, vol. 73, Aug 2007 3. Singh, S., Siddiqui, T.J., Sharma, S.K.: Naïve Bayes classifier for Hindi word sense disambiguation. In: Proceedings of the 7th ACM India Computing Conference, p. 1. ACM, Oct 2014 4. Parameswarappa, S., Narayana, V.N., Yarowsky, D.: Kannada word sense disambiguation using decision list. Int. J. Emerg. Trends Technol. Comput. Sci. (IJETTCS) 2(3), 272–278 (2013) 5. Chung, Y.J., Kang, S.J., Moon, K.H., Lee, J.H.: Word sense disambiguation in a Korean-to-Japanese MT system using neural networks. In: Proceedings of the 2002 COLING Workshop on Machine translation in Asia, vol. 16, pp. 1–7. Association for Computational Linguistics, Sept 2002

494

Z. B. Vaishnav and P. S. Sajja

6. Lee, Y.K., Ng, H.T., Chia, T.K.: Supervised word sense disambiguation with support vector machines and multiple knowledge sources. In: Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2004) 7. Schütze, H.: Automatic word sense discrimination. Comput. Linguist. 24(1), 97–123 (1998) 8. Agirre, E., Martínez, D., de Lacalle, O.L., Soroa, A.: Two graph-based algorithms for state-of-the-art WSD. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 585–593. Association for Computational Linguistics, July 2006 9. McCarthy, D., Carroll, J.: Disambiguating nouns, verbs, and adjectives using automatically acquired selectional preferences. Comput. Linguist. 29(4), 639–654 (2003) 10. Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation, pp. 24–26. ACM, June 1986 11. Banerjee, S., Pedersen, T.: An adapted Lesk algorithm for word sense disambiguation using WordNet. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 136–145. Springer, Berlin, Heidelberg, Feb 2002 12. Menai, M.E.B.: Word sense disambiguation using evolutionary algorithms–application to Arabic language. Comput. Hum. Behav. 41, 92–103 (2014) 13. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235–244 (1990) 14. Narayan, D., Chakrabarti, D., Pande, P., Bhattacharyya, P.: An experience in building the indo wordnet-a wordnet for Hindi. In: First International Conference on Global WordNet, Mysore, India, Jan 2002 15. Chatterjee, A., Joshi, S.R., Khapra, M.M., Bhattacharyya, P.: Introduction to tools for IndoWordNet and word sense disambiguation. In: 3rd IndoWordNet Workshop, International Conference on Natural Language Processing (2010) 16. Bhensdadia, C.K., Bhatt, B., Bhattacharyya, P.: Introduction to Gujarati Wordnet. In: Third National Workshop on IndoWordNet Proceedings (2010)

Compressive Sensing Approach to Satellite Hyperspectral Image Compression K. S. Gunasheela

and H. S. Prasantha

Abstract Hyperspectral image (HSI) processing plays a very important role in satellite imaging applications. Sophisticated sensors on-board the satellite generates huge hyperspectral datasets since they capture a scene across different wavelength regions in the electromagnetic spectrum. The memory available for storage and bandwidth available to transmit data to the ground station is limited in case of satellites. As a result, compression of hyperspectral satellite images is very much necessary. The research work proposes a new algorithm called SHSIR (sparsification of hyperspectral image and reconstruction) for the compression and reconstruction of HSI acquired using compressive sensing (CS) approach. The proposed algorithm is based on the linear mixing model assumption for hyperspectral images. Compressive sensing measurements are generated by using measurement matrices containing Gaussian i.i.d. entries. HSI is reconstructed using Bregman iterations, which advance the reconstruction accuracy as well as the noise robustness. The proposed algorithm is compared with state-of-the-art compressive sensing approaches for HSI compression and the proposed algorithm performs better than existing techniques both in terms of reconstruction accuracy as well as noise robustness. Keywords Hyperspectral image



Compressive sensing



SHSIR algorithm

K. S. Gunasheela (✉) ⋅ H. S. Prasantha Nitte Meenakshi Institute of Technology, Bengaluru 560060, India e-mail: [email protected] H. S. Prasantha e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_49

495

496

K. S. Gunasheela and H. S. Prasantha

1 Introduction Hyperspectral image of a particular area captured by the satellite consists of hundreds of bands corresponding to different wavelength regions in the electromagnetic spectrum. The resources available on-board the satellite is always very limited both in terms of storage space and communication bandwidth available. As a result, compression of HSI is very much necessary. Many lossless and lossy algorithms [1] have been proposed in the literature for the compression of hyperspectral images. A recent development is, using compressive sensing [2] approach to greatly reduce the number of samples required to reconstruct the image thereby greatly enhancing compression performance with acceptable image quality. Convex optimization [3] strategies are used for image reconstruction. The research work proposes a new CS-based algorithm for the compression and reconstruction of HSI based on linear mixing [4] model for HSI and convex optimization techniques.

2 Sparsification of Hyperspectral Image and Reconstruction (SHSIR) Algorithm Consider a 3D hyperspectral image in matrix format given by X ∈ Rb × p . Where b denotes the number of spectral bands and p denotes the number of pixels in a particular band. The columns of X correspond to spectral vectors at a particular pixel position. The measurements vector Y can be modeled as Y = βðX Þ + ϑ

ð1Þ

β is a linear operator matrix. ϑ refers to noise in measurements and system modeling errors. Compressive sensing is performed in the spectral domain. Different measurement matrices containing Gaussian i.i.d. entries are used to perform compressive sensing. Hence, β can be written in block diagonal form as follows:   β : = Bdg β1 , β2 , . . . , βp ð2Þ where βi calculates k inner values between known Gaussian vectors and spectral vectors in X. Where, k ≪ b. k × p Measurements are obtained while achieving a compression ratio of k ̸b. Under the linear mixing model, the hyperspectral image can be modeled as X = FT

ð3Þ

where F is a mixing matrix whose columns correspond to the spectral signatures of endmembers. T corresponds to abundance coefficients. Mixing matrix is estimated

Compressive Sensing Approach to Satellite Hyperspectral …

497

using a robust minimum volume simplex algorithm (RMVSA [5]). From modeling Eq. (3), linear operator matrix β on X can be written as follows: W : = βð X Þ = βðFT Þ

ð4Þ

The main aim of SHSIR algorithm is to estimate the coefficients T ′ . Then hyperspectral image can be reconstructed as X ′ = FT ′

ð5Þ

Convex optimization problem to solve for abundance coefficients T ′ can be formulated as follows:  2     minT ′ 0.5Y − WT ′  + μI ET ′ + iR + T ′

ð6Þ

  iR + T ′ is an indicator function. It is zero when T ′ belongs to nonnegative orthant   otherwise it is positive infinity. μ is a penalty parameter. For a given frame I, I ET ′ is an isotropic total variation regularizer [6] given by qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   I ET ′ = ∑ Ehði, jÞ ðT ′ Þ2 + Evði, jÞ ðT ′ Þ2

ð7Þ

    Ehði, jÞ T ′ and Evði, jÞ T ′ correspond to horizontal and vertical backward differences. Total variation regularizer is incorporated to ensure the smoothness of the reconstructed image. Equation (6) can be written in the following equivalent form by introducing new variable per regularizer as follows: minT ′ , G1 , G2 , G3 0.5kY − WG1 k2 + μI ðG3 Þ + iR + ðG2 Þ subject to G1 = T ′ , G2 = T ′ , G3 = ET ′

ð8Þ In compact form, Eq. (8) can be written as follows: minT ′ , G aðGÞ subject to G = AT ′

ð9Þ

To solve the constrained optimization problem (9), adaptive ADMM (alternating direction method of multipliers [7]) is used. By adding Bregman distance to adaptive ADMM equation, we get adaptive-Bregman ADMM equation as follows: Bðt, G, EÞ = Dda



G, G

L



+ Dda





T ,T

′L



 2 τL  φL  ′  + AT − G − E +  2 τL 

ð10Þ

498

K. S. Gunasheela and H. S. Prasantha

  Dda ðG, GL Þ and Dda T ′ , T ′L are Bregman distance [8] terms. L represents the number of iterations. τL is the penalty parameter and φL is the dual variable.

2.1

Algorithm

Initialization: set L = 0 Select: τ ≥ 0 ð0Þ

ð0Þ

ð0Þ

ð0Þ

ð0Þ

ð0Þ

Parameter selection: T ′ð0Þ , G1 , G2 , G3 , E1 , E2 , E3 ,

φð0Þ τ

Parameter updates:   ðLÞ ðLÞ ðLÞ ðLÞ ðLÞ ðLÞ T ′ðL + 1Þ ← argminT ′ B T ′ , G1 , G2 , G3 , E1 , E2 , E3   ðL + 1Þ ðLÞ ðLÞ G1 ← argminG1 B T ′ðL + 1Þ , G1 , G2 , G3   ðL + 1Þ ð L + 1Þ ðLÞ G2 ← argminG2 B T ′ðL + 1Þ , G1 , G2 , G3   ðL + 1Þ ð L + 1Þ ð L + 1Þ G3 ← argminG3 B T ′ðL + 1Þ , G1 , G2 , G3 Update: Bregman iterations: φðLÞ τðLÞ φðLÞ ðL + 1Þ ðL + 1Þ E2 ← E2L − T ′ðL + 1Þ + G2 + ðLÞ τ φðLÞ ðL + 1Þ ðL + 1Þ E3 ← E3L − T ′ðL + 1Þ + G3 + ðLÞ τ   ðL + 1Þ

E1

ðL + 1Þ

← E1L − T ′ðL + 1Þ + G1

+

daðL + 1Þ ← daðL + 1Þ − τL Atr AT ′ðL + 1Þ − G

Update: τðL + 1Þ , L until the stopping criterion is satisfied. The steps involved in SHSIR algorithm is given in Algorithm 1. It is the expansion of Eq. (10). In the algorithm, the main aim is to conclude variable T ′ at each iteration. This kind of problem is referred to as a quadratic problem with block circulant system matrix. It is effectively solved in the Fourier domain.

Compressive Sensing Approach to Satellite Hyperspectral …

499

3 Results This section demonstrates the noise robustness and reconstruction ability of the proposed SHSIR algorithm. The proposed algorithm is compared with some existing hyperspectral compressive sensing methods (HCSM) and the simulation has been done using MATLAB 2016b. Here the system configurations are 8 GB RAM, 1 TB ROM, Intel i5 processor, 2 GB Nvidia Graphics card with the latest operating system of Windows 10. In this paper, URBAN [9] HSI dataset is used as the experimental data for the reconstruction of the image. The URBAN dataset has the dimensions 307 × 307 × 210. It is a 210 bands hyperspectral cube, where each scene comprises of 307 × 307 pixels. For experimentation, we have cropped the image to 200 × 200 pixels dimension in each band. The image is cropped to reduce the computational complexity and time. In this paper, three state-of-the-art HCS-based compression techniques are considered for comparison. They are, orthogonal matching pursuit (OMP [10]), reweighted Laplace prior based HCS (RLPHCS [11]), and structured-sparsity-based hyperspectral blind compressive sensing (SSHBCS [12]). OMP is a greedy sparse learning technique. RLPHCS is a structured-sparsity-based sparse learning technique. Both OMP and RLPHCS use off-the-shelf dictionaries to sparsify the HSI. SSHBCS is also a structure-sparsity-based sparse learning technique but unlike OMP and RLPHCS, SSHBCS performs sparse estimation from measurements using the learned dictionary. All the above mentioned three state-of-the-art models are compared with the proposed model. To calculate the performance of reconstructed image through different methods, three parameters have been considered. Peak signal-to-noise ratio (PSNR [13]), structural similarity index measure (SSIM [14]) and spectral angle mapper (SAM [15]). These parameter measures help to measure the performance difference between the proposed model and other state-of-the-art techniques which has been considered for comparison. Specifically, PSNR measure indicates the average data similarity between the original image and the reconstructed image. Hence, higher PSNR indicates better reconstruction model. The sampling rate corresponds to the dimension proportions of the measurements with respect to the actual HSI. It ranges from 0.1 to 0.5 in the experiments. The sampling rate is varied to demonstrate the performance of the algorithm at different sampling rates. Sampling rate 0.1 corresponds to the reconstruction of the image using only 10% of input samples. This is done by a random selection of rows in the measurement matrix β. Afterward, an additive white Gaussian noise is added into the measured HSI with different sampling rates to mimic the noise corruption in hyperspectral compressive sensing (HCS). This gives rise to SNR of 20 dB in the measured HSI. The algorithm is also verified for reconstruction under different levels of noise. In the experiments, SNR value ranges from 5 to 40 dB to analyze the performance of the algorithm under different levels of noise.

500

3.1

K. S. Gunasheela and H. S. Prasantha

Reconstruction Comparison Under Different Sampling Rate/Noise Level

Table 1 shows the SSIM scores for different approaches on the URBAN dataset. Noise level is kept constant which is 20 dB. The proposed SHSIR is having 0.8727 SSIM score at 0.1 sampling rate, which is 2.9%, 9% and 33% more than the SSHBCS, RLPHCS and OMP approach respectively. The average value of SSIM in SHSIR approach is 0.9516 which is 7.32%, 17.85% and 25.04% more than the SSHBCS, RLPHCS and OMP approach respectively. Figure 1 shows the PSNR versus sampling rate curves for the URBAN dataset. Sampling rate (SR) varies between 0.1 and 0.5 at an interval of 0.05. Noise level is kept constant which is 20 dB. At 0.1 SR the proposed SHSIR model is having 30.75 PSNR value, which is 10.9% more than the SSHBCS, 12.2% more than RLPHCS and 36.59% more than the OMP approach. At 0.5 SR, the SHSIR model is having 43 PSNR value, that is 25.23, 29.35 and 35.16% more compared to the SSHBCS, RLPHCS, and OMP model. The average PSNR value of SHSIR algorithm is 37.4 (SR 0.1–0.5), which is 17.9, 21.6, and 26.54% more than the existing model of SSHBCS, RLPHCS, and OMP. Table 2 shows the SSIM scores for different approaches on URBAN dataset. Sampling rate is kept constant which is 0.4. SNR is varied from 5 to 40 dB. The proposed SHSIR is having 0.9233 SSIM score at 5 dB SNR, which is 48.23%,

Table 1 SSIM scores for URBAN dataset under different sampling rate (best results are indicated in bold) Sampling rate

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

OMP RLPHCS SSHBCS SHSIR

0.6555 0.8004 0.8481 0.8727

0.7749 0.7964 0.8728 0.8973

0.8247 0.7899 0.8793 0.9308

0.8236 0.7954 0.8882 0.9597

0.8046 0.8051 0.8908 0.9688

0.7827 0.8136 0.898 0.9752

0.7566 0.8196 0.8988 0.9838

0.7286 0.823 0.903 0.9881

0.6981 0.8238 0.901 0.9880

Fig. 1 PSNR versus sampling rate curves

Compressive Sensing Approach to Satellite Hyperspectral …

501

Table 2 SSIM scores for URBAN dataset under different noise level (best results are indicated in bold) SNR (dB)

5

10

15

20

25

30

35

40

OMP RLPHCS SSHBCS SHSIR

0.1192 0.3601 0.4071 0.9233

0.2387 0.5476 0.6547 0.9236

0.4331 0.7103 0.8070 0.9236

0.7566 0.8196 0.8988 0.9236

0.8035 0.8929 0.9447 0.9236

0.8983 0.9398 0.9694 0.9236

0.9460 0.9634 0.9769 0.9236

0.9569 0.9722 0.9783 0.9236

Fig. 2 SAM versus noise level bars

53.3%, and 87% more than the SSHBCS, RLPHCS, and OMP approach, respectively. The average value of SSIM in SHSIR approach is 0.9235 which is 10.16%, 16%, and 30.26% more than the SSHBCS, RLPHCS, and OMP approach, respectively. SSIM score almost remains constant at different levels of noise which validates the noise robustness of the proposed algorithm. Figure 2 shows the SAM versus noise level bars for the URBAN dataset. Sampling rate is kept constant i.e. 0.4. The gaussian noise added is varied from 5 to 40 dB to analyse the performance of algorithm at different noise levels. The average SAM value of SHSIR algorithm for URBAN data set is 2.62( SNR 5–40 dB), which is almost three times, four times, seven times less than the existing model of SSHBCS, RLPHCS and OMP. Figure 3 shows the URBAN data reconstructed results for different approaches with their error-magnitude maps, Fig. 3d is compared with Fig. 3a, b, c, here the error occurrence is less at SHSIR compared to OMP, RLPHCS, and SSHBCS.

502

K. S. Gunasheela and H. S. Prasantha

Fig. 3 The first row shows the reconstruction results and the second row shows the reconstruction error-magnitude maps of the 90th band from URBAN dataset (at sampling rate = 0.2) and the SNR measurement is 20 dB. a OMP, b RLPHCS, c SSHBCS, d SHSIR. e Original band

4 Conclusion The research work proposes a new algorithm for the compression and reconstruction of hyperspectral data acquired in compressive sensing way. The proposed SHSIR algorithm takes benefit of the two main hyperspectral data properties, such as the high spatial correlation between abundance fractions and less number of end members, required to represent the data. Performance of the algorithm at different sampling rates and under different levels of noise is presented in the paper. The improvement in the performance is obtained because we have incorporated RMVSA algorithm for endmembers extraction, which increases the accuracy in endmembers extraction. Another main reason for better performance of our algorithm is, we have used the Bregman solver to solve the optimization problem, which advances the reconstruction accuracy of the HSI dataset. The main constraint of SHSIR algorithm is computation time since we are dealing with huge hyperspectral datasets, computation time can be overlooked as it happens in the ground station, where resources are available in abundance. The computational complexity at the encoder side has been greatly reduced by considering less number of samples by compressive sensing. The proposed algorithm is compared with state of art CS based compression algorithms. Experimental results demonstrate the supremacy of the proposed technique over other state-of-the-art techniques. Acknowledgements This work is carried out as a part of Research work at Nitte Meenakshi Institute of Technology (Visvesvaraya Technological University, Belgaum). We are thankful to the institution for the kind support.

Compressive Sensing Approach to Satellite Hyperspectral …

503

References 1. Gunasheela, K.S., Prasantha, H.S.: Satellite Image Compression-Detailed Survey of the Algorithms, Proceedings of ICCR in LNNS Springer, vol. 14, pp. 187–198 (2017) 2. Donoho, D.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006) 3. Tropp, J.: Just relax: convex programming methods for identifying sparse signals. IEEE Trans. Inf. Theory 51, 1030–1051 (2006) 4. Martín, G., Bioucas-Dias, J.M.: Hyperspectral blind reconstruction from random spectral projections. In: Proc. IEEE JSTARS, 2390–2399 (2016) 5. Agathos, A., Li, J., Bioucas-Dias, J.M., Plaza, A.: Robust minimum volume simplex analysis for hyperspectral unmixing. In: 22nd European Signal Processing Conference (EUSIPCO). Lisbon, pp. 1582–1586 (2014) 6. Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004) 7. Xu, Z., Figueiredo, M.A.T., Goldstein, T.: Adaptive ADMM with spectral penalty parameter selection. In: International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 54, pp. 718–727, July 2017 8. Yin, W., Osher, S., Goldfarb, D., Darbon, J.: Bregman iterative algorithms forl1-minimization with applications to compressed sensing. SIAM J. Imaging Sci. 142–168 (2008) 9. http://www.tec.army.mil/Hypercube 10. Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inf. Theory 53, 4655–4666 (2007) 11. Zhang, L., Wei, W., Zhang, Y., Tian, C., Li, F.: Exploring structural sparsity by a Reweighted laplace prior for hyperspectral compressive sensing. IEEE Trans. Image Process. 25, 4974– 4988 (2016) 12. Zhang, L., Wei, W., Zhang, Y., Shen, C., van den Hengel, A., Shi, Q.: Dictionary learning for promoting structured sparsity in hyperspectral compressive sensing. IEEE Trans. Geosci. Remote Sens. 54(12), 7223–7235 (2016) 13. Peng, Y., Meng, D., Xu, Z., Gao, C., Yang, Y., Zhang, B.: Decomposable nonlocal tensor dictionary learning for multispectral image denoising. In: IEEE Conference on CVPR Columbus USA, pp. 2949–2956 (2014) 14. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004) 15. Yuhas, R.H., Boardman, J.W., Goetz, A.F.H.: Determination of semi-arid landscape endmembers and seasonal trends using convex geometry spectral unmixing techniques. In: Fourth Annual JPL Airborne Geosci. Workshop Washington, vol. 1. (1993)

Development of Low-Cost Embedded Vision System with a Case Study on 1D Barcode Detection Vaishali Mishra, Harsh K. Kapadia, Tanish H. Zaveri and Bhanu Prasad Pinnamaneni

Abstract In the trend of miniaturization and smart systems/devices, many industries are still working with comparatively large and costly computer-based system as compared to embedded systems. The work discussed in the paper focuses on development on a small, low-cost, less power-consuming embedded vision-based one-dimensional barcode detection and decoding system by fusion of camera and embedded system. 1D barcodes are prevalent in retail, pharma, automobile, and many other industries for automatic product identification. Real-time application of 1D barcode localization and decoding algorithm in Python using OpenCV library was developed. Image processing task will be performed by embedded systems, which proves that the performance of embedded systems is comparable to a computer. Results of barcode detection and computation time comparison over different hardware platforms are discussed in the results. Keywords Embedded vision system Barcode OpenCV





Machine vision



Image processing

V. Mishra ⋅ H. K. Kapadia (✉) ⋅ T. H. Zaveri ⋅ B. P. Pinnamaneni Tirupati 517502, Andhra Pradesh, India e-mail: [email protected] V. Mishra e-mail: [email protected] T. H. Zaveri e-mail: [email protected] B. P. Pinnamaneni e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_50

505

506

V. Mishra et al.

1 Introduction As we humans can see through our eyes and brain can process and recognize what we see, a camera and a computer facilitate machines to visualize information and process them in order to perform a task. Human vision cannot pace with today’s manufacturing techniques that demand fast, accurate inspection in order to get speedy production and quality product. Industrial application of vision technology is termed as machine vision [1, 2]. Machine vision ousts the manual inspection with automated inspection with the help of industrial cameras, illumination, sensors and image processing, which help to eliminate human error, increase production speed and improve product quality. Few applications of machine vision are automatic PCB inspection, label inspection on products, robot guidance and checking the orientation of components, packaging inspection, medical vial inspection, reading of serial numbers, product surface inspection, and many more. Implementation of computer vision on embedded systems is referred as embedded vision [3], which are microprocessor- or microcontroller-based systems, which can be found in many commercial as well as industrial appliances like automobiles, surveillance systems, medical equipment, etc. Availability of powerful, low-cost, and energy-efficient processors allows developers to use computationally complex computer vision algorithms in the real world. ARM processor-based embedded systems are becoming popular due to their lower cost, lower power consumption, compactness, and comparable performance than general-purpose CPU’s. Numerous powerful ARM-based development boards are available in market at reasonable prices which have allowed their usage in the field of industrial as well as commercial image processing applications. Advanced drivers assistance system (ADAS) is one such application of embedded vision, which is getting more attention from developers around the world. Below paper is organized as Sect. 2 ‘Literature survey’ gives a brief introduction of embedded vision platform and research carried out in the related field, Sect. 3 ‘Embedded vision system’ explains embedded vision, their application, and the proposed system model. It also includes algorithm steps used in this application development and computation time comparison.

2 Literature Survey Due to the increase in embedded vision applications, many companies are developing embedded boards, which can replace traditional CPU. Figure 1 shows the ARM/embedded processor-based development boards, designed for various embedded and/or vision applications. Figure 1 shows popular embedded vision platforms starting from the left is Blackfin embedded vision system, OpenMV Cam m7, Odroid c2, raspberry PI 2B and NVIDIA Jetson tk1. Blackfin embedded vision starter kit is a high-performance

Development of Low-Cost Embedded Vision System …

507

Fig. 1 Embedded development boards [18]

board with software development tools costing $199. It has dual-core BF609 processor, onboard 720p color CMOS sensor, illumination lights which provide users an off-the-shelf platform for an embedded vision application development. Target applications can be machine vision, barcode scanners, robotics vision, security, etc. OpenMV Cam M7 costing $55 has 216 MHz ARM Cortex M7 processor and onboard camera, which can be programmed in Python. It performs machine vision algorithms like track colors, face detection, and control I/O in the real world. Odroid C2 is a low-cost 64-bit 1.5 GHz ARMv8 cortex single-board computer. Cost of Odroid c2 is $40 and can run on open-source OS like Ubuntu, Android, ARCH Linux, and Debian. It can be used as set-top box for home theater, general-purpose computer for gaming, web browsing, and controller for home automation. Raspberry PI is the most popular single-board computer. Its latest model Raspberry PI 3 model B costs around $33. Raspberry PI 2 has 32 bit 900 MHz quad core ARMv7 Cortex-A7 processor. It can run ARM GNU/Linux distributions and Microsoft Windows 10. Raspberry PI has open-source technologies like communication and multimedia web technologies which made it available to use as embedded systems. Researchers around the globe had presented work on 1D and 2D barcode localization, decoding, barcode segmentation techniques. Not many people have attempted to develop embedded vision-based applications for barcode detection. A brief literature survey is presented here. Fernandez-Robles and Alegre [4] had presented a work on the integration of embedded systems with the industrial camera for an image processing application. The authors discussed the advantages of embedded systems and image acquisition on raspberry PI (Linux) and Toshiba NB200-12N Notebook (Windows OS) along with their specification. Failure to interface GigE camera with Raspberry PI 2 was presented. Image acquisition using camera and camera API written in C++ and OpenCV were shown. An embedded vision system comprising a CMOS camera and a low cost, the fast microcontroller was presented by Rowe et al. [5]. Their basic idea was to replace costly frame grabbers and traditionally used host computers for simple vision application. On the developed system of $109, they implemented color blob tracking at 16.7 fps. They compromised the horizontal resolution of the image due to lower memory space in the microcontroller. Developed hardware, image processing, and camera setting modes were presented by the authors.

508

V. Mishra et al.

Chai and Hock [6] presented the work for EAN-13 barcode and image processing performed by mobile phone. Barcode localization and then decoding according to EAN-13 barcode specifications were discussed. Liyanage [7] had presented a barcode decoding algorithm for real-time embedded application like point-of-sales terminals, without the use of cameras. Paper had discussed distortions like improper printing, scratches, fading, and distortions while capturing like uneven illumination, blur, noise, skewed images, which can give incorrect decoding results. To solve these issues, the authors proposed to scan multiple lines which improve robustness but increases computation time. J. Coughlan explained Bayesian algorithm in order to decode UPC barcodes from low-quality images captured by mobile phone cameras. The authors [8] have given a detailed explanation of deformable template. The proposed algorithm was bit complicated but robust enough against distortions like uneven illumination, low resolution, missed edges, noisy images. Tropf and Chai [9] had proposed an algorithm to localize 1D barcodes on consumer products using a discrete cosine transform (DCT) domain. Mobile phone cameras use JPEG codec, which is an accepted compression standard. The paper had explained the localization algorithm and features of 1D barcode in DCT domain. DCT coefficients also provide angle and orientation information. From the above survey, concluded two aspects that many have worked on barcode decoding application for consumers using mobile phones and the second is many industries are still using computer-based systems which are space consuming and costly. Since embedded systems are growing, but not much work is done on it. In industries, there is a need for high-resolution cameras and smaller systems. Hence, we are presenting a model of 1D barcode decoding for an industrial application using industrial high-resolution cameras and embedded systems in order to reduce cost, space, and power consumption.

3 Embedded Vision System Embedded vision [10] refers to embedded systems that extract meaningful information from the images. It involves the implementation of computer vision algorithms on embedded systems. Embedded systems are specific purpose microprocessor-based single-board computers, which are found in many daily appliances like automobiles, surveillance systems, and medical equipment. A typical machine vision system comprises a computer (industrial grade) and an industrial camera, which consumes more power and have a large footprint. It provides hardware overhead which is not required by all applications and increases overall cost. Miniaturization trend of electronic circuits introduced single-board computer SBC (i.e., computer on a small single board), which is compact with small cameras (called as board-level cameras) without housing [11]. That allows easy integration into compact systems. SBC’s use open-source Linux operating system on ARM processors instead of x86 processors in computers. In a nutshell, embedded systems

Development of Low-Cost Embedded Vision System …

509

Fig. 2 Block diagram of embedded vision system [16]

provide advantages of small, compact, lean, lightweight structure, consumes less power, and smaller footprint. Moreover, the system is customized, which makes it cost effective. Embedded systems are growing up with huge potential applications [12] in consumer, medical, automotive, entertainment, retail, industrial, and aerospace domain. Applications in automotive like number plate recognition, driver assistance systems, security applications like surveillance for face or movement recognition, industries applications like barcode reading or OCR, medical applications like tumor detection, vein detection, and consumer applications like interacting gaming, gesture-controlled smart TV, etc. An embedded system model is proposed as shown in Fig. 2, the objective of the work is to perform machine vision application (i.e., 1D barcode decoding) and replace costly, bulky and power-consuming computer-based system with low-cost and compact embedded vision system. As seen in the figure, product (carton or bottle with barcode, label, OCR, or logo on it) passing on a conveyor will be sensed by proximity sensor interfaced to an embedded platform. It will trigger the industrial camera, capture the image of the object and transfer the same to the embedded platform (SBC). Barcode decoding process will be performed by SBC and the result will be communicated to a server for verification. If barcode result is not verified incorrect then an actuator will be triggered to push the object in a reject bin. The object will be collected in an accept bin in case of correct barcode result. Hardware requirement for the system is ARM processor-based SBC, camera, and its interface (USB 2.0, GigE, USB 3.0) and software requirement is to select IDE for image processing. Raspberry PI 2B has a 32-bit with ARM Cortex A7 processor and is widely used as a cost-effective SBC. USB2.0/GigE camera interface was chosen as Raspberry PI does not have USB 3.0 and image processing algorithm was developed in Python using OpenCV and custom functions.

3.1

Barcode Detection

Barcodes [13, 14] are ubiquitous and used in retail industries, shipping and packaging industries, health care, education, logistics, transportation, etc. Barcode encodes product-related information like where it is manufactured; manufacturer’s

510

V. Mishra et al.

Fig. 3 Industrial camera image

and product unique number, etc. In 1D barcode, information is encoded in different widths of black and white bars and spaces. Barcodes are important as it helps in keeping track of the large stockpile, cost, and time saving by eliminating work by employees for inventory purposes which eliminates human error and increase production speed. Handheld barcode scanners are available which requires human intervention and is time consuming for stockpile purpose. Hence, there is an urge to develop an automated system that can decode barcode at higher speed. Barcode decoding involves localization, angle correction, and decoding. Localization helps to extract the barcode part from the entire image and reduces the overall area to be decoded. Many times barcode containing images are orientated, angle corrections steps orient the barcode horizontally. Captured image using the industrial camera and Raspberry PI camera module [15, 16] are shown in Figs. 3 and 4. The captured images may be color, which needs to be converted into grayscale as barcode decoding does not require color information. Images captured may contain text, logo, etc., with a barcode. From that image, we need to detect barcode and localize. First, edges are detected by subtracting gradients in x- and y-direction using Sobel operator. The gradient is found by convolving Sobel kernel in the xand y-direction. Convolution formula (1) is given as follows: gðx, yÞ =

Fig. 4 Raspberry PI camera image

n2



m2

∑ hðj, kÞf ðx − j, y − kÞ

k = − n2 j = − m2

ð1Þ

Development of Low-Cost Embedded Vision System …

511

Fig. 5 Barcode sample image, localization, and angle correction

“f” is the input pixel. Here, n2 = n/2 and m2 = m/2, m and n are the size of kernel ‘g’ is the output gradient. ‘h’ (2) is the kernel which is shown as follows: 2

−1 hx = 4 − 2 −1

3 2 0 1 −1 0 2 5, hy = 4 0 0 1 1

−2 0 2

3 −1 0 5 1

ð2Þ

To remove noise, the image is blurred by convolving average filter kernel. The morphological closing operation followed by erosion and dilation is performed to form a blob of the barcode. Angle correction using a Hough transform is performed to correct the skewed barcode in the captured images and assist in accurate decoding. The corrected barcode image goes for decoding as per the type of barcode and the result is communicated to the server for verification (Fig. 5).

3.2

Experimental Results

Testing of barcode decoding application was carried out on computers with Windows and Linux OS, Raspberry PI 2B [17], Odroid C2 [15] and Nvidia Jetson TK1. Table 1 shows the comparison of the computer, Odroid c2, Raspberry PI 2B and NVIDIA0 Jetson tk1 systems. Table 2 shows the time comparison of image

Table 1 System comparison Computer

Odroid C2

Raspberry Pi 2B

NVIDIA Jetson tk1

CPU

Intel® Core™ i3-4170 CPU 64-bit @ 3.70 × 4 GHz

Amlogic S905 SoC 4 × ARM Cortex– A53 2 GHz 64-bit ARM v8

Broadcom BCM2836 4 × ARM Cortex A7 900 MHz 32-bit ARM v7

RAM OS

4 GB Windows 7/ Ubuntu16.04

2 GB 32-bit DDR3 Ubuntu 16.04

1 GB 32bit DDR2 Raspbian-Jessie

NVIDIA 2.32 GHz ARM quad core Cortex-A15 CPU 2 GB DDR3L Ubuntu 14.04

512

V. Mishra et al.

Table 2 Online execution time comparison

Acquisition Processing (online) Processing (offline)

Computer Linux Windows OS OS

Odroid C2

Raspberry Pi 2B

NVIDIA Jetson tk1

0.560 0.260

0.991 0.507

1.001 2.084

2.176 12.751

0.995 1.258

0.246

1.471

2.145

8.845

1.794

acquisition time, and computation time for online captured image processing and offline/stored image processing (barcode localization and decoding). For both online and offline, the image size is 640 × 480 and computation time is in seconds. From the comparison, it is evident that Odroid c2 and NVIDIA Jetson tk1 shows comparable performance and can be used for specific application development.

4 Conclusion In the paper, development of an embedded vision system by integrating the industrial camera with SBC for 1D barcode decoding application was discussed. The computational complexity in terms of execution time is compared and results show that embedded systems perform computationally efficient as compared to a computer and can replace them. Challenges faced in the development include integration of industrial camera with embedded platforms, camera drivers, open-source software development, user interface, debugging, troubleshooting, etc. Manufacturers of the industrial camera have started support for their camera integrations with popular SBC like Raspberry PI, NVIDIA Jetson tk1, and Odroid c2.

References 1. UKIVA: Machine Vision Handbook. www.ukiva.org (2007) 2. SICK: Machine Vision Introduction. www.sickivp.com (2006) 3. Changing the face of machine vision, Tuesday Aug 2016. https://www.imveurope.com/ feature/changing-face-machine-vision 4. Fernández-Robles, L., Alegre, E.: Design and implementation of an embedded system for image acquisition of inserts in a headtool machine. In: XXXVI Jornadas de Automática, Bilbao, España (2015) 5. Rowe, A., Rosenberg, C., Nourbakhsh, I.: A Low Cost Embedded Color Vision System. http://www.cs.cmu.edu/∼cmucam/Publications/iros-2002.pdf (2002) 6. Chai, D., Hock, F.: Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras. In: ICICS (2005)

Development of Low-Cost Embedded Vision System …

513

7. Liyanage, J.P.: Efficient Decoding of Blurred, Pitched, and Scratched Barcode Images. http:// citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.330.5850&rep=rep1&type=pdf (2007– 2016) 8. Tekin, E., Coughlan, J.: A Bayesian algorithm for reading 1D barcodes. In: Canadian Conference on Computer and Robot Vision, 2009. CRV ‘09 (2009) 9. Tropf, A., Chai, D.: Locating 1-D bar codes in DCT-domain. In: Proceedings of 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France (2006) 10. Embedded vision. http://www.baslerweb.com/en/support/knowledge-base/embedded-vision (2016) 11. PC-based Machine Vision Versus Smart Camera Systems, Jan 2017. http://www.thomasnet. com/articles/automation-electronics/smart-camera-versus-pc-based-machine 12. Applications for Embedded Vision, 2011–2017. http://www.embedded-vision.com/ applications-embedded-vision 13. Joseph, E., Pavlidis, T.: Bar code waveform recognition using peak locations. IEEE Trans. Pattern Anal. Mach. Intell. 630–640 (1994) 14. Shams, R., Sadeghi, P.: Barcode recognition in highly distorted and low resolution images. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2007) 15. Karstens, F.: Machine Vision goes Embedded Vision, Sept 2016. http://www.qualitymag. com/articles/93541-machine-vision-goes-embedded-vision 16. Integration of vision into embedded system. https://www.vision-systems.com/articles/print/ volume-22/issue-1/features/integration-of-vision-in-embedded-systems.html (2017) 17. Odroid: http://www.hardkernel.com/main/products/prdt_info.php (2013) 18. Torsten Wehner, M.D.: Selection and integration of ARM®-based boards for machine vision applications, Apr 2016. http://www.baumer.com/fileadmin/user_upload/international/ Downloads/WP/Baumer_ARM-for-machine-vision_EN_20160401_WP.pdf

Path Planning of Mobile Robot Using PSO Algorithm S. Pattanayak, S. Agarwal, B. B. Choudhury and S. C. Sahoo

Abstract Recent trends in path planning of mobile robot are emerging as preponderance research field. This paper presents particle swarm optimization (PSO) for optimizing the path length of the mobile robot. The proposed approach downsizes the path length for the mobile robot without any physical meeting of the obstacles between starting and destination point. This method uses a static environment for the estimation of path length between two points. Totally, six numbers of obstacles are taken into consideration for this evaluation work. MATLAB software was used for generating the programs for the PSO approach. Keywords Mobile robot



Path planning



PSO

1 Introduction Expansion of the industries and their prerequisite for endless production, working in a hazardous situation, and unattended manufacturing operation limits the working of human beings. Thus, it is crucial to develop a robot that can be controlled through a cellular phone/laptop/remote controller. The determination of path through which the robot reaches its goal point is a challenging task for the designer. S. Pattanayak Department of Production Engineering, Indira Gandhi Institute of Technology, Sarang 759146, Odisha, India e-mail: [email protected] S. Agarwal ⋅ B. B. Choudhury (✉) ⋅ S. C. Sahoo Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang 759146, Odisha, India e-mail: [email protected] S. Agarwal e-mail: [email protected] S. C. Sahoo e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_51

515

516

S. Pattanayak et al.

So the path should be selected in such a way that the collision with obstacles can be avoided also the path length should be as shorter as possible. Two environment conditions (static and dynamic) are followed for the path planning and determination of its length [1]. An outright knowledge about surrounding and obstacles is available with the robot in a static environment. But the positions of the obstacles changes over time, under a dynamic environment. The path selection depends upon certain factors such as shortest route, less cost, and minimum time required to move. Various computational approaches were followed for the selection of optimal path. Chaari et al. [1] scrutinized the capabilities of tabu search (TS) for the estimation of global path and the result obtained were compared with genetic algorithm (GA) approach. It was reported that the TS approach provides a nearly optimal solution with shorter execution time than GA approach. Solea and Cernega [2] proposed a new control strategy for obstacle avoidance in static and dynamic environments for wheeled mobile robots, used in flexible manufacturing systems. This approach delivers optimum result not only in the static environment but also in the dynamic environment too. An indoor navigation for the wheeled robot was developed by Kumari [3]. So the robot can move to any location without human intervention. This is possible by integrating the map of the environment into the system. The robot is equipped with an infrared sensor for detection, range determination, and obstacle avoidance. Normal probability weight distribution (NPWD) is implemented by Amith et al. [4] for outdoor environment, where the robot moves from one static node to other along a planned path. Heuristics-based shortest path (HSP) algorithm also employed for solving complex problems. Brassai et al. [5] applied the travelling salesman problem (TSP) for evaluating the optimal path between two points in a working space. Duchon et al. [6] developed modified A* (A star) algorithm for optimization of computational time and path length of the mobile robot. The modifications comprise of Basic theta*, Phi*, RSR (rectangular symmetry reduction) and JPS (jump point search) approach. While determining the optimized path length, the basic theta* algorithm was found as convenient. Yusof et al. [7] adapted PSO soft computing approach for the planning of trajectory path and determined the shortest route according to less cost. Some predetermined waypoints are considered for path planning. This approach contributes towards mobility of the visually impaired people without any caretaker. Han and Seo [8] used path improvement algorithm depending upon the former and latter points (PI_FLP) for estimating shortest and smoothened path. Surrounding point set (SPT) was also implemented to obtain a set of points that surround the obstacles which help in solving path planning problems. Zhou et al. [9] made a comparison between PSO and TVAPF (tangent vector based artificial potential field) approach for evaluating the efficient path planning method. It is reported that the PSO approach provides improved path planning efficiency and overcomes the limitations of the APF approach. A fuzzy control system is implemented by Rulong et al. [10] to demonstrate the robotic operations. The developed approach found as effective when determining the path length for the multi-robot system.

Path Planning of Mobile Robot Using PSO Algorithm

517

In this paper, a striving has been done to resolve the shortest route through which the robot reaches its destination without any collision. A static environment with complete information about the position of obstacles is considered in this study. The optimum path length between the starting and end point for the mobile robot is estimated using PSO soft computing approach. The programming for the PSO algorithm is written using MATLAB software.

2 Experimental Setup This study begins with the development of an autonomous mobile robot (as shown in Fig. 1) for assessment of path length between the starting and goal point. The photographic view of the mobile robot is presented in Fig. 1, which also mentions its components. It can run smoothly by bypassing the obstacles present in its path, without any human involvement. The position of starting point and goal point of the environment, used to clinch the path length is presented in Table 1. The obstacles (six) are high lightened as pink circles in Fig. 2. The starting and goal point coordinates are set as (0, 0) meter

Fig. 1 Photographic view of a mobile robot

518

S. Pattanayak et al.

Table 1 Environment condition Sl. no.

No. of static obstacles

Coordinates of starting point

Coordinates of end point

3

6

(0, 0)

(8, 9)

Fig. 2 Pictorial view of the environment

Table 2 Size and position of the obstacles Obstacle number Obstacle Obstacle Obstacle Obstacle Obstacle Obstacle

1 2 3 4 5 6

Coordinates of X-axis in (m)

Coordinates of Y-axis in (m)

Radius of the circle in (m)

1.7 2.5 4.5 5.0 7.1 0.1

1.7 5.5 8.5 3.0 5.5 4.5

1.4 1.6 0.8 0.9 1.5 0.6

and (8, 9) meter, respectively. The obstacles size and position are presented in Table 2. During the estimation of path length, the size and position of each obstacle remain constant for every test.

3 Algorithm Few programming commands are used in the PSO soft computing approach to attain the global optimum result by optimizing the path length. PSO approach holds well, when the absolute knowledge about the obstacle presents is available. After every iteration, the particle updates their position and velocity according to the following formula:

Path Planning of Mobile Robot Using PSO Algorithm Table 3 PSO parameters

519

Sl. no.

Parameter name

Parameter value

1 2 3 4

Number of particles (i) Number of iteration (it) Acceleration coefficient (c1, c2) Inertia weight (w)

(10–10,000) 20 c1 = c2 = 1.5 1

prtposij = prtposij − 1 + prtvelij

ð1Þ

prtvelij = ½w ⋅ prtposij − 1 + c1r1 ðpbestij − 1 − prtposij − 1Þ + c2r2 ðgbesti − 1 − prtposij − 1Þ

ð2Þ where prtposij prtvelij pbestij gbesti c1, c2 r1, r2 w

Position of jth particle in ith iteration Velocity of jth particle in ith iteration Best position of jth particle till ith iteration Best position within the swarm Acceleration coefficient Random numbers between 0 and 1 (varies at each iteration) Inertia weight (constant)

There are certain parameters, whose values are fixed in PSO algorithm for evaluation of the global best value (gbest) as listed in Table 3.

4 Results After every iteration, the position and velocity of the particles were amended using Eqs. (1) and (2). Then an investigation is carried out, whether the position particles are lying inside the obstacles or not. If so, then that swarm is canceled and goes for further updating. If no, then enumerate the global best (gbest) and particle best (pbest) value for the particles. Finally it checks, whether the termination condition is satisfied or not. If satisfied, then it is terminated, contrariwise goes for further forecasting the fitness function. So the fitness function is vital here. A series of tests have been conducted to resolve the shortest path between coordinate (0, 0) and (8, 9) using PSO algorithms. The shortest path length in this static environment for the mobile robot is presented in Fig. 3. A graph (Fig. 4) is plotted between the length of the path and iterations for the estimation of best results. The obstacles are delineated as small pink circles and the travel trajectory for the mobile robot starting from origin to goal point was displayed as a black line in Fig. 3. The optimal path for the robot is also represented by this graph. After every iteration, the path length was updated and presented in Fig. 4. A series of tests were

520

S. Pattanayak et al.

Fig. 3 Shortest path using PSO

Fig. 4 Length of the path in every iteration

organized to evaluate the shortest path length and the results are tabulated in Table 4. The optimum path length is found to be 13.1182 m from Fig. 4 and Table 4.

Path Planning of Mobile Robot Using PSO Algorithm Table 4 Path length obtained by using PSO

521

Number of tests

Length of the path in “m”

1 2 3 4 5 6 7 8 9 10

13.1785 15.0095 13.5114 13.7753 15.8573 15.5295 13.9383 13.1182 15.7435 13.5097

5 Conclusion In this work, the collision-free optimal trajectory path for the mobile robot in static environment is established using PSO algorithm. After the simulation work, it was known that the PSO soft computing approach is quite simple and easy to implement, also determines the shortest path at a very short time than other existing approach. The optimum trajectory length is found to be 13.1182 m among all tests conducted.

References 1. Chaari, I., Koubaa, A., Ammar, A., Trigui, S, Youssef, H.: On the adequacy of tabu search for global robot path planning problem in grid environments. In: 5th International Conference on Ambient Systems, Networks and Technologies, vol. 32, pp. 604–613. Elsevier B.V (2014) 2. Solea, R., Cernega, D.C.: Obstacle avoidance for trajectory tracking control of wheeled mobile robots. In: Proceedings of the 14th IFAC Symposium on Information Control Problems in Manufacturing, Bucharest, Romania, vol. 472, pp. 279–290. Springer, Heidelberg (2013) 3. Kumari, C.L.: Building algorithm for obstacle detection and avoidance system for wheeled mobile robot. Glob. J. Res. Eng. Electr. Electron. Eng. 12(11), 11–14. Global Journals Inc., USA (2012) 4. Amith, A.L., Singh, A., Harsha, H.N., Prasad, N.R.: Normal probability and heuristics based path planning and navigation system for mapped roads. In: International Multi-Conference on Information Processing (IMCIP), vol. 89, pp. 369–377. Elsevier B.V (2016) 5. Brassai, S.T., Iantovics, B., Enachescu, C.: Artificial intelligence in the path planning optimization of mobile agent navigation. In: Emerging Markets Queries in Finance and Business, vol. 3, pp. 243–250. Elsevier B.V (2012) 6. Duchon, F., Babinec, A., Kajan, M., Beno, P., Florek, M., Fico, T., Jurisica, L.: Path planning with modified a star algorithm for a mobile robot. Model. Mech. Mech. Syst. 96, 59–69. Elsevier Ltd. (2014)

522

S. Pattanayak et al.

7. Yusof, T.S.T., Toha, S.F., Yusof, H.M.: Path planning for visually impaired people in an unfamiliar environment using particle swarm optimization. In: IEEE International Symposium on Robotics and Intelligent Sensors, vol. 76, pp. 80–86. Elsevier B.V (2015) 8. Han, J., Seo, Y.: Mobile robot path planning with surrounding point set and path improvement. Appl. Soft Comput. 57, 35–47. Elsevier B.V (2017) 9. Zhou, Z., Wang, J., Zhu, Z., Yang, D., Wu, J.: Tangent navigated robot path planning strategy using particle swarm optimized artificial potential field. Optik—Int. J. Light Electr. Opt. 158, 639–651. Elsevier B.V (2018) 10. Rulong, X., Qiang, W., Lei, S., Lei, C.: Design of multi-robot path planning system based on hierarchical fuzzy control. Adv. Control Eng. Inf. Sci. 15, 235–239. Elsevier Ltd. (2011)

An Application of Maximum Probabilistic-Based Rough Set on ID3 Utpal Pal and Sharmistha Bhattacharya (Halder)

Abstract The Iterative Dichotomiser 3 (ID3) recursively partitions the problem domain, producing the subsequent partitions as decision trees. However, the classification accuracy of ID3 decreases in dealing with very large datasets. Maximum probabilistic-based rough set (MPBRS) is a sophisticated approach for insignificant feature reduction. This paper represents an application of MPBRS on ID3 as a method for insignificant attribute elimination. The paper further investigates ID3 using Pawlak rough set and also Bayesian decision-theoretic rough set (BDTRS) for comparison. The experimental result, using R language, on datasets collected from UCI Machine-Learning repository shows that MPBRS-based ID3 induces enriched decision tree resulting in improved classification. Keywords ID3



MPBRS



Feature reduction

1 Introduction Since the inception of the rough set concept by Pawlak [7], several enhancement and extensions have been done [13, 15, 17]. The MPBRS [5, 7] is an outstanding rough set technique for feature set minimization. It computes indiscernibility relation and set approximation which remains unaffected before and after feature reduction. It also produces positive region that involves the highest number of entities compared to other rough set based methods. For this reason, the resultant feature set contains all the significant members rejecting all insignificant. The ID3 [8, 9] is a widely used supervised decision tree algorithm which is chosen to handle classification problems [16]. It is accepted across industries as well as academicians for its simplicity, comprehensibility and classification accuracy. But for very large U. Pal (✉) ⋅ S. B. (Halder) Tripura University, Agartala, Tripura, India e-mail: [email protected] S. B. (Halder) e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_52

523

524

U. Pal and S. B. (Halder)

datasets especially having a huge number of feature set, the tree induction quality is worse. Moreover, ID3 cannot properly handle missing values, outliers, and mixed-type of data. In order to overcome these shortcomings of ID3, MPBRS-based feature minimization has been introduced before decision tree induction. This paper also investigates ID3 with respect to Pawlak rough set [7] and BDTRS [1, 6]. This research work uses R language [4, 10] (Version 3.3.2 for Windows) and experimented on three (3) different standard secondary data sets. Theoretical and mathematical background of Pawlak rough set, BDTRS, MPBRS, and Decision Tree are briefed in Sect. 2. In Sect. 3, we discussed decision tree learning by MPBRS-based ID3. Experimental results are explained in Sect. 4 followed by a conclusion in Sect. 5.

2 Mathematical Background 2.1

Pawlak Rough Set Theory [7, 14]

Information system is defined as S = ðU, At, fVa j a ∈ Atg, fIa j a ∈ AtgÞ, where U is a set of entities, At is a set of features, Va is the set of values of a ∈ At and Ia : U → Va is an information function. It maps an entity in U to one value in Va . The approximations of X⊆U with respect to R, can be defined according to its lower (RX) and upper (RX) approximations.     RX = ∪ ½ xR jP X ̸½ xR = 1, ½ xR ∈ π R .

ð1Þ

     RX = ∪ ½ xR P X ̸ ½ xR ⟩0, ½ xR ∈ π R .

ð2Þ

The positive region is computed as:     POSR ð X Þ = RX = ∪ ½ xR jP X ̸ ½ xR = 1, ½ xR ∈ π R .

2.2

ð3Þ

The Bayesian Decision Theoretic Rough Set [1, 2, 6, 12]

Consider, DPOS as the positive region in BDTRS model for feature reduction. For an equivalence class, ½ xc ∈ π A ,       DPOS ½ xc = Di ∈ π D : P Di ̸ ½ xc > PðDi Þ .

ð4Þ

The members of a discernibility matrix, MDpos , for equivalence classes ½ xc , ½ yc is defined as,

An Application of Maximum Probabilistic-Based Rough Set on ID3

525

       MDPOS ½ xc , ½ yc = a ∈ C: Ia ð xÞ ≠ Ia ð yÞ⋀DPOS ½ xc ≠ DPOS ½ yc .

ð5Þ

Positive-decision reduct is taken as the prime implicant of the reduced disjunctive form of the discernibility function. It is defined as,         f ðMDPOS Þ = ⋀ ⋁ MDPOS ½ xc , ½ yc : ∀x, y ∈ U MDPOS ½ xc , ½ yc ≠ ∅ .

2.3

ð6Þ

The Maximum Probabilistic-Based Rough Set [5, 7, 14]

MPBRS is a robust feature reduction method among other rough set methods. The precision of a class ½ xc ∈ π c for predicting a decision Di ∈ π D can be formulated.   We denote it by Pmax Di , ½ xc and are defined as follows:   Pmax Di , ½ xc =

   ½ x ∩ D i  c  . maxj ½ x ∩ Di 

ð7Þ

c

For a decision class, Di ∈ π D , the MPBRS upper and lower approximation with respect to π c is ( apr maxðπC Þ ðDi Þ = x ∈ U: Pmax



)    ½ x ∩ D i  c   >0 . Di , ½ xc = maxj ½ x ∩ Di 

ð8Þ

)    ½ x ∩ D i  c   =1 . Di , ½ xc = maxj ½ x ∩ Di 

ð9Þ



c

( apr maxðπ Þ ðDi Þ = x ∈ U: Pmax C





c

The positive region of Di ∈ π D with respect to π C is ( POSmaxðπ c Þ ðDi Þ = POSmax ðDi , π c Þ = x ∈ U: Pmax



)   ½ x ∩ Di  c   =1 . D i , ½ x c = maxj ½ xc ∩ Di  

ð10Þ Attribute/Feature Significance [7]. The consistency factor (C.F) may be defined as γ ðC, DÞ = jPOSC ðDÞj ̸jU j. The data set is consistent if γ ðC, DÞ = 1. The significance of feature a, can be defined as σ ðC, DÞðaÞ = 1 − ðγ ðC − fag, DÞ ̸γ ðC, DÞÞ, where 0 ≤ σ ðaÞ ≤ 1.

526

2.4

U. Pal and S. B. (Halder)

Decision Tree: ID3 [8, 9]

Decision Trees are a type of Supervised Machine Learning, where the data is continuously split according to a certain parameter. In such a tree each internal (non-leaf) node represents a condition on a feature, each division represents the outcome of the condition and each leaf shows a decision class. One can traverse the paths from the root to leaf for a classification rule. A perfect decision tree is a tree that accounts for most of the data and at the same time it tries to minimize the number of levels. Apart from ID3, several other algorithms to generate such optimal trees have been devised. Decision trees are simple to understand, interpret and can be combined easily with other decision-making techniques.

3 Decision Tree Learning Based on MPBRS-Based ID3 3.1

Attribute Reduction by MPBRS

Steps involved in the algorithm [5] of MPBRS-based attribute reduction are shown in Fig. 1. The method first computes indiscernible relations (equivalent classes) using rough set theory. Next, precision for each decision class and indiscernible relations is computed. If precision equals 1, the decision is a member of the positive region. The algorithm repeats the procedure for each indiscernible relation. The discernibility matrix involves those attributes (names) for which information function is different and also differs in positive region. Solving the discernibility matrix gives the reduced attribute set.

3.2

ID3-Based Tree Induction

The ID3 [8, 9] is a simple and widely used algorithm among various other decision tree methods. It adopts a greedy approach to construct a decision tree in a recursive

Fig. 1 Steps involved in MPBRS-based attribute reduction

Data Set Indiscernibility Relations MPBRS-based Positive Region Discernibility Matrix Reduced Attribute Set

An Application of Maximum Probabilistic-Based Rough Set on ID3

527

method. Starting with an initial training set (dataset) and their associated decision classes, the set is successively partitioned into smaller subsets. Considering the original data set as the root node, the algorithm iterates through every unused attribute and calculates information-gain for each attribute. It then selects the attribute which has the largest information-gain. The data set is then split based on the selected feature to produce subsets of the data. The algorithm continues to run on each subset and stops in one of the following cases: • The whole set of entities of a particular node belong to the same class. • No features left, i.e., the feature set is empty. • There are no entities left in a branch.

3.3

Attribute Reduction and Tree Induction in R Environment

MPBRS-based ID3 is induced in two phases. In the first phase feature reduction followed by decision tree induction based on the reduced feature set. Following commands in R environment are executed on housing [3] data. Output along with the significance of eliminated features is in Table 1. >MpDisMatrix = MpbrsDtDisMat (HousingDecTable, PositiveRegion, range.objct = NULL, return.matrx = FALSE) >ReducedData = FS.one.redct.comptation (MpDisMatrix) The function MpbrsDtDisMat() [5], is the implementation of MPBRS-based discernibility matrix and the function FS.one.reduct.computation() [11] converts the discernibility matrix to reduced attribute set. Commands in the R environment to build MPBRS-ID3 decision Tree, T2 (Fig. 2). >Modfit1 HData Modfit2 < - train(MEDV ∼ .,method= “rpart”,data=HData) The domain of decision columns are grouped into six (6) equivalent classes: C1, C2, …, C6. The number of classification rules has been decreased to six (6) from eight (8) corresponding to T2 and T1, respectively. It can be observed that the rule: ‘If (“RAD” > 3) && (“RAD” 52) then “MEDV” = C3’, corresponds to node C3 of T1.

4 Experimental Results and Discussion For experiment CervicalCancer (858 entities, 36 features) [18], Spambase (4601 entities, 57 features) [19] and housing (506 entities, 14 features) [3] datasets have been chosen. All of the above three data sets are analyzed by MPBRS, BDTRS, and Pawlak method for feature minimization. First, preprocessing tasks such as missing value completion, discretization and so on are performed on the raw data set, D, using methods available in various packages of R. Second, the function train(), an implementation of decision tree is applied on D. The resultant tree (T1) obtained from housing data is given in Fig. 2. Third, feature reduction is carried out on D, in order to eliminate the insignificant features. The new sample data set is renamed as Dred. For example, application of MPBRS upon housing data, reduces the feature set to nine (9) from thirteen (13). It can be observed that the deleted features such as “ZN”, “NOX”, “PTRATIO”, and “INDUS” have a low score for attribute significance value (Table 1). Furthermore,

ID3 MPBRS-ID3 BDTRS-ID3 Pawlak-ID3 ID3 MPBRS-ID3 BDTRS-ID3 Pawlak-ID3 ID3 MPBRS-ID3 BDTRS-ID3 Pawlak-ID3

Cervical Cancer

Housing

Spam base

Models/ methods

Dataset

36 26 27 32 57 44 49 52 13 09 09 12

Feature Set 10 09 09 10 21 18 19 20 05 04 04 05

Tree height 09 08 08 09 20 17 18 19 04 03 03 04

No. of leaves 8.24 8.01 8.03 8.22 14.33 13.12 13.38 14.01 3.12 2.88 2.88 2.98

Avg. rule-length 0.97 1.33 1.27 1.14 2.7 3.25 3.22 3.04 0.58 0.91 0.91 0.87

Execution time (s)

Table 2 Comparative analysis of ID3, MPBRS-based ID3, BDTRS-based ID3, and Pawlak-based ID3

90.32 91.93 91.89 91.07 86.55 88.22 88.01 87.11 88.45 88.71 88.60 88.60

Accuracy of classification (%)

1 1 1 1 1 1 1 1 1 1 1 1

C.F

An Application of Maximum Probabilistic-Based Rough Set on ID3 529

U. Pal and S. B. (Halder)

Cervical-Cancer Data Spam-base Data Set Set

Pawlak-ID3

BDTRS-ID3

MPBRS-ID3

ID3

Pawlak-ID3

BDTRS-ID3

MPBRS-ID3

ID3

Pawlak-ID3

BDTRS-ID3

MPBRS-ID3

93 92 91 90 89 88 87 86 85 84 83

ID3

Classification Accuracy

530

housing Data Set

Fig. 3 Representing classification accuracy by ID3 and various rough set based approach on ID3

the calculated C.F, after reduction, remains unchanged. Lastly, the ID3 algorithm is executed on Dred, which gives a simplified tree, T2 as shown in Fig. 2. All the above mentioned four steps are also performed for CervicalCancer and Spambase data sets and Table 2 shows the detailed result. It can be observed that MPBRS approach results in best feature reduction for all the data sets. Since the total number of condition attributes goes down the quantity such as tree height also decreases. As a result, a significant reduction in terms of tree complexity has been achieved. Average rule-length is the sum of the height of all the rules divided by the no. of rules in a tree. Average rule-length is worst (maximum) when ID3 works on the raw data set and minimum in case of MPBRS-based ID3. As a result, maximum classification accuracy is attained by MPBRS-based ID3. This is shown in Fig. 3. But, MPBRS-based ID3 has the drawback of average execution time. The total execution time is the sum of time required for feature reduction and time required for tree construction. The nominal overhead of execution time is tolerable as there is a significant improvement in classification accuracy.

5 Conclusion Performance of ID3 has been a challenge when the size of the data set, especially in terms of features, is very large. In this research work classification efficiency of ID3 has been enhanced by introducing MPBRS as a tool for insignificant feature reduction. The MPBRS-based feature reduction, maintains indiscernibility as well as a set approximation of the information system. This is further supported by calculating consistency factor and attribute significance. This paper also shows that

An Application of Maximum Probabilistic-Based Rough Set on ID3

531

the accuracy percentage of MPBRS-ID3 is the highest among all other methods discussed and for all the three datasets used. Moreover, the comparative analysis shows that the resultant decision tree, produced by MPBRS-ID3 has the advantage of shorter height, lesser number of nodes and decreased tree complexity compared to the other methods mentioned in this work.

References 1. Halder, S.B.: A study on bayesian decision theoretic rough set. Int. J. Rough Sets Data Anal. (IJRSDA) 1(1), 1–14 (2014) 2. Halder, S.B., Debnath, K.: Attribute reduction using bayesian decision theoretic rough set models. Int. J. Rough Sets Data Anal. (IJRSDA) 1(1), 15–31 (2014) 3. Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978) 4. Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996) 5. Pal, U.: Halder, S.B., Debnath, K.: A study on CART based on maximum probabilistic-based rough set. In: MIKE, vol. 10682, pp. 1–12. LNAI-Springer (2017). https://doi.org/10.1007/ 978-3-319-71928-3_39 6. Pal, U., Halder, S.B., Debnath, K.: R implementation of bayesian decision theoretic rough set model for attribute reduction. In: I3SET, vol. 11. LNNS-Springer (2017). https://doi.org/10. 1007/978-981-10-3953-9_44 7. Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11, 341–356 (1982) 8. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986) 9. Quinlan, J.R.: Simplifying decision trees. Int. J. Man-Mach. Stud. 27, 221–234 (1987) 10. R Development Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria, 2011: The R Foundation for Statistical Computing. ISBN: 3-900051-07-0. http://www.R-project.org/. 08 June 2016 11. Riza, L.S., Janusz, A., Bergmeir, C., Cornelis, C., Herrera, F., Slezak, D., Benitez, J.M.: Implementing algorithms of rough set theory and fuzzy rough set theory in the R package “RoughSets”. Inf. Sci. (ELSEVIER), 287, 68–89 (2014) 12. Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Slowiski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory. Kluwer Academic Publishers, Dordrecht, pp. 311–362 (1992) 13. Slezak, D., Ziarko, W.: Bayesian rough set model. In: Proceedings of the International Workshop on Foundation of Data mining, Japan, pp. 131–135, 9 Dec 2002 14. Yao, Y.Y.: Generalized rough set models. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery, pp. 286–318. Physica-Verlag, Heidelberg (1998) 15. Yao, Y.Y.: Probabilistic approaches on rough sets. Expert Syst. 20, 287–297 (2003) 16. Zhiling, C., Qingmin, Z., Qinglian, Y.: A method based on rough set to construct decision tree. J. Nanjing Univ. Technol. 27, 80–83 (2005) 17. Ziarko, W.: Variable precision rough set model. J. Comput. Syst. Sci. 46, 39–59 (1993) 18. https://archive.ics.uci.edu/ml/datasets/Cervical+cancer+%28Risk+Factors%29. 21 Dec 2017 19. http://archive.ics.uci.edu/ml/datasets/Spambase. 21 Dec 2017

A Novel Algorithm for Video Super-Resolution Rohita Jagdale and Sanjeevani Shah

Abstract Video super-resolution is a technique which generates a high-resolution video sequence from multiple low-resolution frames. This paper presents a novel algorithm for video super-resolution (NA-VSR) to improve the resolution quality of the image as well as video. This method consists of a combination of interpolation and image enhancement. The bicubic interpolation has been employed to increase the pixel density and luminance compensation is used to get the super-resolved view of the interpolated frame. The algorithm is designed in MATLAB 2016 and quality of the image is estimated with design metrics like Peak signal-to-noise ratio (PSNR) and Structural Similarity index method (SSIM). Experimental results show that the quality of output high-resolution video of NA-VSR method is good as compared to previously published methods like Bicubic, SRCNN, and ASDS. PSNR of the proposed method is improved by 7.84 dB, 6.92 dB, and 7.42 dB as compared to Bicubic, SRCNN, and ASDS, respectively. Keywords Video super-resolution



Bicubic interpolation



Image enhancement

1 Introduction Nowadays, video super-resolution is a very challenging area of research and development. It is used in many applications like medical, satellite, surveillance, and consumer electronics. Video super-resolution is a process to generate high-resolution video from a number of low-resolution video frames by different R. Jagdale (✉) E&TC Engineering, Maharashtra Institute of Technology, Savitribai Phule Pune University, Pune, India e-mail: [email protected] S. Shah E&TC Engineering, Smt. Kashibai Navale COE, Savitribai Phule Pune University, Pune, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_53

533

534

R. Jagdale and S. Shah

techniques. The high-resolution images can be constructed by fusing multiple scenes of the same image. There is a need to increase the resolution of image or video for human understanding and automatic machine perception. In digital images, resolution can be improved based on pixel, spatial, spectral, temporal, and radiometric. The resolution of the image depends on image sensors and image data acquisition system. The major challenges in super-resolution are image registration, computation efficiency, performance limit, and robustness limits. Image up-scaling can be done in two ways: interpolation and super-resolution. Interpolation for image up-scaling is simply to enlarge an LR grid and fill its missing pixels with appropriate values while SR incorporates quality enhancement to obtain HR images. Image up-scaling methods can be categorized into two fields as (i) low complexity oriented up-scaling that aim to produce HR images in reasonable quality but with low computation complexity; and (ii) high-quality oriented up-scaling that pursue to maximize the quality of HR images even at the expense of high-computational complexity. Super-resolution-based algorithms are classified into two main types as reconstruction-type algorithms and learning-type algorithms. Multiple successive frames of the same scene are combined to create a high-resolution frame, this technique is generally used in reconstruction-type algorithms. In learning-type algorithm, a nonlinear mapping between lowresolution and high-resolution patches is learnt.

2 Related Work Edge orientation super-resolution method [1] is proposed for quality enhancement. This method performs enhancement from the initial resolution of the input image toward the target resolution of the output enhanced image. It does not require any intermediate interpolated image. Convolutional neural network based approach [2] is described to enhance the spatial resolution of the video sequence. Enhanced video frames are obtained by motion compensation techniques of consecutive low-resolution frames. Sparse coding and belief propagation technique are used for video enhancement [3]. By using sparse coding candidate match pixels are computed from multiple frames and then by nonlocal means method, candidate pixels with weights are calculated. Super-resolution with depth estimation concept is introduced [4] for asymmetric stereoscopic video. In which unified energy function is used to model the video super-resolution and stereo matching. Virtual view assisted system [5] is used for video super-resolution and enhancement. This method involves the technique to fill missing pixels in the super-resolved frame by spatially interpolated pixels. Texture characteristics of the neighbors of each missing pixel lead the decision mechanism. Blind super-resolution algorithm is designed [6] to enhance the spatial resolution of videos. In this method, up-sampling of the frame is done with nonuniform interpolation super-resolution and blur are estimated through the multi-scale process. Multi-frame super-resolution method is proposed [7] to reconstruct a high-resolution frame

A Novel Algorithm for Video Super-Resolution

535

from multiple low-resolution frames. This method consists of three steps. In first step region of interest is calculated with the help of motion vector for the optimal match, second with respect to image degradation model optimal patch is selected and third the reconstructed image and optimal patch are combined. This method is also well suitable for hardware implementation in the various consumer electronics systems. Segmentation-based method [8] is introduced for super-resolution. In which scale invariant nonlocal means approach is used. Generally, in zooming in and out, the scale of frames and objects may be changed. Because of that, it is very difficult to find out the same patches for super-resolution. To avoid this problem, the image is segmented to find the region of interest with a different scale. Video super-resolution using exemplary images of semantic components [9] method is introduced by Xu Chen. This method is designed to select HR facial components from the database in a temporal consistent manner, by ranking candidates based on how well they match against the LR input multiple frames. It uses a pixel-based matching technique to align selected HR candidates with the input, in order to refine their position resulting from initial facial component key-point detection. This method also includes a new temporal stabilization term in the final energy function optimization stage for blending HR facial component into the LR video. Bayesian MAP method is introduced for efficient video super-resolution. In this method, the potential is increased by combining the lost data pixels from different input frames [10].

3 Super-Resolution Technique A novel algorithm is designed for the image as well as video super-resolution. For experimentation and testing image datasets set1, set5, set14, BSD100, and BSD200 are used. For experimentation of video super-resolution, videos are downloaded from the website. MATLAB function is used for frame extraction. Noise in the frame is removed by the median filter and pixel density of the image is increased by Bicubic interpolation. To enhance the contrast, image enhancement technique is used. All frames of videos are processed one by one and finally, all high-resolution frames are combined and converted to high-resolution video.

3.1

Median Filter

The median filter is used to remove unwanted noise from any signal or an image. Noise such as impulse noise sometimes corrupt digital images. The set of random pixels having a very high contrast as compared to surrounding pixels comprises impulses noise. The quality of the image is degraded by impulse noise which appears to be a sprinkle of bright and dark spots on the image. Median filter blurs at low densities and removes thin lines hence provides reasonable noise removal

536

R. Jagdale and S. Shah

performance with a simple implementation. The standard median filter is also implemented as a rank selection filter. It is also referred to as median smoother that is added to eliminate impulse noise with aid of changing luminance values of the center pixel of filtering window with the median of luminance values of the pixels contained within the window.

3.2

Bicubic Interpolation

Image interpolation is widely used for resolution enhancement. It is the process to estimate new unknown location pixels from surrounding known data pixels. Mainly there are three interpolation methods as bilinear, nearest neighbor, and bicubic interpolation. In the proposed algorithm bicubic interpolation method is used to estimate unknown pixels of low-resolution frames. It calculates unknown values at random position from a weighted average of the 16 closest pixels. The intensity value assigned to the point (x, y) is obtained using the equation 3

3

vðx, yÞ = ∑ ∑ aij xi yj

ð1Þ

i=0 j=0

3.3

Image Super-Resolution

There are different techniques used for image resolution enhancement in different image processing applications. In which subjective quality of the image is enhanced for human interpretation. Subjective quality of the image depends on contrast. To distinguish one object from other object or from it background contrast is very important. Contrast is determined by the difference in the color and brightness level of different objects. For better human visual perception color vision approach is used. RGB or HSV color space is used in color vision approach. Red, Green, and blue colors are used to represent RGB model and Hue, Saturation, and Value are used to represent HSV model. Figure 1 shows a flow diagram of image resolution enhancement technique. In this technique, V transform is used for image enhancement which contains brightness information. The first step in this approach is to convert the RGB image to HSV image. The second step is to extract the H, S, and V channels from the HSV image; the third step is to transform the V channel by scaling it with a suitable magnification factor. The final step is to concatenate the new V channel with the old H and S channel and get back that the RGB image from the HSV image.

A Novel Algorithm for Video Super-Resolution

537

H Channel Convert RGB to HSV

Concatenate HSV Image

S Channel V Channel

Interpolated Image

V Channel Magnification

Convert HSV to RGB Image

RGB Image Fig. 1 Flow diagram of image super-resolution

The transform from RGB to HSV is given as 8 Undefined > > ◦ ◦ > > < 60◦ X MAXG −− BMIN + 0 ◦ H = 60 X MAXG −− BMIN + 360 ◦ ◦ > B−R > > > 60◦ X MAXR −−GMIN + 120◦ : 60 X MAX − MIN + 120  S=

0, MIN , 1 − MAX

if MAX = MIN if MAX = R and G ≥ B if MAX = R and G < B if MAX = G if MAX = B if MAX = 0 Otherwise

V = MAX

ð2Þ

This indicates that in RGB components the Value or Brightness is maximum for a pixel.

3.4

NA-VSR Algorithm

Step 1: Frame Extraction Read input low-resolution video ‘o’. Convert video into ‘n’ number of frames. Save frames as x.jpg (x varies from 1 to n) Step 2: Noise Removal Select one frame ‘x’ and remove noise from it by the median filter to generate frame ‘u’. Step 3: Bicubic Interpolation Apply bicubic interpolation on ‘u’ frame having output ‘t’.

538

R. Jagdale and S. Shah

Step 4: Image enhancement Convert RGB to HSV is ‘hsvImage’. Extract the colors channels. Calculate ‘v’ transform and combine with old H and S channels. Convert HSB to RGB is ‘RGBImage’. Generate HR image. Go back to step 2. Step 5: Compute design metric Generate HR video

4 Experimental Results The NA-VSR method is tested for both single image and videos. The experiments are performed using the image dataset as Set1, Set5, Set14, BSD100, and BSD200 which consists of different types of images. Experiments are implemented on an Intel i5-62200U processor with 2.30 GHz and 128 GB RAM, with software MATLAB 2016. To compare the quality of the image with previous methods and various design metrics are studied. In NA-VSR quality of the image is decided by peak signal-to-noise ratio (PSNR) and structural similarity index method (SSIM). Figure 2 shows a few test images used in our experiment. Figures are displayed row-wise as: Peppers, Tiger, Baboon, Lena, Penguin, Monarch, Zebra, Woman, and Butterfly

4.1

Single Image Super-Resolution

This section describes the performance of NA-VSR method, which has been evaluated by performing many experiments on more than 100 different images. In the first experiment, a peppers.jpg low-resolution image sample having size of 256 × 256 is taken for enhancement. The size of high-resolution image obtained as 768 × 768. The PSNR and SSIM parameters are obtained for the NA-VSR method are 38.9942 dB and 0.9846, respectively, which are comparatively good than other methods. Figure 3 shows the visual results, it can be seen that HR image generated by our method has sharp edges with the rare artifact. Figure 4 shows the design steps of image super-resolution. Low-resolution (LR) input image of size 500 × 353 is interpolated with Bicubic method 1,186 × 819 and enhanced by NA-VSR method to generate high-resolution (HR) image of the same size 1,186 × 819 with PSNR 35.55 dB, and SSIM 0.972.

A Novel Algorithm for Video Super-Resolution

539

Fig. 2 Gallery of test images

Table 1 shows the list of different values of PSNR and SSIM values of Bicubic, SRCNN, ASDS, and NA-VSR method for different images from set1, set5, set14, BSD100, and BSD200. It shows that the PSNR of NA-VSR method is 7.84 dB, 6.92 dB, and 7.42 dB more as compare to Bicubic, SRCNN, and ASDS methods, respectively.

4.2

Video Super-Resolution

To perform video super-resolution, many videos are collected from the website. The algorithm is tested for different video formats such as avi, 3gp, and mp4. The first video sequence, xylophone.mp4 contains the 141 low-resolution frames from which 20 frames are selected for testing. Size of single frame used for experimentation is 320 × 240. To demonstrate the result in paper only 5 frames are selected from each test video sequence and the average PSNR and SSIM values are

540

R. Jagdale and S. Shah

Fig. 3 Reconstruction results and its comparison with existing methods for single image peppers. jpg, a input LR image, b Bicubic/PSNR:31.705423 dB, c SRCNN/PSNR: 33.989990 dB, d ASDS/PSNR:31.37 dB, e NA-VSR/PSNR: 38.9942 dB

LR

Bicubic Bicubic

HR NA-VSR

Fig. 4 Design flow of NA-VSR algorithm

calculated. Figure 5 shows the SR results of 5 random frames for xylo.mp4 video sequence used in NA-VSR, Bicubic, SRCNN [11] and ASDS-SR. Adaptive sparse domain selection (ASDS) and adaptive regularization (AReg) [12] is designed for image de-blurring and super-resolution. For local image smoothness, autoregressive models are learned from the training dataset. Super-resolution using the deep convolutional neural network (SRCNN) uses three convolution layer approach for image super-resolution [11]. For video super-resolution ASDS and SRCNN methods are modeled for single frames. PSNR and SSIM of xylophone frame 1 of the proposed method is 36.42 dB and 0.955, for bicubic it is 28.73 dB and 0.853, for SRCNN it is 30.85 dB and 0.891 and for ASDS it is 30.85 dB and 0.891.

A Novel Algorithm for Video Super-Resolution Table 1 Design metric PSNR (dB) and SSIM values for a single image

Input image

541 Bicubic

SRCNN

ASDS

28.63 28.13 27.71 0.823 0.854 0.813 peppers.jpg 31.70 33.99 31.57 0.922 0.942 0.876 Baboon.png 23.21 23.67 23.39 0.542 0.609 0.555 Lena.jpg 28.98 31.29 32.33 0.859 0.904 0.850 Monarch.png 29.43 32.81 31.75 0.921 0.949 0.942 Zebra.bmp 26.63 29.29 28.62 0.793 0.852 0.847 Woman.bmp 28.56 31.36 31.04 0.889 0.929 0.922 Butterfly.bmp 24.04 27.95 26.74 0.819 0.906 0.894 Penguin.png 32.73 34.51 33.25 0.893 0.912 0.886 First row is PSNR and second row is SSIM Tiger.jpg

NA-VSR 35.55 0.972 38.99 0.985 32.11 0.940 34.67 0.929 38.06 0.995 35.63 0.979 36.39 0.969 33.92 0.955 37.38 0.975

Table 2 shows the PSNR and SSIM parameters computed for five different frames with frame number 1, 5, 17, 43, and 59. Those frames are randomly selected as per major change in object movement. The results of previous state-of-the-art methods are processed by the codes from the author’s websites. SSIM calculations are not available in bicubic and SRCNN. So these are modified in that code.

5 Conclusion In this paper, a novel algorithm for video super-resolution (NA-VSR) is proposed. Noise from frames is removed by the median filter and pixel density of the image is increased by bicubic interpolation technique. NA-VSR is designed for image resolution enhancement with HSV parameters. Peak signal-to-noise ratio and structural similarity index method parameters are computed for NA-VSR and compared with previous methods. PSNR of the proposed method is improved by 7.84 dB, 6.92 dB, and 7.42 dB as compared to bicubic, SRCNN, and ASDS respectively. The visual quality of output image and video is also better, all images are smooth and corners-edges are also covered clearly.

542

R. Jagdale and S. Shah

Fig. 5 Experimental results for the xylo.mp4 video sequence, a original input frame, b Bicubic, c SRCNN, d ASDS, e NA-VSR

A Novel Algorithm for Video Super-Resolution

543

Table 2 PSNR and SSIM values of different algorithms for frames Input video

Method

Football. avi

Bicubic

Metric

Frame 1

Frame 5

Frame 17

Frame 43

Frame 59

PSNR 27.91 27.71 27.77 26.95 29.11 SSIM 0.824 0.808 0.807 0.795 0.834 SRCNN PSNR 30.64 30.38 30.48 29.64 31.76 SSIM 0.881 0.869 0.867 0.859 0.884 ASDS PSNR 31.93 31.48 31.33 30.76 32.14 SSIM 0.873 0.863 0.859 0.856 0.869 NA-VSR PSNR 36.86 36.73 36.83 36.54 37.32 SSIM 0.985 0.984 0.984 0.984 0.985 Carphone. Bicubic PSNR 29.86 30.24 30.37 30.81 31.19 avi SSIM 0.869 0.879 0.885 0.892 0.899 SRCNN PSNR 31.87 32.37 32.56 32.99 33.59 SSIM 0.908 0.916 0.921 0.925 0.933 ASDS PSNR 33.22 33.90 34.11 34.56 35.00 0.938 SSIM 0.916 0.925 0.931 0.934 NA-VSR PSNR 36.66 37.01 37.07 37.39 37.49 SSIM 0.970 0.973 0.976 0.978 0.980 Dog.mp4 Bicubic PSNR 30.91 31.49 31.29 31.46 31.60 SSIM 0.908 0.914 0.914 0.915 0.919 SRCNN PSNR 33.01 33.74 33.54 33.63 33.89 SSIM 0.935 0.941 0.941 0.942 0.944 ASDS PSNR 34.17 34.58 34.73 34.83 34.81 SSIM 0.931 0.933 0.935 0.936 0.936 NA-VSR PSNR 37.11 37.39 37.45 37.48 37.56 SSIM 0.969 0.972 0.974 0.973 0.975 Xylo.mp4 Bicubic PSNR 28.73 28.88 28.94 28.86 28.93 SSIM 0.853 0.861 0.862 0.858 0.862 SRCNN PSNR 30.85 31.05 31.14 30.98 31.10 SSIM 0.891 0.899 0.899 0.896 0.901 ASDS PSNR 33.09 33.35 33.45 33.14 33.23 SSIM 0.897 0.9030 0.904 0.899 0.901 36.71 36.71 36.59 36.62 NA-VSR PSNR 36.42 0.962 0.963 0.960 0.962 SSIM 0.955 First row shows the results of frame 1, second row shows the results of frame 5, third row shows the results of frame 17, fourth row shows the results of frame 43 and fifth row shows results of frame 59

References 1. Choi, J.S., Kim, M.: Super-interpolation with edge-orientation-based mapping kernels for low complex 2 × upscaling. IEEE Trans. Image Process. 25(1) (2016) 2. Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super resolution with convolutional neural networks. IEEE Trans. Comput. Imag. 2(2) (2016) 3. Barzigar, N., Roozgard, A., Verma, P.: A video super resolution framework using SCOBeP. IEEE Trans. Circuits Syst. Video Technol. 26(2) (2016)

544

R. Jagdale and S. Shah

4. Zhangm, J., Cao, Y., Zha, Z.J., Zheng, Z.: A unified scheme for super resolution and depth estimation from asymmetric stereoscopic video. IEEE Trans. Circuits Syst. Video Technol. 26(3) (2016) 5. Jin, Z., Tillo, T., Yao, C., Xiao, J., Zhao, Y.: Virtual view assisted video super resolution and enhancement. IEEE Trans. Circuits Syst. Video Technol. 26(3) (2016) 6. Faramarzi, E., Rajan, D., Felix, C.A.: Blind super resolution of real life video sequences. IEEE Trans. Image Process. 25(4) (2016) 7. Jeong, S., Yoon, I., Jeon, J., Paik, J.: Multi-frame example-based super-resolution using locally directional self-similarity. In: IEEE International Conference on Consumer Electronics (ICCE) (2015) 8. Yang, S., Liu, J., Li, Q., Guo, Z.: Segmentation-based scale-invariant nonlocal means super resolution. In: IEEE International Symposium on Circuits and Systems (ISCAS) (2014) 9. Chen, X., Choudhury, A., van Beek, P., Segall, A.: Facial video super resolution using exemplar components. In: IEEE International Conference on Image Processing (ICIP) (2015) 10. Bui-Thu, C., Le-Tien, T., Do-Hong, T., Nguyen-Duc, H.: An efficient approach based on bayesian MAP for video super-resolution. In: The 2014 International Conference on Advanced Technologies for Communications (ATC’14) 11. Dong, C., Loy, C.C., He, K., Tang, X.: Image super resolution using deep convolutional networks. IEEE Trans. Image Process. (2016) 12. Dong, W., Zhang, L., Shi, G., Wu, X.: Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. IEEE Trans. Image Process. (2015)

Counting the Number of People in Crowd as a Part of Automatic Crowd Monitoring: A Combined Approach Yashna Bharti, Ravi Saharan and Ashutosh Saxena

Abstract This paper describes a new technique for counting the number of people in a crowd, as a part of automatic crowd monitoring. The technique involves combining the two domains of crowd size estimation, one is approximate crowd size estimation and the second is counting the exact number of people in the crowd. A simple technique based on image features is used to approximate the crowd size, depending on which crowd is divided into different classes and then a technique of exact crowd count suitable for the class of image is applied to get the number of people in the crowd. Combining the two techniques may increase the time complexity, but at the same time, there is a significant increase in accuracy, which is the primary concern. It would be useful for agencies involved in the security of the gathering to avoid crowd-related disaster and also for organizations which are responsible for giving data related to the number of people appearing in public events. Keywords Crowd density ⋅ People counting ⋅ Feature extraction ⋅ Human detection

1 Introduction With the advent of civilization, gathering of human has become a common phenomenon. In today’s world with increasing population, various places all over the globe face huge gathering of crowds like railway station, shopping mall, etc., on a Y. Bharti (✉) ⋅ R. Saharan Central University of Rajasthan, Bandarsindri, NH-8, Ajmer 305817, Rajasthan, India e-mail: [email protected] R. Saharan e-mail: [email protected] A. Saxena CMR Technical Campus Hyderabad, Hyderbad, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_54

545

546

Y. Bharti et al.

Fig. 1 Number of casualties due to stampede over the years (Source https://www.newslaundry. com/2016/10/18/maharashtra-tops-the-list-of-stampede-deaths-from-2001-to-2014)

daily basis. Open events are being organized be it a musical event, social events like the marathon or political rallies for which a large number of people appear. Managing such a large number of people is crucial as they possess potential risk of crowd disaster [1, 2] like stampede. One of the initial steps in managing the crowd would be estimating the number of people in the crowd. This will help the management to decide the amount of resources like number of security officials, vehicles in case of emergency, first aid kit, etc., that will be required either to prevent any kind disaster or to deal with the situation in case of any so that the causalities can be reduced to minimum (Fig. 1). Thus, crowd size estimation is an important part of crowd monitoring and surveillance. There is a significant amount of work done in the area. Going through a large number of papers from the school of thought, it can be concluded that the whole work can be categorized into two domains. One is approximating the crowd density in one of the categories like sparse, moderate, dense, and heavily dense. And the other is counting the exact number of people in the crowd. There are advantages and disadvantages to both the approaches. The limitation of crowd approximate density estimation is that even after a complex procedure what it gives is a mere idea of crowd strength not the exact number of people in the crowd which of course will be more desirable. And talking about the limitation of the approach of counting the exact number of people in the crowd, there are a large number of algorithms proposed for the same but each algorithm is limited to a particular domain of crowd strength. Explaining it further, a particular algorithm for crowd counting is suitable only for one of the domains of crowd strength say it sparse, moderate, or dense. The efficiency of these algorithms decreases when it is applied to other domain. This presents a hurdle in their deployment in the real field, as the same place may be sparsely, moderately, or densely crowded at the different point in time. In order to get the exact number of people in the crowd there can two be possible approaches. One could be to design a new algorithm which can deal with all types of crowd strength. And other is to combine the different algorithms of each domain

Counting the Number of People in Crowd as a Part of Automatic . . .

547

in some suitable way so that it can count the number of people in each domain efficiently. This paper deals with the second approach. In this paper, we are combing both the approaches of crowd size estimation i.e., approximation method and exact counting method as well as different exact counting algorithms in order to get a single stage framework for counting the number of people in the crowd efficiently. In this paper, we will first apply the approximate crowd counting algorithm to categorize the image in one of the domains of crowd strength, i.e., sparse, moderate, dense, or heavily dense. Then a suitable algorithm for that domain will be applied to get the exact number of people in the crowd. The organization of the rest of the paper is as follows: Sect. 2 presents a brief review of the related work. Section 3 explains our proposed approach. Section 4 evaluates the approach with experimental results. And with Sect. 5, we conclude the paper.

2 Related Work After going through a significant number of papers, it was concluded that the overall work in the area can be categorized into two main domains. One is density approximation and the other is counting people in the crowd. Here, is a brief description of some papers from the papers in the two domains.

2.1 Density Approximation Approach There are different methods proposed in this domain. Here, is a brief description of a method which are efficient either in terms of complexity or in terms of accuracy. Marana et al. [3] proposed a method which classifies the image into five class, i.e., very low density, low density, moderate density, high density, and very high density. These features are extracted from the image using image processing and then used for classification via a neural network classifier. As stated in the paper, the method is very efficient in estimating the crowd density, i.e., 94% efficiency in best case and approximately 54% in worst with average of 81%. Ma et al. [4] describe a method which is based on ALBP (advanced local binary pattern) feature descriptor technique. In this paper, the image is first divided into fixed size cell and local density within each cell is calculated. The authors have labeled image cells with a density termed as very low, low, moderate, high, and very high. Then ALBP feature vectors are extracted from each image cells and relationships feature and crowd densities is learned by supervised learning.

548

Y. Bharti et al.

2.2 People Counting Approach Hu at el. [5] proposed a deep learning approach to estimate the number of individuals in mid-level or high-level crowd image. It divides the image into sections or patches and counts the no. of people in each patch using feature-count regression and the number of people in each patch is added to get the whole estimation. The paper claims that it outperforms both traditional methods based on head detection [6, 7] and learning-based method on the tested dataset. Yoon at el. [8] proposed a method of counting people in sparse or moderate crowd. It basically detects the object in the image and depending upon its shape of CMP (conditional marked point) inspects if it is a human and then counts and finally gives the number of persons in the image. Lin et al. [9] proposed a method which is based on a general concept that the head of the individual person can be distinguished in a crowd. The method proposed uses HWT to extract the head in order to describe the characteristics of a head. Then, some significant features are selected and used as input to the classifier which is based on SVM. As per the result shown in the paper, it can be said that the method works well for some images but it is limited to some complex images where there is a greater overlapping and head of individuals are not clearly visible.

3 The Proposed Method The objective is to count the number of people in the crowd. For this, the images of the crowd are taken. Depending on the requirement, the images will be processed like scaling, noise reduction, quality enhancement, etc. After this, the image will be suitable for feature extraction. The features are extracted. The extracted features first will be used to train the classifier with enough number of images from that particular place. This trained classifier will then be used to classify the real-time images in different defined classes. After classification, the image will be passed to an algorithm of exact crowd counting suitable for that class. The whole procedure is summarized as follows (Fig. 2). 1. Preprocessing: Scaling, noise reduction, RGB to gray conversion, etc. 2. Feature extraction: The following features are extracted from the image ∙ ∙ ∙ ∙ ∙

Contrast Entropy Homogeneity Correlation Energy

3. Classification: The classifier is trained with different classes of image. It will classify the image in one of the following classes:

Counting the Number of People in Crowd as a Part of Automatic . . .

549

Sparse Image Moderate Pre-processing

Algorithm Classifier

Selection

People Count

Dense

Feature Extraction Heavily Dense

Fig. 2 Block diagram of the proposed approach

(a) (b) (c) (d)

Sparse Moderate Dense Heavily dense

4. Algorithm selection: Now, depending on the class identified for the image the corresponding algorithms will be applied to get the count of number of people in the image.

4 Experiment and Result For conducting an experiment, a new database was prepared. The database contains five sets of images of a particular place (100 m2 ) taken from five different angles. The sets are named as 1, 2, 3, 4, and view 5. Each set contains 12 images with a different number of people in the range of 50–600. A snapshot of set view 1 of database is presented in Fig. 3.

4.1 Approximation Depending on the number of people and area, different classes are defined. The classes and its numerical representations are as follows (Table 1). For classification, the various features are extracted from each image of the set. Feature extracted from the images are shown in Table 2. These features are used to train the machine in order to get a classifier. The classifier is then tested against the test dataset. The result of classification for test data of this particular set of image is shown in Table 3. From the results as shown in Table 3, it is clear that the classifier is capable of classifying the image with the MSE of 0.0081 (Table 4).

550

Y. Bharti et al.

Fig. 3 Sample images of one set of database Table 1 Different classes and it numerical representation Class Numerical representation Sparse (0–150) Moderate (150–300) Dense (300–450) Heavily dense (450–600)

0.0000 0.3000 0.7000 1.0000

Table 2 Values corresponding to different features of images shown in Fig. 3 Strength Entropy Contrast Homogeneity Correlation 50 100 150 200 250 300 350 400 450 500 550 600

6.3983 6.5257 6.5883 6.5921 6.6258 6.6301 6.6611 6.6937 6.6745 6.6645 6.6733 6.7084

0.1272 0.1497 0.1531 0.1579 0.1623 0.1637 0.1654 0.1643 0.1739 0.1680 0.1690 0.1772

0.9600 0.9536 0.9511 0.9490 0.9480 0.9472 0.9465 0.9473 0.9443 0.9448 0.9449 0.9435

0.9482 0.9483 0.9491 0.9465 0.9482 0.9470 0.9482 0.9541 0.9474 0.9482 0.9495 0.9502

Energy 0.2924 0.2665 0.2542 0.2537 0.2470 0.2460 0.2397 0.2365 0.2371 0.2389 0.2374 0.2314

Counting the Number of People in Crowd as a Part of Automatic . . . Table 3 Test set result for the image set shown in Fig. 3 S. No. Desired output 1 2 4 5

1.0000 0.0000 0.3000 0.7000 MSE

0.0081

0.0074

Calculated output 1.0000 0.1007 0.4752 1.0000 0.0081

Table 4 Result of classification of images of 5 different sets Image set View 1 View 2 View 3 MSE

551

0.0251

View 4

View 5

0.0095

0.0052

We performed the experiment with all five sets of images of database and obtained the following results.

4.2 Exact Counting This section deals with counting the number of individuals in the crowd. There are various algorithms designed for counting the number of people in the crowd for different classes as mentioned in the paper. In this paper, we have used one–one techniques from each domain which found the efficient in that domain. Each of the algorithms applied has accuracy in the range of 85–95% in its class of crowd density. The overall accuracy in giving the exact number of people as average of all these and is found to be approximately 90%. One of the significant points here is that the accuracy is not being affected by any change in crowd density.

5 Conclusion From the experimental result, it is clear that this approach is capable of counting the number of people in the crowd with high efficiency. It can also be concluded that the efficiency of the method is not affected by the change in density of the crowd. Thus, this approach overcomes the limitation of degradation in efficiency of existing methods with change in crowd density. Due to the combination of two approaches, the time complexity of the procedure is higher which can be worked upon in the future. Overall, this paper presents a novel approach for crowd density estimation which could add up to the features of automatic crowd monitoring and surveillance.

552

Y. Bharti et al.

Acknowledgements The crowd images used in research is taken from a site called crowd safety and risk analysis. The link is: http://www.gkstill.com.

References 1. Helbing, D., Johansson, A., Al-Abideen, H.Z.: Dynamics of crowd disasters: an empirical study. Phys. Rev. E 75(4), 046109 (2007). https://doi.org/10.1103/PhysRevE.75.046109 2. Soomaroo, L., Murray, V.: Disasters at mass gatherings: lessons from history. PLoS Curr. 4 (2012). https://doi.org/10.1371/currents.RRN1301 3. Marana, A.N., Velastin, S.A., Costa, L.F., Lotufo, R.A.: Estimation of Crowd Density Using Image Processing, pp. 11–11(1997). https://doi.org/10.1049/ic:19970387 4. Ma, W., Huang, L., Liu, C.: Advanced local binary pattern descriptors for crowd estimation. In: 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, PACIIA’08, vol. 2. IEEE (2008). https://doi.org/10.1109/PACIIA.2008.258 5. Hu, Y., Chang, H., Nian, F., Wang, Y., Li, T.: Dense crowd counting from still images with convolutional neural networks. J. Vis. Commun. Image Represent. 38, 530–539 (2016). https:// doi.org/10.1016/j.jvcir.2016.03.021 6. Subburaman, V.B., Descamps, A., Carincotte, C.: Counting people in the crowd using a generic head detector. In: 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pp. 470–475. IEEE (2012). https://doi.org/10.1109/AVSS. 2012.87 7. Chauhan, V., Kumar, S., Singh, S.K.: Human count estimation in high density crowd images and videos. In: 2016 Fourth International Conference on Parallel, Distributed and Grid Computing (PDGC). IEEE (2016). https://doi.org/10.1109/PDGC.2016.7913173 8. Yoon, Y., Gwak, J., Song, J.-I., Jeon, M.: Conditional marked point process-based crowd counting in sparsely and moderately crowded scenes. In: 2016 International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 215–220. IEEE (2016). https://doi.org/ 10.1109/ICCAIS.2016.7822463 9. Lin, S.-F., Chen, J.-Y., Chao, H.-X.: Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 31(6), 645–654 (2001). https://doi.org/10.1109/3468.983420 10. Karpagavalli, P., Ramprasad, A.V.: Estimating the density of the people and counting the number of people in a crowd environment for human safety. In: 2013 International Conference on Communications and Signal Processing (ICCSP). IEEE (2013) 11. Saqib, M., Khan, S.D., Blumenstein, M.: Texture-based feature mining for crowd density estimation: a study. In: 2016 International Conference on Image and Vision Computing New Zealand (IVCNZ). IEEE (2016)

Improving Image Quality for Detection of Illegally Parked Vehicle in No Parking Area Rikita Nagar and Hiteishi Diwanji

Abstract Nowadays, due to the increasing use of automobiles, as means of transportation people are facing heavy traffic problem in their day-to-day life. Most common cause of traffic is unauthorized parking on busy road. Generally, people do not find parking space in authorized parking lot or distance to authorized parking is too far, so they encouraged doing parking on roadside. This kind of illegal parking sometimes causes accidents. As high-quality video surveillance cost is reduced, detection of human activity and tracking becomes more practical. But still, detection of vehicles parked in no parking is a major task of the operators at surveillance office. So, there is need for such an automated traffic management system, which can detect vehicle parked in no parking. In the span of the most recent couple of years, numerous methods and framework have been proposed to detect illegally parked vehicle in no parking area. Although detection of an illegally parked vehicle in sudden light changing condition becomes more complex as video captured by the static camera is affected by low illumination or lighting condition. Due to the low contrast quality of the image is also reduced. So, vehicles parked in no parking area are not detected with higher precision and recall. In this work, we introduce steps which enhance image quality with respect to higher PSNR and low MSE for the purpose of detecting an illegally parked vehicle in no parking area.





Keywords Vehicle detection Illegally parked vehicle Image quality Adaptive histogram equalization Histogram equalization Contrast enhancement PSNR MSE









R. Nagar (✉) Government Polytechnic for Girls, Ahmedabad, Gujarat, India e-mail: [email protected] H. Diwanji L. D. College of Engineering, Ahmedabad, Gujarat, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_55

553

554

R. Nagar and H. Diwanji

1 Introduction From the study it has been found, a vehicle parked on the busy road or in no parking area creates heavy traffic which may cause for accidents or hitting situation. So, to prevent such situation traffic management system has to detect an unauthorized or illegally parked vehicle in the no parking area. Not only detecting such a vehicle in no parking area, further notification system or alarm system should be implemented to conduct immediate action by the traffic regulation authority. Recently many researchers have used different approaches to detect illegal parking of a vehicle in no parking area. Generally, two types of cameras are used for any surveillance system: Static camera and moving camera. But in the case of vehicle detection in no parking area, static cameras are generally used. So, to detect vehicles from traffic scene most researches have used background subtraction method. But, this method is affected by the sudden change in lighting and illumination condition. So, they could not get higher efficiency in detecting the illegally parked vehicle. So, we need to improve the quality of images captured by the static camera to improve the efficiency of illegally parked vehicle detection. Hereby, we introduce an approach to improve the quality of images by applying various image processing methods such as applying a median filter to remove noise from the captured image. Then, applying morphological opening on the image to open up a gap between objects or clearly separate the connected components which will help to accurately identify object. Images collected in the natural environment are sometimes affected by sudden illumination changing condition or low lighting condition. Due to which objects in the image are not easily detected by detection algorithm. So we need to enhance the contrast feature of an image in such a condition to improve the segmentation accuracy. To improve the contrast of image, Adaptive Histogram Equalization method is applied for feature, which will also improves the quality of an image for segmentation purpose. Then, this enhanced image can be segmented by using background subtraction method to detect illegally parked vehicle in no parking area.

2 Related Work From the study of the literature, it is remarked that efficiency of the different methods used to detect an illegally parked vehicle in no parking area sometimes depends on very poor quality images, which is captured by the surveillance camera. Because video captured by the surveillance camera is highly affected by the changing lighting condition such as daytime, evening time; different weather conditions and illumination change condition. As well as, it is also affected by noise. Due to the poor quality of the video efficiency, detecting vehicles parked in

Improving Image Quality for Detection of Illegally Parked …

555

the no parking area is also degraded. So, we need to first improve the quality of images captured by the surveillance system to increase the efficiency of the detection process. Hereby, we highlighted a few methodologies used by the researchers to detect vehicles parked in no parking area with its pros and cons. 1. The author proposed a methodology to detect the vehicle parked in no parking area using projection of 2D data into 1D data. By doing this, they reduce the complexity of segmentation and tracking. Advantages of this methodology is that it reduces the complexity of segmentation and tracking process from O(n2) to O(n). But the final result is fully dependent on the initial transformation results for static vehicle detection. It also results in false positives due to illumination change conditions and do not give accurate results in nighttime video frames. 2. The author proposes a novel methodology, which depends on deep learning. It uses single shot multi-box detector (SSD) algorithm to locate and classify illegally parked vehicles which are captured by camera. In this paper, to enhance the performance of vehicle detection, the author used optimized SSD. Here, they adjust k-means of actual dataset in turn to adjust the aspect ratio of default box. Optimized Single Shot Multibox Detector (SSD) gives accurate results in complex weather conditions at realtime and can detect a variety of vehicles like car, motorcycle, truck, etc. Another advantage is that this methodology does not use background subtraction methodology, which is highly sensitive to environment changing condition, and results in higher accuracy and lower computational time. 3. In this paper, the author proposed a two-stage application framework which provides real time, illumination variation resistant, and occlusion-tolerant solution. Segmentation History Images (SHI) is used to detect the illegally parked vehicle in the restricted parking area. SHI improves foreground segmentation accuracy for detecting stationary vehicles. Advantages of this methodology are that it handles sudden illumination change condition and detects objects even when they are occluded. But often it failed to detect a stationary object due to the low light condition of the traffic scene. 4. In this paper, the author uses dual-background model subtraction to detect the illegally parked vehicle. Adaptive background model which depends on statistical information of intensity of a pixel is used by the author. This method is highly efficient in lighting condition. To remove false region geometrical property-based analysis is used. Scalable Histogram of Oriented Gradient (SHOG) is then applied to the detect object is vehicle or not. SHOG is trained using Support Vector Machine (SVM). Then, tracking is applied on the detected vehicle and time for which vehicle becomes static is counted. If vehicle stops more than some time limit then proposed system generates an alert. This method failed to give efficient results when a sudden or slow change in illumination condition occurs due to a change in the time of the day. SHOG feature is not

556

R. Nagar and H. Diwanji

easy to design as well as does not give efficient result in changing weather conditions. 5. In this paper, the author used small learning rate and large learning rate to create short-term and long-term background models for Gaussian Mixture Model. Both of these background models identify foreground pixels depending on the consecutive values and temporary positions of pixels for a certain amount of time. From the result, it is shown that this methodology is robust and efficient. But the background subtraction method cannot deal with sudden change in weather conditions and light conditions which decrease the efficiency. Vehicles which are closely parked cannot be detected by this methodology.

3 Proposed Solution for Improving Quality of Image To improve the quality of the image to detect an illegally parked vehicle, we proposed a solution which is using a median filter, adaptive histogram equalization, and morphological opening. We have used here Adaptive Histogram Equalization (AHE) technique to improve contrast in images. With comparison to simple histogram equalization method, adaptive method calculates separate histograms for a distinct part of the image. Then, it uses this histogram to redistribute the grayvalues of the image. Because of this advantage, the adaptive histogram technique is used to improve contrast [1]. It enhances edge description in the image. In turn, it is helpful to improve precision to detect parking of a vehicle in no parking zone.

4 Implementation Process Algorithm for improving the quality of images to detect a vehicle that is illegally parked in no parking zone: Step 1: Video sequence captured from the static camera installed in no parking area is converted into frames. Step 2: Frames extracted in RGB is converted to grayscale image for further processing. Step 3: To remove noise from the image Median filter is used. Step 4: Adaptive Histogram equalization method is performed to the grayscale image to enhance sharpness level in image. Step 5: Morphological opening is applied to the image to open up the gap between objects.

Improving Image Quality for Detection of Illegally Parked …

557

5 Performance Parameter In this paper to evaluate the quality of image, we used MSE (Mean Squared Error) and PSNR (Peak Signal-to-Noise Ratio) as a performance parameter. PSNR is the experimental parameter which is the ratio of maximum signal power to the distorting noise power. PSNR is measured via a logarithmic decibel scale. A good quality image has a larger value of PSNR and a smaller value of MSE. PSNR can be mathematically described as follows (Fig. 1).   PSNR = 10 * log10 MAXf2 ̸ MSE

ð1Þ

where MSE (Mean Squared Error) is MSE =

1 m−1 n−1 ∑ ∑ ½Iði, jÞ − Kði, jÞ2 mn i = 0 j = 0

ð2Þ

where I K m i n j MAXf

original image matrix data. degraded image matrix data number of rows (pixels) of the images index of row number of columns (pixels) of the image index of column maximum signal value of original good quality image

6 Experimental Results In this section, we show experimental outputs of the flowchart to improve the quality of images for detection of illegally parked vehicle object tracking method. The algorithm was implemented in MATLAB 2017a, and tested in Windows 10 with Intel dual-core I3 CPU. Our method was evaluated using the MIT street scenes dataset [2]. The resolution of image is 960 × 1280. We have experimented our proposed solution as per the algorithm described in Sect. 5, on the 200 traffic images, the results from one of the images are displayed in Fig. 2a–j.

Fig. 1 Block diagram for improving image quality

558

R. Nagar and H. Diwanji

(a) Original Image

(b) Grayscale Image

(c) Median Filter applied on (b)

(d) AHE applied on (c)

(e) Histogram of (c)

(g) Opening applied on (c)

(f) Histogram of (d)

(h) Opening applied on (d)

Fig. 2 Experimental results on the parked vehicle image

In Fig. 2a original images are displayed. Figure 2b is the grayscale image of the original image. Figure 2c is the output of the median filter on grayscale image, which removes noise from the image. Figure 2d is the image of adaptive histogram equalization applied on the median filtered image. To enhance the contrast of the image AHE (Adaptive Histogram Equalization) is applied. From Fig. 2c, d, we can

Improving Image Quality for Detection of Illegally Parked …

559

see the quality of Fig. 2d is improved than Fig. 2c. Figure 2e displays a histogram of the median filtered image. Figure 2f displays the histogram of AHE applied image. From the comparison of these two histograms, we can say adaptive histogram equalized image is enhanced in contrast level. Now, we have applied morphological opening on the median image and on adaptive histogram equalized image and get the result as shown in Fig. 2g, h. From the experimental results, we have measured MSE (Mean Squared Error) and PSNR (Peak Signal-to-Noise Ratio) of the 200 traffic images from the MIT street scenes dataset which is shown in Fig. 3.

No. of Images

(a) PSNR comparison

No. of Images

(b) MSE comparison

No. of Images

(c) % improvement in PSNR and MSE Fig. 3 Comparison of performance parameters PSNR and MSE

560

R. Nagar and H. Diwanji

7 Conclusion In this paper, the proposed solution is implemented for improving the quality of image for detecting illegally parked vehicle in no parking area using adaptive histogram equalization method. To get higher PSNR for good quality image we have applied a median filter and adaptive histogram equalization methods on the frames extracted from the video captured from a surveillance camera installed in no parking area. The noise of the image can be removed by applying Median Filter. Generally, images are affected by illumination change or light changing conditions; to improve the contrast of the image histogram equalization and adaptive histogram methods are used. Then morphological opening is applied on resulting image. From the experimental results, we conclude that image contrast and quality improved as values of PSNR and MSE also improved due to adaptive histogram equalization. The experiment is performed on different 200 images of the MIT Street dataset [2]. From the results, we can say that we achieve an average 16.42% improvement in MSE and average 13.15% improvement in PSNR. In our future work, using this enhanced quality images we try to implement methodology which will improve accuracy for detecting vehicle parked in no parking area in sudden illumination changing condition with respect to precision and recall.

References 1. Pooja, Jatana, G.S.: Adaptive histogram equalization technique for enhancement of coloured image quality. Int. J. Latest Trends Eng. Technol. 8(2), 010–017. https://doi.org/10.21172/1. 82.002e. ISSN: 2278-621X 2. MIT street scenes. http://cbcl.mit.edu/software-datasets/streetscenes/ (2007) 3. Lee, J.T., Ryoo, M.S., Riley, M., Aggarwal, J.K.: Real-time illegal parking detection in outdoor environments using 1-D transformation. IEEE Trans. Circuits Syst. Video Tech. 19 (7) (2009) 4. Xie, X., Wang, C., Chen, S., Shi, G., Zhao, Z.: Time illegal parking detection system based on deep learning. In: Proceedings of the 2017 International Conference on Deep Learning Technologies, pp. 23–27 5. Filonenko, A., Jo, K.H.: Illegally parked vehicle detection using adaptive dual background model. In: Proceedings of IEEE Industrial Electronic Society Conference (IECON). Yokohama, Nov 2015 6. Filonenko, A., Jo, K.H.: Detecting illegally parked vehicle based on cumulative dual foreground difference. In: IEEE 14th International Conference on Industrial Informatics (2016) 7. Hassan, W., Birch, P., Young, R., Chatwin, C.: Real Time occlusion tolerant detection of illegally parked vehicles. Int. J. Control Autom. Syst. 10(5), 972982 (2012) 8. Tiwari, M., Rakesh, S.: A review of detection and tracking of object from image and video sequences. Int. J. Comput. Intell. Res. 13(5), 745–765 (2017). ISSN 0973-1873. Research India Publications

Improving Image Quality for Detection of Illegally Parked …

561

9. Karasulu, B., Korukoglu, S.: Moving Object Detection and Tracking in Videos, XV, 76 p. 11 illus., softcover. ISBN 978-1-4614-6533-1. http://www.springer.com/978-1-4614-6533-1 (2013) 10. Shaikh, S.H., et al.: Moving Object Detection Using Background Subtraction, SpringerBriefs in Computer Science. https://doi.org/10.1007/978-3-319-07386-6_2, © The Author(s) 2014 11. Nurhadiyatna1, A., Jatmiko1, W., Hardjono1, B.: Background subtraction using gaussian mixture model enhanced by hole filling algorithm (GMMHF). In: 2013 IEEE International Conference on Systems, Man, and Cybernetics 12. Piccardi, M.: Background subtraction techniques: a review. In: 2004 IEEE International Conference on Systems, Man and Cybernetics 13. Panahi, S., Sheikhi, S., Hadadan, S., Gheissari, N.: Evaluation of background subtraction methods. 978-0-7695-3456-5/08, © 2008. IEEE https://doi.org/10.1109/dicta.2008.52 14. Alandkar, L., Gengaj, S.R.: Dealing background issues in object detection using GMM: a survey. Int. J. Comput. Appl. 150(5), 0975–8887 (2016) 15. Ma, Y.L., Chang, Z.C.: Moving vehicles detection based on improved gaussian mixture model. In: International Conference of Electrical, Automation and Mechanical Engineering (EAME) (2015) 16. Chavan, R., Gangej, S.R.: Multiple object detection using GMM technique and tracking using Kalman filter. Int. J. Comput. Appl. 172(3), 0975–8887 (2017) 17. Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: Feature similarity index for image quality assessment. In: IEEE Transactions on Image Processing 20(8) (2011) 18. Zhou, C., Yang, X., Zhang, B., Lin, K., Xu, D., Guo, Q., Sun, C.: An adaptive image enhancement method for a recirculating aquaculture system. Sci. Rep. 7, 6243. Published online 24 July 2017. https://doi.org/10.1038/s41598-017-06538-9 (2017)

Analysis of Image Inconsistency Based on Discrete Cosine Transform (DCT) Vivek Mahale, Mouad M. H. Ali, Pravin L. Yannawar and Ashok Gaikwad

Abstract The popularity of Digital Image has widely increased in society. Nowadays, by the easy availability of image editing software people can manipulate the image for malicious intent. Our proposed method is to detect inconsistency in the exact area of an image. The paper involves different steps, i.e., preprocessing, feature extraction, and matching processes. In feature extraction, we apply Discrete Cosine Transform (DCT). Evaluate our system by calculating True Positive Rate (TPR), False Positive Rate (FPR), and Area Under the Curve (AUC) of 0.3372, 0.5278, and 0.949, respectively. The results show more efficiency. Keywords DCT



FPR



TPR



AUC



Feature extraction

1 Introduction Recent days Digital Images are used in all fields like Medical, Scientific Discoveries, Magazines, Newspapers, Court, TV News, etc. Due to the easy accessibility of potent image editing tools or software on the net, people can easily manipulate and present as a negative propaganda in social electronics media. This is very harmful to the society. That is why day-by-day image has lost its pureness and honesty. With the help of the naked eye, it is difficult to detect an image is genuine or forged. Manipulated images have to hide the reality, which can be important for V. Mahale (✉) ⋅ M. M. H. Ali Department of CS & IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad 431001, Maharashtra, India e-mail: [email protected] P. L. Yannawar Vison and Intelligent System Lab, Department of CS & IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra, India A. Gaikwad Institute of Management Studies and Information Technology, Aurangabad, Maharashtra, India © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_56

563

564

V. Mahale et al.

Fig. 1 Image forgery classes

the probe in crime. Copy Move manipulation is usually used and it is evil. In the case of Copy Move tampering the same part of the image was copied and put on the place from the same image. The human cannot be discovering suspect artifact easily because the copied part came from the same image that is why properties of the copied part are compatible with the rest of the image [1]. Analysis of image inconsistency has two approaches, i.e., Active [2–4] and Passive [5–7]. Active inconsistency detection method needs anterior information of the digital image. In passive inconsistency detection method, there is no need of anterior information of the digital image. Active image inconsistency method proposes two types, i.e., Digital Signature and Digital Watermark [8]. The passive method was in two forms; dependent and independent. The dependent has again two classes, the first classes was Copy Move [9], while the second class was Splicing. Forgery-type independent also has two types, i.e., retouching and lighting. Figure 1 represents different classes of image forgery detection.

2 Related Work Lin et al. [10] proposed combining different methods for copy-move detection methods in splicing forgery detection. They focus to change the images into YCbCr color space, then they used SURF method to detect the copy-move. In the case of

Analysis of Image Inconsistency Based on Discrete Cosine …

565

splicing detection, the DCT was used to extract the feature from each block and matching block by block to detect the forged part. The results show that good for both copy-move and splicing forgery detection. Lin et al. [11] proposed an automatic forgery detection technique by using the DCT coefficient. The work done by applying DCT and Bayesian as feature extraction, while the matching algorithm done by using similarity measure to detect the forgery parts from images. In addition, Huang et al. [12] performing copy-move forgery detection by improving the DCT. In their work, the DCT coefficients are used to extract the feature and store that as a feature vector. After that similarity matching algorithm is used to identify imitation areas of an image. Cao et al. [13] conducting a novelty method to discover the copy-move forgery from the digital image and similar to previous work and they implemented DCT for each block and extract the feature from that, and the check which part is similar to each other by using block by block comparison to detect the forgery image. Mahale et al. [14] proposed an image inconsistency detection method by using LBP technique. In their study, the simple preprocessing steps were used for converting color to grayscale and perform the feature extraction by using LBP as texture method after that the distance measure was used to determine the similarity of each pixel by using Euclidean distance. The results were achieved accuracy reach 98.58% on block size 2 × 2. Furthermore, Hilal et al. [15] were worked on the same field for detection the forgery images by the proposed Histogram of Orientated Gradient (HOG) method to detect the forgery images. The dataset size was 100 images, 50 images forge, and 50 images original to test their algorithm. The result was achieved FAR of 0.82 and FRR of 0.17 which shows the efficiency results obtain the system.

3 Methodology of the Proposed System In this section of the proposed system, it was discussed in detail by different stages. The first stage was the preprocessing while the second stage was feature extraction using the DCT method afterward the matching stages was discussed and decision making which decided the images was original or forge. Figure 2 shows the general steps of image Inconsistency based on DCT method. The proposed system was conducted in different steps, which were discussed in the next section and our algorithm was shown as follows:

566

V. Mahale et al.

Fig. 2 Diagram of image inconsistency based on DCT

Algorithm 1: Automatic Forgery detection Input: Image for testing inconsistencies Output: Original image /Forge image. Begin:

Step 1: for each image in folder do Step 2: Read testing image Step 3: perform pre-processing (test image) Step 4: test image was divide to overlapping blocks Step 5: calculate DCT for each blocks Step 6: perform Zigzag Scanning to reduce the feature vector of each block. Step 7: investigated k-mean Clustering method. Step 8: Find correlation between pair of blocks. Step 9: If two blocks are similar then calculate distance between them. Step 10: marked the part of the forge of an image. End. End.

3.1

Preprocessing

In this work, the proposed system was performed on two types of images format (gray and color) images. The general steps of preprocessing in this work were done by converting color images into grayscale by Eq. (1).

Analysis of Image Inconsistency Based on Discrete Cosine …

I = 0.299 R + 0.587 G + 0.114 B

567

ð1Þ

where R refers to red channel, G refers to green channel and B refers to Blue channel. In the second step, images were divided into grayscale image I(M × N). After the conversion of an image into small overlapping blocks of n × n pixel and results are shown in Eq. (2) by block size (2 × 2). B = ðM − n + 1Þ × ðN − n + 1Þ

3.2

ð2Þ

Feature Extraction and Matching Process

Results of previous stages were passed to the feature extraction for extraction of the proper feature from a particular image. The feature extraction stages were the backbone of the system due to extract the proper features which can decide the image is original or forge. In this work, the DCT was used for whole blocks of an image. The DCT coefficient was calculated and reshaped onto row vector by using zigzag scanning to reduce the feature vector of each block and reduced processing time [16]. Figure 3 shows the process of a zigzag scanning order. For each block, apply DCT which resulted in the two-dimensional DCT of each block, then round each element of the DCT matrix to the nearest integer for whole blocks of particular images.

Fig. 3 Zigzag order scanning

568

V. Mahale et al.

Fig. 4 Illustrate of blocks clustered and sorted into classes

Finally, DCT coefficient for each block was stored in matrix form as a feature vector template. Here, the DCT coefficients from 1 to 9 are used to find the similarities of each block using K-mean clustering algorithm, which uses to groups the blocks into N classes. The detail of K-mean clustering algorithm was discussed in [17]. Afterward, the whole classes were sorted by using Radix sort method and keep the index of each class was kept in the vector. The similar classes were kept together to reduce the matching time. Finally, the matching was done by checking two blocks, if the blocks were similar then the distance between them was calculated and this process was applied to all the blocks in an image. The example of clustering and sorting seen in Fig. 4 then the sorting matrix it was save for matching purpose. Finally, the comparison was done by each row by correlation of sorting blocks, the condition of checking the correlation >T, then the decision that both blocks were similar.

4 Result and Discussion The proposed method was tested on COMOFOD dataset, University of Zagreb [18] with image size was 512 × 512 and configuration system used was a laptop with processor Intel Core i3 and RAM with 4 GB and image processing tools box with MATLAB V.2013a software. Figure 5 shows images of samples image were taken from the dataset. A color image was converted into grayscale and was segmented into 2 × 2 block size. Then, the feature extraction using DCT and matching by using Euclidean distance measure to detect exact forge area and mark the part which is forging. Figure 6 shows a visualization of experiments. For the evaluation of our system, 100 images are taken from the dataset which mentions in the previous section and performs all steps of preprocessing and feature

Analysis of Image Inconsistency Based on Discrete Cosine …

569

Fig. 5 Some sample from COMOFOD dataset a original case, b forge case

Fig. 6 Forge image with exact forge area detect and color that forge area

extraction then store the feature in the database as a template. The generate cases of divided images to training sets and the testing set was done 50% for both cases, then DCT technique was applied to extract the feature and then calculate the evaluation matrix such as TPR and FPR with the help of threshold values. TPR and FPR formulas were shown as follows. True positive ðTPÞ =

Number of Images Detected as Forged being Forge Number of Forge Image

ð3Þ

570

V. Mahale et al.

Fig. 7 Performance of the system shown by ROC curve of DCT method

TPR and FPR curve of image forgery detection (DCT) 1

False Positive Rate (FPR)

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

True Positive Rate (TPR)

False Positive ðFPÞ =

Number of Images Detected as Forged being Original Number of Original Images

ð4Þ

After evaluation of 100 images dataset the FPR, TPR, and AUC were 0.3372, 0.5278, and 0.9498 respectively. Figure 7 shows the performance of the system by ROC curve.

5 Conclusion In this work, the image inconsistency detection based on Discrete Cosine Transform (DCT) was presented. The experiment was evaluated on popular dataset called COMOFOD dataset. And the preprocessing was done by converting a color image into grayscale and divided into overlapping blocks. The DCT was applied and the matching blocks finally identify which the forge area of an image. The efficiency of the system was calculated based on TPR and FPR with the help depend on thresholds. The best result was achieved on the FPR of 0.3372, TPR of 0.5278, and Area under the curve (AUC) of 0.9498, which show more efficiency compared with the existing system. The future work may be extended to apply on the largest dataset and used different types of techniques like CNN and RLM.

Analysis of Image Inconsistency Based on Discrete Cosine …

571

References 1. Fridrich, A.J., Soukal, B.D., Lukáš, A.J.: Detection of copy-move forgery in digital images. In: Proceedings of Digital Forensic Research Workshop (2003) 2. Cheddad, A., Condell, J., Curran, K., McKevitt, P.: Digital image steganography: survey and analysis of current methods. Signal Process. 90(3), 727–752 (2010) 3. Rey, C., Dugelay, J.L.: A survey of watermarking algorithms for image authentication. EURASIP J. Adv. Signal Process. 6, 218932 (2002) 4. Yeung, M.M.: Digital watermarking: marking the valuable while probing the invisible. Commun. ACM 41(7), 31–35 (1998) 5. Lee, J.C., Chang, C.P., Chen, W.K.: Detection of copy–move image forgery using histogram of orientated gradients. Inf. Sci. 321, 250–262 (2015) 6. Mahdian, B., Saic, S.: Blind authentication using periodic properties of interpolation. IEEE Trans. Inf. Forensics Secur. 3(3), 529–538 (2008) 7. Zhang, Z., Ren, Y., Ping, X.J., He, Z.Y., Zhang, S.Z.: A survey on passive-blind image forgery by doctor method detection. Int. Conf. Mach. Learn. Cybern. 6, 3463–3467 (2008) 8. Birajdar, G.K., Mankar, V.H.: Digital image forgery detection using passive techniques: a survey. Digit. Investig. 10(3), 226–245 (2013) 9. Mushtaq, S., Mir, A.H.: Digital image forgeries and passive image authentication techniques: a survey. Int. J. Adv. Sci. Technol. 73, 15–32 (2014) 10. Lin, S.D., et al.: An integrated technique for splicing and copy move forgery image detection. IEEE 4th International Congress on Image and Signal Processing (CISP) 2, 1086–1090 (2011) 11. Lin, Z., et al.: Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis. Pattern Recogn. 42, 2492–2501 (2009) 12. Huang, Y., Lu, W., Sun, W., Long, D.: Improved DCT-based detection of copy-move forgery in images. Forensic. Sci. Int. 206, 178–184 (2011) 13. Cao, Y., Gao, T., Yang, Q.: A robust detection algorithm for copy-move forgery in digital images. Forensic. Int. 214, 33–43 (2012) 14. Mahale, V.H., Ali, M.M.H., Yannawar, P.L., Gaikwad, A.T.: Image inconsistency detection using local binary pattern (LBP). Procedia Comput. Sci. 115, 501–508 (2017) 15. Hilal, M.V., Yannawar, P., Gaikwad, A.T.: Image inconsistency detection using histogram of orientated gradient (HOG). In: 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM), pp. 22–25. IEEE (2017) 16. Fadl, S.M., Semary, N.A.: A proposed accelerated image copy-move forgery detection. In: Visual Communications and Image Processing Conference, 2014 IEEE, pp. 253–257. IEEE (2014) 17. Elkan, C.: Using the triangle inequality to accelerate k-means. In: ICML. pp. 147–153 (2003) 18. CoMoFoD database. http://www.vcl.fer.hr/comofod

Implementation of Word Sense Disambiguation on Hadoop Using Map-Reduce Anuja Nair, Kaushik Kyada and Neel Zadafiya

Abstract In Natural Language Processing, it is essential to find a correct sense of sentences or document is written for these type of problem is called word sense disambiguation problem. Currently, any machine learning application based on natural language processing requires to solve this type of problem. To identify the correct sense, pywsd (Python implementations of word sense disambiguation technologies) is used, which consists of different lesk algorithms, maximizing similarity tools, superwised WSD, and vector space models. Using simple lesk algorithm of pywsd, WSD is done on a given document and it is established in the Hadoop environment. Implementation on multinode Hadoop environment helps majorly to reduce the complexity of the application. Also, Map-Reduce is a parallel programming environment, which reduces the response time of the implemented application. Keywords Word sense disambiguation Map-Reduce Pywsd





Lesk algorithm



Multinode Hadoop

1 Introduction For word sense disambiguation [1], it is required to give two parameters, first is a dictionary, which finds all the sense definitions. The second parameter is a corpus of language, which is useful for a sample of text from the real world. For implementation purpose, we have used Corpora Dataset from NLTK package which is providing both the important parameters required. For a larger application, it is A. Nair (✉) ⋅ K. Kyada ⋅ N. Zadafiya Institute of Technology, Nirma University, Ahmedabad 382481, India e-mail: [email protected] K. Kyada e-mail: [email protected] N. Zadafiya e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_57

573

574

A. Nair et al.

essential to divide disambiguation part into different tasks and evaluate it individually. For this purpose, Hadoop has been used as a platform. Hadoop is a distributed platform. Usage of distributed platform incorporates less response time and the task/word is spread across multiple workstations known as clients. Also, many advantages of Hadoop includes parallel programming environment, i.e., Map-Reduce programming, fault tolerance, etc. Hence, instead of using a sequential environment, a parallel environment would be much more efficient to implement any problem. The more fundamental thing to discuss is lesk algorithm which is helping for finding most relevant sense among multiple definition options. Also, there are different lesk algorithms available to accomplish the same task but there is a major difference in the sense of accuracy and type of application in which the algorithms are used.

2 Computational Deterministic Approach for Word Sense Disambiguation To solve computationally determining problem like WSD there three approaches [2]: Namely Knowledge-based approach [3], machine learning based approach, and hybrid approach. In knowledge-based approach, all the key areas depend on knowledge resource like WordNet. It is consists of grammar rules that are used sentence making and it also describes some hand-coded rules for determining the correct sense or disambiguation. Knowledge-based itself contains overlap-based approaches. It consists of the idea of extracted features from sense bag (Synsets) and features from context bag (corpus) and tries to find overlapping features and find the maximum probability for matching. Features can be sense definitions like sentences and hypernyms or it can be weight associated with the word for doing this task and it is essential to require a Machine-Readable Dictionary. To demonstrate how overlapping of features can be identified, lesk algorithm [4] is being used. Before using it, it is essential to identify two terms which were introduced during the previous discussion. (a) Sense Collection: It consists of words that are used in the definition of candidate ambiguous word. (b) Context Collection: It consists of words that are used in definition as well as each and every possible sense context words of the candidate word. Now, we will take an example to understand the working of lesk algorithm. Consider a statement: “On burning coal, we get ash”. There are two verbs taking into consideration coal and ash. Different definitions for both verbs are given as follows. (1) Coal: A piece of glowing carbon or burnt wood, charcoal, a black solid combustible substance formed by the partial decomposition of vegetable matter

Implementation of Word Sense Disambiguation on Hadoop …

575

without free access to air and under the influence of moisture and often increased pressure and temperature that are widely used as a fuel for burning. (2) Ash: Trees of the olive family with pinnate leaves, thin furrowed bark, and gray branches. The solid residue left when combustible material is thoroughly burned or oxidized to convert into ash. From the above senses, it is clear that solid, combustible, and burn is overlapping with each other. So, the second sense in coal is correct disambiguation for the above statement. Other types of approach are the machine learning based approach, it consists of supervised, semi-supervised [5], and unsupervised approach [6]. For supervised learning for WSD, machine learning uses a decision list algorithm [6]. This algorithm is based on “one sense per collection” methodology. This means that all the neighbor words provide a robust sense of the target word. It is based on the relativity of words in each sentence. To do that, it collects a large dataset of ambiguous words and finds the word sense probability distribution for all dataset. Other types of machine learning algorithms are naive Bayes, K-NN (example-based disambiguation), support vector machine, and perception trained HMM. All the above-mentioned algorithms are based on supervised learning. For semi-supervised learning, Yarowsky’s supervised algorithm [6, 7] is useful for WSD. It uses decision lists and modifies it by introducing iteration with a classifier, which uses a dynamic approach to identify the correct sense for each word with relative manner. It attaches label and tags with confidence rank with each iteration. It gives more accuracy than simple supervised learning approach. For unsupervised learning, instead of using “dictionary defined senses”, it uses “sense from corpus” itself. Its advantages are that it will create clusters of similar contexts for a target word. It is also called a broad coverage classifier. In the third approach, Hybrid Approach [8], it uses iteration bases on semantic relation from WordNet. It extracts collections and contextual information from WordNet and little bit of tagged dataset. Word will have only single sense definition used as a context seed set of disambiguate words. At each iteration, a new disambiguate word is added from the already disambiguate words from the last iteration. Other types of hybrid approaches are sense learner [9] and SSI (Structural Semantic Interconnections) Model. Hybrid approach provides more accurate results for WSD than any other Machine Learning or Knowledge-based approaches.

3 PYWSD (Python Implementations of Word Sense Disambiguation Technologies) and Different Lesk Algorithms pywsd is a tool which provides a huge variety of technologies, which is essential for natural language processing in Python. It is written in Python and open for contribution and still enhancing day by day. It consists of lesk algorithms [4],

576

A. Nair et al.

maximizing similarity (Path Similarity and Information Content), supervised WSD, and vector space models and graph-based models. We have used the disambiguation module and simple lesk algorithm for our application. We are going to see the basic differences between all the lesk algorithms proposed for WSD application. Original lesk algorithm [4] is designed to simply calculate the probability for matching for neighbor words. It is based on the assumption that they all share a common sense for the proposed topic. To share common sense and to extract the exact meaning of sense from each and every word from a sentence, it is necessary to define some complex context which makes original lesk more complex than other versions. To overcome this complexity, simplified lesk [10] was introduced. Simplified lesk works the same as original lesk but the basic difference is that it removes other stop words from finding overlapping definitions from target words. It produces an accurate result and much faster than original lesk. The following is a simplified lesk algorithm, which uses overlapped function to compute overlapping words in the definition of target words from the same sentence. 1. 2. 3. 4. 5.

function SIMPLELESK (word, sentence) returns Correct-Sense of word Correct-Sense = most repeated sense for word max-repeat = 0 context = set of words in sentence for each tsense in tsenses of word do a. tagged = set of words in the gloss and examples of sense b. overlap = CALCULATEOVERLAP (tagged, context) c. if overlap more than max-repeat then i. max-repeat = overlap ii. Correct-Sense = sense

6. end return (Correct-Sense) Adapted lesk [4] overcomes the biggest drawback of original algorithm that dictionary definitions are often very short and do not have enough words for the disambiguation process. It simply fails to find overlapping words from the definition. To overcome this problem, it uses a semantically organized lexical dataset which is named as WordNet. It is a set of all relative words (Synsets) [11]. It not only compares definitions of target words but also looks ahead in definitions words for relations in WordNet. It achieves more accuracy than simple lesk.

4 Environment Configuration of Hadoop for WSD Application To implement WSD application on a distributed computing system, it is essential to use the unified structure of distributed computing system because if it is installed on the heterogeneous environment, then it is possible to instead of increasing speed of

Implementation of Word Sense Disambiguation on Hadoop …

577

application, we have to face degradation in performance of the application. To conduct an experiment, we first configured Multinode Hadoop [12] environment on different machines, which are different in terms of the size of memory, processing power, and storage requirement. We have noticed that this heterogeneous distributed computing system, decreases the performance of application instead of increasing it, because, one of the data nodes is working faster compared to other data nodes and it delivers the job to master node quickly, if application problem is large. Since Hadoop uses First Come First Serve [13] scheduling by default, it is possible that faster nodes submit back job to master node quickly than slower node or entire job be done only by faster node or maybe faster resource may not be utilized properly. Because of the above assumptions and drawbacks, in a heterogeneous environment [14, 15], we could not implement the WSD application. Now, a homogenous environment is considered, with uniform configuration for Hadoop environment. Care has been taken about memory, storage, processing power, and network bandwidth between master and data node. Based on these points, in computation, we have set up the environment in four phases. Each and every node has the same configuration as listed below. i. RAM: 1 GB. ii. STORAGE: 10 GB HHD. iii. PROCESSING POWER: 1 Core 2.47 GHZ Frequency. No other technologies to support processor. iv. NETWORK BANDWIDTH: 100 Mbps for One-Way communication. v. BLOCK SIZE: 128 MB per data node. We have increased the number of data nodes at different durations and analyzed the performance differences in disambiguation. Let us assume that for execution of one particular WSD application, we need x amount of execution time in the single-node environment. 1. One Master node and One Data Node It will take more than (x + y) time form execution for the same application as a single node executed because it requires job transaction between the master and the slave node. y is the extra time for communication between master and slave. 2. One Master node and Two Data Node It will take more than (x/2) + y time form execution for the same application as a single node executed, it reduces the main execution time to nearly half. 3. One Master node and Three Data Node It will take more than (x/3) + y time form execution for the same application as single node executed, it reduces the main execution time to nearly thrice. From the above observation, y remains nearly constant because Hadoop driver class manages all data nodes and task parallels up to some extent. On limiting reach to parallel processing power, it is possible that it will increase y. Data node management depends on the processing power of the master node.

578

A. Nair et al.

5 Map-Reduce Programming on Distributed Hadoop Environment We have used pywsd technologies for WSD application using Map-Reduce [16] and applied to 1069 MB document data. To perform disambiguation, we have used simple lesk because we need to analyze using a dictionary of WordNet. Following is the Mapper.py program. 1. 2. 3. 4.

from pywsd import disambiguate from pywsd.lesk import simple lesk with open(“pg20417.txt”) as f: for line in f: a. listtext = disambiguate(line) b. for i in listtext: i. if i[1] is not None: 1. print(“0:1”.format(i[0], i[1]))

Mapper file simply extracts all possible Synsets [11] for each and every word in a sentence of documents. Now, following is the Reducer.py program. 1. 2. 3. 4. 5.

import sys from pywsd.lesk import simple lesk from pywsd import disambiguate from nltk.corpus import WordNet as wn for line in sys.stdin: a. line = line.strip() b. singlew = sysn[8:-2] c. print(“0:1”.format(word, wn.synset (singlew).definition()))

Reducer file gets input as Synsets and key as a word from mapper and uses WordNet dataset to identify correct sense between words in each sentence and return most likely definition and Synset as output.

6 Experimental Results For the above configuration, we have measured the execution time required by the system. An execution time means total CPU time spent on Map as well as Reduce task. We have taken 1069 MB text document data for an experiment. y-axis denotes the total CPU time spent in minutes (Fig. 1).

Implementation of Word Sense Disambiguation on Hadoop … Fig. 1 Comparison of execution times using different mappers and reducers on WSD application

579

Map Task(min)

Reduce Task(min)

Total (min) 40

36.7

35 30

28.9 24.3

25 20 15 10

19.5 12.4

20.3 13.5

9.4

6.8

5 0

1-Master, 1-Slave 1-Master, 2-Slave 1-Master, 3-Slave

7 Conclusion Implementation of word sense disambiguation application on the distributed multinode Hadoop system gives a broad idea about parallel and real-time natural language processing can possible. Real-life problems like training to artificial intelligence and developing a natural language interacting system for a large dataset, it is possible to develop faster word sense disambiguation applications. By doing parallel processing on the document, it is possible to analyze a large application dataset. Increasing processing power and storage capability can further reduce the time complexity for the system.

References 1. Zhong, Z., Ng, H.T.: Word sense disambiguation improves information retrieval. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju, Republic of Korea, pp. 273–282 (2012) 2. Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. 41(2), 69. Article 10 (2009) 3. Navigli, R., Velardi, P.: Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans. Pattern Anal. Mach. Intell. (2005) 4. Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using WordNet. In: Gelbukh, A. (eds.) Computational Linguistics and Intelligent Text Processing, CICLing. Lecture Notes in Computer Science, vol. 2276. Springer, Berlin, Heidelberg (2002) 5. Niu, Z.Y., Ji, D., Tan, C.L., Yang, L.: Word sense disambiguation by semi-supervised learning. In: Gelbukh, A. (eds.) Computational Linguistics and Intelligent Text Processing, CICLing. Lecture Notes in Computer Science, vol. 3406. Springer, Berlin, Heidelberg (2002) 6. Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics (ACL’95). Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 189–196

580

A. Nair et al.

7. David, Y.: Word sense disambiguation using statistical models of Roget’s categories trained on large corpora. In: Proceedings of the 14th International Conference on Computational Linguistics (COL-ING), Nantes, France, pp. 454–460 (1992) 8. Budanitsky, A., Hirst, G.: Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures. In: Workshop on WordNet and Other Lexical Resources, vol. 2 (2001) 9. Mihalcea, R., Faruque, E.: SenseLearner: minimally supervised word sense disambiguation for all words in open text. In: Proceedings of ACL/SIGLEX Senseval-3, Barcelona, Spain, July 2004 10. Basile, P., Caputo, A., Semeraro, G.: An enhanced lesk word sense disambiguation algorithm through a distributional semantic model. In: COLING (2014) 11. Fellbaum, C.: WordNet. Wiley, Inc. (1998) 12. Shvachko, K., et al.: The Hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). IEEE (2010) 13. Pakize, S.R.: A comprehensive view of Hadoop MapReduce scheduling algorithms. Int. J. Comput. Netw. Commun. Secur. 2(9), 308–317 (2014) 14. Zaharia, M., et al.: Improving MapReduce performance in heterogeneous environments. In: OSDI, vol. 8 (2008) 15. Khokhar, A.A., et al.: Heterogeneous computing: challenges and opportunities. Computer 26 (6), 18–27 (1993) 16. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

Low-Power ACSU Design for Trellis Coded Modulation (TCM) Decoder N. N. Thune and S. L. Haridas

Abstract Nowadays large constraint length convolution encoders/decoders are used for better output performance. TCM Decoder used for satellite communication is having large constraint length greater than or equal to 7. For a large constraint length number of the states is also large this will lead to large power dissipation and complexity of the system. In TCM decoder, VD is one of the important parts. In VD to find an optimal path out of existing paths T-algorithm is used, this process limits the clock speed due to large calculations. So it is essential to purge some part of ACSU which will reduce complexity as well as power dissipation. In this paper, the efficient architecture of the ACSU unit is proposed which is based on modified pre-computation architecture. New suggested design provides reduced complexity and power consumption in comparison with existing T-algorithm. This design shows acceptable decoding performance with negligible calculations.



Keywords TCM (Trellis coded modulation VD (Viterbi decider) PM (Path metric) PSK (Phase-shift keying) TMU (Transition metric unit) ACSU (Add-compare-select unit)





1 Introduction When modulation and convolution encoding are combined, it is known as Trellis coded modulation TCM [1]. In TCM phase-shift keying modulation along with encoder is used, the encoder is having a finite number of states and generates encoded data systematically.

N. N. Thune (✉) Electronics Engineering, RTM Nagpur University, GHRCE, Nagpur 440016, India e-mail: [email protected] S. L. Haridas Electronics & Communication, RTM Nagpur University, GHRCE, Nagpur 440016, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_58

581

582

N. N. Thune and S. L. Haridas

Fig. 1 Systematic convolution encoder

Fig. 2 Block diagram of VD

Convolution encoder for 4D, 8PSK TCM shown in Fig. 1 having constraint length 7 [1, 2]. Large constraint length encoder gives better performance for high-speed wireless applications. For the decoding purpose among Maximum Likelihood (ML) algorithms, Viterbi Algorithm (VA) [3] achieves better output performance. Figure 2 shows the basic diagram of the Viterbi decoder (VD) for TCM encoder. It consists of four basic units branch metrics (BMs), add-compare-select unit (ACSU), path metrics (PMU), survivor-path memory unit (SMU). BMs calculated from received symbols, simplified equations for BM calculation are given in [4, 5]. Calculated BMs are fed into add-compare-select unit (ACSU), which continuously computes Path Metrics and gives decision bits of every new state transition. At last, decision bits from ACSU are stored in survivor-path memory unit (SMU). Depending upon the decision bits stored into SMU, the decoding operation will complete which is based on the survival path. Only current iteration’s PMs are stored in PM unit (PMU) [6]. Research work based on VD generally focused on the ACSU and SMU. ACSU implementation is always complex because it consists of recursive operations and this is the major problem associated with high-speed application. The computational complexity and power consumption of the system depend upon constraint length K, as K increases the above two factors also increase exponentially. To deal with these drawbacks, the use of T-algorithm is useful [7]. In the T-algorithm, a threshold T is set which is a minimum acceptable value. The difference between Path Metrics and optimal PM is calculated recursively. The states which are having a difference less than T was known as unpurged states and such states are used in the next cycle for computations. Searching of optimal PM is a recursive operation, due to this clock speed of ACS loop will degrade [8]. To achieve high speed, a parallel architecture of 2(k−1) input’s comparator is suitable. This architecture puts unnecessary hardware burden on the system, which is not suitable for our design goals of less complexity and low power consumption.

Low-Power ACSU Design for Trellis Coded Modulation (TCM) …

583

Many researchers suggested the new scheme for T-algorithm, which give poor BER performance for high rate (R − 1/R where R > 2) applications. In [9] new architecture for the Viterbi decoder using T-algorithm is given, which is based on pre-computation. In this design optimal PM is generated by comparing 64 path metrics with suitable BER performance. The design proposed in this paper is modified pre-computation architecture along with the auxiliary trellis approach, which will significantly reduce system complexity and power consumption. The next part of the paper includes six sections. Sections 2 and 3 present the general idea about different pre-computation schemes. Section 4 includes details of the proposed design. In Sect. 5 simulation and synthesis results are present and the paper concludes in Sect. 6.

2 The Basic Idea of Pre-computation In 4D 8 PSK TCM decoder VD has constraint length 7 and 64 states. Every state receives 8 incoming paths. To find PMoptimal Eq. 1 is used, PM (m – 1) is calculated in ACSU prior to the current value PM (m.). Optimal 16 BMs calculated in BMU at the current cycle. Here, the computation starts with q steps before, q should be less than n. PMoptimal can be calculated by PM (m – q) in q cycles.  PMoptimal ðmÞ = min PathM0 ðmÞ, PathM1 ðmÞ . . . PathMk 2 − 1 = minf min½PathM0, 0 ðm − 1Þ + BranchM0, 0 ðmÞ, PathM0, 1 ðm − 1Þ + BranchM0, 1 ðmÞ, . . . , PathM0, p ðm − 1Þ + BranchM0 , p ðmÞ min½PathM1, 0 ðm − 1Þ BranchM1, 0 ðmÞ, PathM1, 1 ðm − 1Þ BranchM1, 1 ðmÞ, . . . ,

ð1Þ

PathM1, p ðn − 1Þ + BranchM1, p ðnÞ, ..., min ½PathMk 2 − 1, 0 ðm − 1Þ + BranchMk 2 − 1, 0 ðmÞ, PathMk 2 − 1, 1 ðm − 1Þ + BranchMk 2 − 1, 1 ðmÞ . . . PathMk 2 − 1, p ðm − 1Þ + BranchMk 2 − 1, p ðmÞ g VD has a symmetric structure of Trellis. Path Metrics of 64 states are grouped based on mod 4 arithmetic. Similarly, Branch Metrics are also grouped, for a defined set of Path Metrics There are a fixed set of Branch Metrics, then Eq. 1 can be rewritten as the Eq. 2.

584

N. N. Thune and S. L. Haridas

PMoptimalðmÞ = minfminðPathMs ðm − 1Þ in cluster aÞ + minðBranchMs ðmÞ for cluster aÞ, minfminðPathMs ðm − 1Þin cluster bÞ + minðBranchMs ðmÞ for cluster bÞ, .... minfminðPathMs ðm − 1Þ in cluster nÞ + minðBranchMs ðnÞ for cluster nÞg

ð2Þ Pre-computation scheme is well suited for the system with rate ¾ and constraint length 7. Pre-computation method is efficient for high-rate applications. Computational complexity depends on the average no of path metrics generated by ACSU in each cycle as we reduce threshold value for ‘T’ algorithm, the number of average enabled states will be lowered which will help in the reduction of power consumption.

3 Two-Step Pre-computation Method Encoder shown in Fig. 1 is used for satellite communication [10]. Design specifications are given by CCSDS with 2.75 bits/symbol channel efficiency and Rm = 11/12. Other design parameters are shown in Table 1 [11]. Decoder for the corresponding encoder having 64 states, 16 BMs, and every state receives 8 incoming paths. As the number of states is large which will lead to high computational complexity. By applying T-algorithm with low threshold level, computational complexity can be reduced effectively, as reduce the threshold number of enabled states and computations are reduced. As stated in Eq. (4) PMs are divided into clusters. Each cluster is having a specific set of BMs. Whole architecture divided into two stages, left part of red line is stage 1 and right part is stage 2. The time required finding out optimal PM value (min) that will be decided by the critical path. MIN16 unit will be constructed by 2 stages of 4 input comparator (Fig. 3). Table 1 Design parameters of the encoder system with 2.75 bits/channel symbol spectral efficiency

Parameters

Symbol

Values

Constellation size Number of dimensions Number of states Constraint length Convolution encoder rate Full TCM encoder rate Spectral efficiency

M L N L R Rm Rib/W

8 4 64 7 ¾ 11/12 2.75

Low-Power ACSU Design for Trellis Coded Modulation (TCM) …

585

Fig. 3 Two-step pre-computation architecture

The critical path for the left side is given by T left = Tadder + 2 T4 − in comparator and T right = 2 Tadder + T4 − in − comp + 2T2 − in − comp Tleft and Tright are almost equal. CLUSTER a = fPathMxj0 ≤ x ≤ 63, x mod 4 = 0g CLUSTER b = fPathMxj0 ≤ x ≤ 63, x mod 4 = 1g CLUSTER c = fPathMxj0 ≤ x ≤ 63, x mod 4 = 2g CLUSTER d = fPathMxj0 ≤ x ≤ 63, x mod 4 = 3g BMG a = fBranchMxj0 ≤ x ≤ 15, x mod 4 = 0g BMG b = fBranchMxj0 ≤ x ≤ 15, x mod 4 = 1g BMG c = fBranchMxj0 ≤ x ≤ 15, x mod 4 = 2g BMG d = fBranchMxj0 ≤ x ≤ 15, x mod 4 = 3g

ð4Þ

586

N. N. Thune and S. L. Haridas

4 Proposed Modified 2-Step Pre-computation Method Modified 2 step of pre-computation is the new suggested scheme for pre-computation. As stated in the Eq. (4) each set of PMs are associated with a particular set of BMs. In the 4D 8 PSK, TCM decoder each state receiving 8 incoming paths. There are sets of states also which are having similar incoming branches. Such type of state known as similar structure states. The Trellis butterfly is having 8 wings here. The same structure is repeated for the states, which belong to similar states but the decision bit may be different. Repetition of butterfly structure increases the complexity and power consumption of the system. This problem can be overcome by reuse of trellis structure. In the proposed method, the auxiliary Trellis and cluster division of PMs and BMs are employed. After observation, we found that due to symmetry we need to compare only 4 PMs from each cluster. In the proposed architecture instead of min 16 we use min 4 blocks, this block made up with the single stage of 4 input comparator. The number of comparisons in the first step is reduced to half, which results in low complexity and less conversion time. The critical path for the proposed architecture is as follows. T left = Tadder + T4 − in comparator and T right = 2 Tadder + T4 − in − comp + 2T2 − in − com Here, the meaning of 2-step pre-computation is to compute the optimal PM in 2 clock cycle. In the suggested method, the optimal PM is calculated in two clock

Fig. 4 Modified 2-step pre-computation architecture

Low-Power ACSU Design for Trellis Coded Modulation (TCM) …

587

Table 2 Synthesis result, xc6vcx195t-2ff784, vertex 6, and speed scale 2 Methods

Number of slice LUTs

Number of flip-flops

Minimum period (ns)

Number of slice registers

2-step pre-computation Modified 2-step pre-computations

1864

338 (17%)

3.564

370

943

206 (20%)

2.909

250

cycle with some architectural modification in comparison to the earlier methods. Optimal PM is used to setting up the values of input flags of SMU for further decoding of the input sequence in VD (Fig. 4).

5 Simulation and Results The proposed scheme is providing optimal PMs in each cycle as conventional T-algorithm. For detail comparison of above-mentioned methods implementation on FPGA with a different scheme like 2-step pre-computation and modified 2-step pre-computation has been presented. Synthesis results are presented in Table 2. The speed performance of the system improved in comparison to that was reported in [8].

6 Conclusion Here, we presented the modified 2-step pre-computation scheme with hardware architecture. This is a proposed scheme for the TCM decoder used in the satellite communication. Comparing with earlier schemes this suggested scheme is more efficient and less complex. This scheme maintains the same BER as 2-step pre-computation architecture. Synthesis results of FPGA verify the significant change in hardware used and delay time.

References 1. Ungerboeck, G.: Trellis-coded modulation with redundant signal sets part II: state of the art. IEEE Commun. Mag. 25(2), 12–21 (1987) 2. Ungerboedc, A.G.: Trellik coded modulation with redundant signal sets part I: introduction. IEEE Commun. Mag. 25(2), 5–11 (1987) 3. He, J., et al.: An efficient 4-D 8PSK TCM decoder architecture. IEEE Trans. Very Large Scale Integr. VLSI Syst. 18(5) (2010)

588

N. N. Thune and S. L. Haridas

4. Sun, F., Zhang, T.: Parallel high-throughput limited search trellis decoder VLSI design. IEEE Trans. Very large Scale Intergr. (VLSI) Syst. 13(9), 1013–1202 5. Ungerboedr, G.: Channel coding with multilevel/phase signals. IEEE Trans. Inform. Theory IT-28(1), 567 (1982) 6. Petrobon, S.S., et al.: Trellis coded multidimentional phase modulation. IEEE Trans. Inf. Theory 36(1) (1990) 7. He, J., et al.: High-speed low-power Viterbi decoder design for TCM decoders. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 20(4) (2012) 8. Servant, J.D.: Design and analysis of a multidimensional trellis codeddemodulator, M.S. thesis, Signals, Sens. Syst., KTH, Stockholm 9. He, J., et al.: A fast ACSU architecture for Viterbi decoder using T-Algorithm. IEEE Conf. Signals Syst. Comput. (2009) 10. Bandwidth-Efficient Modulations, CCSDS 401.0-B-23.1 Green Book, Dec 2013 11. Thune, N.N., et al.: 4 D8 psk TCM encoder for high speed satellite application. In: IEEE Conference on ICAECCT, Dec 2016

Enhancing Security of Android-Based Smart Devices: Preventive Approach Nisha Shah

and Nilesh Modi

Abstract In the current era of smart devices, mobile phones are rapidly emerged and increasingly being used as primary computing, communication device with sensing capabilities and running more performance-intensive task. Secure and healthy working environment of this smarty is required to be maintained. Though sufficient peripheral protection mechanisms are described, authentication and access control are not alone sufficient to provide integral protection against intrusions. This raises the need for smart analysis techniques, particularly in application code, to materialize. There are many security detective and preventive solutions available in market, but still, this research field is immature. Majority of solutions provided in the area of smartphone handles specific issue for particular device and environment. As prevention is better than detection and cure, intended to work in the said direction, we define a framework aimed to help, identify and warn users for the resources going to acquire by the applications downloaded to install on smart devices and the risk behind resource acquisition/access. That way, it will try to intimate the users for the resources going to acquire in future, at runtime by application processes directly or indirectly, can uncover the malicious intention of resource access hidden in the application. Having this in mind, our solution will establish a strong footstep in device security in the form of preventive notification of alarm. For the same, we focused on Android-based smart devices looking to the popularity, availability, and download statistics of Android app, also due to the open Android’s philosophy, benign, or malignant applications can be published easily with limited controls; Android is having very high risk against security.



Keywords Access permission Intrusion Dynamic analysis Security risk





Malware



Device security

N. Shah A.P. SVIT-VASAD, Gujarat Technological University, Ahmedabad, Gujarat, India e-mail: [email protected] N. Modi (✉) Dr. Baba Saheb Ambedkar Open University, Ahmedabad, Gujarat, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_59

589

590

N. Shah and N. Modi

1 Introduction Smartphones/devices are rapidly emerged and increasingly being used as primary computing, communications with sensing capabilities and running more performance-intensive task. Smartphone usage is on the rise and smart devices are becoming more popular furnishing replacement of laptops and desktop computers for a diversity of needs. The range of smart devices (Smart phones, tablets…) and OS are available. Looking at the statistics of smartphone sales vendor wise and operating system wise, Android smartphone are in the leading position amongst all [1]. Statistics says Android app store is having highest number of apps [2] and 2.8 million apps in March 2017. The statistics as of March 2017 of leading app stores provide the clear picture that Android app store is having highest number of apps as can be seen in Fig. 1. Similar is the trend in app download for Android. Android users have downloaded almost double (in number) apps in a year than iOS. Even Gartner’s report of total Mobile apps download worldwide projected for 2013 onwards shows increase in about 25% rise every year. With that the projection comparison of free app download and paid app download reflects that the regular rise of free app download is following similar trend as that of total app and even it is about 85–90% more than paid app for given years [3] as shown in Fig. 2. Despite the high overall number of downloads, just 5–10% of the total apps downloaded be paid apps. That is rests more than 90% of total apps are free. These possess the highest risk against security and safety for smartphones. Moreover, Android application can be easily published by anybody on Google Play store by paying nominal amount for the registration requiring the application must be digitally signed. On the contrary, there are no warranties that applications

Fig. 1 Number of apps (in billion) in leading app stores as of March 2017

Fig. 2 Number of mobile app downloads worldwide versus free and paid apps from 2012 to 2017

Enhancing Security of Android-Based Smart Devices …

591

are not malicious. Moreover, due to the open Android’s philosophy, applications can also be published on unofficial markets, or distributed through several other channels where no control is performed. A major source of security problems in Android is the ability to incorporate third-party applications from available online markets. Determining which applications are malignant and which are not is still a formidable challenge, as those constitutes a threat to user for security and privacy. The growth and popularity have exposed mobile devices to increasing number of security threats. Though sufficient peripheral protection mechanisms are described, there is a need for more intelligent and sophisticated security controls, and for that intelligent Intrusion Detection and Prevention System are essential. As prevention is better than cure, we intended to provide a preventive measure by our proposed framework, warning the smart users for resources going to access by the app in future before the app start establishing its footprint into the device passing through installation. This work is in continuation of our previous work where we proposed the design of our preventive framework.

2 Previous Work Several researches are done in the field of smartphone intrusion detection, where it can be seen that different promising approaches are defined by researchers. Some of the approaches involve cloud-based techniques to detect attack to reduce resource usage at the cost of cloud services, network connectivity and communication to maintain real-time synchronization of device in cloud [4, 5]. Other approaches involve nonhuman, behavioral analysis instead of relying on known signatures for malware detection, are lightweight and run on the device itself but fail to detect instantaneous and abrupt attack [6–8]. Even rigorous surveys done from 2011 to 2015 in the area of smartphone security challenges talk about Android security architecture and its issues, malware evolution, and penetration threats, and highlight desirable security features, security mechanisms, and solutions available and provided suggestions for defense, detection, protection, and security [9–12]. In [13], researchers talked about stealing of high-valued private user information through on-board sensors. In [14], authors described a dynamic analysis tool “Alterdroid”, to detect hidden malware components distributed as a part of an app package. In [15], it is shown that how the extra permissions can be used for malicious intention and suggested to have better solution to control permissions before installing the application on the device. In [16], it talked about assessment of hosted Android applications as benign or malicious using proposed dynamic security analysis framework. The related work conveys that in depth and thorough work is required to assess apps installed on smart device and permissions required by the apps [14–16]. With the intention to contribute in these areas, we propose a framework to assess applications, aiming to install on the device.

592

N. Shah and N. Modi

3 Android Framework and Security Google’s Android is a Linux-based operating system having four-layered architecture as application, application framework, libraries and Android runtime, Linux kernel from top to bottom, respectively. Because of its open architecture, its Application Programming Interface is popular in developer community. Self-signed certified application can be installed to the Android device very easily. As no central certificate authority is needed, malignant applications can be introduced easily into the market, moreover due to the open Android’s philosophy, application can also be published on unofficial markets or distributed through several other channels. Android’s this strategy does not provide adequate level of security. Previous research work on smartphone conveys that majority of contribution is found in the area of intrusion detection, security, and privacy. Wherein, researchers worked in vicinity to find footprints left by intruders, viruses, malwares, and Trojans applying different data mining techniques like classification, clustering, machine learning, neural network, and pattern recognition depend upon type of analysis to do on which kind of data, runtime configuration environment of smartphone, input provided, and the output required. Widely held researches do not capture runtime environment context and majority of time they are demonstrated in emulated/simulated environment.

4 Proposed Framework to Enhance Security With the objective in mind to enhance the security of Android devices by providing intelligent, resource intensive, and robust solution, we proposed a framework. Our goal is to provide a preventive measure by warning the smart users at the time of app start establishing its footprint into the device passing through installation, for resources going to access by the app in future. Here, the purpose is by providing alarming notification, user comes to know about the resources actually required by the application, risk associated with it and that way, the illogical access control permission acquisition intention will be exposed. So that misuse of resource can be reduced, privacy and security of personal informations reside on device can be achieved. But it again depends on alertness of the smart device user only. Our idea is to reduce the damage going to take place to the device by malicious activity performed by processes running on device as a result of application execution than to identify and take recovery actions after device is harmed maliciously. In later case, recovery and cure will not be 100% sure and even it needs extra efforts. We proposed a monitoring and detection framework to maintain and provide safe working environment in Android-based smart devices (Fig. 3). It will reside in application and application framework layer of four-layer Android architecture.

Enhancing Security of Android-Based Smart Devices …

593

Google official play store

Other app store repository

App downloads request by user from any (official / unofficial) app

Monitoring and Detection Process Downloaded app on device Continuous monitoring (event based) for app download

Notifyfor Resource acquisition and associated risk

Iffound New Download

Analyze the installable application components

Identify resource acquisition (done in future at the time of application running) with risk factor

Continue

Fig. 3 Work flow of app monitoring and detection framework

Our framework will continuously monitor for the app download event and as it will found any app download, it will analyze the components of installable application. From this, it will identify all the resources required by the app at runtime in future after it gets installed with risk factor associated with it. The objective behind resource requirement assessment is to identify needless resource acquisition done dynamically at runtime by app which can be used for malicious purposes. Our aspiration is to prevent such misuse and that way tend to provide security and privacy. Our intention behind this work can be clearly understood by taking one example. If one user wants to install some game on his device, it will be downloaded from the app store. If contact/camera or any such resources acquisition request is there in the app code, which is in reality not essential but for malicious purpose it is defined may create devastation.

594

N. Shah and N. Modi

5 Implementation Details and Results The proposed framework is developed for the smart Android devices with minimum SDK version 16 (Android 4.1, Jelly Bean), which is highly used minimum mobile version in market, so our work can cover majority of all the smart Android users. That can be justified looking to the latest December, 2017 statistics, which says that 99% Android users make use of Android 4.1 (Jelly Bean) or higher version. For identification of risk factors and permissions, we used PackageInfo, PermissionInfo, and ApplicationInfo classes from Android SDK. Figure 5 shows O/p generated by our framework for the Sales Master App selected from unofficial repository and Fig. 4 shows outputs generated for the BookMyShow App selected from Google Play store. Our framework analyzed the components of installable application and identified all the resources required by the app at runtime in future with risk factor associated with it as shown in the Figs. 4 and 5.

Fig. 4 O/p for app selected from official source by our proposed framework

Enhancing Security of Android-Based Smart Devices …

595

Fig. 5 O/p for app selected from unofficial source

We assessed the performance of our app for different sizes (2 MB to more than 30 MB) of Android installable, where we found that it takes 5 s or less for showing us the analysis results. Our experiments show that our framework works efficiently with very low performance overhead. Even we compare our outputs with apps installed on smart device’s app permission as “APP PERMISSIONS” from settings. One such for Sales Master is shown in Fig. 6. From the Figs. 5 and 6 for the app Sales Master, it is clear that the results of app permissions are just names of permission going to use as in Fig. 6 which is provided by installed apps show permission utility, whereas our output give one an idea about details of permissions and risk associated with it as in Fig. 5. Even our framework will show all the permissions and associated risk (Protection Level) automatically when an app is selected for installation and user need not to go and select the respective options to see the permissions. The users will get the notification in the form of warning for a specific app before its installation gets over. This provides preventive alarm before one uses specific app.

596

N. Shah and N. Modi

Fig. 6 App permissions from settings

6 Conclusion In this paper, we gave an overview on how an Android-based smart device can be securely used. To overcome the loopholes in work done where major work is done in the area of intrusion detection and very less amount of work is done in the vicinity of intrusion prevention, we proposed dynamic preventive architecture for Android security. We show the results for our proposed framework and compare it with the similar results provided by the available utility of Android smart device. In our previous work, we proposed a design for our framework and in continuation to that, in this paper, we show the implementation scenario and the results. The results are tested for different sizes of apps from different official/unofficial sources to get the assurance of the proposed work.

References 1. http://www.gartner.com/newsroom/id/3323017 2. https://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores/

Enhancing Security of Android-Based Smart Devices …

597

3. http://www.gartner.com/newsroom/id/2592315 4. Houmansadr, A., Zonouz, S.A., Berthier, R.: A cloud-based intrusion detection and response system for mobile phones. In: 2011 IEEE/IFIP 41st International Conference on IEEE Dependable Systems and Networks Workshops (DSN-W) (2011) 5. Zonouz, S, et al.: Secloud: a cloud-based comprehensive and lightweight security solution for smartphones. Comput. Secur. 37, 215–227 (2013) 6. Xie, L., et al.: pBMDS: a behavior-based malware detection system for cell phone devices. In: Proceedings of the Third ACM Conference on Wireless Network Security. ACM (2010) 7. Dini, G., et al.: MADAM: a multi-level anomaly detector for android malware. In: MMM-ACNS, vol. 12 (2012) 8. Shabtai, A., et al.: Andromaly: a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012) 9. Wang, Y., Streff, K., Raman, S.: Smartphone security challenges. Computer 45(12), 0052– 0058 (2012) 10. La Polla, M., Martinelli, F., Sgandurra, D.: A survey on security for mobile devices. Commun. Surv. Tutor. IEEE 15(1), 446–471 (2013) 11. Suarez-Tangil, G., et al.: Evolution, detection and analysis of malware for smart devices. IEEE Commun. Surv. Tutor. 16(2), 961–987 (2014) 12. Faruki, P., et al.: Android security: a survey of issues, malware penetration, and defenses. IEEE Commun. Surv. Tutor. 17(2), 998–1022 (2015) 13. Schlegel, R., et al.: Soundcomber: a stealthy and context-aware sound trojan for smartphones. In: NDSS, vol. 11 (2011) 14. Suarez-Tangil, G., et al.: Thwarting obfuscated malware via differential fault analysis. IEEE Comput. 47(6), 24–31 (2014) 15. Jain, A.: Android security: permission based attacks. In: 2016 3rd International Conference on IEEE Computing for Sustainable Global Development (INDIACom) (2016) 16. Rastogi, V., Chen, Y., Enck, W.: AppsPlayground: automatic security analysis of smartphone applications. In: Proceedings of the Third ACM Conference on Data and Application Security and Privacy. ACM (2013)

Survey of Techniques Used for Tolerance of Flooding Attacks in DTN Maitri Shah and Pimal Khanpara

Abstract A delay-tolerant network is a network which is designed to operate efficiently over extreme distances such as those in space communications. In these environments, latency is the major factor affecting the quality of the network. Delaytolerant network is the network in which neither the nodes in the network are constantly connected to each other nor any specialized network infrastructure is available for managing the network. Due to this DTNs are already facing some major challenges like communication delay, data dissemination, and routing but another major challenge is to protect the network nodes from the attackers. Existing mechanisms provide security to a good extent but they are using a complex hashing algorithm which takes a significant amount of time ultimately affecting the limited bandwidth and limited battery life of mobile nodes. In this paper, we have presented the survey of three techniques used for preventing flooding attacks along with its advantages and disadvantages. Keywords Delay-tolerant networks ⋅ Flooding attacks ⋅ Security

1 Introduction Delay/Disruption Tolerant Network abbreviated as DTN, is designed to establish a connection between two or more nodes which are mobile, being carried by humans or vehicles. DTN enables to establish communication in the most unstable and remote environments in which the nodes in the network would face frequent disconnections. Due to lack of constant connectivity and communication infrastructure two nodes of DTN can communicate with each other and transfer data only when they both move into transmission range of each other. M. Shah ⋅ P. Khanpara (✉) Institute of Technology, Nirma University, Ahmedabad, India e-mail: [email protected] M. Shah e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_60

599

600

M. Shah and P. Khanpara

DTN enables data delivery using automatic store-and-forward mechanism. When a node receives one or more packets, it stores these packets into its buffer space and whenever it comes in the range of another node which can redirect the packet to its destination, it forwards them. In DTN contact between nodes are opportunistic and time period of contact might be less because the nodes are mobile and available bandwidth for transmission is also limited. And due to mobility of nodes, they may have limited buffer space also.

1.1 Security Issues in DTNs Having restricted buffer space and limited bandwidth, DTNs [1] are exposed to flood attacks. In flooding attacks, any outsider node or a node inside the network itself intentionally instill as many packets as possible into the network, or instead of inserting different packets into the network an attacker node forwards replica of the same packet to as many nodes as possible. So there are two types of flooding attacks [2] possible in DTNs, i.e. (1) Packet Flood Attack. (2) Replica Flood Attack. For the Internet and wireless sensor networks, many schemes [3] have been proposed to prevent flood attacks. And those mechanisms cannot be directly applied to DTNs because they assume persistent connectivity which is not possible in DTNs. So there is a need of such mechanisms which can defend DTNs against flood attacks.

2 Flood Attacks in DTNs For malicious or selfish purpose, any number of nodes in the network may launch the flood attacks [4]. Malicious nodes, that can be intentionally deployed to bring out the performance of the network by wasting their limited resources or by increasing the traffic of the network. And Selfish nodes, may launch flood attacks to congest the other network and increase its own communication throughput. A selfish node increases its throughput by flooding many replicas of its own packet. And in DTN, contacts are opportunistic so probability of the packet delivery is less than 1. By replicating its own packet, many times a selfish node can increase its own communication throughput.

3 Techniques to Prevent Flooding Attacks 3.1 Claim-Carry and Check Using Rate-Limiting Factor In this paper [2, 5, 6], they have presented a rate-limiting factor to defend against flood attacks using claim-carry and check in DTNs. In this technique, there is a limit associated with each node to send out the packets over a time interval and there is a

Survey of Techniques Used for Tolerance of Flooding Attacks in DTN

601

limit for number of copies of a packet can be generated by a node. To find out if the node has violated its rate limit or not, they [2] have presented a distributed approach.

3.1.1

Claim-Carry and Check

Any node in the network has to maintain the count for the number of packets it has sent out as a source into the network in a particular time interval to detect whether any node has violated the rate limit or not which leads to the detection of an attacker. Claim and rate limit certificate are sent out with each packet to verify a node’s authenticity. An attacker will try to send packets more than its rate restriction with falsely guaranteeing a count lesser than its rate limit. That count must have been used before for another packet (pigeonhole principle) and thus this will be the clear indication of an attacker. Example is given in Fig. 1a for packet flood attacks [5]. This mechanism is also used for detecting an attacker which forwards the buffered packet more than one time for Replica flood attack. When a node transmits the packet to its next hop, it carries the claim which contains the number of time it has transmitted this packet including the current transmission. Thus, an attacker must have to reuse a count if it wants to send packets more than its limit. So, the attacker can also be detected here by finding the inconsistent claims. Examples are given in Fig. 1b, c [4] for replica flood attack.

3.1.2

Claim Construction

P-claim is used to detect packet flood attack and T-claim is used to detect replica flood attack. P-claim is generated by the source and sent to the next hop along with the original packet. On the other side, a source generates the T-claim and append it to the packet. Whenever that packet is passed to the next hop that hop peels off T-claim and check for the inconsistency. Then it attaches the newly generated T-claim to the packet. In general if we want to identify an attacker in the network P-claim of source node and T-claim of the previous node is used. At the point, when a source node S sends another packet which has been produced before and not sent to a destination node. It generates P-claim and T-claim as follows:

Fig. 1 Basic idea of flood detection

602

M. Shah and P. Khanpara

P-claim: S, Cp , t, H(m), SIGs (H (H(m)|S| Cp |t)). Here, t is the current time stamp. Cp is the packet count(1< Cp L then it discards the packet otherwise it stores the packet into its buffered storage. T-claim is appended to a packet whenever it is being transmitted to other nodes. Suppose, say that node. A has transmitted a packet m to node B, T-claim which is appended to m includes the number of time this packet has been transmitted out by node A and current time stamp t. T-claim is T-claim:A, B, H(m), Ct , t, SIGA (A |B|H (H (m)| Ct | t)). On receiving any node will check Ct against the limit and take actions accordingly. Inconsistency at any node can be detected by verifying P-claim and T-claim. For example in Fig. 1 P-claims of m3 and m4 has used the count value 3 repeatedly. Likewise, dishonest T-claims results into count reuse.

3.1.3

Detection Strategy

Whenever a node sends or receives any packets, it stores both (P-claim and T-claim) claims in its local buffer. Initially, both full claims are kept in the buffer. When any node removes a packet from its buffer (i.e. when a packet is reached to destination or dropped due to lapse), it stores compacted P-claim and T-claim to reduce storage cost. Whenever a node receives a forwarded packet with a claim,it checks for inconsistency of it with a locally stored claim by verifying its signature. Forwarded packet’s claim is a full claim. If locally store claim is a full claim, that node can inform other node in the network via broadcasting a global alarm containing local claim and received claim. Upon receiving an alarm any node will check for the inconsistency between those claims. If inconsistency has been found, then it additionally broadcasts the alarm else it discards the alarm. It does not broadcast the alarm again if it has already broadcast the alarm before for the same claim. If locally store claim is not a full claim, it can not broadcast a global alarm. Because a compacted claim does not include a node’s signature in a claim. So, any node cannot be convinced upon receiving such claim about that node’s authenticity. Since the attacker may have used that claim for another nodes also besides that locally stored claim, the detecting node can send a local alarm to the contacted nodes who have received that false claims. If any of those nodes is having a full claim, attacker can be detected and that detecting node generates global alarm to all other nodes. Upon receiving a global alarm, a node removes its local alarm.

Survey of Techniques Used for Tolerance of Flooding Attacks in DTN

3.1.4

603

Advantages and Limitations

This scheme uses an efficient claim construction mechanism that keeps communication, computation and storage cost low. Even the attack detection probability is also high which makes this mechanism effective. It works in a distributed manner such that no dependency on a single node which works as a central authority. However, when the packet generation rate is too high some of the packets are dropped because of buffer overflow so the communication and computation time which was utilized to create the signatures and building their claims are considered as wasted. So this is an overhead in major traffic scenario.

3.2 Using Encounter Records In this scheme [7], they have proposed a recognition scheme for flooding attack which piggybacks on an existing encounter record based scheme of detecting blackhole attack. To record the sent messages during their previous contacts, nodes required to exchange their ER (Encounter Record) history. Using this mechanism, malicious node will be detected causing packet flooding or replica flooding attack. Adversary nodes may bluff the attack by removing or skipping unfavorable ERs. However, this also results in detection of attack due to inconsistency in time stamp and sequence number of ER.

3.2.1

How It Works?

Suppose i and j are two nodes with respected identifiers IDi and IDj contacted each other. After exchanging messages, each node generates an encounter record and stores it locally. The record ER∗i stored by node i is as follows: ERi = < IDi , IDj , sni , snj , t, SLi > SLi = {MRm | i send message m to node j} ER∗i = ERi , sigi , sigj where sni , snj are the sequence numbers for the nodes i and j, respectively, t is the encounter time and sigi , sigj are the signatures for the of ERi using their own private keys. SLi indicates the messages that is sent by node i to j. If the node is contacting any other node, it assigns new ER with sequence number incremented by 1 from its latest ER. Each message record is denoted as below MRm = < IDm > if SRCm ≠ i MRm = < IDm , REPm , GENm > if SRCm = i

604

M. Shah and P. Khanpara

Fig. 2 Encounter record (ER) manipulation strategies

here, IDm is the identifier of the message, REPm is the number of replica of message m that node i has generated. GENi is the time stamp at which the node i has generated the message m. Due to limited buffer space any node keeps only a window of w latest ERs to show it to its neighbours. For hiding an attack or malicious activity, an attacker may forge its own ER history to obtain a window which is advantageous for itself. It will present this forged window to its neighbour nodes. As shown in Fig. 2 [7], for example an adversary node has an original series of encountered records A, B, C and D with their respective sequence number and time stamp ((1, 10 min), (2, 20 min), (3, 30 min) and (4, 40 min)). If malicious node wants to delete the unfavorable encounter record for example, C. It may not generate another ER using C’s sequence number, i.e. 3 or it may reuse the sequence number 3 to generate another ER, i.e. E(3, 50 min).

3.2.2

Detection Strategy

Malicious nodes may have two types of misbehavior. Either skip a sequence number or reusing the sequence number for any other node. Whenever any node manipulates the ER, its series of sequence number or time stamp becomes inconsistent. When an attacker floods messages, they have their rate limit for number of packets a node can send or number of replica of packet can be forwarded exceed the allowed limit. This leads to the detection of attack. When two nodes, i.e. i and j are in contact, they exchange their ERs to examine behavior of each other. If any of the nodes finds that the other node is suspicious or malicious node, it will blacklist the node and also informs the network about the attacker if it has verified it properly. ER consist of consecutive sequence numbers. So, higher sequence number has a bigger time stamp. But malicious nodes manipulate ERs using any of the strategy shown in Fig. 2 [7]. If any of the attacker has skipped a sequence number it will have nonconsecutive sequence number, i.e. 1, 2 and 4. And if an attacker has reused the sequence number, the series of time stamp may result in non decreasing order, i.e. 10, 20, 50 and 40. Thus by examining the time stamp or sequence number of any of the nodes, one will be able to detect the attack. ER of node j will be used by node i to obtain how many packets j generates per time interval and how many replicas j forwards for each packet. Then, i compares these limits with its predefined limits(L).

Survey of Techniques Used for Tolerance of Flooding Attacks in DTN

605

If j violates the limits, i blacklists j. From the ER history of node j, node i can extract the set of distinct messages that node j has recently sent to the network. Suppose n messages are transferred and their respective time stamps are t1 , t2 , … , tn . It means that node j has generated n messages during (tn − t1 ) time period. On an average node j has generated n∗ messages in each time interval T, i.e. n∗ = n∕ceil((tn − t1 )∕T). If n∗ > L, node j is detected as flood attacker. Node i can obtain replica numbers which are appended to each packet that node j has created and transmitted by checking ER history. Replica count is defined as rep1 , rep2 , … , repk for such m packets. And their encountered time stamps are also managed. If node j is not an attacker, its series of replica count should be sequential, i.e. repl = repl − 1 + 1 for l = 1, 2. The highest replica count repk should not cross replica count limit L. If any of these conditions are violated, node j is detected as replica flood attacker.

3.2.3

Advantages and Limitations

By analyzing this scheme, detection delay decreases and the detection rate increases as EMR increases. Detection rate of a flooding attack is very high (almost 1). It does not incur any false positive. Storage requirement for each encountered record is affordable enough for mobile nodes. But each node has to be capable enough to decode the ER every time it contacts. And if the node which is attacker comes in the contact with such node have the ER history of that particular node after long time, then the effect of attack will slow down the network due to resource over utilization.

3.3 Using Stream Check Method This scheme [8] has an intrusion detection mechanism that uses streaming node (a node with monitoring capabilities) to monitor the network environment. Monitor node has to maintain three tables. The first table is Rate Limit table which includes rate limits of all the nodes in the network, second table is of Delivery probability table which contains probability of delivery of each node in the network. And a table for blacklisted nodes which are attackers. Streaming node compares estimated probability of delivery and actual probability of delivery of packets. If difference greater than assigned limit value that node will be listed as malicious node by the streaming node.

3.3.1

How It Works?

Streaming node [8] is used to detect malicious node inside a DTN. At every time interval, whenever two nodes of the network contacts streaming node travels along the path of the packet to check the authenticity of the communicating nodes. Rate

606

M. Shah and P. Khanpara

limit is used to restrict the packet rate and it is performed in a request- response manner. It is assigned by some trusted authority based on your traffic demand. During the transmission, packet is transmitted into small data blocks. Rate limit table contains approved rate limit for each node, node details, starting and ending sequence number. The node can send packet to the node to which it contacts. Streaming node does not monitor these activities, so node itself monitors these activities. Node itself manages the updated packet count and updated claims. And, node also verifies the claims against its rate limit certificates which are attached to the packet. If an attacker node is flooding network with new packets or with the replicas of the same packet by claiming false count, streaming node detect this node as an attacker because it has violated its rate limit and list it into blacklist table and inform all the nodes in the network. For example [8], an attacker in the network knows that two of the nodes (A and B) never communicate to each other. The attacker can send some packets to one of the nodes, i.e. node A and invalidly replicates that packet and send it to the other node B. Since A and B never communicate, in this case the attacker cannot be detected. In this case as streaming node contains three tables which have all the information about all the nodes. It compares these tables with all the nodes which participate in the communication. It first check for the rate limit then check the blacklisted table if any of the nodes are already added into that table or not. And then, check probability of the delivery of the packets that the node has estimated. Streaming node compares the actual value of probability (calculated from the rate limit table with starting and ending sequence number) with the estimated value. If it is greater than the threshold value, streaming node will list that node as an attacker and add it into the blacklisted table.

3.3.2

Advantages and Limitations

This approach can find malicious node very efficiently and effectively in such conditions where an attacker node deploys attacks in such a way that no distributed mechanism is able to find it. Packet delivery ratio is increased and propagation delay is decreased using this mechanism. But using this mechanism whenever the streaming node fails or corrupts, the network will collapse.

4 Conclusion Security is the major issue in Delay-Tolerant Network. Particularly flooding attacks cause network slow down and resources of the network are not utilized properly. Rate-limiting technique is used to mitigate the flood attacks in DTN. Claim-carrycheck mechanism is used to detect the rate violation by any node. Encounter Record method keeps the ER history to detect the attack by examining ER history of a node. These both approaches work in distributed manner. Streaming node will monitor the

Survey of Techniques Used for Tolerance of Flooding Attacks in DTN

607

traffic and behaviour of the node to detect the attack. This approach is centralized approach. And our work will be mainly based on distributed mechanisms. And, there is future scope for enhancement in the areas like storage capacity, communication cost and detection rate. Technology enhancement enables lots of advantages for DTN but it also brings many challenges to DTN.

References 1. Fall, K.: A delay-tolerant network architecture for challenged internets. In: Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 27–34. ACM (2003) 2. Li, Q., Gao, W., Zhu, S., Cao, G.: To lie or to comply: Defending against flood attacks in disruption tolerant networks. IEEE Trans. Dependable Secur. Comput. 10(3), 168–182 (2013). https:// doi.org/10.1109/TDSC.2012.84 3. Burgess, J., Bissias, G.D., Corner, M.D., Levine, B.N.: Surviving attacks on disruption-tolerant networks without authentication. In: Proceedings of the 8th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 61–70. ACM (2007) 4. Natarajan, V., Yang, Y., Zhu, S.: Resource-misuse attack detection in delay-tolerant networks. In: 2011 IEEE 30th International Conference Performance Computing and Communications Conference (IPCCC), pp. 1–8. IEEE (2011) 5. Ramaraj, K., Vellingiri, J., Saravanabhavan, C., Illayarajaa, A.: A delay-tolerant network architecture for challenged internets. In: Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 27–34. ACM (2003) 6. Sailaja, N., Farook, S.M.: Claim-carry-check to defend against flood attacks in disruption tolerant network. In: IEEE International Conference on Trust, Security and Privacy in Computing and Communications IJSETR, pp. 118–125. IEEE (2013) 7. Diep, P.T.N., Yeo, C.K.: Detecting flooding attack in delay tolerant networks by piggybacking encounter records. In: 2015 2nd International Conference on Information Science and Security (ICISS), pp. 1–4. IEEE (2015) 8. Kuriakose, D., Daniel, D.: Effective defending against flood attack using stream-check method in tolerant network. In: 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), pp. 1–4. IEEE (2014) 9. Li, Q., Zhu, S., Cao, G.: Routing in socially selfish delay tolerant networks. In: IEEE 2010 Proceedings INFOCOM, pp. 1–9. IEEE (2010)

An In-Depth Survey of Techniques Employed in Construction of Emotional Lexicon Pallavi V. Kulkarni, Meghana B. Nagori and Vivek P. Kshirsagar

Abstract Emotion is a state of mind affected by many external parameters one of which is text either read or spoken by others or self. Recognition of emotion from facial expression, sound intensity, or text is becoming an interesting research area. Extracting emotions from text is quite unfocused but important research problem from natural language processing domain. It requires the construction of emotional lexicon in respective natural language for classification of text/document into emotional classes. In this paper, an overview of the state-of-the-art techniques used to construct emotional lexicon for different languages is given. These methods are in their initial stage of research as much of the work is conducted for optimizing the results and hence open to wide field of innovative contributions. The author concludes with a proposal for developing language independent emotional lexicon. Main challenges in implementing this are discussed and promising applications in various fields are also elaborated.



Keywords Emotion recognition Natural language processing Emotional lexicon WordNet-Affect Crowdsourcing





1 Introduction Emotion can be categorized as an expressed emotion, perceived emotion, and felt (evoked) emotion. The performer means, the person who delivers text through song, poem, speech or normal talk meant for specific purpose. He tries to express P. V. Kulkarni (✉) ⋅ M. B. Nagori ⋅ V. P. Kshirsagar Computer Science & Engineering Department, Government Engineering College, Aurangabad, Maharashtra, India e-mail: [email protected] M. B. Nagori e-mail: [email protected] V. P. Kshirsagar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_61

609

610

P. V. Kulkarni et al.

his intended emotions to the listener. It depends upon the listeners perception which emotion is felt. Sometime it may be same or sometimes it may be different. Emotion felt by the listener in response to the felt song and performance is felt emotion. The way of human communication is through sound and sentences. The combination of sound and group of sentences generates particular emotion which is expressed emotion and the same may be perceived by the listener. Emotions play a very important role in accelerating human life. A lot of work has been done to classify emotions by considering audio or sound. Patra et al. classified Hindi songs using lyrics into different mood [1, 2]. Qualis Alsharif classified Arabic Poetry into different emotions using machine learning [3]. Ze-Jing Chuang detected emotions from speech and text for the Chinese language [4]. The area of lexical based emotion needs attention. The scope increases as the language changes. Opinion mining or sentiment analysis and emotion recognition work in the same direction to classify text based on human psychology but emotion recognition deals at depth trying to predict the exact emotional state of mind. Whereas opinion mining or sentiment analysis tries to find positive, negative, or neutral approach from text. SentiWordNet is used for sentiment analysis which can be constructed using two main approaches 1. WordNet based and 2. Corpus based [5]. SentiWordNet for Hindi and Bengali is developed by WordNet expansion through dictionary-based approach using synonymy and antonym relations [6].

2 Need Affective computing (AC) bridges the gap between computer and human by considering emotional aspects. Learning methods, gaming, and stress management will get benefit from affect based systems [7]. It is necessary to study the relation between natural language and affective information to improve human–human and human–computer relation. It is becoming crucial to handle it by computer systems. The basic objective of these systems is to automatically recognize emotion from physiology, face, voice, body language, and text. Use of textual conversation or poems through music is one of the most used media of expressing emotion. This gives birth to the research area of recognizing emotion from natural language. In order to identify/classify emotion from a speech or a document/paragraph/sentence or a musical piece involving lyrics, the system needs a well-constructed emotional lexicon for the respective language. Blog of emotional corpus can also be constructed which represents emotions [8, 9]. This Corpora-based approach requires the assumption of similar conceptions for different emotions by a language community. Emotional lexicon or affect-WordNet is a lexical database (extension of WordNet) having emotional ratings assigned for words. In [10], Balmurali et al. proved that using sense-based WordNet improves the overall performance of machine learning algorithms for sentiment analysis.

An In-Depth Survey of Techniques Employed in Construction …

611

3 Approaches to Construct Emotional Lexicon A lexicon is a structured collection of words with lexical information about those words such as noun, verb adjective, etc. Along with lexical entries, feature information is also included. Affective or emotional lexicon contains a set of affective concepts correlated with affective words. In this dictionary, main emotions are specified for each word. The words are clustered according to emotion. It is impractical to manually build an emotional lexicon for any natural language. It is essential to develop a method to automatically construct emotional lexicon. Following are dictionary-based methods used to automatically construct emotional lexicon for specific natural language.

3.1

An Affective Extension of WordNet

Strapparava and Valititti [11] assigned one or more affective labels (a-labels) representing eleven psychological terms including emotion and mood to the words. WorldNet-Affect for English language was developed in two stages. A collection of adjectives containing affective information and nouns, verbs and adverbs having intuitive correlation with the adjectives is done. For each item, lexical information and affective information is added in a frame. This group of information is considered as core and named as AFFECT, which is developed in first phase. In second phase, the core is extended by exploiting WordNet relations. The relations which are synaset of core group are included in WorldNet-Affect and others are checked manually. Recent version of WordNet-Affect contains 2,874 synsets and 4,787 words. This found a base for developing emotional lexicon in other languages. WordNet-Affect has an additional and independent hierarchy of “affective domain labels” for words and synsets. This lexical resource is the first resource of its kind and useful for improving performance of applications based on natural language processing.

3.2

Graph-Based Algorithm

Xu et al. [12] developed a Chinese Emotional Lexicon using this approach. Multiple resources from the Chinese language are used to identify emotions. A similarity matrix is constructed which is a fusion of similarity based on an unlabeled corpus, semantic dictionary, and heuristic rules. The algorithm creates a matrix of words where the cell value (i, j) represents ranking scores. For initial startup, some seed emotion words are taken as labeled point. Iteratively similarity between two words is considered and matrix expands for other words using labeled information. Such graph-based transductive learning methods are useful in case of

612

P. V. Kulkarni et al.

scarce data. Separate matrix is constructed for each resource and finally, all four similarity matrices are fused to achieve multiview learning. Seed database is prepared for five basic emotions (Happiness, sadness, anger, fear, and surprise.) and the results are manually validated by judges by the ranking method. The noise introduced by this method is removed by an iterative feedback approach. Finally, five (for each basic emotion) Chinese emotional lexicons were constructed by considering the guidelines and linguistic considerations for manually identifying emotion words.

3.3

Annotation-Based Approach

Mohammad and Turney [13] constructed high-quality moderate size English emotional lexicon called Emolex. Amazon Mechanical Turk, an online service of Amazon is carefully used. Through a smart question/answer interface, a large amount of human annotation are obtained. Target words are chosen from Macquate Thesaurus as a source pool which occurs frequently in Google’s-n gram corpus. Selected target words having exactly one sense are made available to annotators through a smartly designed interaction called word choice problem. The response from annotators classified the word into six emotional classes based on Plutchik Emotional model. EmoLexWAL, a dictionary based on WordNet-Affect Lexicon for target words, is used to validate the annotation based word classification for emotional classes. The challenges faced in filtering Human annotations are enumerated in (2013) [13] by same authors using improved crowdsourcing by choosing strength and wisdom of the crowd in [14].

3.4

Classifier Based Approach

Das et al. [15] proposed fuzzy clustering with SVM-based classification. Words are classified into seven categories of emotion (anger, disgust, fear, guilt, joy, sadness, and shame). Three types of networks are constructed initially 1. A generalized lexical network(G): It is constructed from ISEAR dataset. Nodes are linked if the words appear in single statement. 2. Co-occurrence-based network (Gco): It is developed from G based on co-occurrence. Frequent occurrence and order of appearance of terms in statement are used to calculate semantic proximity. WordNet-Affect based network (GWA): G is transformed into WordNet-Affect base network by considering the presence of the word and its stem form in any of the WordNet-Affect. For every match tagging as the Emotion Word is assigned to the word. Words appearing in the same context are linked if it contains emotion word . The above three networks are further improved by considering the summation of two similarity scores. 1. WordNet distance based similarity (WSIM) 2. Corpus distance based similarity (CSim).

An In-Depth Survey of Techniques Employed in Construction …

613

The weighted networks are developed as ‘G’, ‘Gco’, ‘GWA’ by adding weights. The result of summation of two similarity scores is considered as weight. First fuzzy clustering using c-means is applied. Then, it is classified using SVM classifier. Performance is improved by using PMI (Point wise mutual Information) and LGr (Universal law of Gravitation for words) [15]. Fuzzy C-Means Clustering followed by SVM classifier gave 85.92% accuracy for top 100 words on the WordNet-Affect based weighted lexical network. Psychological features played vital role in improving the performance. The model is unable to deal with implicit emotions and idiomatic expressions.

3.5

The Potts Model

Patra et al. [16] used a Pott’s Model (a probabilistic model for lexical network) which is suitable for multi-label text classification for developing emotion lexicon based on Ekman’s Emotional Model. Potts Model is a network where multiple values for variable are present without any ordering relations between them. Gloss Network: Lexical network of adjectives, adverbs, nouns, and verbs is constructed where the words are linked with two ways same orientation link (SL) and different orientation links (DL). Gloss thesaurus network (GT): Synonyms, antonyms, and hyponyms information is used to link the words either DL or Sl. Gloss thesaurus corpus network (GTC): Co-occurrence information extracted from corpus is used. Conjunctives between adjectives are used to classify links as DL or SL. The results are evaluated using WordNet-Affect. Achieved average accuracy is 48.8% for top 100 words. One important observation they found is fuzzy nature of words, where a single word could belong to multiple classes which give direction for future research [16].

3.6

Fuzzy Clustering with Semi-supervised Learning

Poria et al. [17] implemented semi-supervised learning. Unlabeled examples are allowed to learn from small seed labeled examples from inventory. Small seed labeled lexicon is used for fuzzy clustering the large set of unlabeled examples. Fuzzy clustering optimally groups together a given set of data points where each point may belong to one or more clusters. Fuzzy clustering is stated as Minimize objective function

Jðfμik g, fvi gÞ

where vi, i = 1, 2, …, c set of points called centroids of clusters. μik, k = 1,…, N set of membership values for N data points.

614

P. V. Kulkarni et al.

The degree with which a data point xk belongs to the cluster characterized by the centroid vi is interpreted as μi(xk) = μik. The total membership of one data point in all clusters must be a unity, i.e., ∑ μik for k = 1, . . . , N.

ð1Þ

i=1

In construction of emotional lexicon, words are the data items and features are co-occurrence based similarity measures between words [17]. Different sub-corpus constructed by combinations of feature set based on co-occurrence in WordNet-Affect and ISEAR Data set were used for experiment. Hybrid approach of incorporating fuzzy clustering with SVM classifiers enhances the accuracy over the traditional approach up to 95.02%.

3.7

Fuzzy Linguistic Aggregation

Brenga et al. [18] applied fuzzy linguistic modeling to map the text into hourglass of emotions. Since emotions are strictly related to the human feeling and perception. Fuzzy linguistic model is built using linguistic information. This information is described by linguistic terms. Emotions and dimensions are mapped as linguistic information. Hourglass of emotions is based on four affective dimensions (Pleasantness, Attention, Sensitivity, and Aptitude) and six emotional level (−3 to +3) of each dimension positioned according to the intensity of emotional state of mind. LOWA (Linguistic Ordered Weighted Averaging) operator is used to map the dimensions as the linguistic variables and the emotions as fuzzy terms. WordNet and WordNet-Affect are used as lexical resources. LOWA operator is used both at word and sentence level. The famous ISEAR (International Antecedents and Reactions) dataset is used for testing purpose.

3.8

Expansion of WordNetAffect Using SentiWordNet

Torii et al. [19] used this approach to construct Japanese WordNet-Affect. In this approach, mapping between affect Id of the WordNet-Affect to the corresponding synsetID of the WordNet is done and resultant list is translated into corresponding natural language using its WordNet. The translation was possible because Japanese WordNet is based on English WordNet. For a large amount of entries where no equivalent matching is found, manual translation was carried out. Abdaoui et al. [20] created FEEL: A French Expanded Emotion Lexicon: It is constructed by automatically translating NRC_EMOlex: (an English lexicon) into French Emotional Lexicon. Then it is validated manually by annotation method.

An In-Depth Survey of Techniques Employed in Construction …

615

4 Proposed Architecture From the above-discussed techniques for constructing emotional lexicon, we arrive at a generalized architecture in Fig. 1 that can encompass multiple languages and efficiently construct emotion based lexicon. The overview of steps is given as follows.

Fig. 1 Proposed architecture for construction of emotional lexicon

616

P. V. Kulkarni et al.

4.1

Algorithm for Mapping and Translation

4.2

Measuring Co-occurrence in Corpus-Based Expansion

4.3

Measuring Agreement Between Two Annotators in Crowd Sourcing

A smartly designed Q/A interaction through digital world or traditional methods can be used to collect the data from trained annotators. Agreement between two annotators can be measured by Cohen’s Kappa (K) K = ðPrðaÞ − PrðeÞÞ ̸ð1 − PrðeÞÞ

ð2Þ

An In-Depth Survey of Techniques Employed in Construction …

617

where Pr(a) is the relative observed agreement between annotators. Pr(e) is the expected agreement between annotators.

5 Issues and Challenges for Emotion-Based Lexicons Natural Language is very difficult to analyze. It is not easy to extract exact emotion from word/sentence/para/document. Perceived emotions by a person depend upon lot of parameters like history of perceiver, current psychological condition of perceiver, knowledge about the domain and intelligence. Text plays a vital role in creating an emotional situation and directly influences human brain. Detecting emotions from text is scientific research goal which is struggling with following challenges. • Different Language Different Emotional Model: There is no Universal model available which will encompass emotional words of all available languages. Language is culture-specific and there is a drastic cultural diversity. • Word Sense Disambiguation: Emotion inferred by a word or sentence depends upon the sense in which it is used. In annotation-based method, it is important to know the meaning of the word to the annotator by prior knowledge or definition [21]. • Fuzzy Nature of Emotional Text: Classification of emotional text is a natural multi-label classification problem. • Perceiver Dependency: The emotion perceived by listener or reader is time specific, situation dependent very hard to compute unless a listener is well trained and familiar with the language. • Incorporation of Special Symbol and Emoticons: Special symbols like ? (question Mark), !(Exclamatory Mark) play vital role in inserting emotion in text. Combination of seed words and emoticons improves the performance [22].

6 Applications Emotion recognition has a variety of applications in Business Intelligence, Health Psychology, Smart Education, Public Social Life, Entertainment they can be elaborated as: • Business Intelligence: Emotion recognition will improve not only for business expansion but also encourage new business opportunities among the entrepreneurs. Emotion-based product review will prove as a novel input in Business Analysis.

618

P. V. Kulkarni et al.

• Health Psychology: Listening to Music, Spiritual Elements, or moral speech affect the mental condition in a positive direction. Emotion recognition will motivate to develop systematic approaches to change psychological conditions of children, older adults, and psychologically depressed patients. • Smart Education: Today’s education is undergoing a drastic change from the traditional exam oriented approach to skill based practical oriented approach based on student’s affinity. Emotion recognition will help to modify techniques and tools to choose courses, deliver lectures, and test the progress of student leading to smart education. • Public Social Life and Entertainment: Politicians, lawyers, counselors, and doctors can use the models based on emotion recognition to improve the way they communicate with people. TV serials and movies will have a great benefit when automatically emotion is detected. It is also useful in reputation analysis [2, 23]. • Technology: It is the requirement of affective computing, decision-making system and intelligent machine learning. Human–computer interaction will take a turning point in a path of artificial intelligence with the help of automatic emotion recognition system [24].

7 Conclusion Although emotions are not directly associated with words, a figurative sense can be extracted. Considering the fuzzy nature of human emotion, it is necessary to adopt a method which supports multi-label classification hence fuzzy classification suits for constructing emotion lexicon. There are hundreds of languages spoken by millions of people. This necessitates the construction of emotional lexicon in future. Current research has led to the development of lexicons for some of Indian languages like Marathi, Hindi, Sanskrit, Bengali, Tamil, etc. Standard emotional lexicon is still a dream. Efforts are initiated in successful construction of Hindi Emotional Lexicon. Future work can focus on developing other emotional lexicons.

References 1. Patra, B.G., Das, D., Bandyopadhyay, S.: Automatic music mood classification of Hindi songs. In: Proceedings of the 3rd Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2013), IJCNLP 2013, pp. 24–28, Nagoya, Japan, 14 October 2013 2. Patra, B.G., Bandyopadhyay, S.: Mood classification of Hindi Songs based on lyrics 3. Alsharif, Q., Alshamaa, D., Ghneim, N.: Emotion Classification in Arabic Poery using Machine Learning 4. Chuang, Z.-J., Chung-Hsien, W.: Multi-modal emotion recognition from speech and text. Comput. Linguist. Chin. Language Process. 9(2), 45–62 (2004)

An In-Depth Survey of Techniques Employed in Construction …

619

5. Patel, S.N., Choksi, J.B.: A survey of sentiment classification techniques. J. Res. 01(01) (2015). ISSN: 2395-7549 6. Das, A., Bandyopadhyay, S.: SentiWordNet for Indian languages. In: Proceedings of the 8th Workshop on Asian Language Resources, pp. 56–63, Beijing, China, 21–22 August 2010 7. Calvo, R.A., D’Mello, S.: Affect detection: an interdisciplinary review of models, methods and their applications. IEEE Tran. Affect. Comput. 1(1) (2010) 8. Quan, C., Ren, F.: Construction of blog emotion corpus for chinese emotional expression analysis. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1446–1454, Singapore, 6–7 August 2009 9. Das, D., Bndyopadhyay, S.: Labeling emotion in Bengali Corpus-A fine grained tagging at sentence level. In: Proceedings of the 8th Workshop on Asian Language Resources, pp. 47– 55, Beijing, China, 21–22 August 2010 10. Balmurli, A.R., Joshi, A., Bhattacharyya, P.: Cost and benefit of using WordNet senses for sentiment analysis 11. Strapparava, C., Valitutti, A.: WordNet affect: an affective extension of WordNet. In: Language Resources and Evaluation (2004) 12. Xu, G., Meng, X., Wang, H.: Build Chinese emotion lexicons using a graph-based algorithm and multiple resources. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 1209–1217, Beijing, August 2010 13. Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 26–34, Los Angeles, California, June 2010. ©2010. Association for Computational Linguistics 14. Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. In: National Research Council Canada, arXiv:1308.6297vi [cs.CL]. 28 Aug 2013 15. Das, D., Poriam S., Bandyopadhyay, S.: A Classifier based approach to emotion Lexicon construction. In: Bouma, G., Ittoo, A., Metais, E., Wortmann, H. (eds) Natural Language Processing and Information Systems. NLDB 2012 Lecture Notes in Computer Science, vol. 7337. Springer, Berlin, Heidelberg (2012) 16. Patra, B.G., Takamura, H., Das, D., Okumura, M., Bandyopadhyay, S.: Construction of emotional lexicon using potts model. In: International Joint Conference on Natural Language Processing, pp. 674–679, Nagoya, Japan, 14–18 October 2013 17. Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S.: Fuzzy clustering for semi-supervised learning-case study: construction of an emotion Lexicon. In: Batyrshin, I., Gonzalez Mendoza, M. (eds) Advances in Artificial Intelligence. MICAI 2012. Lecture Notes in Computer Science, vol. 7629. Springer, Berlin, Heidelberg (2013) 18. Brenga, C., Celotto, A., Loia, V., Senatore, S.: Fuzzy linguistic aggregation to synthesize the hourglass of emotions. In: 2015 IEEE International Conference Fuzzy Systems (FUZZ-IEEE). https://doi.org/10.1109/fuzz-ieee.2015.7338020 19. Torii, Y., Das, D., Bandyopadhyay, S., Okumura1, M.: Developing Japanese WordNet affect for analyzing emotions. In: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, ACL-HLT 2011, pp. 80–86, 24 June 2011, Portland, Oregon, USA (2011). Association for Computational Linguistics 20. Abdaoui, A., Aze, J., Bringay, S.: Pascal Poncelet: FEEL: a French expanded emotion lexicon. In: Language Resources and Evaluation. Springer. https://doi.org/10.1007/s10579016-9364-5 21. Cambria, E., White, B.: Jumping NLP curves: a review of natural language processing research. IEEE Comput. Intell. Mag. (2014). 1556-603X/14 22. Song, K., Feng, S., Gao, W., Wang, D., Chen, L., Zhang, C.: Build emotion lexicon from microblogs by combining effects of seed words and emoticons in a heterogeneous graph. In: TRNC, Cyprus 2015, Guzelyurt, 1–4 September 2015. ACM (2015). ISBN: 978-1-4503-3395-5/15/09

620

P. V. Kulkarni et al.

23. Zhang, B., Provost, E.M., Essl, G.: Cross-Corpus Acoustics Emotion Recognition From Singing and Speaking: A Multi-Task Learning Approach. IEEE (2016). ISBN: 978-1-4799-9988-0/16 24. Lee, C., Lee, G.G.: Emotion recognition for affective user interfaces using natural language dialogs. In: IEEE Xplorer Conference Paper, September 2007

DWT-Based Blind Video Watermarking Using Image Scrambling Technique C. N. Sujatha and P. Sathyanarayana

Abstract This paper addresses an Arnold Transform based gray image embedding in video using Discrete Wavelet Transform (DWT). In the proposed scheme, the video is authenticated with different parts of watermark by using histogram-based scene change technique. Each frame is estranged into three planes. DWT is applied to selective plane in each frame to putrefy into sub-bands. The chosen secrete image is separated into 8-bit planes. Bit plane image is further scrambled using an Arnold transform to get high protection of watermark. In this scheme, embedding is done in mid and high-frequency coefficients of DWT without mortifying the perceptual quality of video. Hidden image is extracted from the marked video by following the inverse processing steps. Robustness is tested by subjecting the marked video to various video processing and image processing attacks. Simulation results show that the proposed scheme is highly resistant to frame averaging, frame dropping, and noise attacks. Keywords Watermark CF



Scene change analysis



DWT



PSNR

1 Introduction Nowadays, the extensive usage of internet technology for transmitting and storing images or videos made easy to grab them illegitimately. To protect digital multimedia from unauthorized persons, the effective method required is digital watermarking. Watermarking is a process of inserting secret information into a host media. Digital watermarking schemes generally classified according to embedding C. N. Sujatha (✉) Department of ECE, SNIST, Hyderabad, Telangana, India e-mail: [email protected] P. Sathyanarayana Department of ECE, AITS, Tirupati, Andhra Pradesh, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_62

621

622

C. N. Sujatha and P. Sathyanarayana

domain, extraction process, perceptibility and ability to resist manipulations. For more reliability watermark is scrambled to reckon confusion. The basis for assimilating scrambling method is to encode the watermark image before placing it in the host image. Scrambling is mostly used for encryption that refers to permutations of bits or pixel values [1–3]. The strength of an image encryption algorithm depends on the type of scrambling technique used. Generally a matrix based scrambling technique is widely used which is known as Arnold transformation [4] and matrix-based scrambling can also be done using Fibonacci transform which has a unique property of uniformity [5]. Watermarking in video presents some issues due to huge data and redundancies between frames [6]. Video frames are vulnerable to frame dropping, frame averaging, swapping, etc. Insertion of the same watermark in each frame causes a problem of maintaining imperceptibility which is video independent. Whereas insertion of individual watermarks in each frame also creates a problem of removing watermarks by means of statistical averaging of consecutive video frames. So for better insertion of watermarks, host video is first divided into different scenes based on scene change analysis [7]. Each scene is embedded with independent watermarks. The embedding in the video can be done either in spatial domain like LSB modification [8] or transform domain such as using Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), and Discrete Wavelet Transform (DWT) [9–14]. Among these transform schemes, DWT is found to be the best scheme for embedding secrete image in selective frequency bands. DWT is a simple method to represent an image in different resolutions which is not possible with other schemes like DFT, DCT, etc. The methods of watermark extraction categorize the watermarking schemes into blind and non-blind as shown in Fig. 1a, b. In the blind scheme, the hidden watermark is extracted without using original data whereas in non-blind scheme original data must extract the watermark from watermarked media. The stability of watermark against various attacks classifies the watermarking methods into robust, fragile, and semi-fragile. Robust watermarking schemes are designed for copyright protection and to prove the owner of media [15, 16]. The watermarking which makes the embedded data sensitive to any operation is known as fragile watermarking. Fragile watermarking schemes are used to identify small changes made to host data whereas semi-fragile schemes specially used to verify the content of the host media, as well as changes caused by unintentional attacks and also more focused on detection of intentional attacks. Visible and invisible watermarking schemes are found based on visual perceptibility.

Fig. 1 a Blind watermark extraction, b non-blind watermark extraction

DWT-Based Blind Video Watermarking …

623

In this paper, a blind video authentication scheme is presented by watermarking different scenes of video with different parts of scrambled grayscale watermark image. Daubechies wavelet is taken to decompose video frames into four resolution bands. The preprocessed watermark is embedded in decomposed mid- and high-frequency wavelet coefficients to make it perceptibly invisible to all viewers. The detailed steps involved in watermark embedding and extraction process are discussed clearly in the following sections.

2 Embedding Process This section follows two primitive preprocessing operations before embedding the watermark into host multimedia.

2.1

Preprocessing of Watermark

A gray image of size m × n is chosen as a watermark, in which each pixel is represented with 8 bits. Gray watermark image is first converted into binary image using bit plane decomposition scheme. This decomposition provides 8-bit planes like MSB to LSB planes as shown in Fig. 2. These planes are placed side by side to get an enlarged watermark binary image of size 4m × 2n. Furthermore, the security of watermark image is improved by using Arnold transform to scramble the watermark [17]. Arnold transform is an iterative process to shift the positions of pixels in an image. The matrix-based Arnold transform is an invertible transform. If Arnold transform is applied to a square image of size N × N, it is described by the following Eq. (1):

Fig. 2 Bit plane decomposition of one pixel value

624

C. N. Sujatha and P. Sathyanarayana



x′ y′



 =

1 1

1 2

  x mod N y

ð1Þ

where (x, y) is the coordinate of image pixel and x, y ∈ {0, 1, …, N – 1}, N is the image width or height, (x′ , and y′ ) is the new coordinate of resulting image after Arnold transform. This transform scrambles the pixel coordinates at a specified number of iterations to convert the watermark image into an unreadable noise. After applying this transform, it is highly difficult to identify the watermark from its scrambled form directly. In the proposed method, the number of iterations done is 10.

2.2

Preprocessing of Video

Preprocessing of video can be done at two levels. First scene change detection in video and the second one is decomposition of video frames using discrete wavelet transform. Detection of Scene change: As insertion of same watermark image in each video frame is difficult to maintain invisibility and embedding of independent watermarks to each successive frame cannot withstand statistical average, a novel video watermarking operation is introduced by using histogram analysis based scene change identification. Scene change analysis separates divides the video into different scenes depending on a threshold. The scene change is detected if the histogram difference between two consecutive scenes is more than the threshold. Each scene with certain group of frames is considered as motionless scene. Each motionless scene is embedded with a similar watermark. Such different independent watermarks are embedded in different scenes of video. Decomposition of Video frames: The input video is portioned into frames. Each frame is segregated into three planes, R, G, and B. Each plane is transformed into coefficient domain using discrete wavelet transform. The wavelet transform decomposes each plane into four sub-bands: LL, LH, HL and HH. Watermark embedding can done in LH, HL and HH sub-bands except LL band due to high sensitivity of human eye to low-frequency coefficient variations. As LL band contains maximum energy of image, small modification in these coefficients causes noticeable degradation in image quality.

2.3

Watermark Embedding Steps

The following sequence of steps takes place while embedding the watermark in the proposed method:

DWT-Based Blind Video Watermarking …

625

1. 2. 3. 4. 5. 6.

Host video of N frames is divided into P number of scenes Select the scene (group of frames) Each frame is isolated into R, G, B planes Any of three planes is selected for watermark insertion Middle and high-frequency coefficients are obtained by applying DWT Select the watermark image and convert it into binary image using bit plane decomposition 7. Arnold transform is used to get scrambled watermark 8. Scrambled watermark is divided into P number of parts 9. Individual parts P of watermark are embedded into successive P scenes of video with the following condition: i. If watermark bit is 1, exchange the maximum value among the array of 5 frequency coefficients with first value in that array ii. If watermark bit is 0, exchange the minimum value among the array of 5 frequency coefficients with first value in that array These processing steps continue till the last watermark bit is processed. 10. Inverse DWT is applied on watermarked coefficients and unmarked low-frequency coefficients 11. Watermarked plane is united with unmarked planes to form watermarked frame 12. Steps 3–9 are executed for next scene and process continues until the last frame 13. Watermarked video is obtained by combining all watermarked frames.

3 Watermark Extraction Process Implementation of watermark extraction process is given in steps as follows. 1. 2. 3. 4. 5.

Watermarked video is divided into scenes Select the scene (group of frames) Watermarked frame is isolated into R, G, B planes DWT is applied on marked plane to get watermarked coefficients Watermark bits are detected by using the following condition i. If marked coefficient is greater than the median of an array of 5 coefficients, then watermark bit is treated as 1 ii. If marked coefficient is less than the median of an array of 5 coefficients, then watermark bit is treated as 1

As watermark part is detected from each frame of the same scene, multiple copies of the same watermark part are identified from all frames. By averaging them, embedded watermark part is recovered. 6. Steps 3–5 are repeated for next scene and process continues until the last frame 7. The parts of watermark from each scene are concatenated to form scrambled watermark

626

C. N. Sujatha and P. Sathyanarayana

8. Inverse Arnold transform is performed to get bit plane image 9. All bit planes are combined to recover the hidden watermark.

4 Results on Simulation The efficiency of the proposed scheme is confirmed by performing simulation on a short video of 121 frames. Size of each frame is 240 × 320. A grayscale moon image of size 30 × 60 is taken as watermark. Bit plane decomposed image is enlarged to a size 120 × 120 by placing bit planes side by side as shown in Fig. 3b and scrambled watermark after Arnold transform is shown in Fig. 3c. In this method, video is broken into four scenes. So scrambled watermark also broken into four parts of each size 30 × 120. Identical part is implanted in each scene. Retrieved watermark from marked video is shown in Fig. 3d. The watermarked video is subjected to some common attacks to prove the strength of algorithm. Correlation factor (CF) and PSNR (peak signal to noise ratio) are computed between original and retrieved watermark to assess the resemblance and perceptual quality under attacks as shown in Table 1. In frame averaging case, 1–5, 31–35, 62–66 and 96–100 watermarked frames are averaged separately and the video frames are deduced to 105. The retrieved watermark after frame averaging is shown in Fig. 3e. After dropping 1–4, 31–34, 62–65, and 96–99, video frames are reduced to 105 and the corresponding detected watermark is shown in Fig. 3f. Frame swapping is done randomly between frames, so for each execution, the values of CF and PSNR changes. The identified watermark is shown in Fig. 3g. Watermarked

Fig. 3 a Watermark, b bit plane decomposed image, c scrambled watermark, d detected watermark; extracted watermarks (e) from frame averaged video (f) from frame dropped video (g) from frame swapped video (h) from Salt and Pepper noise attacked video

DWT-Based Blind Video Watermarking …

627

Table 1 Performance metrics of a scheme for three planes Attacks

Metrics

LH and HL embedding R G B

LH, HL and HH embedding R G B

Frame averaging

CF PSNR CF PSNR CF PSNR CF PSNR CF PSNR

0.9571 41.0245 0.9563 41.2759 0.5377 31.1181 0.8291 30.4411 0.9600 40.4841

0.9470 37.4489 0.9475 37.7243 0.8044 35.0980 0.8210 30.0487 0.9507 37.1708

Frame loss Frame swapping Salt and Pepper noise Without attack

0.9611 41.0707 0.9615 41.0831 0.5332 31.2506 0.8506 30.3321 0.9651 40.3715

0.9623 40.9504 0.9625 40.7210 0.8384 34.8670 0.8326 30.4198 0.9640 39.9991

0.9436 38.2479 0.9443 38.4623 0.6616 31.9374 0.8332 30.2137 0.9426 37.5457

0.9474 37.4991 0.9496 37.5964 0.6704 31.8923 0.8279 30.2659 0.9477 36.8834

video is suffered with salt and pepper noise at density of 0.02 and the detected watermark is shown in Fig. 3h. The present scheme is analyzed by embedding watermark in three different planes of mid-frequency coefficients as well as in mid and high frequency coefficients.

5 Conclusion In this paper, a novel blind video watermarking scheme based on DWT using Arnold transform is presented. Watermark parts are embedded into different scenes of video. The watermark is preprocessed using bit plane decomposition and Arnold transform. Processed watermark is embedded in mid and high-frequency DWT coefficients of frames. The proposed scheme is blind as watermark retrieved without using original image which is important for video watermarking. Simulation results exhibit the strength of the proposed method against frame averaging, frame loss, frame swapping, and salt and pepper noise attacks.

References 1. Mondal, B., Sinha, N., Mandal, T.: A secure image encryption algorithm using LFSR and RC4 key stream generator. In: Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics, SIST, vol. 43, pp. 227–237. Springer, India (2016) 2. Zhou, R.-G., Sun, Y.-J., Fan, P.: Quantum image gray-code and bitplane scrambling. Quantum Inf. Process. 14(5), 1717–1734 (2015) 3. Li, Y., Wang, C., Chen, H.: A hyper-chaos-based image encryption algorithm using pixel-level permutation and bit-level permutation. Opt. Lasers Eng. 90, 238–246 (2017)

628

C. N. Sujatha and P. Sathyanarayana

4. Abbas, N.A.M.: Image encryption based on independent component analysis and Arnolds cat map. Egypt. Inf. J. 17(1), 139–146 (2016) 5. Zou, J., Ward, R.K., Qi, D.: A new digital image scrambling method based on Fibonacci numbers. In: Proceedings of the 2004 International Symposium on Circuits and Systems, ISCAS’04, vol. 3, pp. III-965 (2004) 6. Inoue, H., Miyazaki, A., Araki, T., Kastura, T.: A digital watermark method using the wavelet transform for video data. IEICE Trans. Fundam. E38-A(1) (2000) 7. Swanson, M.D., Zhu, B., Tewfik, A.H.: Multiresolution scene-based video watermarking using perceptual models. IEEE J. Select. Areas Commun. 16(4) (1998) 8. Parah, S.A., Sheikh, J.A., Assad, U.I., Bhat, G.M.: Realisation and robustness evaluation of a blind spatial domain watermarking technique. Int. J. Electron. (2016) 9. Zhao, X., Ho, A.T.S.: An introduction to robust transform based image watermarking techniques. In: Intelligent Multimedia Analysis for Security Applications. Studies in Computational Intelligence, vol. 282. Springer, Berlin, Heidelberg (2010) 10. Pradhan, C., Saha, B.J., Kabi, K.K., Bisoi, A.K.: Blind watermarking techniques using DCT and Arnold 2D cat map for color images. In: International Conference on Communication and Signal Processing, pp. 027–030 (2014) 11. Priya, P., Tanvi, G., Nikita, P., Ankita, T.: Digital video watermarking using modified LSB and DCT technique. Int. J. Res. Eng. Technol. (IJERT) 3(4), 630–634 (2014) 12. Essaouabi, A., Regragui, F., Ibnelhaj, E.: A blind wavelet-based digital watermarking for video. Int. J. Video Image Process. Netw. Secur. (IJVIPNS) 9(9), 37–41 (2009) 13. Hussein, J., Mohammed, A.: Robust video watermarking using multi band wavelet transform. Int. J. Comput. Sci. Issues (IJCSI) 6(1), 44–59 (2009) 14. Lin, W.-H., Wang, Y.-R., Horng, S.-J., Kao, T.-W., Pan, Y.: A blind watermarking method using maximum wavelet coefficient quantization. Expert Syst. Appl. (Elsevier) 36, 11509– 11516 (2009) 15. Qian, H., Tian, L., Li, C.: Robust blind image watermarking algorithm based on singular value quantization. In: ICIMCS’16. ACM (2016) 16. Chang, C.-S., Shen, J.-J.: Features classification forest: a novel development that is adaptable to robust blind watermarking techniques. IEEE Trans. Image Process. IEEE Signal Process. Soc. (2017) 17. Gaobo, Y., Xingming, S., Xiaojing, W.: A genetic algorithm based video watermarking in the DWT domain. In: International Conference on Computational Intelligence and Security. IEEE (2006)

Fractals: A Novel Method in the Miniaturization of a Patch Antenna with Bandwidth Improvement Geeta Kalkhambkar, Rajashri Khanai and Pradeep Chindhi

Abstract The latest technology is demanding the miniaturized size of the antenna with wider bandwidth to allow the higher data rates. Fractal antenna with miniaturization in antenna size and enhancement in bandwidth is given in this paper. A line feed hexagonal fractal antenna and an inset fed Minkowski fractal antenna operating at 2.4 GHz Wireless LAN application (WLAN) with improvement in the bandwidth from 14.58% to 112.5% in case of the hexagonal fractal antenna. An issue with the fractal antenna is choosing a feed location which can be solved using complementary structures is also illustrated in this paper. Keywords Bandwidth (B.W) fractal



Fractal



Feed location



Complementary

1 Introduction “ANTENNA” is the most important member of all the communication areas. The idea of reduced antenna size with the optimal outcomes has become a fundamental issue in latest emerging technologies. Apart from the presence of highly efficient antennas, patch antennas are gaining tremendous importance. Because of their tiny G. Kalkhambkar (✉) ⋅ P. Chindhi S.G.M.C.O.E Kolhapur, Kolhapur, India e-mail: [email protected] P. Chindhi e-mail: [email protected] R. Khanai KLES, Dr. M.S.S.C.E.T, Belgaum, India e-mail: [email protected] G. Kalkhambkar ⋅ P. Chindhi Shivaji University, Kolhapur, India R. Khanai V.T.U, Belgaum, India © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_63

629

630

G. Kalkhambkar et al.

size, planar geometry, conformability, and ease of installation, many researchers are attracted towards the research in the area of the patch antenna. Looking at the limitations of patch antennas that is low gain, limited bandwidth, and less directionality, overcoming these limitations has become an attractive area of research. Improving gain is of major concern without compromising the features like bandwidth and directivity. Nowadays, the ultra-wideband (UWB) antennas are gaining importance in high data transmission applications with low cost. With the changing technology, the devices are becoming smart and compact therefore miniaturization of the antenna is also a major concern. A fractal antenna is such a candidate, which provides the improvement in the bandwidth in an iterative manner. Especially the fractal patch antennas are very easy to fabricate and easily fractal patch antennas are popular because of their miniaturized size, bandwidth enhancing capability, and multiband operation. In [1], the final iteration of triangular fractals has experimented with different substrate thickness, the return loss, improved with the increase in substrate thickness. In [2], the unique fractal shapes are inserted on a hexagonal patch which provided the ultra-wideband performance. In [3], a multiband fractal antenna with plus shaped fractals is presented. A fractal antenna with Frequency Selective Surface (FSS) gives a better gain compare to the normal surfaces [4]. Millimeter wave antennas for 5G communication are gaining importance. In [5], a fractal antenna with dual band and dual polarized operation is presented the antenna exhibits good gain and isolation. A Minkowski fractal antenna with square shape initiator operating at X and Ku band is designed in [6] at each iteration gain is reduced. A single iteration of Minkowski fractal curve is carried out stating that the other performance parameters of the antenna were reduced in the further iterations [7]. In [8], three types of fractal antennas namely rectangular carpet, Diamond carpet, and Apollonian fractals are discussed giving a multiband operation. Locating the feed point in a probe fed fractal antenna becomes a challenging task the characteristic mode theory helps in determining the precise location of the feed [9]. A multiband meandered fractal antenna working in six different bands is illustrated in [10] and the gain with inset line feed is better compared to the simple line feed. A snowflake shape fractal structure is optimized using PSO (Particle Swarm Optimization) and NMS (Nelder-Mead Simplex) techniques [11]. Fractal shaped metamaterial absorbers are illustrated in [12] which facilitate the multipath reduction with miniaturized absorbers. T-square fractal geometry gives ultra-broadband characteristic [13] with an increase in gain. In the reference [14], author has stated that the bandwidth of fractal antenna decreases with an increase in the further iterations, but in our paper, we have seen an increase in the bandwidth with increasing iterations. A Minkowski antenna which exhibits better gain as well as bandwidth in the 2.4 GHz band is presented. In the papers referred so far, we have observed some authors stating that the fractal antennas help in improving bandwidth, but from our study, we experienced that the fractal antennas improve bandwidth only up to some iterations. After a certain limit, it reduces the gain and directivity. The problem of

Fractals: A Novel Method in the Miniaturization …

631

locating the feed in complex fractal geometry can be solved using its complementary structure is also explained in this paper.

2 Fractal Antennas: A Novel Method to Increase the Bandwidth Fractals are the self-similar and duplicating structures. The idea of utilizing the fractal geometries in patch antenna provides the improvement in the bandwidth. As per the basic radiation mechanism of the antenna, the radiation can emerge from the discontinuities in the radiating structure; fractal antennas utilize the same concept. The fractal antenna given in the next sections is our original work.

2.1

Hexagonal Patch Antenna with Triangular Fractals

The fractal antennas iteratively increase the bandwidth. A simple hexagonal patch antenna is constructed first in ZELAND-IE3D software. The fractal antennas shown in Table 2 are designed for the WLAN at 2.4 GHz by taking a square patch of dimensions and specifications as shown in Table 1 with a scaling factor K = 1.77 for next fractal iterations. A ring of triangular fractals is inserted at the successive iterations. It is found that at each higher step the noticeable improvement in bandwidth is observed. Table 2 shows the results at each stage with the subsequent increase in bandwidth. At iteration 1, the bandwidth is very low, as shown in Table 2, as we go on increasing the number of fractals the bandwidth improvement is observed. At stage 4 the −10 dB bandwidth is ranging from 2.2 to 4.9, i.e., 112.5%. Figure 1 shows the bandwidth comparison of all the iterations. Table 1 Aantenna design parameters Parameters

fr

εr

h

Lg

Wg

L

W

K

Values

2.4 GHz

2.4

1.56 mm

40 mm

50 mm

28 mm

38 mm

1.77

Table 2 Iterations of hexagonal patch antenna with triangular fractals Output parameters

Initiator

Iteration 1

Iteration 2

Final iteration

14.58%

50%

50%

112.5%

Geometry

−10 dB Bandwidth

632

G. Kalkhambkar et al.

Fig. 1 Bandwidth comparison of hexagonal fractal antenna

Comment on Other Performance Parameters: Gain and Directivity of Hexagonal Patch Antenna with Triangular Fractals. The hexagonal shape fractal antenna with triangular shape fractals is as shown in Table 2, it has the advantage in terms of bandwidth enhancement, but it adversely affects the gain. With increasing iterations, the metal layer is removed each time while inserting the fractals, as a result, the radiations from the metal patch layer reduce, because of which the gain, as well as directivity, also reduces. Figure 1 shows the improvement in bandwidth.

2.2

Modified Minkowski Island Fractal Antenna

A novel modified complementary Minkowski fractal antenna is presented in Table 4. The square fractal is taken as a basic geometry to construct initiator. As the number of fractals increases, it increases the bandwidth. At the same time antenna size is miniaturized. Table 4 shows the initiator and subsequent fractal iterations with a scaling factor of K = 1.83. The scaled version of an initiator is used in the successive iterations. The bandwidth in initiator itself is almost doubled with multi-frequency operation but in iteration 3, bandwidth decreases due to the unpredictable location of feed point and less scope of optimization in the geometry. Looking at the drawback of locating the feed in iteration 3, we have modified the Minkowski fractal antenna to its complementary mirror image as shown in iteration 4 in Table 4. The original modified Minkowski island fractal antenna and generation of its complementary structure is shown in Fig. 2. Iteration 4 and iteration 5 is more effective compared to the first 3 iterations. Further improvement can be done by

Fractals: A Novel Method in the Miniaturization …

633

Fig. 2 Steps for complementary Minkowski Island fractal antenna

Table 3 Antenna design parameters Parameters

fr

εr

h

Lg

Wg

L

W

K

Values

2.4 GHz

2.4

1.56 mm

40 mm

40 mm

30 mm

30 mm

1.83

Table 4 Iterations Minkowski patch antenna with square fractals Output parameters

Basic patch

Initiator

Iteration 3

Iteration 4

Iteration 5

10.28%

20%

15.27%

28.63%

32.44%

Geometry

Bandwidth

inserting more number of fractal slots in iteration 5. In Table 3, instead of multiple replicas of the initiator as shown in iterations 1–3, we can insert the complementary slots of the shape of fractals designed as shown in Fig. 2. This will give more increment in the bandwidth as illustrated in iterations 4 and 5 in Table 4. The difference between the fractal geometries of the Tables 2 and 4 is that, in Table 2, we have created the fractals of conducting metal plane as shown in Table 1. But in Table 4 by subtracting iteration 3 from initiator, we get iteration 4 as shown in Fig. 2 which is the complementary patch of iteration 3. The main purpose of the iteration 4 is to reduce the uncertainty of feed point location in iteration 3. In iterations 4 and 5, we have sufficient space to choose the feed point location. The bandwidth has noticeable enhancement with good choice of optimization variables. The iteration 5 is the optimized version of iteration 4 with increased fractals and shows the further improvement in bandwidth as shown in Fig. 3. Discussion on Other Performance Parameters: Effect of Modified Minkowski Island Fractal Antenna on its Gain and Directivity. The gain and directivity are affected by the increase in the number of fractals. As compared to the traditional Minkowski island fractal antenna in iteration 2 and in iteration 3, the modified Minkowski island fractal antenna of iteration 4 and 5 also affects gain and directivity which can be improved using different optimization schemes available in the software. This would otherwise be impossible with first 3

634

G. Kalkhambkar et al.

Fig. 3 Bandwidth comparison of Minkowski island fractal antenna

iterations due to insufficient area for optimization. In iteration 4, the gain is less that is 3.2 dB and directivity is below 1 dB which is again improved using Powel optimization scheme available in ZELAND IE3D software. The optimization is performed in iteration 5. From Table 4, the improvement in the gain is from 3.2 to 3.9 dB and there is a noticeable enhancement in the directivity that is from less than 1 to 5.89 dB from iteration 4 to iteration 5. The geometry and its outputs of iterations 4 and iterations 5 are as shown in Table 4. Iteration 5: In iteration 5, we have increased the number of fractals. We found that there is a further improvement in bandwidth, but the problem of the reduction in gain and directivity is overcome in iteration 5 as seen in Table 4. By using the optimization on the length of the patch antenna and depth of inset feed point, the gain, directivity, as well as bandwidth, can be improved. The final geometry and its associated results are as shown in Table 5. Limits on Fractal Iterations: Subsequent iterations show degradation in the antenna performance. The gain of the antenna decreases in the further iterations. Due to the removal of the metal area in the next iteration, the radiation also reduces. This shows that there is a limit on the number of fractal iterations beyond which no improvement can be found and also the fabrication process becomes more complex.

Fractals: A Novel Method in the Miniaturization … Table 5 Improvements from iteration 4 to iteration 5

Geometry

635 Bandwidth (%)

Gain (dB)

Directivity (dB)

28.63

3.2

>1

32.44

3.9

5.89

Fig. 4 Fabricated antennas

(i) First Iteration

(ii) Last Iteration

3 Antenna Fabrication and Testing A modified Minkowski island fractal patch antenna is fabricated by the manual method using a radium mask cut with precise dimensions using a radium cutting machine. An FR4 glass epoxy substrate is used with a dielectric constant of 2.4 and thickness 1.56 mm. A mask is put on a substrate of the proper dimensions and kept it in a ferric chloride solution for etching. The fabricated antennas are shown in Fig. 4a, b. The antenna was designed for 2.4 GHz but due to fabrication imperfections, the band is shifted from 2.4 to 3 GHz. Figure 5 shows the simulated and tested S parameter plots. The further improvement can be done by utilizing some impedance matching techniques.

636

G. Kalkhambkar et al.

(a) Testing Result: S11Plot at 3 GHz

(b) SimulatedS11Plot at 2.4GHz

Fig. 5 a Testing result: S11 plot at 3 GHz, b simulated S11 plot at 2.4 GHz

4 Conclusion The fractal geometries which we have designed show the improvement in bandwidth. In the modified Minkowski fractal antenna, a new method for designing the fractal antenna is introduced which can be taken as a solution to the problem of feeding point selection in the fractal geometries with improvement in the bandwidth.

References 1. Bai, X., Zhang, J., Xu, L.: A broadband CPW fractal antenna for RF energy harvesting. IEEE (2018) 2. Chaouche, Y.B., Bouttout, F., et al.: Compact CPW-Fed hexagonal antenna with a new fractal shaped slot for UWB communications. 978-1-5090-4372-9/17/$31.00 ©2017. IEEE (2017) 3. Jibhkate, N.S., Zade, P.L.: A compact multiband plus shape CPW fed fractal antenna for wireless application. 978-1-5090-4556-3/16/$31.00 ©2016. IEEE (2016) 4. Kavya, K., Dwivedi, R.P.: Study on CPW antenna using fractal geometry for WiMax application. 978-1-5090-5913-3/17/$31.00 ©2017. IEEE (2017) 5. Haider, S.S., Wali, M.R., Tahir, F.A., Khan, M.U.: A fractal dual-band polarization diversity antenna for 5G applications. 978-1-5386-3284-0/17/$31.00 ©2017. IEEE (2017) 6. Dalmiya, A., Sharma, O.P.: A novel design of multiband Minkowski fractal patch antenna with square patch element for X and Ku band applications. 978-1-5090-2807-8/16/$31.00 ©2016. IEEE (2016) 7. Chuma, E.L., Iano, Y., Bravo Roger, L.L.: Compact antenna based on fractal for IoT Sub-GHz wireless communications. 978-1-5090-6241-6/17/$31.00 ©2017. IEEE (2017) 8. Dhana Raj, V., Prasad, A.M., et al.: Comparison of radiation attributes of printed microstrip coplanar waveguide fed fractal antennas. 978-1-5090-5350-6/15/$31.00 ©2018. IEEE (2018)

Fractals: A Novel Method in the Miniaturization …

637

9. Oraizi, H., Poordaraee, M.: Design of circularly polarized Giuseppe Peano fractal patch antenna by the theory of characteristic mode. 978-1-5090-5963-8/17/$31.00 ©2017. IEEE (2017) 10. Sharma, N., Singh, J., Sharma, V.: Design of hexagonal meander fractal antenna for multiband applications. 978-1-5090-3411-6/16$31.00 ©2017. IEEE (2017). https://doi.org/ 10.1109/icmete.206.17 11. Khadhraoui, I., Salah, T.B., Aguili, T.: Dual-band pre-fractal antenna multi-objective optimization design. 978-1-5090-4372-9/17/$31.00 ©2017. IEEE (2017) 12. Venneri, F., Costanzo, S., Di Massa, G.: Fractal-shaped metamaterial absorbers for multi-reflections mitigation in the UHF-band. 1536-1225 ©2017. IEEE (2017) 13. Safia, O.A., Nedil, M.: Ultra-broadband V-band fractal T-square antenna. 978-1-5386-3284-0/ 17/$31.00 ©2017. IEEE (2017) 14. Luciani, G., Mamedes, D.F., et al.: H-shaped fractal antennas for dual-band applications. 978-1-5090-6241-6/17/$31.00 ©2017. IEEE (2017)

Adaptive Live Task Migration in Cloud Environment for Significant Disaster Prevention and Cost Reduction Namra Bhadreshkumar Shah, Tirth Chetankumar Thakkar, Shrey Manish Raval and Harshal Trivedi

Abstract The “Cloud” in IT terms is straightforwardly a storage of data. Cloud computing is one of the most emerging technologies in IT industries as of late and it means to store and manage data persistently over the cloud (the Internet) at a very low cost. Migration to and among cloud servers helps IT professionals to protect their data, prevent them from any disasters, and provide their resources efficiently without any delay or problems. Auto-scaling provides agility in managing virtual machines, whether to increase or decrease them. Any successful prevention of disaster will necessarily depend on the migration of certain tasks from one virtual machine to another. Most of the data recovery approaches suffer from high recovery time, balancing load and to cut cost. In this work, we incorporated an adaptive live task migration technique to prevent as many disasters as possible and to significantly reduce cost which is presented in the form of a graph later in the performance evaluation section. The experimental outcome shows that the proposed algorithm outperforms other approaches by 15–25% in terms of reducing cost, and balancing the load among available nodes. It also diminishes any prospect of disaster.





Keywords Cloud computing Live migration Adaptive task migration Disaster prevention Cost reduction Auto-scaling





N. B. Shah (✉) ⋅ T. C. Thakkar Department of Computer Engineering, Vishwakarma Government Engineering College, Nr. Visat Three Roads, Sabarmati-Koba Highway, Chandkheda, Ahmedabad 382424, Gujarat, India e-mail: [email protected] T. C. Thakkar e-mail: [email protected] S. M. Raval Silver Oak College of Engineering & Technology, Opp. Bhagwat Vidyapith, Nr. Gota Cross Road, Ahmedabad 382481, India e-mail: [email protected] H. Trivedi SoftVan, IIM Road, Ahmedabad 390015, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_64

639

640

N. B. Shah et al.

1 Introduction A Cloud is a data center or multiple data centers made up of compute and storage resources connected through a network [1]. Cloud is an online platform that can be used for the basic services a normal machine provides. Cloud allows its users to utilize the services that it provides at a very feasible cost [2]. Cloud can mainly be useful to the start-up or developing business entities. Cloud computing is the delivery of variegated services over the Internet which here is considered as “the Cloud”. These services comprise databases, servers, software, networking, storage, and many more [3]. These services are delivered on pay-per-use terms. Moreover, the issues of maintaining servers and handling them single-handedly are getting more complex nowadays. Therefore, it is advantageous to use server migration with cloud computing technologies for a better and secure result. A Cloud network is a combination of multiple computers working together in such a way that they appear to be just one vast computer resource to a casual observer. Some of the many advantages of cloud computing are its flexible costs, abstract IT management, improved security, and compliance. Cloud migration is defined as the transfer of data or applications from one server to another over cloud environment or from physical server to cloud server without altering the data. In the first case, the migration is called cloud-to-cloud migration. Cloud migration provides flexible cloud computing. Data migration is mainly used in large or medium-sized businesses, as the loss of their data during downtime of their servers may cost way more than they could imagine. Transferring their data to cloud servers can not only prove to be reliable to them but it may also relax these businesses in server management and security issues [4]. Also, recovery from their own server in times of unfortunate events may take a lot of and time than they could afford, therefore it may be easier to them if they migrate their data to cloud servers (Fig. 1). According to a survey, issues of server failures can be divided into its closest decimal (in %) as follows in Fig. 2. Some of the most important challenges about data migration is data security and data integrity during migration. Also, data may be vulnerable to attacks from intruders. Some of the causes that may lead to migration [5] are: • More number of requests pending from a single virtual machine thus increasing the load.

Fig. 1 Cloud computing architecture [15]

Adaptive Live Task Migration in Cloud Environment … Fig. 2 Server failure causes [16]

641

Issues in server failures in percentage 40 50

42 18

0 Employee mistakes External factors and OperaƟng System failures Failure of applicaƟon

• Degradation of server. • Overload on the server due to a single requests of more than threshold limit requests pending in the queue. • Poor data recovery strategies used by current servers/physical machine. • Higher efficiency of the new server than the old one hence high performance over the previous ones. Auto-scaling is a term closely related to load balancing, and allows user to maintain its application maintainability. Auto-scaling is basically the dynamic allocation of computing resources based on the need of the application [6]. It is an essential feature in cloud computing and can be firmly used in data migration strategies. Now, it is difficult to monitor all host’s and virtual machine’s resource statistics. So, the concept of live task migration comes into picture. In live task migration, a third party application monitors all servers that the cloud hosts and does live migration when needed. The third party indicates the right time for migration, the right node from which a task must be migrated and a perfect node into which task(s) must be migrated. In this paper, we surveyed many server migration strategies which are mentioned in the Related Work section. A brief comparison of the various methods is also stated in the form of two tables in the next section. Some loopholes have been highlighted further in the Research Gap section. Subsequently, a model is proposed that overcomes those flaws in the Proposed Model section which basically takes into consideration many parameters like CPU utilization, memory utilization, network usage, IOPS, and average response time. The proposed model eliminates any chances of disaster by working on those parameters thereby increasing the Quality of Service (QoS).

2 Related Work The technical support required for generating or developing a cloud environment is provided to us by the virtual machine. With respect to the multiple virtual machines based environment, live migration of virtual machines is very effective. Here, there

642

N. B. Shah et al.

are two noteworthy things. First, that which VM will be migrated, and second that to which VM the task/application will be migrated. Choosing an appropriate and suitable target destination for the virtual machine is vital for the process. For this, the host selection algorithm has the following classifications.

2.1

Minimum Load Priority

As the name suggests, the migration will be performed on the virtual machine whose utilization would be least among the available servers in the cloud environment at that specific time. This method is hence called the minimum load priority method [7]. Utilization of this technique would not necessarily lead to overload the nodes. This does not take into consideration the dynamic differences of the consumption of the memory sources by the virtual machine or any other parameter other than the load on that particular server which is a limitation.

2.2

Load Balancing

Here in this technique, a new idea is constructed upon the logic of load balancing. The aim of this technique is to perform migration among various available nodes to form a balanced state in the whole Cloud Environment in a proficient manner [8]. This technique would direct a task to a particular node in such a way that the load among the virtual machines is balanced in the most efficient way. The load balancing technique provides better performance as compared to Minimum Load Priority method.

2.3

Post-copy Approach

The process here is stopping the source machine, copying the processor data from the source and then resuming the target VM machine with the usual rate. This approach was designed by Michael et al. [9]. This approach reduces the total migration time and number of resources to be transferred. For example, Bubbling algorithm for preparing (Table 1). The strategies used to choose a suitable virtual machine from the whole list of machines for relocation are as follows.

2.4

Random Migration (Random Choose)

This strategy is pretty quick and efficient comparatively. Here, the virtual machines which are awaiting the dynamic migration are selected from the list [10]. Hence,

Adaptive Live Task Migration in Cloud Environment …

643

Table 1 Comparison of methods Method name

Advantages

Disadvantages

Consideration of memory blocks

Performance

Minimum load priority

Simple and very easy to implement. No overload of data

No

Moderate

Load balancing

Better performance than the previous one

Yes

Good

Post copy approach

Reduction in total migration of time and number of resources to be used

Non-consideration of varying differences among memory consumption by applications Security flaws in the destination server cannot be identified Target machine may overload with all the data

Yes

Good

random strategy selects the destination virtual machine with greater equity concurrently. This strategy when applied will eventually raise the number of Virtual Machine Migration in the Cloud Environment. The disadvantages of this strategy are also the randomness through which it operates because eventually, we cannot make sure whether the targeted machine is suitable or not for the purpose of migration.

2.5

Dynamic Prediction Migration

This strategy is completely constructed upon the Virtual Machine Memory’s dynamic consumption, under the vicinity of the prediction techniques used for the prediction of the next virtual machine’s scheduling of the periodic memory consumption value. The algorithm described and offered completely analyzes the past data accumulated by the Monitor Server Module, and later after processing, it predicts or forecasts all virtual machines on the HotPM node [7].

2.6

Workload Prediction Method

This method uses processor workload prediction method and secures them using MD5 algorithm. The method calculates exponentially weighted moving average (EWMA) [11]. Server measures the load periodically. Then it calculates an attribute called “temperature”, which would reflect the need for migration, if resulted greater than 0. Some other processes, like authentication, authorization, etc., are carried away.

644

2.7

N. B. Shah et al.

Pre-copy Approach

This approach has two phases: Warm-up phase and stop-and-copy phase [12]. Warm-up phase copies all the VM pages from the machine and the ongoing modified pages are recopied until their rate is less than the copied pages. In the latter one, VM stops and copies all the pages from the VM machine. Some of the pre-copy approach methods are: Combined technologies: Here, a combination of recovery system technologies and CPU scheduling is used for faster migration. Log files generated from source VM are executed by target VM [13]. Advantage is short downtime and reasonable total migration time. However, when CPU and memory intensive VMs are migrated, downtime is extended, that may cause failure of the system. Large-sized VM migration can be an issue, therefore, combination of VM replication and VM scheduling can be used. Improved pre-copy approach: This is the improvements made over pre-copy approach. One of the pre-copy approach methods is memory compression based VM migration approach (MECOM) which is defined by Hai et al. [14]. Some other improvements made over generic pre-copy approach are the addition of bitmap page, delta compression live migration algorithm just to name a few. A small table about the summary of information provided in each method with respect to some parameters is as follows.

3 Research Gap In cloud computing, one of the prime issues is to control the state of disaster. After studying many approaches, we recognized that almost none of the methods in general prevents disaster to begin with. Rather, they let disaster happen and then suggest a recovery approach. Also, in most of the methods, the recovery approach is slow or moderate as mentioned in Table 2. While the ones providing fast recovery has more cost. Moreover, almost all of the mentioned approaches suffer from work load balancing. The load balancing among servers is not proficient in any case. Hence, some tasks collapse unceremoniously without even an indication of failure which in turn hinders the Quality of Service (QoS). All in all, the challenges in the present scenario are • Disaster prevention • Cost curtailment • Work load balancing.

Adaptive Live Task Migration in Cloud Environment …

645

Table 2 Parameters performance of various parameters Method/Parameters

Complexity

Cost

Security

Flexibility

Recovery approach performance

Minimum load priority Load balancing

Very low

Less

More

Low

Medium

No security Low

Post copy approach

Moderate

More

Moderate

Less

Random migration

Low Moderate

No security Low

More

Dynamic prediction migration Disaster recovery planning (DRP) and business resumption services (BRP) Workload prediction method Pre-Copy approach

Very less Medium

Moderate

Slow/No recovery Slow recovery Moderate recovery Slow recovery Moderate

High

More

Low

Less

Slow recovery

Moderate

Medium

High

Less

Varies

More

Moderate

More

Fast recovery Fast recovery

Moderate

4 Proposed Model 4.1

Architecture

The proposed model runs essentially to prevent any disaster state from occurring, as shown in Fig. 3. Migration Controller will contact Task Migration Algorithm whenever a task is to be migrated from one node to another. The algorithm decides which task should be migrated and to which server the task should be migrated. The algorithm also calls VM Controller to add a VM in the VM pool if it does not deem any available servers worthy of migrating the task in. In the following figure, Task 11 is migrated from VM 1 to VM 3 where it will be known as Task 33.

4.2

Flowchart

The process would start with determining threshold value of the migration factor that is to be used further in the method. The initial server node will be then taken into consideration for calculating migration factor and comparing it with the predefined threshold value. The migration controller will evaluate this comparison and if the value would not exceed the threshold limit, next node will be taken into consideration. On exceeding the limit, all the VMs available will be sorted ascendingly for the next step. Task factor, then, will be calculated for the all the tasks on the current VM(s).

646

N. B. Shah et al.

Fig. 3 Proposed architecture

Task with the least task factor will be chosen to migrate and appropriate VM will be selected. If the value of migration factor of that VM would surpass the 50% of the threshold limit, a new VM would be added. Migration to the new VM will be performed on that case. On the other hand, if value would be below the limit, direct migration to that task will be performed and adding a new VM would be skipped (Fig. 4).

4.3

Algorithm

The algorithm to be used runs majorly to prevent any disaster or latency issues on server and client side, respectively. The “migration_factor” will be helpful in determining “to which virtual machine the task will get migrated”. “MigrationController” procedure determines the node from which task(s) should be migrated in order to prevent any disasters. Finally, the task that should be migrated will be determined by TaskMigrationAlgo procedure. Basically, the system runs when Migration Controller catches any Virtual Machine that is running high on the “migration_factor”. The migration_fator is a combination of many parameters, namely, cpuUtilization, IOPS, memoryUtilized, avgResponseTime, and networkUsage. If the migration_factor is greater than the combined threshold value of all these parameters, then that particular server will be migrated. The Migration Controller keeps a track of the migration_factor of every node and if any node misbehaves then it urgently contacts the TaskMigrationAlgorithm. Mathematically, migration_factor for each node will be calculated as migration factor = cpuUtilizationðinpercentageÞ + memoryUtilizationðinpercentageÞ + networkUsageðinpercentageÞ

Adaptive Live Task Migration in Cloud Environment …

647

Fig. 4 Proposed flowchart

The “node from which the task is to be migrated” is decided by the following block of code which is carried out by Migration Controller. Pseudo Code 1. Procedure MigrationController Input: CPU_THRESH, MEMORY_THRESH, IOPS_THRESH, RESPONSE_TIME_THRESH Output: VM from which a task needs to be migrated. For i=0 to n VM(s) do If(

//checking various parameters for available VM(s) ) Then

If(

) Then If(

) Then Call procedure TaskMigrationAlgo //for migration of task from a node to another

End If End If End If End For

648

N. B. Shah et al.

The overall threshold value will be the sum of threshold values of all the parameters that will be used for migration. The mathematical formula will be calculated as THRESH VALUE = CPU THRESH + MEMORY THRESH + NETWORK THRESH Pseudo Code 2 Procedure TaskMigrationAlgo Input: migration_factor, p (pth VM), THRESH_VALUE Output: migration of a task tx from VMa to VMb where a

b

Sort all VM(s) in the ascending order of their respective migration_factor For i=1 to m task(s) do

// m number of tasks in a particular VM that has been chosen for migration

//Calculate task_factor as per the following formula

End For Sort all task(s) in ascending order of their task_factor //Migrating first task from the queue as it would be the easiest to migrate Get the data of task[0] //Finding an appropriate VM to migrate the task in If(

0

) Then st

//checking whether 1 VM’s migration_factor is greater than 50% of Thresh_value from the queue of VM(s) Call Scale_up()

//adds a new VM

Migrate task[0] to the newly added VM by cutting and pasting the required data Else Migrate task[0] to VM[0]

//as VM[0] would be less likely to surpass threshold value

End if End Procedure

5 Performance Evaluation While performing the following experiment with the algorithm applied being mentioned above, it was found that by using the proposed algorithm, the performance is improved drastically and the load is evenly balanced and hence any chances of a disaster occurring are minimized. Moreover, the tasks were carried out efficiently and the Quality of Service (QoS) was boosted. As shown in Fig. 5. two AWS EC2 instances kept running named Task-Migration-1 and Task-Migration-2, respectively. Multiple tasks were kept running on both of these instances. A task from a particular server was migrated

Adaptive Live Task Migration in Cloud Environment …

649

Fig. 5 EC2 dashboard

using the abovementioned algorithm when one of the servers was running high on the threshold value specified. The screenshot of AWS EC2 dashboard is mentioned in Fig. 5. Results show that the TaskMigrationAlgo algorithm outperforms existing approaches as a concerns preventing disasters. Roughly created graph in Fig. 6

Fig. 6 Cost comparison graph

650

N. B. Shah et al.

Fig. 7 Execution snap-1

clearly shows that overall cost of a VM can be significantly reduced using the proposed model over conventional approach. The proposed model works in order to migrate tasks running on a node rather than the node itself and hence the cost will be lower as compared to other approaches which migrate entire servers over a task or multiple tasks. Also, here in the graph, the threshold value of CPU utilization is supposed as 80%. Both the VM’s CPU utilization, memory utilization, and network usage will be calculated. If found vulnerable to disaster then one task will be migrated from a server to another. The process will be repeated until all the servers are safe from any kind of disaster. The outputs of the practical performed are shown in Figs. 7, 8, and 9.

Adaptive Live Task Migration in Cloud Environment …

651

Fig. 8 Execution snap-2

As shown in the above snapshots, first, information of both the VM’s task(s) information is calculated and it was deemed that migration was not required as shown in Fig. 8. Later, Task-Migration-2 had more load than threshold. Hence, Migration Controller was triggered and a specific task from Task-Migration-2 server was migrated to Task-Migration-1 server as shown in Fig. 8. Finally in Fig. 9, it was calculated that load is balanced and migration of another task was not required. Here is the recapitulation in tabular format of the practical performed (Table 3).

652

N. B. Shah et al.

Fig. 9 Execution snap-3

Table 3 Summary Parameter

Value

Number of AWS EC2 instances Number of AWS EC2 running instances Number of tasks running in each instance Number of tasks migrated Migrated from Migrated to Region of available servers Instance type

5 2 3–5 1 Task-Migration-2 (18.219.57.3) Task-Migration-1 (18.221.86.16) us-east-2c Micro

6 Conclusion and Future Work In this work, we surveyed various migration techniques for cloud computing. The proposed model does something that is unprecedented. It prevents disasters from taking place. By implementing the proposed algorithms task migration can be performed which helps in preventing disasters. As the proposed model migrates tasks as soon as a VM reaches the threshold value, there are even less chances for a

Adaptive Live Task Migration in Cloud Environment …

653

disaster to occur. This in turn will benefit business models as they will not have to deal with any recovery strategies once disaster took place. Results show that the proposed algorithm outperforms traditional methods in terms that it works at reducing cost and preventing disaster rather than waiting for disaster to take place and then take action. The proposed adaptive model and algorithms can be extended to solve latency issues which will be considered in future work.

References 1. Reese, G.: Cloud Application Architectures: Building Applications and Infrastructure. The Cloud: O’Reilly Media 2. Mell, P., Grance, T.: The NIST definition of cloud computing. In: National Institute of Standards and Technology, vol 53, pp. 1–50 NIST, Gaithersburg (2011) 3. Moharana, S.S., Ramesh, R.D., Powar, D.: Analysis of load balancers in cloud computing. Int. J. Comput. Sci. Eng. (IASET) 2, 101–108 (2013) 4. Ahmad, R.W., Gani, A., Ab Hamid, S.H., Shiraz, M., Yousafzai, A., Xia, F.: A Survey on virtual Machines migration and server consolidation frameworks for cloud data centers. J. Netw. Comput. Appl. 52, 11–25 (2015) 5. Jamshidi, P., Ahmad, A., Pahl, C.: Cloud migration research: a systematic review. IEEE Trans. Cloud Comput. 1(2), 142–157 (2013) 6. Foster, I., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and grid computing 360-degree compared. In: Proceedings of the Grid Computing Environments Workshop, pp: 99–106. IJSTR (2008) 7. Tang, Z., Mo, Y., Li, K.: Dynamic forecast scheduling algorithm for virtual machine placement in cloud computing environment. J. Super-Comput. 70(3), 1279–1296 (2014) 8. Forsman, M., Glad, A., Lundberg, L., Ilie, D.: Algorithms for automated live migration of virtual machines. J. Syst. Softw. 101, 110–126 (2015). ISSN: 0164-1212 9. Michael, R.H., Umesh, D., Kartik, G.: Post-copy live migration of virtual machines. SIGOPS Oper. Syst. Rev. 43, 14–26 (2009) 10. Chen, J., Qin, Y., Ye, Y., Tang, Z.: A live migration algorithm for virtual machine in a cloud computing environment. In: 2015 IEEE 12th International Conference on Ubiquitous Intelligence and Computing and 2015 IEEE 12th International Conference on Autonomic and Trusted Computing and 2015 IEEE 15th International Conference on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), Beijing, pp. 1319– 1326 (2015) 11. Reeba, P.J., Shaji, R.S., Jayan, J.P.: A secure virtual machine migration using processor workload prediction method for cloud environment. In: 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, pp. 1–6 (2016) 12. Akram, S., Ghaleb, S., Ba, S., Siva, V.: Survey study of virtual machine migration techniques in cloud computing. Int. J. Comput. Appl. 177, 18–22 (2017) 13. Weining, L., Tao, F.: Live migration of virtual machine based on recovering system and CPU scheduling. In: 6th IEEE Joint International Information Technology and Artificial Intelligence Conference, Piscataway, NJ, USA, pp. 303–307, May 2009 14. Hai, J., Li, D., Song, W., Xuanhua, S., Xiaodong, P.: Live virtual machine migration with adaptive, memory compression. In: IEEE International Conference on Cluster Computing and Workshops, CLUSTER’09, pp. 1–10

654

N. B. Shah et al.

15. Cloud computing architecture and its vulnerabilities. https://www.slideshare.net/ VinayDwivedi3/cloud-computing-architecture-and-vulnerabilies. Accessed 13 Oct 2017 16. Kumar, A., Mishra, S., Mishra, A.: Priority with adoptive data migration in case of disaster using cloud computing use style. In: 2015 International Conference on Communication, Information and Computing Technology (ICCICT), Mumbai, pp. 1–6 (2015)

Lip Tracking Using Deformable Models and Geometric Approaches Sumita Nainan and Vaishali Kulkarni

Abstract Multimodal biometrics addresses the issue of recognizing and validating the identity of a person; however, the issue is for a single modality to be robust enough. Voice, being a simple biometric feature to acquire, and the accompanying movement of the lips being distinct for every person can stand up to this challenge. Tracking Lip Movement in real time can be an important biometric trait for Person Recognition and Authentication. A biometric system should be robust and secure especially when they must be deployed in vital domains. In this paper, we have used three different methods to draw the contours around the lips to detect the lip edges to establish lip movement. In the first method, dynamic lip edge patterns were drawn and simultaneously saved in a database created for each person. Secondly, Lip contours were created using Active Snakes models while in the third technique, the motion extraction of lip was implemented using edge detection algorithm. In the work presented here, we have segmented the lip region from the facial images, and have implemented and compared three different approaches to contour the lip region. However, no single model fits every application due to the various face poses and unpredicted lip movements. The target application should be the deciding factor for the consideration of lip movement as the biometric modality along with the voice biometric trait. Keywords ASR Edge detection



Lip movement



Active contours



Segmentation

S. Nainan (✉) ⋅ V. Kulkarni Department of Electronics and Telecommunication Engineering, MPSTME (SVKM’s NMIMS), Mumbai, India e-mail: [email protected] V. Kulkarni e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_65

655

656

S. Nainan and V. Kulkarni

1 Introduction Automatic Speaker Recognition System (ASR) for Identifying or verifying a speaker over the years has taken huge strides. It is significant in validating the identity of a person in banking, high-security surveillance systems, voice dialing, computer access for remote information, various database accessing services, online shopping telemarketing, and implementing attendance systems [1]. The biggest challenge faced here is in the system emulating the precise, reliable, and prompt human response. Biometrics measures, analyzes, recognizes, and matches the biological parameters of a person by acquiring prominent attributes and matching these features with a model created from the previously established database of the same features. The availability of state-of-the-art mobile phones, advances in digital imaging and sensing platforms, and development of innumerable sensors have made acquisition of human parameters such as voice, temperature, fingerprints, facial images, ECG/EEG, gait, and such other personal information, easy to extract. The challenge today is in maintaining privacy, safeguarding personal information, offering identity protection and preventing unauthorized use [2]. It is imperative therefore to select a dedicated biometric modality to implement a biometric system depending on the target application. Voice is a simple, distinct single modality which can be easily extracted and the associated movement of the lips accompanying the speech is different for every person. Single Modality Biometric Systems, however, has its limitations. The accuracy of authentication and recognition depends on the conditions under which these modalities were acquired, as the features are highly susceptible to noise, illumination, angle and intensity of light, age of the person and their health conditions and cannot withstand the test of time and can be vulnerable to spoof and replay attacks. The challenge hence is in selecting that single modality which will ensure a flawless Biometric System. Lip movement is intrinsic to speech production and as it is dynamic, provides proof of liveness. The acoustics of speech production and the simultaneous lip movement are correlated [3]. The articulatory lip, tongue and the teeth movement, is proof of real-time speech and can also represent a live face/person [4]. Real-time Lip movement tracking is a task that this paper attempts to address. This work tries to extract Lip Movement dynamically and explores different methods of extracting lip features leading to Person Identification and Authentication. As the focus is on detecting a speaker in real time and implementing an authentic Person Recognition system, the primary step is to detect the face. The mouth region then is the region of interest (ROI). The images in the ViDTIMIT Database is stored as frames of the video recording of the spoken sentence. For the 43 speakers in the database, every speaker has spoken ten sentences which have been distributed and recorded over three different sessions. For an average sentence of 4 s duration which has been recorded with 25 fps, approximately 100–125 frames for every sentence is available. In one of the methods used for experimentation to draw

Lip Tracking Using Deformable Models and Geometric Approaches

657

contours around the Lip region, for every image in the frame, the lip boundaries were formed for motion extraction using the Active Snakes method. Canny edge detection was another method experimented with on the same images of the ViDTIMIT database. For real-time mouth detection and contour creation, we also used the webcam of the system and captured the image of the author in real time. Using computer vision and Python for writing the algorithm, the bounding box around the mouth was used as the reference. We calculated the distance between random points on the upper and lower lip region from the center of the mouth, creating contours simultaneously while the author speaks. Matplotlib was used for pattern generation and a dynamic database was created for each person. The paper is organized in four sections: Sect. 1 highlights the related work cited by various authors. Section 2 discusses the various techniques available for face and lip movement detection. Section 3 elaborates on the methods of implementation and work done while results are discussed in Sect. 4, followed by conclusion and references.

1.1

Literature Review

Face Recognition is one of the most sought-after area of research especially with the advancements in human–machine interface. Besides the obvious facial features such as the eyes, nose, and mouth, lips are one of the emerging features to be considered as they become intrinsic to speech production and hence can be unique to every person [5]. MATLAB has been used to position and segment the lips for extracting features. RGB color space algorithm has been traditionally employed but has now been replaced by the YCbCr color space method as it leads to higher accuracy in minimum time. Dynamic thresholding is used in the algorithm, for lip localization, segmentation, and edge detection. Features from the lips can then be extracted. The significant challenge in using this feature singularly is the uneven illumination as it influences the computational complexity in person recognition. Lip reading using optical flow has been proposed by Shiraishi and Saitoh [6] where the motion of the head and the features from the face have been considered in place of the regular rectangular ROIs. Another aspect considers the movement of the skin surface as the person speaks. The optical flow has been evaluated from this skin surface. CUAVE database has been used for the same. In a multimodal human– computer interface [7], a Lip Mouse was created mimicking the motion of the lips. Interaction with the computer was established using that movement of the mouth. In yet another approach, the center of the lips [8] was taken as the initial starting points to plot points along the lip edges making the method significantly precise. Mouth region contours once obtained need to be tracked for dynamic person authentication. Lip Reading is another application where it is widely used. Character recognition has been established [9] from the feature vectors deduced from the lip contour velocity being tracked. For the hearing, impaired visual character

658

S. Nainan and V. Kulkarni

recognition becomes an important tool for communication. Distance-weighted k nearest neighbor algorithm was used for recognition. Kalman Filters used along with Support Vector Machines (SVM) have been used for training data and their classification to demonstrate real-time lip movement detection and tracking. A divergence from the regular Harr-Like features is the Variance based Harr-Like feature when used along with SVM as the classifier proves to be much more efficient [10]. Color-based techniques follow the hue saturation method to distinguish the mouth region from the nearby skin region and prove to be more efficient and lead to improvement in the design of a biometric system [11]. Images were analyzed using the hue and chrominance levels and segmentation of the mouth region was carried out accordingly. A novel composite perspective has been discussed in [12]. Principal Component Analysis (PCA) was applied to discriminate between the colors of the lip region from the rest of the face and utilizing the intensity values, a new deformable model was proposed to effectively obtain and track the lips. PCA has also maximized the amalgamation of auditory and imaged parameters and has shown the prominent development of the Automatic Speaker recognition (ASR) approaches. Euler Lagrange Algorithm is another mathematical technique which uses Active Contours to perform segmentation [13]. However, this method does not consider the Contour dilatation aspect which was originally proposed by Kass et al. Gaussian Filtering, Thresholding, and Hough Transform have also been traditionally used to create binary images, and implement edge detection and object segmentation.

2 Lip Movement Detection Features extracted from the unique voice, when combined with additional information available from the Lip Movement, gives a robust Multimodal Biometric System. For applications where, personal data safety needs to be safeguarded always, lip motion tracking as a prime biometric trait can simplify and address major hindrances. Precision in Recognition for any biometric System is limited by the image acquiring gadget, the ambience condition and not enough distinction in the facial and mouth colors. There are two prominent approaches under which we can categorize the feature extraction techniques of the mouth area, from the image sequence of frames extracted from videos. They are image-dependent and Active Contour dependent.

2.1

Image-Dependent Approach

The lip model in this approach is obtained by estimating the dimensions of the lips in terms of its height and width usually calculated from the center of the mouth. The active automatic contour creation model also known as the snake model, the

Lip Tracking Using Deformable Models and Geometric Approaches

659

deformable template approach, and geometric modeling are some of the methods of implementation. Neural networks, window classifiers, and auto-correlation are the classifiers that can be employed. This method is usually deployed for tracking the lips as it brings down the duration of processing information and offers improved precision. Localized Active Contour model (LACM) and Region Scalable fitting energy (RCEF) are other models prominently used for Lip tracking [13]. Detecting the lip contours from every frame in the series of the face images acquired can verify and authenticate the speaking person as it is an indication of liveness.

2.2

Active Contour Model

The Active Contour Model suggested and first implemented by Kass et al. works on minimizing the energy of an image thereby allowing the contours to automatically align along the edges of the region of interest. Another important consideration in this deformable model is to have a continuous curve which should match the shape of the object. However, in this method, we must initialize the beginning of the snake pixels close to the desired object and then it will eventually follow a line. The difficulty faced here is in deciding the initial contours as we have lip shape deforming with every utterance. The snake algorithm calculates the External Energy that describes how the changing curve will match the object. The Internal Energy describes the smoothness and the continuity of the curve and lastly, the contour global energy establishes the minimization [13] as given in Eq. (1). E = ∫ ðαðsÞEcont + βðsÞEimage Þds

ð1Þ

α, β and ϒ are the weighting factors of the spatial positions.

2.3

Region-Scalable Fitting Energy and Localized Active Control Methods

The RSFE and LACM were introduced especially for processing biomedical images, especially in segmenting the ROI area but have proven to be equally effective for lip movement tracking. The active contour here is driven by the data fitting energy towards the edges of the object to be segmented. Equations (2) and (3) explain the lip tracking implementation.

660

S. Nainan and V. Kulkarni

dϕ = − δE ðϕÞðλ1 e1 − λ2 e2 Þ dt

ð2Þ

The lip contours represented by the zero level decide the domain to which the contour belongs. This is also called the level set segmentation method. Z Kσ ðy − xÞjIðxÞ − fi ðxÞj2 dy ð3Þ ei ðxÞ = where i = 1/2, Variable Kσ = Gaussian kernel and I(x) = image intensity. In the LACM method, local image statistics are employed for lip motion tracking. The contours created can be used for segmentation of the desired objects from the whole image. Equation (4) is used for implementing this method [14]. dϕ ðxÞ = δϕðxÞ∫ σ Dðx, yÞFðIðyÞ, ϕðyÞdy + KðϕðxÞÞ dt

ð4Þ

x and y are independent variables. They represent a point in the image domain. I is a 2D image from amongst the many frames.

3 Work Done The VidTIMIT database [15], as well as real-time Image has been used here for carrying out the experimentation. We have used three different methods to draw the contours around the lips or detect the lip edges. In the first method, for real-time edge detection of the mouth, USB webcam was used. The experimentation was carried out in the office environment and the image of the author has been captured here as shown in Fig. 1. The face was initially detected but as the mouth region is our area of interest, we segmented that region to reduce the image size. Face detection as shown in Fig. 1 was done by Using VideoCapture option in OpenCv. Contours were then drawn to define Lip movement along the lip edges. To create the contours, some points were mathematically obtained along the upper lip edges as well as for the lower lips. Distance of the lip edges from the center of mouth to locate the lip movement for the word spoken was calculated. The lip edges were determined by calculating the lip dimensions along the length, breadth, width, and height (x, y, w, h) from the center of the mouth. The lips were thus tracked at the rate of 20–25 frames per second, as contours were created for every frame as seen in Fig. 2, and saved, creating a database of the detected mouth images of the author for further processing. The advantage of this method over the other methods experimented is that within 3–4 s, the database of the segmented lip region was created which is mandatory in real time applications.

Lip Tracking Using Deformable Models and Geometric Approaches

661

Fig. 1 Face detection using ROI

Fig. 2 Lip contours

The lip contours obtained however do not completely define the lip edges as the illumination conditions under which images were acquired were far from ideal conditions. In the active snake method, the VidTIMIT Database was used which has video of 43 volunteers reciting short sentences. For our experimentation, the video recording of 10 different people was used where each person had 10 videos for 10 different sentences uttered. The limitation as we can see from the Fig. 3 is that the images are influenced by the intensity of light and the contours created depend on the positioning of the frontal face. Figure 3 shows the contour drawn by the active snake.

662

S. Nainan and V. Kulkarni

Fig. 3 Extraction of mouth/lips

Fig. 4 Bounding box around the mouth/segmented mouth/edge-detected lips

The four images in the Fig. 3 show the initial contours, the external energy, the external force field, and the snake movement. In the final method, the Computer Vision Tool box was used to detect the face and the mouth detection was done by drawing the rectangular bounding box around the mouth. Once the bounding box is available, that was considered as an image. Canny edge detection was subsequently applied to extract the lip contours as shown in Fig. 4.

4 Conclusion The lip contours were thus extracted using three different methods for further processing. Dynamic lip movement although was achieved in one of the methods, however the ambient conditions under which they were acquired play a very important part in authentication process. Person recognition can be done in real time, i.e., dynamically if exact contours can be traced in minimum time. In spite of current research devoted to mouth localization and lip region extraction for Person Recognition, the difficulty is in evaluating the frontal and the profile faces. It is difficult to describe the lip region by a single model due to the various face poses and unpredicted lip movements. The target application should be the deciding

Lip Tracking Using Deformable Models and Geometric Approaches

663

factor for the consideration of lip movement as the biometric modality along with the voice biometric trait. We propose to use these contours created to further extract features for Person Verification and Authentication. Acknowledgements The consent to use the image for this work has been obtained as the image used is of the author and there is no objection in publishing this work.

References 1. Nainan, S., Kulkarni, V.: A comparison of performance evaluation of ASR for noisy and enhanced signal using GMM. In: International Conference on Computing and Security Trends. IEEE, Pune (2016) 2. Wang, Liu: A new spectral image assessment based on energy of structural distortion. In: International Conference on Image Analysis and Signal Processing. IEEE (2009) 3. Matsui, T., Furui, S.: Concatenated phoneme models for text-variable speaker recognition. In: Proceedings International Conference on Acoustics, Speech and Signal Processing, ICSLP, pp. 391–394 (1993) 4. Chetty, G., Wagner, M.: Automated lip feature extraction for liveness verification in audio-video authentication. HCC Laboratory University of Canberra, Australia (2004) 5. Shen, X., Wei, W.: An algorithm of lips secondary positioning and feature extraction based on YCbCr color space. In: International Conference on advances in Mechanical Engineering and Industrial Informatics AMEII (2015) 6. Shiraishi, J., Saitoh, T.: Optical Flow based Lip Reading using Non Rectangular ROI and Head Motion Reduction. IEEE (2015). ISBN: 971-1-4799-6026-2 7. John Hubert, P., Sheeba, M.S.: Lip and head gesture recognition based PC interface using image processing. Biomed. J. Pharm. J. 8(1), 77–82 (2015) 8. Gurumurthy, S., Tripathy, B.K.: Design and implementation of face recognition system in Matlab using the features of lips. IJISA 4(8–4) (2012) 9. Mehrotra, H., Agrawal, G.: Automatic lip contour tracking and visual character recognition for computerized lip reading. Int. J. Electr. Comput. Energ. Electron. Commun. Eng. 3(4) (2009) 10. Wang, L., Wang, X.: Lip detection and tracking using variance based Harr-like features and Kalman filter. In: Fifth International Conference on Frontier of computer Science and Technology. IEEE (2010). ISBN: 978-0-7695-4139-6/10 11. Craig, B., Harte, N.: Region of Interest Extraction using Colour Based Methods on the CUAVE Database ISSC, Dublin (2009) 12. Ooi, W.C., Jeon, C.: Effective lip localization and tracking and achieving multimodal speech recognition. In: International Conference on Multisensor Fusion and Integration for Intelligent Systems, Seoul Korea (2008) 13. Morade, S., Patnaik, S.: Automatic lip tracking and extraction of lip geometric features for lip reading. Int. J. Mach. Learn. Comput. 3(2) (2013) 14. Arthur, C.: Image processing using active contours model. Thesis (2012) 15. Sanderson, C., Lovell, B.C.: Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference. Lecture notes in Computer Science (LNCS), vol. 5558, pp. 199–208 (2009)

Highly Secure DWT Steganography Scheme for Encrypted Data Hiding Vijay Kumar Sharma, Pratistha Mathur and Devesh Kumar Srivastava

Abstract Steganography is an application for concealed writing. Nowadays, development in the field of electronic media demands highly secure data transfer between computers over the globe. There is a pressing demand to send the secret information in the hidden form. Conceal writing is an ultimate feature of steganography which fulfills this demand. In general; a robust steganography scheme should have better visual imperceptibility of the hidden information with sufficient payload. In this paper, a steganography method based on cryptography is proposed. The proposed scheme uses two processes; first, encrypt the secret plaintext (PT) message followed by secret image generation from this encrypted message. Second, hide this secret image into the cover image using Daubechies discrete wavelet transform operation followed by mixing operation. After performing the Daubechies inverse discrete wavelet transform (IDWT) process, we get the stego image. Stego image visual quality is increased by using Daubechies wavelet. The effectiveness of the proposed scheme is shown by experimental results. Keywords Wavelet transforms Steganalysis



Steganography



Cryptography

V. K. Sharma (✉) ⋅ P. Mathur ⋅ D. K. Srivastava SCIT, Manipal University Jaipur, Jaipur, Rajasthan, India e-mail: [email protected] P. Mathur e-mail: [email protected] D. K. Srivastava e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_66

665

666

V. K. Sharma et al.

1 Introduction Today, there are so many techniques are available to transfer the information. Security is essential when they are secret (i.e., intelligence agency information). Cryptography and steganography are the two techniques for concealing the information. Cryptography is used to conceal the meaning of the message, whereas the existence of message is hidden by steganography technique. Cryptography performs the encryption operation to convert plaintext message (meaning full) into ciphertext message. The feature is always visible in cryptography, whereas the steganography is able to hide the feature of secret information, which means the feature is always invisible in steganography [1, 2]. Steganalysis is a type of attack at steganography technique [1, 3]. Steganography combined with cryptography makes it more robust [4]. Steganography schemes can be classified into two classes: Class A: special domain technique Class B: Transfer domain technique The class A steganographic techniques are less secure because they embed the data directly into the image pixels, whereas the class B steganography techniques first transfer the image information into another form such as wavelets, DCT, FFT, etc. then perform the embedding operation. So here, we are using transfer domain technique because of this robust feature. There are so many kinds of wavelets present [3, 5]; in current work, Daubechies (i.e., db8) wavelet is used. It provides better image quality stego image and secret image [6].

2 Related Work Many works have been done in the pasture steganography and cryptography. Comprehensive data hiding [5] uses both the steganography and cryptography scheme. This technique uses AES-based encryption followed by image steganography. In [5–7], transfer domain technique is used. High embedding capacity based schemes, presented by [7–10], again In [4], substitution based cryptography technique for embedding data in the image is presented.

3 The Proposed Steganography Scheme Using Cryptography The proposed steganographic scheme is shown in Figs. 1 and 2. Figures 1 and 2 show the basic building block diagram and detailed view of data hiding, respectively, the proposed model is a combination of two schemes such as cryptography scheme and wavelet-based steganography.

Highly Secure DWT Steganography Scheme …

Fig. 1 Block diagram of cryptography followed by steganography

Fig. 2 Block diagram of encoding process of the proposed scheme

667

668

V. K. Sharma et al.

Fig. 3 Block diagram of decoding process of the proposed scheme

Figure 3 shows the overall extraction process of the plaintext message from stego image.

3.1

Cryptography System of Proposed Scheme

Transforming the plaintext message into the corresponding ciphertext message, two types of techniques are available [4]; first is substitution techniques and second is transposition techniques. Here, we use the substitution technique. This technique has two processes: encryption process and decryption process. Encryption Process: The proposed schemes cryptographic system is based on the following encryption steps. (1) Perform the following steps as one-time initialization processes. (a) Divide Plaintext message into a block size of 15 bytes. (b) Perform the initialization process (say; one-time initialization) of the 15-byte Plaintext (PT) block (named as a state).

Highly Secure DWT Steganography Scheme …

669

(2) Do (a) (b) (c)

the following, for each round perform PT = PT-32; Multiply the key matrices and the PT matrices. Now compute a new matrix, by applying a mod 95 value of the above-generated matrix. (d) Now, add 32 with the matrix obtained from the above step.

(3) Reshape the alphabets into a row. (4) Translate the number into the alphabet, and obtain ciphertext message. The n number of the round can be performed, and the number of the round depends on user computer who wants to perform n number of around. For each round, the same key is used, here we are performing the technique for n = 1 (i.e., one round) round. In the first phase, invention of the key (K) is an important task. The size of K (i.e., 3 × 3 or 5 × 3) depending on the message size matrix (multiplication rule). The key K is a valid key, if and only if its inverse has an integer matrix, this property introduces the security and robustness of the proposed schemes cryptography technique. The one-time initialization process is quite simple. Here, the 15-byte PT block is just copied into a 2D array of size 3 × 5. The ASCII code value of the printable characters is in the range of 32–126. Before computing the matrix multiplication (i.e., step 2(c)) we have to adjust the PT matrix by subtracting 32 from every number that results from a matrix which has values in the range of 0–94 (i.e., 32 − 32 = 0 and 126 − 32 = 94). Apply the mod 95 operations followed by addition of a number which is 32. Then apply step 3–4. The one-time initialization process is shown in the following Fig. 4. In this figure B1, B2, …, Bn represents data bytes of each alphabet. For example, let us take the key (K) and the PT, which is K=

1 2 4

Fig. 4 One time initialization process

5 11 24

3 8 21

670

V. K. Sharma et al.

K is a 3 × 3 key matrix and PT = “vijay how r u?” Step (1): The 15 bytes of the PT is represented as 118 105 106 97 121 32 104 111 119 32 114 32 117 63 32 The state array is as follows: State array =

118 105 106

97 121 32

104 111 119

32 114 32

117 63 32

Step (2): (a) PT = PT-32; PT =

86 73 74

65 89 0

72 79 87

0 82 0

85 31 0

(b) Now, the obtained value of K * PT is as follow: K * PT =

673 1567 3650

510 1109 2396

728 1709 4011

410 902 1968

240 511 1084

(c) Now calculate, New matrix as: New matrix = (K * PT) mod 95; New matrix =

8 47 40

35 64 21

63 94 21

30 47 68

50 36 39

(d) Now add 32 with the result obtained in step (c) We get the following value: 40 79 72

67 96 53

95 126 53

62 79 100

82 68 71

Step (3): 40 79 72 67 96 53 95 126 53 62 79 100 82 68 71 Step (4): Now the obtained ciphertext message as follow (OHC`5_ ∼ 5>OdRDG Decryption Process. The decryption process of the proposed schemes cryptographic system is just the reverse of the encryption process.

Highly Secure DWT Steganography Scheme …

3.2

671

Proposed Steganography Scheme

Every steganography technique has two processes: these are encoding process and decoding process, which are shown in Figs. 2 and 3. Encoding Process. The main encoding steps of the proposed scheme are as follow: Step 1: Obtain the cover image. Step 2: Generate secret image from ciphertext message (CT), by using the Text message to image message generator. Step 3: Do the DWT on the cover image (C). Step 4: Do the DWT on the secret image (S). Step 5: Perform addition of both images obtained from step 3 and step 4 and generate stego image. Decoding Process. The decoding process of the proposed wavelet-based steganography scheme is shown in Fig. 3 Step 1. Obtain stego image (SO). Step 2. Obtain the cover image (C). Step 3. Do DWT on SO image and C image. Step 4. Subtract cover image from stego image and acquire secret image coefficients. Step 5. Apply the IDWT that generate message image (secret image message). Step 6. Regenerate the ciphertext from the secret image characters by using picture message to equaling text message generator. This obtained message is known as the ciphertext message. Step 7. Go for the decryption process and find the Plaintext message.

4 Results and Analysis The experimental results of the proposed steganography scheme are tested by taking on the different cover image of size 256 × 256 and a secret message (i.e., encrypted message image of size 256 × 256) as shown in Table 1. Figure 5 shows the results of the proposed scheme. The proposed scheme is tested on different image quality parameters [7, 11], such as Peak to Signal Noise Ratio (PSNR), Normalized Cross-Correlation (NCC), Average Difference (AD), and Structural count (SC). Form the table, it can be observed that the PSNR is near about 40 dB. PSNR more than 30 dB shows the good visual quality of image [12], Thus the attacker cannot identify that image is stego image or not a stego image.

672

V. K. Sharma et al.

Table 1 Performance of proposed technique at different image quality parameters Cover image

Secret image

Proposed technique PSNR NCC

AD

SC

Lenna.Tiff Flower.jpg Peppers.Tiff Goldhill.jpg

Message.jpg Message.jpg Message.jpg Message.jpg

41.6892 41.8758 41.4321 41.9740

−0.4215 −0. 4216 −0.4215 −0.4215

0.9939 0.9940 0.9940 0.9935

0.9969 0.9969 0.9969 0.9966

Fig. 5 Results of proposed steganography technique

5 Conclusion The use of cryptography in the proposed steganography technique introduced more security and robustness. Secret information is invisible inside the stego image because the use of Daubechies wavelet introduced good visual quality in stego image. form the results it can be clearly observed that our AD is very less and our NCC and SC is also excellent. Feature work will focus on the advance wavelets such as contourlet transform and curvelet transform.

Highly Secure DWT Steganography Scheme …

673

References 1. Feng, B., Wei, L., Sun, W.: Secure binary image steganography based on minimizing the distortion on the texture. IEEE Trans. Inf. Forensics Secur. 10(2), 243–255 (2015) 2. Sharma, V.K., Srivastava, D.K., Mathur, P.: A study of steganography based data hiding techniques. Int. J. Emerg. Res. Manag. Technol. 145–150 (2017) 3. Shejul, A.A., Kulkarni, U.L.: A DWT based approach for steganography using biometrics. In: IEEE International Conference on Data Storage and Data Engineering, pp. 39–43, Feb 2010 4. Swain, G., Lenka, S.K.: Steganography using the twelve square substitution cipher and an index variable. In: 2011 3rd IEEE International Conference on Electronics Computer Technology (ICECT), pp. 84–88 (2011) 5. Sharma, V.K., Srivastava, D.K.: Comprehensive data hiding technique for discrete wavelet transform-based image steganography using advance encryption standard. In: Computing and Network Sustainability, pp. 353–360. Springer (2017) 6. Shrestha, A., Timalsina, A.: Color image steganography technique using Daubechies discrete wavelet transform. In: 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), pp. 9–15 (2015) 7. Prabakaran. G., Bhavani, R.: A modified secure digital image steganography based on discrete wavelet transform. In: International Conference on Computing, Electronics and Electrical Technologies, pp. 1096–1100, March 2012 8. Archana. S., Antony Judice, A., Kaliyamurthie, K.P.: A novel approach on image steganographic methods for optimum hiding capacity. Int. J. Eng. Comput. Sci. 2, 378–385 (2013) 9. Khalili, M., Asatryan, D.: Colour spaces effects on improved discrete wavelet transform-based digital image watermarking using Arnold transform map. IET Signal Proc. 7, 177–187 (2013) 10. Wang, R.-Z., Chen, Y.-S.: High payload image steganography using two-way block matching. IEEE Signal Process. Lett. 13(3), 161–164 (2006) 11. Sehgal, P., Sharma, V.K.: Eliminating cover image requirement in discrete wavelet transform based digital image steganography. Int. J. Comput. Appl. 68(3), 37–42 (2013) 12. Moreno, J., Jaime, B., Saucedo, S.: Towards no-reference of peak signal to noise ratio. Int. J. Adv. Comput. Sci. Appl. 4(1), 123–130 (2013)

A Novel Approach to the ROI Extraction in Palmprint Classification Swati R. Zambre and Abhilasha Mishra

Abstract Biometric Person Identification (BPI) plays important role in the security for the purposes of authentication, as pins and password are never reliable for certification. Recently in the biometric systems, touchless palmprint recognition system has focused on flexibility, more personal hygiene, and less time consumption. However, identification using touchless or pegless images also faces several severe challenges to find palm areas such as variations in rotation, shift/size, and complex background. In this paper, a robust rotation invariant, size/scale invariant preprocessing method for touchless palmprint has been proposed. This method has been implemented on standard databases of CASIA and IITD, where images are captured using the pegless/touchless scenario with a lot of variations in the rotation as well as the size of the palm.



Keywords Palmprint Preprocessing database Rotation invariant





Pegless databases



Touchless

1 Introduction Security is made up of three major parts which include authentication, authorization, and accountability. Out of these three parts, an authentication is very important. Authentication means the process of verifying the identities of communicating equipment or the verification of the users. The personal identification system is very important to enable authentication, where biometric systems are considered most efficient.

S. R. Zambre (✉) ⋅ A. Mishra Department of Electronics and Telecommunication, Maharashtra Institute of Technology, Aurangabad, India e-mail: [email protected] A. Mishra e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_67

675

676

S. R. Zambre and A. Mishra

The examples of biometric technologies in existences include fingerprint, DNA, face recognition, iris, retina, voice, palm print recognition, etc. All these provide secure and convenient identification systems but they have got their strengths and weakness well. If we consider the number of users in the particular organization, biometric systems based on fingerprint recognition and palmprint recognition are widely used. In fingerprint, the workers and the older people may not provide clear fingerprint because of their skin, hence more research has been carried out for palmprint recognition. The palm print is relatively more unswerving as compared to the available traits. It uses a much lower resolution making computation in both preprocess and features extraction faster. It has gone the following advantages over the others it includes; high distinctiveness, high permanence, high performance, medium collectivity, medium acceptability, and medium universally [1]. Palm print refers to the inner surface of the hand in between the fingers and the wrist. These images can have captured by use of low resolution and are very idiosyncratic, some of the important features of the palm prints include; principal lines, wrinkles, ridges, and delta point [2]. However, these palm prints have gained lots of interests because of their uniqueness from the rest. Despite the fact that most of the features are accurately measured in terms of their structure, it is important to note that there are challenges like high acquisition flexibility, requirement for high recognition survey, high computational efficiency, and large variation in illumination condition, rotations, effect of noise, less time and space consumption, more personal hygiene. Most of the challenges depend on ROI extraction. All existing algorithms are based on finding key points for reference to extract ROI, which is very complex to implement. Hence in this paper, a novel palmprint preprocessing method for the palmprint images which are captured in the touchless/pegless scenario has been proposed. This method is based on geometric based, which is very simple to implement. The combination of transform based and spatial domain techniques has been used for feature extraction which benefits to reduce storage space by selecting only few significant features at every stage and illumination effect, and simple kNN is used for classification.

2 Literature Reviews Very few researchers have been worked for preprocessing of touchless palmprint. Han et al. [3] proposed a novel method of the extraction of the region of interest under unconstrained scenes by detecting shape and color of palm. Feng et al. [4] designed palmprint preprocessing method for their own real-time touchless palmprint database by introducing various key points with an accuracy of 93.8%. Ito and Aoki [5], given the survey of recent advances in biometrics including a survey of all existing preprocessing methods; also developed new preprocessing method for

A Novel Approach to the ROI Extraction …

677

images with no gap in fingers based on the extraction of contours and detection of key points. Mokni and Kherallah [6] extracted region of interest by applying various image processing techniques including the use of the Otsu’s method to binarize the original image, the boundary extraction, noise reduction, and hole curve elimination using smoothing filters and locating the centroid to extract the key points applying the Euclidean distance. Li et al. [7] proposed a new ROI extraction method, by considering the distance between corner points and contour centroid. Tamrakar and Khanna [8] applied competitive coding scheme on wavelet approximation and classified by kNN classifier using a variable number of k and concluded that accuracy for k = 1 is more than k greater than 1.

3 Palmprint Classification It contains four main steps as shown in Fig. 1.

3.1

Image Acquisition

Images are acquired using various methods and various devices like high resolution/low resolution, online/offline, and using fixed/variable positions of subjects. Once images are obtained, the variety of methods of processing is applied to reduce noise and enhance clarity required. In this paper, images are taken from two standard palmprint databases IIT Delhi Touchless Palm Print Database (V1) [9] including around 1645 images from 235 subjects and CASIA Palmprint Database (V1) [10] including 5502 images from 312 subjects. Images in these databases are captured using offline high resolution and pegless image acquisition modes as shown in Fig. 2.

Fig. 1 Block diagram of palmprint recognition

678

S. R. Zambre and A. Mishra

Fig. 2 a Sample images from IITD touchless palmprint database, b sample images from CASIA palmprint database

4 Proposed Preprocessing Method A very important step in pattern recognition is preprocessing, in which main focus is to reduce noise in the particular image and to get the region of interest for the further processing. The extracted region of interest from this block is 128 × 128 or 150 × 150 square region of the center of the inner palm. In both databases, images are captured in variant position as the IITD images are collected using the touchless device so its orientations are highly varied. In CASIA database, palmprint images are captured using peg less device so there are no fixed positions. Very high change in images scale and rotation occurs between every subject’s different captured samples. It is difficult to determine the key for ROI extraction in every sample. To overcome all these challenges, a new method has been proposed in this paper which follows the following steps: • • • • •

Image binarization Extraction of palm region Establishing a coordinate system Rotation of palm Extraction of ROI.

4.1

Image Binarization

Binarization is concerned majorly transforming 800 × 600 gray images into binary images, and the threshold is used to obtain binary hand-shaped images using the following methods, like poisons distribution method, Otsu method etc. Here, before binarization, first a Gaussian low-pass filter is applied on the input grayscale image I(x, y). The operation of image binarization involves thresholding gray images to acquire binary hand-shaped images. Here, image thresholding is performed by using Poisson distribution based minimum error thresholding method [11]. First, compute the normalized image histogram, denoted h(i), where “i”

A Novel Approach to the ROI Extraction …

679

Fig. 3 Binary palmprint image

denotes the intensity of a pixel in the range 0, …, Imax. The normalized image histogram for the mixture of Poisson distributions is written as hðiÞ = P0 * pðij0Þ + P1 * pðij1Þ

ð1Þ

where P0 and P1 are prior probabilities of the background and foreground regions respectively, and p(i|j), j = 0, 1 are Poisson distributions with means µj. The Poisson mixture parameters for a threshold t are given by t

t

t=0

t=0

P0 ðtÞ = ∑ hðiÞ; μ0 ðtÞ = 1 ̸P0 ðtÞ ∑ i × hðiÞ; Imax

Imax

i=t+1

i=t+1

P1 ðtÞ = ∑ hðiÞ; μ1 ðtÞ = 1 ̸P1 ðtÞ ∑ i × hðiÞ;

ð2Þ

ð3Þ

The optimal threshold t is chosen to minimize an error criterion as follows: t* = argminfμ − P0 ðtÞðln P0 ðtÞ + μ0 ðtÞln μ0 ðtÞÞ − P1 ðtÞðln P1 ðtÞ + μ0 ðtÞln μ0 ðtÞÞg ð4Þ where µ indicates the main intensity of the complete images. Then binary image ITh can be obtained using threshold t* as shown in Eq. (5) and Fig. 3.  IThðx, yÞ =

4.2

1 0

if ðIðx, yÞ > t* if ðIðx, yÞ < t*

 ð5Þ

Extraction of Palm Region

In the extraction of the palm region, we need to extract the fingers, so with a circle of 25% of the total palm area as a radius, and move the circle from the left top of the image to the all over image columnwise. Confirm if the perimeter of the circle is within the palm area (Fig. 4).

680

S. R. Zambre and A. Mishra

Fig. 4 a Extraction of palm area, b corresponding original palm area

4.3

Establishing the Coordination System

• Find centroid: Once the palm area is extracted, find its xmin, xmax, ymin, and ymax. Draw axes from xmin and xmax parallel to x-axis, from ymin and ymax parallel to the y-axis. Find xmid and ymid for all four axes and join both xmid and ymid to get the center point (i, j) of the palm area. • Find maximum possible inscribed circle region: Draw a circle of initial radius r, with palm center point (i, j) using transformations of the rectangular coordinate to polar coordinates, by varying the radius, find the maximum possible inscribed circle region as shown in Fig. 5. The final radius of this circle is noted as “Crad”.

Fig. 5 Maximum possible inscribed circle region from centroid

A Novel Approach to the ROI Extraction …

681

f1

f2

Fig. 6 a Binary image with a sampled circle, b plotted gradient vector of a sampled circle to compare with match vector

Fig. 7 Rotated palm region

Amid

4.4

Rotation of Palm

To find the rotation of the palm, draw a sampled circle with approximately 100 pixels greater than “Crad” on the binary image with the intensity of 150 for each sample as shown in Fig. 6a. Find the gradient vector for the sampled points and plot it as shown in Fig. 6b. Set threshold count of 4 for the white pixel as well as 3 for a black pixel, respectively. Match the gradient vector pattern with match vector [0 1 0 1 0]. If the match is found, obtain the finger points f1 and f2. Then find the mid angle Amid between f1 and f2 from the center. Rotate the image by Adiff = Amid − pi as in Fig. 7.

4.5

Extraction of ROI

Find the square region from the maximum possible inscribed circle and resize it to 150 × 150 as shown in Fig. 8.

682

S. R. Zambre and A. Mishra

Fig. 8 150 × 150 square ROI region

5 Feature Extraction and Classification Images should be analyzed accurately in order to get the distinguishable traits among different persons by using multiresolution with different translation and scaling factor to obtain precise variations using DWT, then segmented into several modules and compressed using DCT. Here, the dc coefficient is neglected to reduce illumination effect and for optimization, only 32 × 32 image features are fed into PCA. Further reduction of feature dimension is obtained by the use of PCA to reduce space required to save. Only 200 PCA features are considered for the classification. In DWT, various filters have been tested for better accuracy; in that Db2 filter shows the highest accuracy. Hence, features are extracted using DWT, DCT, and PCA [12]. Classification is done using simple kNN classifier, which detects nearest neighbors to classify the subject.

6 Results Simulation has been established using MATLAB Version R2012a on Compaq Presario C700 Laptop, Intel(R) Pentium(R) Dual CPU T2370 1.73 GHz 2.00 GB 64 bit Windows 7 operating system is used for simulation. The proposed preprocessing method has been implemented for around 2000 images from two different sets of databases; database D1 includes IITD palmprint images and D2 includes CASIA palmprint images. Required time for each ROI extraction is 3.848782 s. Results of few extracted palm region of interest are as shown in Fig. 9. The overall accuracy of the proposed algorithm for both databases is as shown in Tables 1; and comparative analysis with existing state of art for both databases are shown in Tables 2 and 3.

A Novel Approach to the ROI Extraction …

683

Fig. 9 150 × 150 square ROI region

Table 1 Results for all databases Database

Accuracy (%)

Time (s)

Memory (Kb)

D1 D2

98 94.66

8.27 10.7638

26,877 2217

Table 2 A comparative analysis of the existing state of art and the proposed method for IITD Database Database

Method

Accuracy (%)

D1 D1

Proposed preprocessing method: DWT level 3 DWT level 3 [12]

97 80.00

Table 3 A comparative analysis of the existing state of art and proposed method for CASIA Database when kNN Classifier is used for k = 1 Database

Method

Accuracy (%)

D2 D2

Proposed Method by Raouia Mokni [13]

94.66 93.20

684

S. R. Zambre and A. Mishra

7 Conclusion In implemented palmprint classification, a new simple method of preprocessing is proposed for IITD and CASIA palmprint database. Here, by using very few key points, ROI has been detected and other key points are used to make this method rotation invariant. Features are extracted using a combination of DWT (Db2), DCT and PCA to save space and reduce illumination effect. A comparative analysis with the existing state of the art shows a computational complexity of this method is lower than all other methods. In both cases of Touchless/IITD database and pegless/ CASIA database we obtained significant accuracy as compared to existing state of art. Further modification can be done to improve more accuracy and method can also be generalized for palm with close fingers. Hence a robust, simple, and less time-consuming system has been proposed in this paper.

References 1. Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20 (2004). https://doi.org/10.1109/tcsvt.2003.818349 2. Zhang, D.: Palmprint Authentication. Kluwer Academic Publishers, USA (2004) 3. Han, Y., Sun, Z., Wang, F., Tan, T.: Palmprint recognition under unconstrained scenes. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) Computer Vision—ACCV 2007. Lecture Notes in Computer Science, vol. 4844. Springer, Berlin, Heidelberg (2007). https://doi.org/10. 1007/978-3-540-76390-1_1 4. Feng, Y., Li, J., Huang, L., Liu, C.: Real-time ROI acquisition for unsupervised and touch-less palmprint. World Acad. Sci. Eng. Technol. Int. J. Comput. Inf. Eng. 5(6) (2011). https://doi.org/10.1999/1307-6892/8883 5. Ito, K., Aoki, T.: Recent advances in biometric recognition. Inst. Image Inf. Telev. Eng. 6(1), 64–80 (2018). https://doi.org/10.3169/mta.6.64 6. Mokni, R., Kherallah, M.: Lecture Notes in Computer Science, vol. 9887, p. 259 (2016). ISSN: 0302-9743, ISBN: 978-3-319-44780 7. Li, H., Guo, Z., Ma, S., Luo, N.: A new touchless palmprint location method based on contour centroid. In: 2011 International Conference on Hand-Based Biometrics Bandung, Indonesia (2011). https://doi.org/10.1109/ichb.2011.6094306 8. Tamrakar, D., Khanna, P.: Analysis of palmprint verification using wavelet filter and competitive code. In: 2010 International Conference on Computational Intelligence and Communication Systems (2010). https://doi.org/10.13140/rg.2.1.4393.1124 9. IITD Touchless palmprint Database (v1). http://www.comp.polyu.edu.hk/∼csajaykr/IITD/ Database_Palm.htm 10. CASIA Palmprint Database. http://www.idealtest.org/dbDetailForUser.do?id=5 11. Al-Kofahi, Y., Lassoued, W., Lee, W., Roysam, B.: Improved automatic detection and segmentation of cell nuclei in histopathology images. IEEE Senior Member 841–852 (2010) https://doi.org/10.1109/tbme.2009.2035102 12. Vaidehi, K., Subashini, T.S.: Transform based approaches for palmprint identification. Int. J. Comput. Appl. (0975 8887) 41(1), 1 (2012). https://doi.org/10.5120/5502-7496 13. Mokni, R., Kherallah, M.: Novel palmprint biometric system combining several fractal methods for texture information extraction, Systems Man and Cybernetics (SMC) 2016 IEEE International Conference on, pp. 002 267–02 272. (2016). https://doi.org/10.1109/SMC.2016. 7844576

A Novel Video Genre Classification Algorithm by Keyframe Relevance Jina Varghese and K. N. Ramachandran Nair

Abstract Video classification is one of the challenging areas in the current world. It is a necessary tool for systematic organization and efficient retrieval of videos from repositories. Generally, video classification is a complex operation since video is a composite media with different components. Here, we propose a novel and simple probabilistic approach to classify the videos, broadly into three major domains news, sports, and entertainment. The existence measures of respective scene types in video genres are the prominent factor used in the proposed approach for video classification. We have tested our work on some test videos like football, wedding, and news discussion videos and results sound well . . . . Keywords SBD (shot boundary detection) ⋅ Key frame extraction ⋅ Scene classification ⋅ Video classification

1 Introduction Multimedia has been a recent field of communication and several tools have emerged to process it in the last decade. Videos are the most popular and useful type of multimedia. Repositories in online and off-line media hold massive amount of videos through archives; and the growth rate of the digital video is increasing exponentially. These repositories are struggling to manage the huge amount of data. The difficulty in managing video repository is due to the presence of composite types of events/videos present in any videos. An authentic rule about the event types for a specific video category is not at hand. The broad categories of events and their

J. Varghese (✉) Department of Computer Science, Mahatma Gandhi University, Kottayam, Kerala, India e-mail: [email protected] K. N. Ramachandran Nair ViswaJyothi College of Engineering and Technology, Vazhakulam, Kerala, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_68

685

686

J. Varghese and K. N. Ramachandran Nair

intra and/or inter-combinations and relations are the reason for the infeasible rule production. Proper classification of videos is a preliminary step in video archives for its effective management. A video can be treated as a collection of related events and the set of objects performing that events. Generally, video classification procedures starts with classifying the objects, the actions performed by those objects and later by classifying the associated events. Since this series of procedures accompanies many low-level operations (such as low-level feature extraction, their classifications at each stage, final decision-making, etc), it is a complex operation. Moreover, these operations do not reflect the way the human classifies a video. Here, we propose a model to classify a video in the perspective of human being. Initially, a scene model is created which is used in the subsequent stages of the procedure. The presence of each scene type in each video category is measured and is used in a later stage for final video classification. Scene model is created using a high-level feature, extracted from the video frames. This paper is organized as follows. Previous works done on video classification are included in Sect. 2. The proposed method of video classification is given in Sect. 3. Results obtained on training and testing phase are presented in Sect. 4. The conclusion of the work is presented in Sect. 5.

2 Literature Works A survey of literature for automatic video classification is reported in [1]. Several text-based, audio-based, and visual-based techniques for video classification are explained in it. Paper [2, 3] also describes some video classification procedures. Human action recognition is a major part in video event classification systems and a systematic review of human activities is written in [4]. As any classification procedure relies on the feature set, the feature extracted from video has a significant role in video classification procedures. Frame level and temporal features are available in literature. SIFT [5], HOG [6], and LBP [7] are some frame level feature used in literature work. Since video is time-varying signal, temporal features have more attraction than frame level features. One such major temporal feature is Spatiotemporal Interest Points (STIP) which is used by Columbia Consumer Video Analysis (CCV).1 Other temporal features for video classification are Cuboid [8] and Hessain [9]. Paper [10] uses a semantic-visual knowledge base for video event recognition. In [11], histogram of motion gradients is integrated to video classification framework which allows real-time performance. The author proposed a learning framework for multimedia event detection. In [12] generates a specific intermediate representation of videos from low-level features. This intermediate representation of videos is automatically optimized together with the classifier. Concept classifiers are used 1 http://www.ee.columbia.edu/ln/dvmm/CCV/.

A Novel Video Genre Classification Algorithm . . .

687

in [13] to evaluate the semantic correlation of each concept w.r.t. the event of interest. A codebook is generated in [14] from the training set of videos with STIP computed across frames. A bag of action words is then generated by encoding the first-order statistics of the visual words. Support vector machine classifiers (one against all) are trained using these codebooks. Concept-based video classification system [15] initially classifies the concept and the video is classified as per the concepts present in it. Some concepts are handshake, running, earthquake, parade, etc., were defined in the literature. If any prior knowledge about the domain of video categories are known, the procedure for classification [16, 17] is more manageable. Many authors developed algorithms for video genre classification. Authors of [18] uses low-level audiovisual cues and cognitive and structural information to classify the types of TV programs and YouTube videos. A movie genre classification system is proposed in [19]. They use Self-Adaptive Harmony Search (i.e., SAHS) to find local features for corresponding movie genres. The majority voting method is used to predict the genre each movie.

3 Proposed Method Here, we propose an algorithm with another approach. The logic is based on the general characteristics of video types. A football event will never occur in a wedding scene background. Similarly, a news discussion event will not encounter more wedding scenes than studio scenes. This behavior of videos is explored here for video genre classification. The significance of each background scene (interchangeably used as background image or frame) types for each video category is evaluated for video classification. The proposed work has two phases. Preprocessing phase and video classification phase. Support vectors developed in the first phase are used in the second phase of the procedure. For ease of illustration, we choose only sports, news, and entertainment videos. The background scene categories taken are sports ground, sports gallery, studio, and wedding scenes. The diagrammatic representation of the entire procedure

Fig. 1 Diagrammatic representation of the overall work

688

J. Varghese and K. N. Ramachandran Nair

Fig. 2 Suboperations in processing videos (training/input video(s))

is shown in Figs. 1 and 2. A trained model (S) for scene classification is created in the pre-processing phase, usingl a training dataset. This model is then used by video classification phase to classify the frames of the video in to their proper categories. During the second phase, a set of training videos is processed to estimate the Measure of Existence (ME). Then, the given input video is also processed to estimate the presence of each scene type in it (called Relevance). The ME and relevance are same, except that relevance is the measure of existence of an input video. The result of video processing module is termed as ME for training videos and Relevance for input video. Operations performed on each video are shown in Fig. 2. The shots in the video are extracted, then key frames from each shot are selected and are classified to their proper class using the scene model. The ratio of count of each scene type to the total number of frames in the video is the result of video processing operations. The similarity between the training video and the input video is measured as the Cosine Similarity between them.

3.1 Preprocessing Here, we create a trained model for scene classification. This model is then used to classify the frames of the video to their appropriate scene types. Various scene (background Image) types associated with each video category are predefined. The features for scene classification are extracted and given for training in this phase.

3.1.1

Extraction and Training of Scene Features

Scene recognition/classification is a growing field of image processing. Several features such as sift, gist, color, texture, and interest points are already available in the literature for scene classification. We use the Gist feature as its results are good compared to other features in our approach. Description of Gist: Gist feature is reported in [20]. The author considers a scene as an individual object. This feature eliminates the need for object and region segmentation for image recognition, which is a required quality in our proposed approach. The spatial structure of the scene (image background) is represented as gist feature. Human observers give more importance to the structure of the scene. The perceptual properties of an image such as openness, naturalness, roughness, expansion, and ruggedness represent the spatial structure of the scene. These percep-

A Novel Video Genre Classification Algorithm . . .

689

tual properties can be described with simple computations. A scene where a lengthy boundary line with lack of visual reference may be an open scene like a coast or highway. Natural scenes edges are more distributed than the vertical edges of manmade scenes. Roughness refers to the complexity of the image. A street with long vanishing line has high expansion. It is the depth gradient of space. Ruggedness is the deviation from the ground. For natural environments, the degree of ruggedness will be high. These properties are measured using filters to create the feature vector of the frame. The gist is the gradient of parts of the image. Gist feature of an input image I is calculated as follows. For a given input image I, gist feature is calculated as a concatenation of 32 feature maps produced from 16 regions of the image. Initially, the image is convolved with 32 Gabor filters, 4 scales, 8 orientations to produce the feature maps. Now, we have to train a model for scene classification with the extracted gist feature. More clearly, gist feature of 500 images of studio, sports gallery, sports ground, and wedding scenes are extracted and given for training to create the scene model S. We have used the SUN dataset provided in [21] for scene classification. SUN database is one of the most popular datasets used for scene classification.

3.1.2

Support Vector Machine

The gist features extracted from SUN database in the previous step are given for support vector machine to develop a Scene model for background image (scene) classification. This model is used in both phases of the algorithm to classify the key frames of video shots. Training is done by LIBSVM (libsvm 3.21 Matlab version) in fivefold scheme. We created a model for scene classification with quadratic kernal function. Quadratic kernal function gives good accuracy for scene classification. And, training is done only for 150 principal components extracted by Principle Component Analysis (PCA) from training set.

3.1.3

Measure of Existence Calculation

Once the scene types are trained, the existence of each scene type in each video genre is measured. It is done by first classifying each frame of the video into the appropriate scene type and then estimating the fraction of frames classified into the respective scene type over the total number of frames. More clearly, the percentage of existence of the ground scenes, gallery scenes, wedding scenes, and studio scenes in football videos are calculated. Like that, the percentage of the existence of all scene types in all other video categories is also calculated. Measure of Existence is calculated over a training dataset of 25 videos per video category. Let ME denote the Measure of Existence. Then ME of ith scene category SCi , in jth video Vj , is represented as ME(SCi , Vj ). For cost reduction, calculations are done only for key frames identified from each shot. Key frames are the frames where

690

J. Varghese and K. N. Ramachandran Nair

major changes occur in the shot. Procedure for key frame extraction is described in the subsequent sections. ME is calculated as per Eq. (4). ME is calculated over each shot of the video. That is, ME of a video is the sum of ME of all shots of the video as per Eq. (1). Video shots are extracted using the procedure described in the paper [22]. ∑

No. of shots

ME(SCi , Vj ) =

ME(SCi , VShotj,k )

(1)

k=1

For shots extraction and key frame selection, we use the approach explained in the paper [22]. The author detects the boundary between shots, by evaluating the similarity among the frames, through an FSM feature (Frame Similarity Measure). This feature estimates the similarity between two consecutive frames, by checking the movement of pixels to their respective neighborhood positions, of the next frame. If most number of pixels are not present in their own neighborhood positions of the next frame, then those two frames are the candidates for boundary frames. Movements of pixels to their neighborhood positions are mostly due to the camera and/or object motion, which cannot be considered for shot boundaries. The count of pixels moved to their neighborhood positions is then checked against a threshold (determined dynamically from FSM feature) to declare the points of shot boundary. From each shot, three (predefined) frames, with maximum values for FSM feature are selected as key frames. Sometimes, three frames are not sufficient to represent a shot with more than 50 frames. Therefore, we redefine the number of key frames for such shots according to Eqs. (2) and (3), to ensure a potential participation from the video. Shots with up to 3 frames are taken entirely. n1 =

Length of the shot 50

n = n1 × Predefined number of key frames

(2) (3)

Once the key frames are extracted from the shots, they are then classified into their appropriate scene type by the trained scene model. When all the key frames are classified, the Measure of Existence of each scene in the particular video is calculated according to Eq. (4). ME(SCi , Vj ) =

Number of key frames classified as SCi Total Number of key frames in Vj

(4)

Hence, we have estimated the measure of existence of a scene type SCi in the video Vj . The average value of ME obtained over the 25 videos is taken and is then used in subsequent procedures. The ME of all scene categories in a video category adds to 1. The Measure of Existence of each scene category is shown in Table 1. The

A Novel Video Genre Classification Algorithm . . .

691

Table 1 Measure of existence calculated in the preprocessing stage Video type Shots Key frames Scene News videos

13

552

Entertainment videos

70

550

114

1015

Sports videos

Studio Wedding Gallery Ground Studio Wedding Gallery Ground Studio Wedding Gallery Ground

Count

ME

254 64 185 49 2 386 39 123 0 0 276 739

0.4547 0.1176 0.3370 0.0907 0.0045 0.6430 0.0844 0.2680 0 0 0.2637 0.7367

last column in Table 1 is the average value obtained over the 25 videos and it clearly states the ME of each scene per video genre. Each segment in Table 1 (4 rows) are the values reached while preprocessing. According to Table 1, 25 News Videos are taken for preprocessing. 13 shots and 552 key frames are extracted from 25 news videos. The approach classified 254 key frames as studio scenes, 64 as wedding scenes 185 scenes as sports gallery scenes and 49 as sports ground scenes. A news video may report many events of the day, including sports, which is the reason for large values in sports scene types. The ME of each scene category for the news video type is provided in the last column of the table. According to the Table 1, a sports video contains 73% of ground scenes, 26% of gallery scenes, and so on. Entertainment videos have negligible percentage of studio scenes.

3.2 Video Classification The previous operations calculates the measure of existence of each scene type in each video genre from a set of training videos. Now, this section classifies a given input video into appropriate video genre. It includes procedures for shot boundary detection [22], key frame selection, classification of key frames into scene types, relevance estimation of each scene type in the input video, and final decision-making. The parameters Scene model and ME are used in this phase. Once the key frames of each shot in the given input video are classified, the relevance of each scene type in the given input video is calculated according to Eq. (5).

692

J. Varghese and K. N. Ramachandran Nair

Table 2 Relevances of each input video category Input video Shots Key frames Scene Football

8

69

Football1

16

81

KWW

13

75

News

1

60

Studio Wedding Gallery Ground Studio Wedding Gallery Ground Studio Wedding Gallery Ground Studio Wedding Gallery Ground

Count

Relevance

0 0 27 42 0 0 0 81 0 75 0 0 51 0 9 0

0 0 0.3913 0.6087 0 0 0 1 0 1 0 0 0.8500 0 0.1500 0.0001

Let relevance of scene Sci in the input video Vin be represented as R(Vin , Sci ). It is similar to the Eq. (4). R(Vin , Sci ) =

Number of key frames classified as Sci Total number of key frames in Vin

(5)

In other words, it is the measure of existence of the scene SCi in input video Vin . Four input videos are tested and the obtained Relevance measures are detailed in Table 2. Table 2 gives the percentage of each scene category in the four input videos. Most key frames in the football videos are classified correctly into either ground or gallery and the presence of other scene types are negligible there. An input video of Kate William Wedding (KWW) has more wedding scenes than others. The news video has a bit change due to the presence of sports clips in the it. Since the procedure for classification of key frames sounds excellent, the input videos are perfectly classified as shown in Table 3.

3.2.1

Similarity Estimation

We use the Measure of existence from the preprocessing phase and the relevance measure from the current phase as inputs for final decision making.

A Novel Video Genre Classification Algorithm . . . Table 3 Similarity estimates of input video Input video News

693

Entertainment

Sports

Football

0.1566

0.3829

0.9418

Football1

0.4432

0.3875

0.9742

Kate William Wedding

0.2024

News

0.8657

0.9168

0.0287

0.0012

0.0596

Here, we use the Cosine Similarity measure to decide the category to which the input video to be classified into. We consider the two parameters, ME and Relevance, as t-dimensional points in vector space, where t is the number of scene types taken while preprocessing. Cosine similarity vector measures the cosine of the angle between the two vectors (here, ME and Relevance). The main advantage of cosine similarity over other comparison metrics is that it measures the orientation of the vector points and not magnitude. The metric says how related are the two videos by looking at the angle instead of magnitude. Cosine similarity of two vectors 𝐚 and 𝐛 is calculated by the Eq. (6). The cosine of small angles is near to one. In the context of video classification, a small angle between ME of a video category Vcat and Relevance of the input video Vin implies that the two videos are pointing in the same direction. The cosine similarity measure obtained for given input videos against each video category is shown in Table 3. 𝐚⋅𝐛 (6) cos 𝜃 = ‖a‖ ⋅ ‖b‖ The similarity of each input video to different categories of videos is shown in Table 3. Rows in the table represent input videos and its similarity with each video category is shown in respective columns. The similarity of Kate William Wedding video to be categorized as a Entertainment video is 0.9168 which is the maximum compared with the chances of categorizing to other types. The Football and Football1 video are classified as sports video which are actually is. News discussion video is classified as News video type. The category to which the videos are classified is rounded in the table where the similarity values are maximum and it states the correctness of the algorithm.

694

J. Varghese and K. N. Ramachandran Nair

Table 4 Performance against referred work [23] Precision Sensitivity/Recall Video genre Ref. [23] Proposed Ref. [23] Proposed work work News Sports Movies

88 84 77

99.2 88.7 85.7

88 79 72

46.01 100 100

Specificity Ref. [23] 92 92 93

Proposed work 99.8 95.7 95.9

Fig. 3 Comparison with the referred work

4 Results We obtained excellent results with the proposed approach. We have tested the proposed approach for video classification against the work done by [23]. In paper [23], the author classifies three major genres of videos namely news, sports, and movies. We also tried to classify of the three video genres using our proposed approach and the results are better than the referred work. The comparison results are shown in Table 4. The values of the referred work are taken directly from the referred paper. The comparison is graphically plotted in Fig. 3. The quality of the video genre classification is tested by parameters such as precision, sensitivity/recall, and specificity. These parameters are calculated from four basic parameters. TP (True Positive), FP (False Positive), FN (False Negative) and TN (True Negative). The equations are described in paper [23].

A Novel Video Genre Classification Algorithm . . .

695

5 Conclusion A video can be easily classified into the proper category by the percentage of appropriate background scene category. Combinations of scene categories can also occur in videos and can be incorporated into this method as a future work. Illustration in large-scale database is needed for better understanding. We have obtained good results with this approach. Acknowledgements This work has been done as a part of Ph.D. programme of Author, funded by UGC MANF fellowship, Government of India.

References 1. Brezeale, D., Cook, D.J.: Automatic video classification: a survey of the literature. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(3), 416–430 (2008) 2. Jiang, Y.-G., Bhattacharya, S., Chang, S.-F., Shah, M.: High-level event recognition in unconstrained videos. Int. J. Multimed. Inf. Retr. 2(2), 73–101 (2013) 3. Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR’07, pp. 494–501, New York, NY, USA. ACM (2007) 4. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16:1–16:43 (2011) 5. Azhar, R., Tuwohingide, D., Kamudi, D., Sarimuddin, Suciati, N.: Batik image classification using sift feature extraction, bag of features and support vector machine. Proced. Comput. Sci. 72, 24–30 (2015). The Third Information Systems International Conference (2015) 6. Uijlings, J.R.R., Duta, I.C., Rostamzadeh, N., Sebe, N.: Realtime video classification using dense HOF/HOG. In: Proceedings of International Conference on Multimedia Retrieval, ICMR’14, pp. 145:145–145:152, New York, NY, USA. ACM (2014) 7. Banerji, S., Verma, A., Liu, C.: LBP and Color Descriptors for Image Classification, pp. 205– 225. Springer, Berlin, Heidelberg (2012) 8. Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatiotemporal features. In: 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005) 9. Willems, G., Tuytelaars, T., Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Proceedings of the 10th European Conference on Computer Vision: Part II, ECCV’08, pp. 650–663. Springer, Berlin, Heidelberg (2008) 10. Zhang, X., Yang, Y., Zhang, Y., Luan, H., Li, J., Zhang, H., Chua, T.S.: Enhancing video event recognition using automatically constructed semantic-visual knowledge base. IEEE Trans. Multimed. 17(9), 1562–1575 (2015) 11. Duta, I.C., Uijlings, J.R.R., Nguyen, T.A., Aizawa, K., Hauptmann, A.G., Ionescu, B., Sebe, N.: Histograms of motion gradients for real-time video classification. In: 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6 (2016) 12. Ma, Z., Yang, Y., Sebe, N., Zheng, K., Hauptmann, A.G.: Multimedia event detection using a classifier-specific intermediate representation. IEEE Trans. Multimed. 15(7), 1628–1637 (2013) 13. Chang, X., Yang, Y., Hauptmann, A.G., Xing, E.P., Yu, Y.-L.: Semantic concept discovery for large-scale zero-shot event detection. In: Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pp. 2234–2240. AAAI Press (2015)

696

J. Varghese and K. N. Ramachandran Nair

14. Chattopadhyay, C., Das, S.: Supervised framework for automatic recognition and retrieval of interaction: a framework for classification and retrieving videos with similar human interactions. IET Comput. Vis. 10(3), 220–227 (2016) 15. Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE MultiMed. 13(3), 86–91 (2006) 16. Huang, J., Schonfeld, D.: A distributed context-free grammars learning algorithm and its application in video classification. In: 2012 Visual Communications and Image Processing, pp. 1–6 (2012) 17. Si, Z., Pei, M., Yao, B., Zhu, S.C.: Unsupervised learning of event and-or grammar and semantics from video. In: 2011 International Conference on Computer Vision, pp. 41–48 (2011) 18. Ekenel, H.K., Semela, T., Stiefelhagen, R.: Content-based video genre classification using multiple cues. In: Proceedings of the 3rd International Workshop on Automated Information Extraction in Media Production, AIEMPro’10, pp. 21–26, New York, NY, USA. ACM (2010) 19. Huang, Y.-F., Wang, S.-H.: Movie Genre Classification Using SVM with Audio and Video Features, pp. 1–10. Springer, Berlin, Heidelberg (2012) 20. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001) 21. Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010) 22. Varghese, J., Ramachandran Nair, K.N.: Detecting video shot boundaries by modified tomography. In: Proceedings of the Third International Symposium on Computer Vision and the Internet, VisionNet’16, pp. 131–135, New York, NY, USA. ACM (2016) 23. Hamed, A.A.M., Li, R., Xiaoming, Z., Xu, C.: Video genre classification using weighted kernel logistic regression. Adv. MultiMed. 2013, 2:2–2:2 (2013)

Reduction of Hardware Complexity of Digital Circuits by Threshold Logic Gates Using RTDs Muhammad Khalid, Shubhankar Majumdar and Mohammad Jawaid Siddiqui

Abstract Logic gates play an important role in digital circuit design. Hardware complexity is a major issue when designing digital circuits. Whenever any digital circuit is designed using the conventional logic gate, it faces many challenges like more number of components. So, it is today’s need to develop some new logic design concepts which reduce circuit’s hardware complexity. This paper presents threshold logic gates (TLGs) using resonant tunneling diodes (RTD) for reduction of hardware complexity. The proposed TLG is designed using RTD/FET with the help of MOnostable BIstable Latch Enable (MOBILE) principle. When TLGs are implemented using Resonant Tunneling Diodes, it demonstrates many electronic features like high-speed switching capability and functional versatility to implement functions such as AND, MAJORITY, OR, BUFFER, X1 * X ̄2 . However, RTDs are most suitable for implementing threshold logic rather than Boolean logic using CMOS technology. RTDs model and simulation result are verified using SPICE, which is in close agreement with the theoretical conclusion. Further, we have compared the transistor count of proposed circuits and conventional circuits for all functions. We have also calculated delay and average power of all function. Keywords MOnostable BIstable Latch Enable (MOBILE) tunneling diode (RTD) TLG





Resonant

M. Khalid (✉) Central Instrumentation Facility (CIF), Jamia Millia Islamia University, Jamia Nagar, New Delhi 110025, India e-mail: [email protected] S. Majumdar Department of Electronics and Communication Engineering, National Institute of Technology (NIT) Meghalaya, Shillong, Meghalaya, India e-mail: [email protected]; [email protected] M. J. Siddiqui Department of Engineering, Aligarh Muslim University, AMU, Aligarh, Uttar Pradesh, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_69

697

698

M. Khalid et al.

1 Introduction Basic working principles of TLGs is based on conventional Mc Culloch Pitts (MCP) artificial neuron Model [1–4]. TLGs are electronics circuit, which is a combination of RTD and FET models. RTD model has similar I-V characteristics as in [5]. Basic technique monostable bistable latch enable (MOBILE) is utilized in generalized TLGs circuit using RTDs [6, 7]. Two input AND gate and majority gate are designed and tested [8, 9] but performance is not evaluated for all type of gates using RTDs such as AND, OR, and MAJORITY. Therefore, this paper has been focused on designed and performance of all gates. The rest of the paper is organized as follows: SPICE circuit model of RTD is exhibited in Sect. 2. Generic MOBILE threshold logic gates (TG) is introduced in Sect. 3. Implementation of threshold logic gates is represented in Sect. 4. Comparison between RTD-based logic functions and conventional functions is discussed in Sect. 5. Simulation results and discussion are presented in Sect. 6. The conclusion is reported in Sect. 7.

1.1

Threshold Logic

Threshold logic is the simplest artificial neuron which computes the weighted sum of its inputs and compares the sum with a threshold value.  FðxÞ =

1 0

if Wi X i ≥ T otherwise

ð1Þ

where Wi = weights, Xi = inputs, T = Threshold, and i = 1, 2, 3, …, n. The TLG/ neuron symbol is shown in Fig. 1b.

Fig. 1 RTD a symbolic diagram, and b I-V characteristics

Reduction of Hardware Complexity of Digital Circuits …

1.2

699

Resonant Tunneling Diode

Schematic symbol of RTD model and I-V characteristics has been shown in Fig. 1a, b, respectively. Feature of RTD has a nonlinear I-V characteristic. This feature is also known as negative differential resistance (NDR). Working operation of obtained I-V characteristic, when voltage (Vrtd) across RTD increases up to voltage (Vp) then current (Irtd) through RTD increases up to a certain limit at peak value IP. Further, voltage (Vrtd) increases then current (Irtd) through RTD decreases and now, voltage (Vrtd) increases more, then current (Irtd) through RTD decreases. It means that the RTD is at a low resistance state then Irtd reaches IP, next the RTD enters the NDR region, and high resistance state. Great circuit functionalities can be enabled by exploiting this special NDR characteristic [7].

1.3

MOnostable BIstable Logic Element (MOBILE)

It is raising edge triggered current controlled gate which consists of two RTDs driven by a Vbias. Circuit applications of RTDs are mainly based on the MOBILE [8] Fig. 2.

1.4

Working of MOBILE

When Vbias is low state, both RTDs are in ON state. An increasing Vbias, RTD switches from ON state to OFF state. Output is HIGH if driver switches. Output is LOW if load switches. ON and OFF state depends on Peak currents. These peak currents are proportional to RTD areas λ1 and λ2.

Fig. 2 a Basic MOBILE structure, b I-V characteristics of RTD

700

M. Khalid et al.



if if

λ1 < λ2 λ1 > λ2

Vout = 0 Vout = 1

ð2Þ

Logic functionality can be implemented by the peak current of one of the RTDs, which is controlled by an input.

2 SPICE Circuit Model of RTD RTD model has been shown in Fig. 3. In this model, it is adjusted by register (R1) at resonance condition (Inductor (L1), Capacitor (C1) element) for the characteristic current I-V by the help of Tunnel Diode (T.D).

3 Generic Mobile Threshold Logic Gates (TG) The proposed circuit is known as generic MOBILE threshold logic gates as shown in Fig. 4. This circuit is used in the implementation of all logic functions such as AND, MAJORITY, OR, and BUFFER, X1 * X ̄2 . This circuit acts as all gates due to adjusting the values of the weights (Wi) and the threshold (T). They represent in terms of areas of RTD. These weights and threshold may be positive and negative values, for positive weight, RTD-FET is placed in the upper part of the proposed circuit and for negative weight is placed in part of the proposed circuit. Upper and lower parts of the proposed circuit involve MOSFETs (M1 and M2), RTDs (λ3 and λ4) and MOSFETs (M3 and M4), RTDs (λ5 and λ6), respectively. For the threshold value, RTDs include λ1 and λ2. Total current of upper and lower part of the proposed circuit is Id. Total current (It) flows through in RTDs (λ1 and λ2). Proposed circuit has two inputs X1 and X2, weights (W1 = λ3; W2 = λ4; W3 = λ5; W4 = λ6) and threshold (T = λ2 − λ1). Current equation for this circuit can be written as Id − It ≥ 0 ðW1 X1 + W2 X2 − W3 X3 − W4 X4 ÞAId ≥ TAId

Fig. 3 SPICE circuit model of RTD

ð3Þ

Reduction of Hardware Complexity of Digital Circuits …

701

Fig. 4 Circuit of generic mobile threshold logic

Above equation can be solved as ðW1 X1 + W2 X2 − W3 X3 − W4 X4 Þ ≥ T Otherwise

Vout = 1 Vout = 0

ð4Þ

In other words, weighted sum of all input is greater than threshold, output will be one and otherwise zero.

4 Implementation of Threshold Logic Gates In this section, we have implemented all logic functions by RTD. These functions are AND, MAJORITY, OR, BUFFER, X1 * X ̄2 .

4.1

AND Function

Circuit of two input AND function has shown in Fig. 5, which has driven from Fig. 4. In this circuit, only upper and threshold part of the proposed circuit has included to implement for AND function. In upper part, weights (W1 and W2) and threshold (λ1 and λ2) are used and this circuit is equivalent to AND function when these conditions are follows as: 8 <

λ 1 < λ2 λ2 > λ1 + λ3 : λ2 < λ1 + 2λ3

ð5Þ

By hit and trial method, we have solved above three equations and obtained these values: W1 = W2 = λ3 = 1 μm2, T = 1.15 μm2, λ2 = 2.15 μm2, λ1 = 1 μm2. We

702

M. Khalid et al.

Fig. 5 Circuit of AND function

Fig. 6 Implementation of threshold logic gate

have put all values of weight and threshold in neuron model as shown in Fig. 6. Output is verified by two-input AND function.

4.2

Majority Function

Circuit of the input majority function has shown in Fig. 7, which has driven from Fig. 4. In this circuit, only upper and threshold part of the proposed circuit has included to implement for majority function. In upper part, weights (W1, W2 and W3) and threshold (λ1 and λ2) are used and this circuit is equivalent to Majority function when these conditions are as follows: 8 λ 1 < λ2 > > < λ2 > λ1 + λ3 ð6Þ λ < λ1 + 2λ3 > > : 2 λ2 < λ1 + 3λ3 By hit and trial method, we have solved above three equations and obtained these values: W1 = W2 = W3 = λ3 = 1 μm2, T = 1.15 μm2, λ2 = 2.15 μm2, λ1 = 1 μm2. Fig. 7 Circuit of majority function

Reduction of Hardware Complexity of Digital Circuits …

703

Fig. 8 Implementation of threshold logic gate

We have put all values of weight and threshold in neuron model as shown in Fig. 8. Output is verified by two-input AND function.

4.3

OR Function

Circuit of two input OR function has shown in Fig. 9, which has driven from Fig. 4. In this circuit, only upper and threshold part of the proposed circuit has included to implement for OR function. In upper part, weights (W1 and W2) and threshold (λ1 and λ2) are used and this circuit is equivalent to OR function when these conditions are follows: 8 < λ 1 < λ2 ð7Þ λ < λ1 + λ3 : 2 λ2 < λ1 + 2λ3 By hit and trial method, we have solved above three equations and obtained these values: W1 = W2 = λ3 = 2.15 μm2, T = 1 μm2, λ2 = 2 μm2, λ1 = 1 μm2. We have put all values of weight and threshold in neuron model as shown in Fig. 10. Output is verified by two-input AND function.

Fig. 9 Circuit of OR function

Fig. 10 Implementation of threshold logic gate

704

4.4

M. Khalid et al.

Buffer Function

Circuit of one input buffer function has shown in Fig. 11, which has driven from Fig. 4. In this circuit, only upper and threshold part of proposed circuit has included to implement for buffer function. In the upper part, weight (W1) and threshold (λ1 and λ2) are used and this circuit is equivalent to buffer function when these conditions are follows: 

λ 1 < λ2 λ 2 < λ1 + λ3

ð8Þ

By hit and trial method, we have solved above three equations and obtained these values: W1 = λ3 = 2.15 μm2, T = 1 μm2, λ2 = 2 μm2, λ1 = 1 μm2. We have put all values of weight and threshold in neuron model as shown in Fig. 12. Output is verified by two-input AND function.

4.5

X 1 * X̄2 Function

Circuit of two input X1 * X ̄2 function has shown in Fig. 13, which has driven from Fig. 4. In this circuit, upper, lower, and threshold part of proposed circuit are included to implement for X1 * X 2̄ function Weights (W1) for upper part, Weights (W2) for lower part, and threshold (λ1 and λ2) are used and this circuit is equivalent to X1 * X ̄2 function when these conditions are as follows:

Fig. 11 Circuit of buffer function

Fig. 12 Implementation of threshold logic gate

Reduction of Hardware Complexity of Digital Circuits …

705

Fig. 13 Circuit of X1 * X 2̄ function

Fig. 14 Implementation of threshold logic gate

8 < λ 1 < λ2 λ > 3λ1 : 2 λ2 < 2λ1

ð9Þ

By hit and trial method, we have solved above three equations and obtained these values: W1 = W2 = λ1 = 1.5 μm2, T = 0.5 μm2, λ2 = 2 μm2, λ1 = 1.5 μm2. We have put all values of weight and threshold in neuron model as shown in Fig. 14. Output is verified by two input X1 * X ̄2 function.

5 Comparison Between RTD-Based Logic Functions and Conventional Functions Transistors count involved RTD-based logic functions and conventional functions such as AND, MAJORITY, OR, BUFFER, X1 * X 2̄ which indicates in Table 1. Performances of all functions are included in terms of delay and average power and these terms are shown in Table 2.

6 Simulation Results and Discussion We have successfully implemented the hardware reduction of threshold logic gates using RTDs. Main part of this proposed circuit is MOBILE-module which is implemented by RTDs. In MOBILE part, the threshold value has decided by series

706

M. Khalid et al.

Table 1 Transistor count for all functions Functions

Number of conventional MOSFET transistor

Number of MOSFET and NDR (threshold logic based RTD)

AND MAJORITY OR BUFFER X1 * X 2̄

6 10 6 4 8

2 3 2 1 2

(MOS) (MOS) (MOS) (MOS) (MOS)

+ + + + +

4 5 4 3 4

(RTD) (RTD) (RTD) (RTD) (RTD)

Table 2 Performance for all Functions Functions

Clock (V)

Logic high (V)

Logic low (V)

Rise time

Fall time

Delay (pS)

Average power (μW)

AND MAJORITY OR BUFFER X1 * X 2̄

0.8 0.8 0.8 0.8 0.8

0.7 0.7 0.7 0.7 0.7

0.2 0.2 0.2 0.2 0.2

0 0 0 0 0

0 0 0 0 0

10 10 10 10 10

156 185 150 90 100

connected two RTDs. Other parts are upper and lower, to adjust all values of weights, the general proposed circuit becomes as AND, MAJORITY, OR, BUFFER, and X1 * X ̄2 functions. Simulation results of characteristic of RTD and all logic function have shown in this subsection.

6.1

RTD Characteristic

We have simulated RTD model in SPICE software and its simulation result can be seen in Fig. 15. It shows negative differential resistance (NDR).

6.2

Majority Function

In three inputs of majority function, number of inputs are logic one, they occur two or more than two times. Output of this majority function will be logic one otherwise zero. These logical outputs have been verified by simulation result as shown in Fig. 16.

Reduction of Hardware Complexity of Digital Circuits …

707

Fig. 15 Simulation result for characteristic of RTD

Fig. 16 Simulation result of majority function

6.3

AND Function

In two inputs of AND function, any input is logic zero. Output of this AND function will be logic zero otherwise one. These logical outputs have been verified by simulation result as shown in Fig. 17.

708

M. Khalid et al.

Fig. 17 Simulation result of AND function

6.4

OR Function

In two inputs of OR function, any input is logic one. Output of this AND function will be logic one otherwise zero. This logical outputs have been verified by simulation result as shown in Fig. 18.

6.5

Buffer Function

In one input of buffer function, output of this buffer function will be the same as input logic zero or one. These logical outputs have been verified by simulation result as shown in Fig. 19.

Fig. 18 Simulation result of OR function

Reduction of Hardware Complexity of Digital Circuits …

709

Fig. 19 Simulation result of buffer function

6.6

X 1 * X̄2 Function

In one input X2 of X1 * X ̄2 function is logic one, output of this X1 * X ̄2 function will be zero otherwise one. These logical outputs have been verified by simulation result as shown in Fig. 20.

Fig. 20 Simulation result of X1 * X 2̄ function

710

M. Khalid et al.

7 Conclusion In this work, TLGs based circuits have been implemented using RTDs. It has been found that TLGs based circuits exhibit correct functionality and reduces hardware complexity of the device. The reduced complexity of RTD-based threshold logic gates allows better gains in different function density. RTD/FET Threshold logic gates allowed conventional Boolean design methods to be applied while increasing the function density through the low complexity of the circuits.

References 1. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943) 2. Khalid, M., Singh, J.: Memristive crossbar circuits-based combinational logic classification using single layer perceptron learning rule. J. Nanoelectron. Optoelectron. Am. Sci. Publ. 12, 47–58 (2017) 3. Khalid, M., Singh, J.: Memristor crossbar-based pattern recognition circuit using perceptron learning rule. In: 2016 IEEE International Symposium on Nanoelectronic and Information Systems (iNIS), pp. 236–239, Dec 2016 4. Khalid, M., Siddiqui, M., Rahman, S., Singh, J.: Implementation of threshold logic gates using RTDs. J. Electron. Electr. Eng. 1(1), 13–17 (2010) 5. Broekaert, T.P.E., Brar, B., van der Wagt, J.P.A., Seabaugh, A.C., Morris, F.J., Moise, T.S., Beam, E.A., Frazier, G.A.: A monolithic 4-bit 2-GSPS resonant tunneling analog-to-digital converter. IEEE J. Solid-State Circuits 33, 1342–1349 (1998) 6. Zhang, R., Gupta, P., Zhong, L., Jha, N.K.: Threshold network synthesis and optimization and its application to nanotechnologies. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 24, 107–118 (2005) 7. Nikodem, M., Bawiec, M.A., Biernat, J.: Synthesis of generalised threshold gates and multi threshold gates. In: 2011 21st International Conference on Systems Engineering, pp. 463–464, Aug 2011 8. Pettenghi, H., Avedillo, M.J., Quintana, J.M.: A novel contribution to the RTD-based threshold logic family. In: 2008 IEEE International Symposium on Circuits and Systems, pp. 2350–2353, May 2008 9. Kelly, P.M., Thompson, C.J., Mcginnity, T.M., Maguire, L.P.: A binary multiplier using RTD based threshold logic gates. In: Proceedings of the 7th International Work-Conference on Artificial and Natural Neural Networks: Part II: Artificial Neural Nets Problem Solving Methods, IWANN’03, pp. 41–48. Springer-Verlag, Berlin, Heidelberg (2003)

Analysis of High-Power Bidirectional Multilevel Converters for High-Speed WAP-7D Locomotives J. Suganthi Vinodhini

and R. Samuel Rajesh Babu

Abstract Another topology for the multilevel converter is available for the high speed train in this paper. Three-phase induction engines are driven by voltage source inverters. This chapter investigates different multilevel converters. The speed and contortion which exists in three-phase locomotives was controlled by variable voltage, variable frequency (VVVF), and the modulation index was expanded to reduce the distortion in the yield voltage and current. In this chapter the modulation index was increased with respect to the frequency of the three-phase induction motor. In current insulated gate bipolar transistor (IGBT) trains voltage stresses exist. These are investigated and reduced using a proposed framework for WAP-7D locomotives. Simulations were done across an entire district investigating speed, torque, voltage, and braking effort in traction motors, utilizing MATLAB programming.



Keywords Multilevel inverter Metal-oxide–semiconductor field-effect transistor (MOSFET) Sinusoidal pulse width modulation (SPWM) Locomotive



1 Introduction Indian railway locomotives are driven by three-phase non-simultaneous squirrel cage induction motors using insulated gate bipolar transistor (IGBT) converters. Still due to the low characteristics of voltage and current of the standard converter the motor drive going up against the higher demand sounds issue. So the prepare should meet the effective with the high power control semiconductors. The only a solitary system which sees each one of the downsides looking by the locomotive J. S. Vinodhini (✉) ⋅ R. S. R. Babu Department of Electronics & Instrumentation, Sathyabama University, Chennai, India e-mail: [email protected] R. S. R. Babu e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_70

711

712

J. S. Vinodhini and R. S. R. Babu

was the multilevel inverter. Multilevel inverters are regularly are named diode-clamped multilevel inverters (DCMIs), flying capacitor multilevel inverters (FCMIs), or cascaded H-bridge (CHB) multilevel inverters (CHBMIs). These inverters are exceptionally well suited for uses in balanced motors in locomotives. CHB associated inverters, with separate DC sources, are a particularly good choice for fixed-speed traction motors. However, variable voltage variable repeat is generally required for regenerative braking. Multilevel inverters are by and large focused on electric engine drives in various enterprises. Flexible speed AC motors typically have a control method which uses inverters with high pulse width adjustment strategies. In late headways of converters for the effective balance motor it achieves the troubles like current issue and switching methods. This issue can be overcome using semiconductor switches like metal-oxide–semiconductor field-effect transistors (MOSFETs), IGBTs, and metal– semiconductor field-effect transistors (MESFETs). The impact semiconductor switches have reduced voltage—this can impel crown discharging in the layers [1–3]. Presently, converters are laid out using IGBTs to decrease the switching stresses. Typically, in the present change of the converters in prepare are edge standing up to the short out issue while changing over the DC to AC. These drawbacks can be easily overcome by using multilevel inverters.

2 Multilevel Converters Multilevel converters generally have the capability to act as rectifiers and inverters. These converters are being developed for electrical high-power drives. This technology is growing rapidly in all areas of drive applications. In Indian railways power quality measurements have been carried out for traction drives. In terms of this phenomenon all power devices need proper cooling and control devices for both linear and non-linear modes of operation. For the above purpose in this chapter to maintain the speed for the balanced load and unbalanced load for a time interval all the multilevel inverters was carried out [4, 5]. Three topologies are compared in this chapter for constant speed drive applications. In these topologies the power quality was improved due to the use of a multilevel output voltage (Fig. 1).

3 Control Strategy for the Bidirectional Multilevel Inverter—Sinusoidal Pulse Width Modulation In order for the inverter to work, the power switches, used to frame the inverter, should be turned on and off for the current to flow. This exchanging plot is accessible in a few techniques for standard balances. One extremely well-known strategy in mechanical applications is the sinusoidal pulse width modulation

Analysis of High-Power Bidirectional Multilevel Converters …

713

Fig. 1 Basic three-phase induction drive using a power converter

(SPWM) method. Acknowledgment of the benefits of SPWM is seen when looking at the coveted sinusoidal reference standard to the high recurrence triangle bearer waveform. The established SPWM is a sub-group of PWM which utilizes a triangular carrier to think about against the reference (control) waveform. The proportion of the abundance of carrier wave as well as the control waveform is called the “adjustment proportion.” In the event that the reference wave is higher than the carrier, the relating inverter cell yields a positive voltage; generally, the inverter cell yields a negative voltage. One significant restriction of this normally tested PWM is the issue of its execution in a computerized balance framework because of the convergence between the reference sinusoid and triangular carrier as characterized by a supernatural condition which is perplexing to outline. This restriction can be overcome by using a normally examined PWM technique, in which the low recurrence reference waveforms are inspected and afterward held steady amid every carrier interim. These tested esteems are looked at against the triangular carrier waveform to control the exchanging procedure of each stage leg, rather than the sinusoidal changing in the reference.

4 Regenerative Braking in WAP-7D Locomotives A regenerative braking strategy means that the induction motor functions as a generator when it keeps running above synchronous speed and sustains energy to the supply. The problem lies in the way that the created voltage must be higher than the supply voltage; the supply recurrence needs to be coordinated and must be without sound. Subsequently, regenerative braking requires complex power electronics to dispose of the noise and match the frequencies. Regenerative braking is, in a perfect world, utilized when there are trains accessible on both up-line and down-line directions in a rail arrangement, at the precise moment the produced voltage from one locomotive is consumed by another, without bolstering the

714

J. S. Vinodhini and R. S. R. Babu

transmission lines. In slip ring motors, the slip control recouped with the speed control can be utilized to facilitate charging batteries, running blower fans, working lights, and so on, as opposed to feeding back to the mains supply. This would effectively brake the traction motors and furthermore spare the vitality tapped from the mains to drive the locomotive assistants which are of lower voltage and current evaluations.

5 Bidirectional Multilevel Inverters for WAP-7D Locomotives The design of these inverters was determined in order to utilize similar equipment in two methods of operation and along these lines have bidirectional power stream usefulness. The release mode was a way to separate energy from the battery bank and utilize it to supplement construction. This was refined by boosting the battery bank voltage to the required level and afterward changing it over to AC with the correct frequency and phase required, keeping in mind that the end goal was to infuse current into the AC grid [6]. To minimize reactive power this designed bidirectional multilevel inverter can achieve close to a power factor of unity, by connecting it to the grid. In addition, a charge method of operation uses a lattice to revive the battery bank, storing energy. This can be refined by changing the network voltage and directing the required current to the batteries.

6 Proposed Drive System for WAP-7D Locomotives 6.1

Experimental Analysis for 21-level Bidirectional Cascaded H-Bridge Multilevel Inverters

In a cascaded H-bridge multilevel inverter (CHBMI) 3 H-bridges and a total of 12 power MOSFET switches were used produce an AC voltage for a three-phase induction motor with low harmonic distortion. Three voltages V1, V2, and V3 were chosen as inputs to the three CHB inverters. Twelve switches were taken for the inverter and for each level the switches were switched on and off digitally—with 0 representing off and 1 representing on. For level 8 the switches S2, S3, S5, S8, S9, and S12 are represented digitally as 1s, with the MOSFET switch state in an ON position. Therefore, the level of +Vdc increases. So for every level each switch goes ON and OFF. During this time short circuits may occur—this was fully avoided in the proposed system (Figs. 2, 3, 4, 5, and 6 and Table 1).

Analysis of High-Power Bidirectional Multilevel Converters …

715

Fig. 2 Simulation for proposed 21-level CHBMI

Fig. 3 Output voltage and current for the proposed CHBI

7 Experimental Analysis for Diode-Clamped Multilevel Inverters The simulation for the three-level diode-clamped multilevel inverter (DCMI) was carried out using MATLAB simulink with MOSFET power switches using the SPWM technique. The whole system was analyzed in its stable state [7–9]. In this proposed DCMI 24 power switches were used for the whole system and inductors used for energy purposes. Total harmonic distortion (THD) analysis for the voltage and current was carried out (Fig. 7).

716

J. S. Vinodhini and R. S. R. Babu

Fig. 4 Speed and torque characteristics for the proposed CHB

Fig. 5 THD analysis for voltage in a CHBMI

7.1

Switching Operation for a Diode-Clamped Multilevel Inverter

In this Chapter “Extended Security Model over Data Communication in Online Social Networks” switches were used for designing the DCMI, to produce the output voltage. When the voltage +Vdc/2 the switches S1, S2, S3, S4 are designed digitally to produce so the output appeared by increased in level and the remaining

Analysis of High-Power Bidirectional Multilevel Converters …

717

Fig. 6 THD analysis for current in a CHBMI

Table 1 Switching sequence for a CHBMI Vol.

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

0V 1V 2V 3V 4V 5V 6V 7V 8V 9V 10 V −1 V −2 V −3 V −4 V −5 V −6 V −7 V −8 V −9 V −10 V

0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0

0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1

1 0 1 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 0 1 1

1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0

0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1

1 1 0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0

0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1

1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0

718

J. S. Vinodhini and R. S. R. Babu

Fig. 7 MATLAB simulink for a DCMI

four switches digitally zero. For level 0, switches S3, S4, S5, and S6 remained digitally 1, with the remaining switches being in the OFF position. This is for a single line voltage—for the other two phases 16 switches were used (Figs. 8, 9, and 10 and Table 2).

8 Experimental Analysis of the Proposed Flying Capacitor Multilevel Inverter Using MATLAB The proposed flying capacitor multilevel inverter (FCMI) was simulated using MATLAB. In this simulation 24 power switches and passive devices were connected for all the phases. In order to improve the performance of the FCMI the operation was considered with a very low modulation index which still produced higher order harmonics in the output voltage. During loaded conditions the

Analysis of High-Power Bidirectional Multilevel Converters …

Fig. 8 Output voltage and current for a DCMI

Fig. 9 Voltage THD analysis for a DCMI

Fig. 10 Current THD analysis for a DCMI

719

720 Table 2 Switching sequence for a DCMLI for a single-phase line

J. S. Vinodhini and R. S. R. Babu Vol.

S1

S2

S3

S4

1 1 1 1 +Vdc/2 0 1 1 1 +Vdc/4 0 0 0 1 1 0 0 0 1 −Vdc/4 0 0 0 0 −Vdc/2 Cv1 = Cv2 = Cv3 = Cv4 = Vdc/4

S5

S6

S7

S8

0 1 1 1 1

0 0 1 1 1

0 0 0 1 1

0 0 0 0 1

Fig. 11 MATLAB simulink for an FCMI

capacitors should maintain a balanced voltage during the period of charging and discharging [8]. One main advantage of this system is that during fault conditions each branch can be easily analyzed separately. In terms of the speed–torque characteristics the motors have an initial speed which takes time to settle. In this FCMI the settling time was increased, meaning that it takes time to jump from a reference speed rpm to a rated speed rpm (Figs. 11, 12, and 13 and Table 3).

Analysis of High-Power Bidirectional Multilevel Converters …

721

Fig. 12 Output voltage and current for an FCMI

Fig. 13 Speed and torque characteristics for an FCMI

Table 3 Switching sequence for an FCMI for a single-phase line

Vol.

S1

S2

S3

S4

1 1 1 1 +Vdc/2 0 1 1 1 +Vdc/4 0 0 0 1 1 0 0 0 1 −Vdc/4 0 0 0 0 −Vdc/2 Cv1 = Cv2 = Cv3 = Cv4 = Vdc/4

S5

S6

S7

S8

0 1 1 1 1

0 0 1 1 1

0 0 0 1 1

0 0 0 0 1

722

J. S. Vinodhini and R. S. R. Babu

9 Total Harmonic Distortion Analysis of the Proposed Cascaded H-Bridge Multilevel Inverter and Flying Capacitor Multilevel Inverter The line voltage and current for the CHBMI and FCMI was analyzed using MATLAB. The THD values for the CHBMI rapidly decreased compared with the FCMI. Figure 14 shows that moving from lower frequency to higher frequency the THD decreases in the CHBMI compared with the FCMI. From experimental results it was found that a 21-level cascaded H-bridge inverter was the best choice for WAP-7D locomotives (Fig. 15). In Table 4 the THD values are compared for the proposed system. The analysis shows that the CHBMI is the best for WAP-7D locomotives—it also has fewer switches with low harmonics in the output side of the inverter during a low modulation index.

10

Comparison of Cascaded H-Bridge Multilevel Inverters, Diode-Clamped Multilevel Inverters, and Flying Capacitor Multilevel Inverters

This comparison shows the good accuracy of all the multilevel inverters which are suitable for WAP-17D locomotives. The 21-level multilevel inverter was designed and simulated to produce fewer harmonics as well as a reduced switching loss. For any type of multilevel inverter more components are required but for this proposed

Fig. 14 THD analysis for voltage in an FCMI

Analysis of High-Power Bidirectional Multilevel Converters …

723

Fig. 15 THD analysis for current in an FCMI

Table 4 Comparison of CHBMI, DCMI, and FCMI

Proposed systems

No. of switches

THD (%)

CHBMI DCMI FCMI

12 24 24

7.42 42.9 152.17

CHBMI only 12 switches were used. So this represents the best choice compared with the other two multilevel inverters—FCMI and DCMI need 24 power switches and also a passive device to rectify the distortion in the circuit, making such systems more bulky and expensive. For all the proposed systems the THD value was analyzed using the fast Fourier transform (FFT) analysis [10, 11]. For all the proposed systems FFT analysis was carried out to find the total harmonic distortion for the waveforms. Finally, the results of the THD was much less in cascaded H-bridge inverters at 7.42%, but the THD for a DCMI voltage is 42.9% and a THD for the current waveform is 15.77%. For FCMI the THD value is 152.17%. According to this analysis the cascaded H-bridge inverter ensemble will likely be used for WAP-17D locomotives in the future.

11

Conclusion

In this chapter the 21-level CHBMI, 5-level DCMI, and FCMI was outlined and examined for use by WAP-7D locomotives. So SPWM system play for unfaltering quality for the power MOSFET switches and moreover the switching stresses has been decreasing in this investigation. By various assessment of the multilevel

724

J. S. Vinodhini and R. S. R. Babu

inverter at last this examination gives an incredible efficiency and less THD for the CHBMI, so this inverter suits for the locomotive WAP-7D. Hence the cost, size, EMI, and EMC issues are reduced in 21-level CHBMI. For a 5-level DCMI its higher aggregate sounds in the yield voltage again increase the level, the switches are additionally builds the size and cost will be expanded. The primary downside for DCMI and FCMI is size on the grounds that along with dynamic state operation hence the device may affect henceforth it is hard to switch over the operation while during the operation of locomotive.

References 1. Bose, B.K.: Power Electronics and Motor Drives. Academic Press, an imprint of Elsevier, Massachusetts (2006) 2. Wu, B.: High Power Converters and AC Drives. IEEE Press (2006) 3. Khomfoi, S., Tolbert, L.M.: Multilevel Power Converters. In: Power Electronics Handbook, 2nd edn., Chapter 17, pp. 451–482. Elsevier (2007). ISBN 978-0-12-088479-7 4. Sepahvand, H., Khazraei, M., Ferdowsi, M., Corzine, K.A.: Feasibility of capacitor voltage regulation and output voltage harmonic minimization in cascaded H-bridge converters. In: Proceedings of IEEE Applied Power Electronics Conference and Exposition, pp. 452–457 (2010) 5. Aneesh, M.A.S., Gopinath, A., Baiju, M.R.: A simple space vector PWM generation scheme for any general n-Level inverter. IEEE Trans. Ind. Electron. 56(5), 1649–1656 (2009) 6. Vinodhini, J.S., Babu, R.S.R.: Analysis of new high power 21 level cascaded H-Bridge multilevel inverter with regenerative braking comparing with DCMI for Indian railway locomotive WAP-7D. In: IEEE TPEC Conference, TEXAS, Feb 2018 7. Yuan, X., Barbi, I.: Fundamentals of a new diode clamping multilevel inverter. IEEE Trans. Power Electron. 15(4), 711–718 (2000) 8. Su, G.-J.: Multilevel DC-Link inverter. IEEE Trans. Ind. Appl. 41(3) (2005) 9. Fujita, H., Yamashita, N.: Performance of a diode-clamped linear amplifier. IEEE Trans. Power Electron. 23(2), 824–831 (2008) 10. Ahmed, M.E., Mekhilef, S.: Design and implementation of a multilevel three-phase inverter with less switches and low output voltage distortion. J. Power Electron. 9(4), 593–603 (2009) 11. Jiang, Y.-H., Cao, Y.-L., Gong, Y.-M.: A novel topology of hybrid multilevel inverter with minimum number of separate DC sources. J. Shanghai Univ. (English Edition) (2004)

A Study on Different Types of Base Isolation System over Fixed Based M. Tamim Tanwer, Tanveer Ahmed Kazi and Mayank Desai

Abstract Based isolation is a technique which is used to prevent or reduce damage to a structure at a time of earthquake. It is a design principle by which flexible supports (isolators) are installed under every supporting point of a structure. It is generally located across a foundation (substructure) and superstructure. Seismic hazards are key concern for a earthquake prone areas of the world. Performance-based earthquake design has brought recent technological advances which has established new approach to construct earthquake resistant structure. Base isolation systems are progressively used technique for advanced earthquake resistance structure. The effect of different types of base isolator over earthquake resistant structures is studied in this paper. The work focuses on comparative study of different types of base isolators such as lead rubber bearings (LRB), friction pendulum bearings (FPB), elastomeric rubber bearing (ERB), high damping rubber bearings (HDRB), and low damping rubber bearing (LDRB) and compared for time period, base shear, fundamental period, frequency, storey drift, time history analysis, and displacement of the fixed base.

⋅ ⋅

Keywords Base isolation system Lead rubber bearings (LRB) Friction pendulum bearings (FPB) Elastomeric rubber bearing (ERB) High damping rubber bearings (HDRB) Low damping rubber bearing (LDRB) Earthquake



M. Tamim Tanwer (✉) Pacific University, 76/77 Amber Colony Harinagar-1, Udhna, Surat Gujarat 394210, India e-mail: [email protected] T. A. Kazi Pacific University, Pacific Hills, Pratapnagar Extension, Airport Road, Debari, Udaipur 313003, India e-mail: [email protected] M. Desai SVNIT Surat, Keval Chowk, SVNIT Campus, Athwa, Surat, Gujarat 395007, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_71

725

726

M. Tamim Tanwer et al.

1 Introduction An earthquake is one of nature’s most dangerous disasters which results in significant loss of life and terrible harm to the property, especially man-made structures. An earthquake is a shaking of the earth surface which results from the sudden release of accumulated energy in the tectonic plates of the earth lithosphere and due to which seismic waves occurs. Earthquake is a natural calamity which has destroyed millions of lives throughout in the past historic time. Due to earthquake, a force is precipitate from the earth lithosphere and which lasted for short duration of time. Base isolation system is a technique introduced in a structure which separates the structure from damaging induced by seismic waves and it will prevent the superstructures from engrossing the earthquake force. Base isolator mechanism helps to increase the natural time period of the structure and decreases the earthquake acceleration response. The base isolation system rests on the structural bearing which lies between the superstructure and substructure and helps to dissolve the horizontal displacement, rotation or translation. The bearing which helps to prevents translation is known as a fixed bearing or fix-point bearing and if this bearing is fixed in all directions, then it is known as a guided bearing or unidirectional movable bearing. Earthquakes study provides guidance to architects and engineers with a number of important design criteria foreign to the normal design process. From the well-established methods reviewed by many researchers, base isolation system proves to be the most effective solution for a broad range of earthquake design problems and the effect of these systems over seismic responses of the structures are studied in this paper.

2 Objective of Study The key objective of the base isolation system is to save the structure from earthquake’s effect or to minimize the earthquake’s effect. Many comparative researches have disclosed that the reaction of the isolated structure is remarkably less than the fixed (regular) base structure. The main objective of the study is to compare different types of base isolators such as lead rubber bearings (LRB), Friction pendulum bearings (FPB), elastomeric rubber bearing (ERB), high damping rubber bearings (HDRB), and low damping rubber bearing (LDRB) with time period, base shear, fundamental period, frequency, storey drift, time history analysis, and displacement of the fixed base.

A Study on Different Types of Base Isolation System …

727

3 Literature Study Nitya and Arathi [1] have published a research paper for “Study of earthquake response of a RC building with base isolation” on International Journal of Science and Research (IJSR). In this research, a RCMR frame structure of G+6 storey’s with fixed base and with base isolation system is considered. Analysis is performed by using SAP 2000. They come to a conclusion that the base isolation system substantially increases the time period of the structure. It reduces correspondingly the base shear up to 75% as compared to fixed one. With the increase in fundamental period, RCMR frame with base isolation system completely removed the structure from the resonance range of the seismic waves. Analysis shows that the fundamental period of the structure is approximately twice for the isolated structure. Increment in fundamental period reduces the maximum acceleration and hence it reduces the earthquake force from the structure. From the tables and graphs, it gets clear that the storey displacements are much higher for isolated buildings, also the displacement of all the storeys are almost same. The isolator with rubber has more displacement as of friction isolator (Fig. 1). Thomas and Mathai [2] have published a research paper for “Study of base isolation using friction pendulum bearing system” on Journal of Mechanical and Civil Engineering. They had created FEM model of base isolator in ANSYS 14.5 software. They had analyzed and compared the behavior of the friction pendulum bearing with rubber base isolator. Static analysis of base isolator as nonlinear is performed for different storey under different load value. They came to a conclusion that as we increase the number of storey load value, then stress intensity value also gets increased. The stress intensity value was found under permissible limits up to 30-storeys and we can design the base isolators for 22 storeys to 30 storey buildings. From this analysis, it gets clear that the slider movement produced a dynamic friction force which provides the required damping for engrossing the earthquake energy (Fig. 2).

Fig. 1 a Perspective view of model, b comparison of base shear

728

M. Tamim Tanwer et al.

Fig. 2 a Model of the base isolator, b mesh configuration of the base isolator, c boundary condition of base isolator

Vijaykumar et al. [3] have published a research paper for “A Study on Seismic Performance of RCC Frame with Various Bracing Systems using Base Isolation Technique” on International Journal of Applied Engineering Research. In these research paper, a G+25 storey building square in plan is analyzed using design software SAP 2000. They come to a conclusion that the performance of the structure with base isolation systems proves more effective than a fixed base. The structure is analyzed for displacement and drift parameters and they noted that displacement in base isolation structure is high compared to fixed base. The main factor responsible for collapse of structure is its storey drift. The research shows that storey drift in base isolation structure is very much reduced compared to regular base structure. Though the cost of installation adds to drawback of base isolation, the performance proves its necessity in hospitals, public places, and essential buildings. Hence from the study, it can be observed that the bracing system performs better by the use of base isolation in seismic prone area (Fig. 3). Desai and John [4] have published a research paper for “Seismic Performance of Base Isolated Multi-Storey Building” on International Journal of Scientific and Engineering Research. In this research paper, Dynamic Response Spectrum Analysis is worked out for 8-storey office building. The structure is analyzed with fixed

Fig. 3 Comparison of base shear

A Study on Different Types of Base Isolation System …

729

base structure and with different types of base isolator. Comparative study of different parameters like frequency, spectral acceleration, base shear, displacement, and storey drift is worked out without provision of base isolator and with provision of different base isolators. From the summary of results, it can be seen that In base-isolated structure, frequency has reduced as compared to the fixed base structure. Fundamental mode is more effective in seismic analysis. Frequency is minimum in LRB structure in fundamental mode compared to HDRB and LDRB. Acceleration has reduced when isolators are provided. LRB structure gives the least acceleration compared to HDRB and LDRB isolators. Base shear reduces considerably in base-isolated structure. The base shear in LRB structure is reduced to 47%, in HDRB structure it reduced to 33% and in LDRB structure it reduced to 34%, respectively, as compared to the fixed base structure. Displacement is very high in LRB, HDRB, and LDRB compared to fixed base structure. The Average displacement is maximum in LRB as compared to HDRB and LDRB. Storey drift has reduced considerably by provision of isolator. The reduction in storey drift at 9 m height are 13%, 13%, and 15%, respectively for HDRB, LDRB and LRB structures as compared to the fixed base structure. It can be concluded that the performance of the structure with base isolation systems proves more effective than a fixed base. Performance of LRB proves more effective as compared to the HDRB and LDRB (Fig. 4). Naveen et al. [5] have published a research paper for “Base Isolation of Mass Irregular RC Multi-Storey Building” on International Research Journal of Engineering and Technology (IRJET). In this research paper, a G+9 storey building square is analyzed using design software SAP 2000. They come to a conclusion that the reduction in lateral displacement at top storey of regular structure was found to Fig. 4 Perspective view of 8-storey office building model

730

M. Tamim Tanwer et al.

Fig. 5 a Graph showing displacement (mm) of all IV module, b graph showing drift (mm) of all IV module

be 35% whereas in mass irregular structure the lateral displacement at top storey was 36% from the history time analysis of El centro earthquake. From the analysis of lateral displacements in both directions, it came to know that torsion occurs due to mass irregularity in a structure. No inter-storey drifts was found in base-isolated structure, whereas in mass irregular structure large amount of inter-storey drifts found, which means that the structure takes rigid body movements in base-isolated structure as compared to a fixed base structure (Fig. 5). Noorzai et al. [6] have published a research paper for “Study Response of Fixed Base and Isolation Base” on International Journal of Innovative Research in Science Engineering and Technology. In their research (G+25), RCC frame structure with fixed base and with isolated LRB base was analyzed and design using design software ETABS. They come to a conclusion that the structure with isolated base discloses less lateral deflection. The lateral displacement at base in base-isolated structure never equals zero and less amount of moment is generated than the fixed base structure. The base isolation systems separate the structure from the earthquake-induced load and also maintain larger fundamental lateral period as compared to a fixed base structure. Base isolation system also known as seismic base isolation is one of the most recent technique to protect the structure against seismic forces. It also helps in pertaining the passive vibration control to structure. Structure with isolated base separates the substructure and superstructure during the earthquake, and as a result, the substructure will move along the ground and the superstructure will be dormant. LRB proves to be the most effective base isolators as compared to fixed base and any other types of isolators (Fig. 6). Ghodke and Admane [7] have published a research paper for “Effect Of Base-Isolation for Building Structures” on International Journal of Science, Engineering and Technology Research (IJSETR). In this research paper, a G+5 storey building is analyzed using design software SAP 2000. They come to a conclusion that with increasing the height of the structure, displacement is decreasing in

A Study on Different Types of Base Isolation System …

731

Fig. 6 a G+25 storey plan, b analytical models

base-isolated structure. The displacement is less in isolated—base structure than to a fixed base (Fig. 7). Nassani and Wassef [8] have published a research paper for “Seismic Base Isolation in Reinforced Concrete Structures” on International Journal of Research Studies in Science, Engineering and Technology. G+4 storeys are analyzed with isolated base and without isolated base. The analysis was performed in design software SAP2000. They come to a conclusion that the structure with isolated base

732

M. Tamim Tanwer et al.

Fig. 7 a Model generation in SAP 2000, b displacement curve, c deformed shape of building

Fig. 8 a Perspective view of symmetric building, b perspective views of nonsymmetric building

A Study on Different Types of Base Isolation System …

733

reduces the base shear and storey drifts, on the other hand, it also increased the displacement as compared to fixed base systems where base shear and storey drift are too high and the displacement of structure get decreased (Fig. 8).

4 Conclusion The base isolation system substantially increases the time period of the structure. It reduces correspondingly the base shear up to 75% as compared to fixed one. Fundamental period of the structure is approximately twice for the isolated structure. Fundamental modes prove more effective in seismic analysis. Performance of the structure with base isolation systems proves more effective than a fixed base. In base-isolated structure frequency has reduced as compared to the fixed base structure. Storey drift has considerably reduced by provision of a base isolator. The reduction in storey drift at 9 m height are 13%, 13%, and 15%, respectively, for HDRB, LDRB, and LRB structures as compared to the non-isolated structure. Performance of lead rubber bearing is better as compared to the HDR bearing and LDR bearing. In a time history analysis for EI Centro, earthquake reduction in top storey lateral displacement is 35% in 10 storied fixed base structures, whereas the reduction of lateral displacement is 36% in 10 storied mass irregular structures. No inter-storey drifts were found in base-isolated structure, whereas in mass irregular structure large amount of inter-storey drifts found, which means that the structure takes rigid body movements in base-isolated structure as compared to a fixed base structure. The base isolation systems separate the structure from the earthquake-induced load and also maintain larger fundamental lateral period as compared to a fixed base structure. It is concluded that with increasing the height of the structure, displacement is decreasing in base-isolated structure.

References 1. Nitya, M., Arathi, S.: Study on the earthquake response of a RC building with base isolation. Int. J. Sci. Res. 5, 1002–1005 (2016) 2. Thomas, T., Mathai, A.: Study of base isolation using friction pendulum bearing system. J. Mech. Civil Eng. 19–23 (2016) 3. Vijaykumar, M., Manivel, S., Arokiaprakash, A.: A study on seismic performance of RCC frame with various bracing systems using base isolation technique. Int. J. Appl. Eng. Res. 11, 7030–7033 (2016) 4. Desai, M., John, R.: Seismic performance of base isolated multi-storey building. Int. J. Sci. Eng. Res. 6, 84–89 (2015) 5. Naveen, K., Prabhakara, H.R., Eramma, H.: Base isolation of mass irregular RC multi-storey building. Int. Res. J. Eng. Technol. 2, 902–906 (2015) 6. Noorzai, M., Bajad, M.N., Dodal, N.: Study response of fixed base and isolation base. Int. J. Innov. Res. Sci. Eng. Technol. 4, 3674–3681 (2015)

734

M. Tamim Tanwer et al.

7. Ghodke, R.B., Admane, S.V.: Effect of base-isolation for building structures. Int. J. Sci. Eng. Technol. Res. 4, 971–974 (2015) 8. Nassani, D.E., Wassef, M.A.: Seismic base isolation in reinforced concrete structures. Int. J. Res. Stud. Sci. Eng. Technol. 2, pp. 1–13 (2015) 9. Kravchuk, N., Colquhoun, R., Porbaha, A.: Development of a friction pendulum bearing base isolation system for earthquake engineering education. In: Proceedings of the 2008 American Society for Engineering Education Pacific Southwest Annual Conference, Pittsburg, Pennsylvania, June (2008) 10. Keerthana, S., Sathish Kumar, K., Balamonica, K.: Seismic response reduction of structures using base isolation 11. Taywade, P.W., Savale, M.N.: Sustainability of structure using base isolation techniques for seismic protection. Int. J. Innov. Res. Sci. Eng. Technol. 4(3) (2015) 12. Kooshki, M.E., Shahri, A.A.: Seismic response controlling of structures with a new semi base isolation device (2015) 13. Deb, S.K.: Seismic base isolation—an overview. Current Sci. 1426–1430 (2004) 14. Kelly, J.M.: Earthquake-Resistant Design with Rubber (1993) 15. Moretti, S., et al.: Utilizing base-isolation systems to increase earthquake resiliency of healthcare and school buildings. Proced. Econ. Finance 18, 969–976 (2014) 16. Nitya, M., Arathi, S.: Design and study of seismic base isolators. J. Basic Appl. Eng. Res. 2 (9), 734–739 (2015) 17. Hussain, S., Lee, D., Retamal, E.: Viscous damping for base isolated structures. Taylor Devices (1998). http://www.taylordevices.com/Tech-Paper-archives/literature-pdf/36-Viscous Damping.pdf 18. Kubo, T., et al.: A seismic design of nuclear reactor building structures applying seismic isolation system in a high seismicity region—a feasibility case study in Japan. Nucl. Eng. Technol. 46(5), 581–594 (2014) 19. Arya, G., Alice, T.V., Alice, M.: Seismic analysis of high damping rubber bearings for base isolation

Suppression of Speckle Noise in Ultrasound Images Using Bilateral Filter Ananya Gupta, Vikrant Bhateja, Avantika Srivastava and Aditi Gupta

Abstract The suppression of speckle noise is necessary for clear vision of ultrasound images. The quality of ultrasound images is degraded by the presence of speckle noise. In this work, bilateral Filter is used to suppress speckle noise. Conventionally, this filter is used to suppress the Gaussian noise from the images. A bilateral filter is better at edge preserving, noise suppression and for better smoothening of gray as well as colored images. Bilateral filter tends to improve image quality as it replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels. These weights are basically based on Gaussian distribution function. The three parameters have been used to analyze the performance of bilateral filter the are PSNR, SSIM, and SSI. Keywords Speckle suppression SSI



Bilateral filter



PSNR



SSIM

1 Introduction In medical field, ultrasound image is used to see internal body structure like joints and internal organs. Ultrasound work on application of ultrasonic waves. Ultrasound equipment which are used for image acquisition consist of transducer, scanner, CPU, and display device. During the acquisition process of ultrasound A. Gupta ⋅ V. Bhateja (✉) ⋅ A. Srivastava ⋅ A. Gupta Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow 226028, Uttar Pradesh, India e-mail: [email protected] A. Gupta e-mail: [email protected] A. Srivastava e-mail: [email protected] A. Gupta e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2_72

735

736

A. Gupta et al.

image, speckle noise is introduced in it. Speckle noise is multiplicative in nature, which perceptually appears as variation in contrast of the image. The resulting images with speckle noise will have decrease in the contrast of image and hence it becomes blurred [1]. Due to speckle noise, it is difficult to perform image processing operations like edge detection, segmentation, feature extraction, etc. Speckle noise holds a granular pattern which is the intrinsic property of ultrasound image. By using the speckle suppression filter, we can improve the quality of image by improving its contrast and retain fine structure. Therefore, for clinical analysis and quantitative analysis measure, it is important to suppress speckle noise [2, 3]. Many classifications of speckle suppression filters are characterized as local statistics and anisotropic diffusion filter. Local statistics filter include Mean [4], Median [5], Lee [6], Kuan [7], Frost [8], Wiener [8], and LSMV [9]. Anisotropic diffusion filters include SRAD [10, 11]. The major drawback of anisotropic filter is that if iterations are increased then the image gets blurred. Hence by increasing the number of iterations, computational complexity increases. The major drawback of local statistic filter is that they cannot differentiate between edges and noise present in ultrasound image Hence, noise is not effectively reduced by Local Statistics and Anisotropic Diffusion filters. Buades et al. performed an asymptotic analysis of neighborhood filters as the size of the neighborhood shrinks to zero. His paper proved that these filters are asymptotically equivalent to the Perona–Malik equation [12, 13]. Bai et al. proposed a filter for noise removal which is a new class of fractional-order anisotropic diffusion equations for image denoising [14]. Dabov et al. proposed a novel image denoising strategy based on an enhancement sparse representation. The filter is designed with three successive steps: 3-D transformation of a group, shrinkage of the transform spectrum, and inverse 3-D transformation [15]. Osher et al. proposed an iterative filter known as total variation filter. Later a non-iterative scheme for edge preserving smoothing was proposed which is known as bilateral filter which is introduced by Tomasi and Manduchi [16]. Bilateral filters can preserve the edges during smoothening of image. Bilateral filtering can work for both gray and color image. This filtering combines both range and domain filtering Bilateral filtering produces no phantom colors along edge in color images. Bilateral filtering extracts the texture of an image. This filtering preserves the detail of an image [17]. Finally, the performance of bilateral filters can be analyzed by Image Quality parameter which includes Peak Signal-to-Noise ratio (PSNR), Speckle Suppression Index (SSI), and Structural Similarity Index (SSIM). The other part of the paper is divided into four parts in which the second part explains about the theory and algorithm of bilateral filter. The third part and second part are of results and discussion and conclusion, respectively.

Suppression of Speckle Noise in Ultrasound Images …

737

2 Bilateral Filter for Suppression of Speckle Noise Bilateral filtering method is a simple method and it also preserves the edges during image smoothening. It is a nonlinear technique in which the weighted average of nearby pixels is calculated. Each neighbor is weighted by spatial component and it also takes difference in value with the neighbors in order to preserve the edges while smoothening. This filtering is applicable for both grayscale image and color image. In bilateral filtering, two types of weights are computed which include spatial domain weights and Intensity domain weights. Finally, spatial domain weights and intensity domain weights are multiplied at the output pixel. It depends on two parameters that are size and contrast of features to preserve. This filtering is used for image segmentation and it reconstructs the edges. This filtering combines both range and domain filtering. In range filtering image, values are averaged with weights and in range filtering weights depend upon image intensity, hence these filters are nonlinear. With the photometric distance parameter, color image can be easily processed. Bilateral filtering uses a range of an image as this filtering reduces the phantom colors. Bilateral filter can also split the image into two parts that are filtered image and its residual image. Filtered image holds only large-scale feature without affecting the edges. The residual image is formed by subtracting the filtered image from original image. This filtering separates the image into large-scale component and small-scale component. This filtering process on these components separately. Thus, bilateral filtering is an effective way by which image can be smoothened by preserving its discontinuities. This filtering separates the image structure of different scales. In this filtering, the weights can be adjusted based on the difference between two pixels. This filtering extracts the texture of an image and also preserves the detail of an image. It isolates the small-scale signal variations and these variations include texture and small detail of an image. Bilateral filtering is non-iterative in nature. Thus, this filter smooths the image by keeping its edges resilient. The performance of Bilateral Filters can be analyzed by these parameters that are Peak Signal-to-Noise ratio (PSNR), Speckle Suppression Index (SSI) and Structural Similarity Index (SSIM). Here, Peak Signal-to-Noise Ratio (PSNR) and Speckle Suppression Index (SSI), and Structural Similarity Index (SSIM) are the Image Quality Assessment parameters [18].

3 Results and Discussions The bilateral filter method involves certain parameters which are fixed such as window size, domain value σ d and range value σ r. The size of window is fixed to 3. The value of domain is fixed to 0.06 and range is 0.16. The original ultrasound image which is taken is of a fetus of the seventh week [19]. The various results on the ultrasound image are shown on Table 1. The Bilateral filter is applied to the ultrasound image with different levels of noise variance which ranges from 0.001,

738

A. Gupta et al.

Table 1 Algorithm of Bilateral filter

0.02, 0.04, 0.06 and 0.08. The value of 0.001 implies the very low amount of noise present in the ultrasound image, and the result for the low variance of noise in the ultrasound image is shown in Fig. 1b, c. Here, the value of noise variance is increasing up to the maximum level noise variance = 0.08, and the result for the maximum variance of noise is shown in the Fig. 1j, k. It can be seen here that as the value of noise variance is increasing, the performance of the bilateral filter is declining although at the same time the performance bilateral filter for low noise variance is good. It is known that greater the value of PSNR better the quality of image. It is shown in the Table 1, the value of PSNR is higher for the image with lowest amount of noise and as the amount of noise present in the image is increased the value of PSNR is decreased. Hence, it can be concluded that the bilateral filter is better for the lower amount of noise as it cannot suppress the noise of higher variance. The SSI is calculated for both noisy and filtered image. The SSI is used to calculate the amount of speckle noise present in the original noisy image as well as filtered image. Value of SSI is greater for noisy image and less for filtered image.

Suppression of Speckle Noise in Ultrasound Images …

739

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

Fig. 1 a Original ultrasound image, b noisy image with variance = 0.001, c filtered image, d noisy image with variance = 0.02, e filtered image, f noisy image with variance = 0.04, g filtered image, h noisy image with variance = 0.06, i filtered image, j noisy image with variance = 0.08, k filtered image

740 Table 2 Simulation result of PSNR and SSIM for ultrasound image

Table 3 Simulation result of SSI for ultrasound image

A. Gupta et al. Noise variance

PSNR (in dB)

SSIM

0.001 0.02 0.04 0.06 0.08

46.6179 33.7892 31.0187 29.3202 28.1777

0.9844 0.7808 0.6528 0.5648 0.5012

Noise variance

SSI (noisy)

SSI (filtered)

0.001 0.02 0.04 0.06 0.08

0.4980 0.5193 0.5359 0.5551 0.5714

0.1669 0.1702 0.1731 0.1733 0.1777

4 Conclusion In this paper, simulation of bilateral filter is performed by adding different amount of noise variance to the original ultrasound image. The results are being obtained by calculating image quality assessment parameters such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Speckle Suppression Index (SSI) as shown in Tables 2 and 3. The main aim of this paper is to implement bilateral filter for speckle noise suppression which should be capable of retaining the fine structure of the original image without sacrificing any fine details of the image along with the restoration of the edges and contrast enhancement of the image.

References 1. Bhateja, V., Tripathi, A., Gupta, A., Lay-Ekuakille, A.: Speckle suppression in SAR images employing modified anisotropic diffusion filtering in wavelet domain for environment monitoring. Measurement 74, 246–254 (2015) 2. Bhateja, V., Singh, G., Srivastava, A., Singh, J.: Speckle reduction in ultrasound images using an improved conductance function based on anisotropic diffusion. In: Proceedings of International Conference of Computing for Sustainable Global Development (INDIACom), pp. 619—624. IEEE (March, 2014) 3. Bhateja, V., Tiwari, H., Srivastava, A.: A non local means filtering algorithm for restoration of Rician distributed MRI. In: Emerging ICT for Bridging the Future-Proceeding of the 49th Annual Convention of the Computer Society of India CSI, vol. 2, pp. 1–8. Springer, Cham (2015) 4. Zhang, P., Li, F.: A new adaptive weighted mean filter for removing salt and pepper noise. IEEE J. Signal Process. Lett. 21(10), 1280–1283 (2014)

Suppression of Speckle Noise in Ultrasound Images …

741

5. Loupas, T., McDicken, N.W., Allan, L.P.: An adaptive weighted median filter for speckle suppression in medical ultrasound images. IEEE Trans. Circuits Syst. 36(1), 129–135 (1989) 6. Lee, S.J.: Digital image enhancement and noise filtering by use of local statistics. IEEE Trans. Pattern Anal. Mach. Intell. 2(2), 165–168 (1980) 7. Loizou, P.C., Pattichis, S.C., Christodoulou, I.C., Istepanian, S.R., Pantziaris, M., Nicolaides, A.: Comparative evaluation of despeckle filtering in ultrasound imaging of the carotid artery. IEEE Trans. Ultrason. Ferroelectr. Freq. 52(10), 1653–1669 (2005) 8. Finn, S., Glavin, M., Jones, E.: Echocardiographic speckle reduction comparison. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 58(1), 82—101 (2011) 9. Sivakumar, J., Thangavel, K., Saravanan, P.: Computed radiography skull image, enhancement using Wiener filter. In: Proceedings of International Conference on Pattern Recognition, Informatics and Medical Engineering, Henan, China, pp. 307–311. IEEE (2012) 10. Tripathi, A., Bhateja, V., Sharma, A.: Kuan modified anisotropic diffusion approach for speckle filtering. In: Proceedings of the First International Conference on Intelligent Computing and Communication, pp. 537—545. Springer, Singapore (2017) 11. Bhateja, V., Sharma, A., Tripathi, A., Satapathy, S.C., Le, D.N.: An optimized anisotropic diffusion approach for despeckling of SAR images. In: Digital Connectivity Social Impact, pp. 134—140. Springer, Singapore (2016) 12. Baudes, A., Coll, B., Morel, J.: Neighborhood filters and PDE’s. Technical report (2005–04) 13. Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 12(7), 629–639 (1990) 14. Bai, J., Feng, X.: Fractional-order anisotropic diffusion for image denoising. IEEE Trans. Image Process. 16(9) (2007) 15. Dabov, K., Foi, A., Katkovink, V.: Image denoising by sparse 3-D transform domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007) 16. Tomasi, C., Manduchi, R.: Bilateral filter for gray and color images. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1–8. IEEE, Bombay, India (1988) 17. Anh, N.D.: Image denoising by adaptive non-local bilateral filter. IEEE Int. J. Comput. Appl. 99(12) (2014) 18. Bhateja, V., Mishra, M., Urooj, S., Lay-Ekuakille, A.: Bilateral despeckling filter in homogeneity domain for breast ultrasound images. In: Proceedings of International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1027–1032, IEEE (2014) 19. Ultrasound Pictures. www.ob-ultrasound.net/frames.htm

Author Index

A Agarwal, S., 515 Aher, Shweta Nitin, 125 Ali, Mouad M.H., 563 Arias, Susana A.T., 251 Arolkar, Harshal, 455 Aswale, Pramod, 287 B Babu, R. Samuel Rajesh, 711 Bakal, Jagdish, 341 Banerjee, Soumya, 409 Bansal, Divya, 391 Bharambe, Shubham, 287 Bharati, Pritam, 287 Bharti, Yashna, 533 Bhaskar, Anand, 165 Bhaskar, K., 185 Bhateja, Vikrant, 735 Bhattacharya, Mahua, 219 Bhatt, Nirav, 155 Bilenia, Aniket, 219 C Castillo, David, 251 Chakraborty, Sonali, 31 Chanana, Lovneesh, 193 Chattopadhyay, Samiran, 409 Chernyshova, G. Yu., 107 Chindhi, Pradeep, 629 Choudhury, B.B., 515 D Demba, Hal Ahassanne, 351 Desai, Mayank, 725

Desai, Shrinivas D., 275 Deshapande, Anupama S., 275 Deshpande, Shraddha S., 193 Devi, Duvvuri Vijaya Nagarjuna, 309 Dhalmahapatra, Krantiraditya, 175 Dhami, Nilam, 53 Dhingra, Arjun, 219 Dilip, Sundharamoorthy, 409 Diwanji, Hiteishi, 553 Doshi, Jyotika, 31 D’silva, Godson Michael, 115 F Franklin Vinod, D., 435 G Gaikwad, A.A., 363 Gaikwad, Ashok, 563 Gajjar, Sachin, 473 Ganeshmoorthy, Kaliappan, 409 Gebremichael, Tsadkan, 287 Gharpure, Praful, 41 Giraddi, Shantala G., 275 Gogate, Uttara, 341 Gomez, Hector F.A., 251 Gunasheela, K. S., 495 Gupta, Aditi, 735 Gupta, Ananya, 735 Gupta, Anuj Kumar, 193 Gusyatnikov, V. N., 107 H Haridas, S.L., 581 Henry, Joseph, 143 Hussain, Sadiq, 67

© Springer Nature Singapore Pte Ltd. 2019 S. C. Satapathy and A. Joshi (eds.), Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies 106, https://doi.org/10.1007/978-981-13-1742-2

743

744 I Ilavarasan, P. Vigneswara, 465 Ishaq, F.S., 67 J Jagdale, Rohita, 533 Jain, Yash, 175 Jain, Swati, 309 Joshi, Dhwanika, 219 K Kadvani, Smit, 87 Kalkhambkar, Geeta, 629 Kamble, Vijaya, 309 Kapadia, Harsh K., 505 Kapur, Avnip, 219 Karibasappa, K.G., 275 Kar, Pragma, 409 Kathiria, Preeti, 455 Kaur, Kuljeet, 391 Kaur, Puneet Jai, 263 Kaushal, Sakshi, 263 Kazi, Tanveer Ahmed, 725 Keswani, Kapil, 165 Khalid, Muhammad, 697 Khanai, Rajashri, 629 Khan, Azharuddin, 115 Khanpara, Pimal, 589 Kiran, V., 1 Kshirsagar, Vivek P., 609 Kulkarni, Pallavi V., 609 Kulkarni, S.B., 363 Kulkarni, Vaishali, 655 Kumar, Sanjay, 11 Kunjumon, Anoop, 115 Korde, Pallavi, 309 Kyada, Kaushik, 573 L Lozada, T. Edwin Fabricio, 251 Luz M. Aguirre, P., 251 M Mahalakshmi, Ramkumar, 401 Mahale, Vivek, 563 Mahapatra, Gautam, 409 Maiti, J., 175 Majumdar, Shubhankar, 697 Mareswara Rao, P., 239 Martínez, C. Carlos Eduardo, 251 Mathur, Pratistha, 665 Meenakshi, V., 133 Mehta, Bhavana, 473 Mishra, Abhilasha, 675

Author Index Mishra, Vaishali, 505 Modi, Nilesh, 589 Mohammed, I.A., 67 Mondal, Kartick Chandra, 409 Mounika, Kakollu, 309 Muhammad, L.J., 67 Muhil, Atchutha, 409 Mulla, Sajeed S., 193 Mundra, Shikha, 219 N Naganjaneyulu, S., 21 Nagaraj, Vinila, 1 Nagar, Rikita, 553 Nagori, Meghana B., 609 Nainan, Sumita, 655 Nair, Anuja, 573 Narasimha Reddy, G.K.V., 21 Nedungadi, Prema, 351 Nagar, Rikita, 553 Negi, Ashish, 11 P Pal, Utpal, 523 Palve, Shekhar, 287 Parakh, Arun, 371 Parekh, Sameer, 53 Patel, Aditya, 87 Patel, Axita, 155 Patel, Samir, 87 Patel, Sandip, 53 Pattanayak, S., 515 Pendse, Ankita, 371 Pinnamaneni, Bhanu Prasad, 505 Popat, Param, 309 Prajapati, Priteshkumar, 79, 87 Prajapati, Purvi, 155 Prakash, S., 143 Prasantha, H. S., 495 R Rajashekara Rao, K., 239 Raj, Himanshu, 219 Ramachandran Nair, K. N., 675 Raman, Raghu, 351 Raman, Rahul, 219 Raval, Shrey Manish, 639 Robalino, Freddy, 251 Roy, Lukose, 115 S Saharan, Ravi, 427, 533 Sahoo, S.C., 515 Saivishnu, Sundarasrinivasan, 409

Author Index Sajja, Priti S., 485 Sameera, Khan, 381 Saravanan, Poomalai, 409 Sathyanarayana, P., 621 Saxena, Ashutosh, 533 Shah, Namra Bhadreshkumar, 639 Shah, Parth, 79, 87 Shah, Sanjeevani, 533 Shaikh, Ayesha, 79 Sharma, Daksh, 219 Sharma, Vijay Kumar, 665 Sharmistha Bhattacharya (Halder), 523 Sheshasaayee, Ananthi, 133 Sheth, Nakul, 79 Sheth, Prasham, 309 Shukla, Aditi, 287 Siddiqui, Mohammad Jawaid, 697 Singh, Amandeep, 465 Singh, J. N., 11 Singh, Kritika, 175 Singh, Sukhwinder, 193 Sivapragasam, Chandrasekaran, 401, 409 Shah, Nisha, 589 Shah, Maitri, 589 Somkunwar, Rachna, 445 Sowjanya Swathi, Nambhatla, 309 Srivasa Rao, P. V., 21 Srivastava, Avantika, 735 Srivastava, Devesh Kumar, 665 Subramanian, Karpaga Selvi, 287 Sudha, K.L., 1 Sujatha, C.N., 621 Suresh Babu, E., 21 T Talukdar, Jonti, 473 Tamim Tanwer, M., 725

745 Thakkar, Amit, 155 Thakkar, Tirth Chetankumar, 639 Thune, N.N., 581 Tilala, Mansi, 87 Trivedi, Harshal, 639 U Undavia, Jaimin, 53 Upadhyay, Tejal, 87 V Vairachilai, S., 287 Vaishnav, Zankhana B., 485 Vanitha, Sankararajan, 401 Varghese, Jina, 675 Vasudevan, V., 435 Vaze, Vinod M., 445 Venkata Rajini Kanth, Thatiparti, 309 Verma, H.K., 371 Vijayan, Parvathi, 185 Vinodhini, J. Suganthi, 711 Vishwakarma, Pinki, 381 Vygodchikova, I. Yu., 107 W Walunj, Sandip M., 125 Y Yadav, Sadanand, 427 Yakubu, Atomsa, 67 Yannawar, Pravin L., 563 Z Zadafiya, Neel, 573 Zambre, Swati R., 675 Zaveri, Tanish H., 505

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2025 AZPDF.TIPS - All rights reserved.