Advances in Data Science and Information Engineering: Proceedings from ICDATA 2020 and IKE 2020 (Transactions on Computational Science and Computational Intelligence)

✍ Scribed by Robert Stahlbock (editor), Gary M. Weiss (editor), Mahmoud Abou-Nasr (editor), Cheng-Ying Yang (editor), Hamid R. Arabnia (editor), Leonidas Deligiannidis (editor)

Publisher: Springer
Year: 2021
Tongue: English
Leaves: 965
Edition: 1st ed. 2021
Category: Library

⬇ Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis

The book presents the proceedings of two conferences: the 16th International Conference on Data Science (ICDATA 2020) and the 19th International Conference on Information & Knowledge Engineering (IKE 2020), which took place in Las Vegas, NV, USA, July 27-30, 2020. The conferences are part of the larger 2020 World Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE'20), which features 20 major tracks. Papers cover all aspects of Data Science, Data Mining, Machine Learning, Artificial and Computational Intelligence (ICDATA) and Information Retrieval Systems, Information & Knowledge Engineering, Management and Cyber-Learning (IKE). Authors include academics, researchers, professionals, and students.

Presents the proceedings of the 16th International Conference on Data Science (ICDATA 2020) and the 19th International Conference on Information & Knowledge Engineering (IKE 2020);
Includes papers on topics from data mining to machine learning to informational retrieval systems;
Authors include academics, researchers, professionals and students.

✦ Table of Contents

Preface
Preface
Data Science: ICDATA 2020 – Organizing Committee (Leadership)
Information & Knowledge Engineering: IKE 2020 – Program Committee
Contents
Part I Graph Algorithms, Clustering, and Applications
Phoenix: A Scalable Streaming Hypergraph Analysis Framework
1 Introduction
2 Related Work
3 Framework for Analyzing Streaming Hypergraphs
3.1 Hypergraph Sources and Generation
3.2 Hypergraph Streaming and Consumption
3.3 Distributed and Scalable in-Memory rePresentation of HypERgraph (DiSciPHER)
3.3.1 In-Memory Representation of Hypergraph
3.3.2 Hypergraph Service Nodes (HSNs) and Hypergraph Data Nodes (HDNs)
3.4 Hypergraph Clustering
4 Performance
4.1 Approach for Evaluating the Performance
4.1.1 Dataset
4.1.2 Computational Environment
4.1.3 Software Environment
4.1.4 Performance Metrics
4.2 Results
4.3 Observations
5 Conclusion
References
Revealing the Relation Between Students' Reading Notes and Scores Examination with NLP Features
1 Introduction
2 Related Work
2.1 Students' Performance Prediction with Learning Behaviors
2.2 Students' Performance Prediction with Exercise Records
3 Experimental Design and Data Collection
3.1 Homework Design
3.2 Examination Design
3.3 Data Collection
4 Language Model and Prediction Model
4.1 Two-Step LDA Model
4.1.1 Pipeline of Two-Step LDA
4.1.2 Cross-Validation Result
4.1.3 Discussion
4.2 Topic-Based Latent Variable Model
4.2.1 Probabilistic Generative Model
4.2.2 Probabilistic Discriminative Model
4.2.3 Cross-Validation Result
4.3 Knowledge Graph Model
4.3.1 Leaning Behavior LVM
4.3.2 Graph Embedding
4.3.3 Cross-Validation Result
5 Conclusions
References
Deep Metric Similarity Clustering
1 Introduction
2 Related Works
2.1 Clustering Based on Affinity Space
2.2 Deep Metric Learning
3 Deep Metric Similarity Clustering
4 Optimizations
4.1 A Large-Scale Network Co-training Approach
5 Connection to Spectral Clustering
6 Experiments
6.1 Experiment Setups
6.2 Evaluations on Synthetic Data
6.3 Evaluations on Real-World Benchmark Data
6.4 Evaluations on Visual Data
7 Discussions
8 Theoretical Analysis
8.1 Optimization Analysis of the Sparsity Penalty
8.2 Solution to the Optimizations
8.2.1 Solution to S
8.2.2 Solution to Yc
References
Estimating the Effective Topics of Articles and Journals Abstract Using LDA and K-Means Clustering Algorithm
1 Introduction
2 Related Work
3 Research Methodology
3.1 Text Preprossessing
3.1.1 Text Tokenization Process
3.1.2 Stop Words Removing
3.1.3 POS Tag Process
3.1.4 Lemmatizing Process
3.2 Chunking and N-Gram Process
3.2.1 Chunking Process
3.2.2 N-Gram Model Process
3.3 Word2vec Model Process
3.4 Train Model
3.4.1 Latent Dirichlet Allocation (LDA)
3.4.2 K-Means Cluster
3.5 WordNet Process
3.5.1 Noun Phrase Process
3.5.2 WUP Similarity Labels Process
4 Result and Discussion
5 Conclusion
References
Part II Data Science, Social Science, Social Media, and Social Networks
Modelling and Analysis of Network Information Data for Product Purchasing Decisions
1 Introduction
2 Literature Review
3 Methodology
3.1 The Information Network Model Formulation
3.2 The Discrete Choice Model
3.2.1 Model Fitting and Parameter Estimation
4 Simulation Experiment and Discussion
5 Conclusion
References
Novel Community Detection and Ranking Approaches for Social Network Analysis
1 Introduction
2 Literature Review
3 Methodology
4 Results
5 Conclusion
References
How Is Twitter Talking About COVID-19?
1 Introduction
2 Methodology
2.1 Business Understanding
2.2 Data Understanding and Preparation
2.3 Modelling
3 Discussion
4 Conclusion
References
Detecting Asian Values in Asian News via Machine Learning Text Classification
1 Introduction
2 Literature Review
3 Methodology
4 Results
5 Conclusion
Bibliography
Part III Recommendation Systems, Prediction Methods, and Applications
The Evaluation of Rating Systems in Online Free-for-All Games
1 Introduction
2 Related Work
3 Predicting Rank
3.1 Elo
3.2 Glicko
3.3 TrueSkill
3.4 PreviousRank
3.5 Calculating Predicted Ranks
4 Metrics
4.1 Accuracy
4.2 Mean Absolute Error
4.3 Kendall's Rank Correlation Coefficient
4.4 Mean Reciprocal Rank
4.5 Average Precision
4.6 Normalized Discounted Cumulative Gain
5 Methodology
6 Results and Discussions
6.1 Accuracy
6.2 MAE
6.3 Kendall's Tau
6.4 MRR
6.5 Average Precision
6.6 NDCG
7 Conclusion and Future Works
References
A Holistic Analytics Approach for Determining Effective Promotional Product Groupings
1 Introduction
2 Literature Review
3 Data
4 Methodology
4.1 Process Flow of Methodology (Fig. 2)
4.2 Data Preprocessing
4.3 Feature Engineering
5 Modeling and Results
5.1 Logistic Regression
5.1.1 Results
5.2 Optimization
5.2.1 Goal
5.2.2 Solving Method
5.2.3 Advantages of Evolutionary Algorithm
5.2.4 Disadvantages of Evolutionary Algorithm
5.2.5 Objective Function
5.2.6 Decision Variables
5.2.7 Constraints
5.2.8 Tool Used for Optimization
5.2.9 Results (Fig. 7)
6 Conclusions and Future Scope
6.1 Manufacturing Cost of Product
6.2 Store-Wise Analysis
6.3 Customer Transaction Analysis
References
Hierarchical POI Attention Model for SuccessivePOI Recommendation
1 Introduction
2 Related Work
2.1 Successive POI Recommendation
2.2 Temporal Characteristic Modeling
2.3 Textual Content Influence Modeling
2.4 Hybrid Characteristic Modeling
3 Model Design
3.1 Task Definition
3.2 POI Representation Layer
3.3 POI LSTM Layer
3.4 Contextual Sequence Layer
3.5 Output Layer
3.6 Model Training
4 Experimental Setting
4.1 Performance Metric
4.2 Dataset Preparation
4.3 Compared Methods
4.4 Variants of HPAM Model
4.5 Hyper-parameter Set
5 Evaluation
5.1 Performance Comparison
5.2 Performance Analysis
6 Discuss
6.1 Case Study
6.2 Hyper-parameter Optimization
6.3 Effect of Temporal Characteristic
6.4 Effect of Word Attention
7 Conclusion
References
A Comparison of Important Features for Predicting Polish and Chinese Corporate Bankruptcies
1 Introduction
2 Background
3 Predicting Corporate Bankruptcies in Poland
3.1 Data Description
3.2 Modeling
3.3 Prediction Results
3.4 Feature Importance
4 Predicting Chinese Corporate Bankruptcies
4.1 Data Description
4.2 Modeling
4.3 Prediction Results
4.4 Feature Importance
5 Comparison of Important Features
6 Related Work
7 Conclusion
References
Using Matrix Factorization and Evolutionary Strategy to Develop a Latent Factor Recommendation System for an Offline Retailer
1 Introduction
2 Literature Review
2.1 Recommendation System
2.2 Evolutionary Strategy
3 Methodology
3.1 Data Collection
3.2 Data Sampling and Data Splitting
3.3 Data Preparation and Duration Adjustment Function
3.4 Model Measure
3.5 Matrix Factorization
3.6 Evolutionary Algorithm
4 Results
4.1 Duration Adjustment Function Results
4.2 Model Evolution Results
5 Conclusions, Contributions, and Future Work
5.1 Conclusions
5.2 Contributions
5.3 Limitations and Future Work
References
Dynamic Pricing for Sports Tickets
1 Introduction
2 Literature Review
3 Data
4 Methodology
4.1 Data Cleaning
4.2 Feature Engineering
4.3 Data Imputation
5 Models
5.1 Logistic Regression
5.2 Gradient Boosting/XGBoosting
5.3 Random Forest
5.4 Optimization
6 Results
6.1 Demand Forecasting Results from Predictive Models
6.2 Dynamic Pricing Results from Optimization Models
7 Conclusion
7.1 Limitation
7.2 Future Study Direction
7.2.1 Parallel Computing
7.2.2 Database Management
7.2.3 Attendance Rates
References
Virtual Machine Performance Prediction Based on Transfer Learning of Bayesian Network
1 Introduction
2 VMP-PBN Construction
2.1 Definition and Constraints
2.2 Parameter Learning and Structure Learning
3 VMP-PBN Transfer Learning
3.1 Node Variation Degree and BIC Average Score
3.2 Get T-VMP-PBN with Transfer Learning
4 Experimental Results
4.1 Dataset
4.2 Data Pre-processing
4.3 The Constraints of VMP-PBN in Cloud
4.4 Performance Prediction of VM Based on S-VMP-PBN
4.5 Performance Prediction of VM Based on T-VMP-PBN
5 Conclusion
References
A Personalized Recommender System Using Real-Time Search Data Integrated with Historical Data
1 Introduction
2 Data
3 Methodology
3.1 Explanatory Data Analysis (EDA)
3.2 Data Preprocessing
3.3 Modeling and Validation
4 Models
4.1 Model 1
4.2 Model 2
5 Results
6 Conclusion
References
Automated Prediction of Voter's Party Affiliation Using AI
1 Introduction
2 Paper Organization
3 Background and Related Works
3.1 Evaluation of Reviewed Published Works and Apps
3.1.1 Big Data-Driven Classification Approaches
3.1.2 ML Algorithms and Prediction Models
3.1.3 Reviewed Existing Proprietary Apps
4 Data and Methodology
4.1 Pv1.0 Problem Scope Definition
4.2 Data Collection, Feature Selection and Engineering
4.2.1 Data Selection Categories and Rationale
4.2.2 Feature Selection and Engineering
4.2.3 Target Variable for Prediction
4.3 Data Preprocessing and Splitting
5 PV1.0 Model
5.1 Decision Tree Classifier
5.2 Random Forest Classifier
5.3 Gradient Boosting Classifier Using XGBoost
5.4 Performance Measures
6 PV1.0 Model Evaluation
6.1 Performance Results
6.1.1 Decision Tree Classification Model
6.1.2 Random Forest Classification Model
6.1.3 Gradient Boosting Classification Model with XGBoost Using Grid Search for Tuning
6.2 Feature Importance
6.3 Testing XGBoost Model with Blind Test Dataset
7 Conclusion
8 Future Work
References
Part IV Data Science, Deep Learning, and CNN
Deep Ensemble Learning for Early-Stage Churn Management in Subscription-Based Business
1 Introduction
2 Related Work
3 Models
3.1 Standard Meta Stacking Model
3.2 Deep Stacking Model
4 Case Study with Ancestry Data
4.1 Problem Formulation
4.2 Features Generation
4.3 Data
4.4 Evaluation Metrics
5 Experiment Results
5.1 Performance of Single Models
5.2 Performance of Stacking Models
5.3 Insights on Model Performance
6 System Implementation
7 Conclusions
References
Extending Micromobility Deployments: A Concept and Local Case Study
1 Introduction
2 History
3 Benefits and Deficiencies
4 Factors in Adoption
5 Method
6 Data for the Case Study
7 Results
8 Conclusions
References
Real-Time Spatiotemporal Air Pollution Prediction with Deep Convolutional LSTM Through Satellite Image Analysis
1 Introduction
2 Methods
2.1 Dataset
2.2 Data Preprocessing
2.3 Input Data Labeling
3 Results
3.1 Aerial Satellite Image Model
3.2 Error Analysis
4 Conclusion
5 Future Work
Bibliography
Performance Analysis of Deep Neural Maps
1 Introduction
2 Self-Organizing Maps
3 Autoencoders
4 Deep Neural Maps
5 Experiments
6 Results
7 Conclusions and Further Work
References
Implicit Dedupe Learning Method on Contextual DataQuality Problems
1 Introduction
2 Related Works
3 Background
3.1 Data Quality Fundamentals
3.2 Data Quality Problems Detection Techniques
4 Materials and Methods
4.1 Dedupe Learning for Detecting and Correcting Contextual Data Quality Problems
4.2 Dedupe Learning Method
4.3 Dedupe Learning (DDL) Setup
5 Experimental Evaluation
5.1 Experiment Setup
5.2 Experiment for Matching Algorithm on the Datasets
6 Discussions
7 Conclusions
References
Deep Learning Approach to Extract Geometric Features of Bacterial Cells in Biofilms
1 Introduction
1.1 Key Contributions
2 Methods and Results
2.1 Data Collection
2.2 DCNN Training
2.3 Cell Cluster Segmentation
2.4 Qualitative Comparison of the Proposed Model with ImageJ
3 Conclusions
References
GFDLECG: PAC Classification for ECG Signals Using Gradient Features and Deep Learning
1 Introduction
2 Problem Formulation
2.1 Multivariate Time Series (MTS)
2.2 The Input
2.3 Problem Definition
3 Related Work
3.1 Feature Generation and Selection from Multivariate Time Series of ECG
3.2 Traditional Classification Algorithms Using Logistic Regression, Decision Tree, and KNN
3.3 Deep Learning for Time Series Classification
4 ECG Binary Classification Framework
4.1 Dataset Preprocessing
4.2 Gradient Feature Generation (GFG)
4.3 The Proposed Neural Network
4.3.1 Gated Recurrent Units Network (GRU)
4.3.2 Residual Fully Convolutional Networks (ResFCN) with GRU
4.3.3 Combining GRU and ResFCNGRU
5 Experiments and Results
5.1 Dataset Description
5.2 Overall Performance Summary
5.3 The Effect of GRU in GFDLECG
5.4 GFDLECG vs Other Neural Networks Using Gradient Filter
5.5 The Effect of Gradient Feature
6 Conclusion
References
Tornado Storm Data Synthesization Using Deep Convolutional Generative Adversarial Network
1 Introduction
2 GAN-Based Data Augmentation
3 Data
4 Results: DCGAN-Based Weather Data Synthesization
5 Conclusions
References
Integrated Plant Growth and Disease Monitoring with IoT and Deep Learning Technology
1 Introduction
2 Literature Review
3 Problem Statement
4 Architecture and Methodology
4.1 Analytics Server
4.2 Activity Log
4.3 Sensors
4.4 Image Intake and Analysis
4.5 Alerting System
5 Discussion and Further Research
References
Part V Data Analytics, Mining, Machine Learning, Information Retrieval, and Applications
Meta-Learning for Industrial System Monitoring via Multi-Objective Optimization
1 Introduction
2 Prior Work
3 Methodology
3.1 Feature Design
3.2 Feature Reduction
3.3 Unsupervised Learning via Clustering
3.4 Metrics for Evaluating Quality of Clusters
3.5 Approaches for Meta-Learning over System Space
3.5.1 Single-Objective Optimization
3.5.2 Single-Objective Optimization by Aggregating Metrics of Interest
3.5.3 Multi-Objective Optimization
4 Results
5 Conclusion
References
Leveraging Insights from ``Buy-Online Pickup-in-Store'' Data to Improve On-Shelf Availability
1 Introduction
2 Literature Review
3 Data
4 Methodology
4.1 Data Preprocessing
4.2 Feature Engineering
4.3 Upsampling
4.4 Target Transformation
4.5 Parallel Computation
5 Model
5.1 Model Evaluation Criteria
5.2 Model Experiments
5.3 Logistic Regression
5.4 Random Forest
5.5 AdaBoost
6 Results
7 Conclusions
References
Analyzing the Impact of Foursquare and Streetlight Data with Human Demographics on Future Crime Prediction
1 Introduction
2 Related Work
3 Feature Extraction
3.1 Temporal and Historical Features
3.2 Demographic and Streetlight Features
3.3 POI Features
3.4 Human Mobility Dynamic Features
4 Experiments
4.1 Datasets
4.2 Experimental Setup
4.3 Results for Our Proposed Features
4.4 Comparison with a Baseline
5 Conclusions and Future Work
References
Nested Named Sets in Information Retrieval
1 Introduction
2 Basic Definitions and Constructions
2.1 General Named Sets
2.2 Set-Based Named Sets
2.3 Nested Named Sets
3 Relations Between Nested Named Sets
4 Application to Databases
4.1 The Branching Data Model
4.2 The Nested Branching Data Model
5 Conclusion
References
Obstacle Detection via Air Disturbance in Autonomous Quadcopters
1 Introduction
2 Literature Review
3 Modeling
3.1 Mathematical Modeling
3.2 Airflow Testing
4 Experimentation
4.1 Wall Data Collection
4.2 Classifier Building
5 Results
6 Discussion
6.1 Classifier Accuracy
6.2 Artificial Ground Effect
6.3 Lab Variances
6.4 Future Work
References
Comprehensive Performance Comparison Between Flink and Spark Streaming for Real-Time Health Score Service in Manufacturing
1 Introduction
2 Goals and Background
2.1 Goals
2.2 Characteristics of Streaming Process Engines
2.2.1 Flink
2.2.2 Spark Streaming
3 Methods
3.1 A. System Architecture
3.2 Stream APIs
3.3 Optimizations for Stream Processing
3.3.1 Number of Parallelisms
3.3.2 Checkpointing
3.4 A. Models for Health Score
3.4.1 Long Short-Term Memory (LSTM) Model
3.4.2 Multivariate Analysis Model
4 Datasets
5 Results
5.1 Performance of the Health Score Service
5.1.1 Processing Time for Different Time Interval
5.1.2 Maximum Number of Assets
5.1.3 Varying the Number of Parallelisms
6 Discussion
7 Conclusions
References
Discovery of Urban Mobility Patterns
1 Introduction
2 Related Works
3 General Aspects
4 Methodology
5 Experimental Results and Discussion
5.1 Clustering
5.2 Huff Model
5.3 Consumption Patterns
6 Conclusions and Recommendations
References
Improving Model Accuracy with Probability Scoring Machine Learning Models
1 Introduction
2 Literature Review
2.1 Model Selection
2.2 Probability Scoring
3 Data
4 Methodology
4.1 Data Exploration and Pre-Processing
4.2 Feature Engineering and Data Partitioning
4.3 Model Building, Evaluation, and Ensemble Creation
5 Models
5.1 Logistic Regression
5.2 Random Forest
5.3 Artificial Neural Network
5.4 Boosting Models
5.5 Ensemble Model
6 Results
6.1 Feature Importance
6.2 Model
7 Conclusion
References
Ensemble Learning for Early Identification of Students at Risk from Online Learning Platforms
1 Introduction
2 Related Works
3 Method
3.1 Data Acquisition and Analysis
3.2 Information Fusion and Ratio Formulation
3.3 Base Classifiers
3.4 Stacking Ensemble
4 Experiment and Result
4.1 Deployment and Experimental Settings
4.2 Performance vs. Percentage of Data Used
4.3 Base Classifiers vs. Stacking Ensemble
4.4 Comparison with Related Works
5 Conclusion and Discussion
References
An Improved Oversampling Method Based on Neighborhood Kernel Density Estimation for Imbalanced Emotion Dataset
1 Introduction
2 Related Works
2.1 Random Oversampling
2.2 SMOTE (Synthetic Minority Oversampling Technique)
2.3 Borderline-SMOTE
2.4 ADASYN (Adaptive Synthetic Sampling)
3 The Proposed Oversampling Method
4 Experimental Setup
4.1 HRV Dataset for Emotion Classification
4.2 Performance Measures for the Evaluation of Classification
5 Results and Discussion
6 Conclusion
References
Time Series Modelling Strategies for Road Traffic Accident and Injury Data: A Case Study
1 Introduction
2 Methodology
3 Data
4 Results and Discussion
5 Conclusion
References
Towards a Reference Model for Artificial Intelligence Supporting Big Data Analysis
1 Introduction and Motivation
2 Artificial Intelligence
3 Methodology
4 Conceptual Modeling
4.1 AI Models for Supporting Big Data Analysis
4.2 AI Input and Output Data
4.3 AI User Stereotypes
4.4 Reference Model
5 Remaining Challenges and Outlook
6 Conclusion
References
Improving Physician Decision-Making and Patient Outcomes Using Analytics: A Case Study with the World's Leading Knee Replacement Surgeon
1 Introduction
2 Literature Review
3 Data
4 Methodology
5 Model
6 Results
7 Conclusions
Bibliography
Optimizing Network Intrusion Detection Using Machine Learning
1 Introduction
2 Literature Review
3 Methodology
4 Results
5 Conclusion
References
Hyperparameter Optimization Algorithms for Gaussian Process Regression of Brain Tissue Compressive Stress
1 Introduction
2 Materials and Methods
2.1 Dataset
2.2 Gaussian Process Regression
2.3 Search Algorithms
2.3.1 Bayesian Optimization
2.3.2 Grid Search
2.3.3 Random Search
2.4 Assessment
3 Results
4 Discussion
References
Competitive Pokémon Usage Tier Classification
1 Introduction
2 Background
2.1 Pokémon Typing
2.1.1 Base Stats
3 Experiment Methodology
3.1 Data Sets
3.2 Algorithms
3.3 Setup Methodology
3.4 Evaluation Metrics
4 Results
4.1 Interpreting Misclassifications
5 Conclusion
Reference
Mining Modern Music: The Classification of Popular Songs
1 Introduction
2 Attribute Definitions [2]
3 Pre-Processing
4 Experimentation
4.1 Testing Phase I
4.2 Testing Phase II
5 Feature Relations
6 Results
7 Conclusion
References
The Effectiveness of Pre-trained Code Embeddings
1 Introduction
2 Related Work
3 Methodology
4 Results
5 Discussion
6 Conclusions
References
An Analysis of Flight Delays at Taoyuan Airport
1 Introduction
2 Literature Review
2.1 Importance, Cost Effects, and Models of Flight Delays
2.2 Influential Factors of Flight Delays
2.3 FSC and LCC Behavior
3 Methodology
3.1 Variable Definitions
3.2 ANOVA and Regression
3.3 Data-Mining Process
4 Results
4.1 Descriptive Analysis
4.2 Regression Results
4.3 Data-Mining Results
5 Discussion and Conclusions
5.1 Discussion
5.2 Conclusions
References
Data Analysis for Supporting Cleaning Schedule of Photovoltaic Power Plants
1 Introduction
2 Methods
3 Preliminary Result
4 Remarks and Future Tasks
References
Part VI Information & Knowledge Engineering Methodologies, Frameworks, and Applications
Concept into Architecture: A Pragmatic Modeling Method for the Acquisition and Representation of Information
1 Introduction
2 Research Methodology
3 Problem Formulation: Information Acquisition and Representation
4 Concept Design
4.1 Operational Context
4.2 User Requirements
5 Implementation of Requirements and Development of an Algorithm
6 Evaluation
7 Conclusion
References
Improving Knowledge Engineering Through Inter-Organisational Architecture, Culture, Agility and Change in E-Learning Project Teams
1 Introduction
1.1 Effective Project Management
1.2 Selection of Methodology for Project Management
1.3 Project Management and Organisational Value Creation
1.4 Project Failure and Recovery in the Project Management Process
2 Background
2.1 Project, Portfolio and Programme Management
2.2 Organisational Architecture
2.3 Organisational Agility
2.4 Organisational Culture
2.5 Organisational Change
3 Problem Statement
4 A Revised Model for Integrating Organisational Architecture and Culture into Standardised Project, Portfolio and Programme Management Methodologies for E-Learning Product Development
4.1 Applying the Cornerstone-TS 3.0 Model
4.1.1 Organisational Culture
4.1.2 Organisational Change
4.1.3 Organisational Agility
4.1.4 Organisational Architecture
5 Discussion
References
Do Sarcastic News and Online Comments Make Readers Happier?
1 Introduction
1.1 Incivility in News Comments
2 Research Hypothesis
3 Research Method
3.1 Research Design
3.2 Experimental Materials
3.3 Emotion Scale
3.4 Control Variables
3.5 Formal Experiment
4 Research Results
4.1 Subjects
4.2 Control Variables
4.3 Emotion Difference
5 Discussion
References
GeoDataLinks: A Suggestion for a Replacement for the ESRIShapefile
1 Introduction
2 About the Shapefile
3 Previously Proposed Replacements for Shapefile
4 GeoDataLinks, the New Storage Scheme for Geographic Data
5 GeoDataLinks: Using ILE as a Replacement for Shapefile
6 Conclusions
References
Nutrition Intake and Emotions Logging System
1 Introduction
2 Problem Statement
3 Establishing Requirements and Designing a Simple Interactive System
3.1 Usability and User Experience Goals
3.2 Questions Using Design Goals
3.3 Users' Needs, Requirements, and Main Tasks
3.4 Scenarios and Use Cases
3.5 Requirements Using Volere shell
3.6 Conceptual Model
3.7 Mental Model
3.8 Analyzed Findings and Enhanced Conceptual Model
3.9 Interface Design Issues
3.10 Initial Designs and Evaluation of Designs
4 Implementing a Simple Interactive System
5 Data Analysis and Evaluation of the Simple Interactive System
5.1 Goals
5.2 Exploring the Questions
5.3 Evaluation Method Choice
5.4 Practical Issues Identified
5.5 Collected Data
5.6 Evaluation of Data
5.7 Data Analysis
5.8 Interpretation of Data
5.9 Ethical Issues
6 Conclusion
References
Geographical Labeling of Web Objects Through Maximum Marginal Classification
1 Introduction
1.1 Overview on Web Object Search Engine
1.2 Research Issues
1.3 Contributions
2 Related Work
3 Geographical Labeling Technique for Web Objects
3.1 Problem Statement
3.2 Maximum Marginal Classification Model
3.3 Algorithm
4 Results and Discussion
4.1 System Design
4.2 Empirical Results Discussions
5 Conclusion
References
Automatic Brand Name Translation Based on Hexagonal Pyramid Model
1 Introduction
2 Translation Strategies and Hexagonal Pyramid Model
2.1 Phonetic Strategy (P)
2.2 Semantic Strategy (S)
2.3 Phono-semantic Strategy (P, S)
2.4 Commercial Strategy (C)
2.4.1 Re-creation
2.4.2 No-Translation
2.5 Phono-Commercial Strategy (P, C)
3 Application of Hexagonal Pyramid Model
4 Conclusion
References
A Human Resources Competence Actualization Approach for Expert Networks
1 Introduction
2 Related Work
3 Reference Model of Human Resources Competence Actualization
4 Human Resources Competence Actualization
5 Algorithm Evaluation
6 Conclusion
References
Smart Health Emotions Tracking System
1 Introductions
2 Requirements
3 Establish Requirements and Resign a Simple Interactive System
3.1 Usability Goals with Respect to Design Goals
3.2 Developer Concerned Questions
3.3 Use Case Diagram for the Smart Health Tracking System
3.4 Conceptual Model for Smart Health Tracking System
3.5 Mental Model from People
3.6 Interface Design Issues
4 Implement a Simple Interactive System
5 Data Analysis and Evaluation
5.1 Goals to Be Attained
5.2 Evaluation Method
5.3 Practical Issues Identified
5.4 Ethical Issues
5.5 Collection of Data
5.6 Evaluation of Data:
6 Conclusion
References
Part VII Video Processing, Imaging Science, and Applications
Content-Based Image Retrieval Using Deep Learning
1 Introduction
2 Related Work
3 Proposed Method
3.1 Preprocessing
3.2 Feature Extraction for Image Retrieval
3.3 Deep Neural Networks
4 Experimental Results
4.1 Dataset
4.2 Discussion of Results
5 Conclusion
References
Human –Computer Interaction Interface for Driver Suspicious Action Analysis in Vehicle Cabin
1 Introduction
2 Related Work
3 Reference Model of Driver Monitoring System
4 Case Study Using a Smartphone
5 Conclusion
References
Image Resizing in DCT Domain
1 Introduction
2 Review of SHE and SHE-SC
3 Proposed Algorithm
4 Experimental Results
5 Conclusions
References
Part VIII Data Science and Information & Knowledge Engineering
Comparative Analysis of Sampling Methods for Data QualityAssessment
1 Introduction
2 Literature Review
2.1 Data Quality Assessment
2.2 Data Quality Dimensions
2.3 Sample Techniques
3 Methodology
4 Experiment and Results
4.1 Experimental Setting and Datasets
4.2 Experimental Results
5 Conclusions and Future Work
References
A Resampling Based Semi-supervised Learning Analysis for Identifying School Needs of Backpack Programs
1 Introduction
2 Methodology
2.1 Supervised Learning and Selected Algorithms
2.1.1 Logistic Regression (LG)
2.1.2 Naïve Bayes (NB)
2.1.3 Decision Tree (DT)
2.1.4 Random Forest (RF)
2.1.5 Support Vector Machines (SVM)
2.2 Unsupervised Learning
2.3 Resampling and Semi-supervised Learning
2.4 Proposed RSSL Algorithm
2.5 The Ensemble of the SSL Results
3 Data Description
3.1 Data Collection
3.2 Descriptive Analysis
4 Results and Discussions
4.1 RSSL Result and Impacts of Selected Classifiers
4.2 Ensemble Results
5 Conclusion and Future Research
References
Data-Driven Environmental Management: A Digital Prototype Dashboard to Analyze and Monitor the Precipitation on Susquehanna River Basin
1 Introduction
2 Machine Learning Approaches
2.1 Precipitation and Streamflow
3 Methodology and Datasets
4 Data Exploratory Analysis
5 Operational Dashboard
6 Conclusion
References
Viability of Water Making from Air in Jazan, Saudi Arabia
1 Introduction
1.1 Solar Energy
1.2 Wind Energy
1.3 Water Making Device
2 Water Generation from Air
3 Comparison between Water Desalination and Water from Air
4 Analysis and Simulation
4.1 Analysis
4.2 Simulation
4.3 Simulation Results
5 Conclusion
References
A Dynamic Data and Information Processing Model for Unmanned Aircraft Systems
1 Introduction
2 Background
3 Methodology
4 Anticipated Outcomes
References
Utilizing Economic Activity and Data Science to Predict and Mediate Global Conflict
1 Introduction
2 Related Work
3 Moving Towards Efficiency
4 Thought Process
5 Real Application
6 First Approach
7 Current Approach
8 Discussion and Conclusion
References
Part IX Machine Learning, Information & Knowledge Engineering, and Pattern Recognition
A Brief Review of Domain Adaptation
1 Introduction
2 Notations and Definitions
3 Categorization of Domain Adaptation
4 Approaches
4.1 Instance-Based Adaptation
4.2 Feature-Based Adaptation
4.3 Deep Domain Adaptation
References
Fake News Detection Through Topic Modeling and Optimized Deep Learning with Multi-Domain Knowledge Sources
1 Introduction
2 Related Work
2.1 Fake News Detection and Classification Approaches
2.2 Stance Detection-Based Fake News Assessment Approaches
3 Proposed Methodology
3.1 Pre-training the BERT Model
3.2 Topic Extraction from News Using LDA
3.3 Deep Learning-Assisted Fine-Tuning
3.4 Fake News Detection Through Intelligent Decision-Making
4 Experimental Evaluation
4.1 Dataset
4.2 Evaluation Metric
5 Conclusion
References
Accuracy Evaluation: Applying Different Classification Methods for COVID-19 Data
1 Introduction
2 Literature Review
3 Methodology
3.1 KNN
3.2 K-Means
3.3 Decision Tree Algorithm
4 Experiment and Results
4.1 Datasets and Preprocessing
4.2 Experimental Setting
4.3 Experimental Results
5 Conclusions
References
Clearview, an Improved Temporal GIS Viewer and Its Use in Discovering Spatiotemporal Patterns
1 Introduction
2 Extant Temporal GIS Software Packages with Display Capabilities
3 Previous Ways to Represent Data in Historical and Temporal GIS Databases
4 Clearview, a Temporal GIS Viewer for Spatiotemporal History
5 Previously Unseen Patterns in the CHGIS V.4 Time Series Database Discovered Using Clearview
6 Conclusions
References
Using Entropy Measures for Evaluating the Quality of EntityResolution
1 Introduction
2 Problem Statement
3 Proposed Method for Entropy Evaluation
4 Research Method
4.1 Reference Sets
4.2 Blocking and Stop Word Removal
4.3 Pairwise Matching with the Monge-Elkan Comparator
4.4 Comparing the Entropy Value to F-Measure
5 Results
6 Conclusion and Future Research
References
Improving Performance of Machine Learning on Prediction of Breast Cancer Over a Small Sample Dataset
1 Introduction
1.1 Background
1.2 Related Work
1.3 Paper Organization
2 Problem, Hypothesis, and Research Questions
2.1 Problem Statement
2.2 Hypothesis Statement
2.3 Research Questions
3 Works That Answer the 1st Research Question
3.1 Stratified Sampling
3.2 Data Augmentation
3.3 Distribution Comparison
3.4 Summary
4 Works That Answer the 2nd Research Question
4.1 Support Vector Machine with Radial Basis (SVMRB): Original vs. Augmented Data
4.2 Gradient Boosting (GB): Original vs. Augmented Data
4.3 Random Forest (RF): Original vs. Augmented Data
4.4 Summary
5 Conclusion
References
Development and Evaluation of a Machine Learning-Based Value Investing Methodology
1 Introduction
2 Experimental Setup
2.1 Dataset Description
2.2 Testing Criteria and Algorithms
3 Experimental Results
3.1 Experiment 1: Determining Optimal Context Years
3.2 Experiment 2: 15 Years (1990–2005), 11 Years (2005–2016)
4 Conclusion
References
Index

📜 SIMILAR VOLUMES

Proceedings of International Conference

📁 Proceedings of International Conference on Computational Intelligence, Data Science and Cloud Computing: IEM-ICDC 2020 (Lecture Notes on Data Engineering and Communications Technologies, 62)

✍ Valentina E. Balas (editor), Aboul Ella Hassanien (editor), Satyajit Chakrabarti 📂 Library 📅 2021 🏛 Springer 🌐 English

This book includes selected papers presented at International Conference on Computational Intelligence, Data Science and Cloud Computing (IEM-ICDC) 2020, organized by the Department of Information Technology, Institute of Engineering & Management, Kolkata, India, during 25–27 September

Advances in Intelligent Computing and Co

📁 Advances in Intelligent Computing and Communication: Proceedings of ICAC 2020

✍ Swagatam Das (editor), Mihir Narayan Mohanty (editor) 📂 Library 📅 2021 🏛 Springer 🌐 English

This book presents high-quality research papers presented at the 3rd International Conference on Intelligent Computing and Advances in Communication (ICAC 2020) organized by Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, Odisha, India, in November 2020. This book brings out the new adva

Recent Advances in Artificial Intelligen

📁 Recent Advances in Artificial Intelligence and Data Engineering: Select Proceedings of AIDE 2020 (Advances in Intelligent Systems and Computing, 1386)

✍ Pushparaj Shetty D. (editor), Surendra Shetty (editor) 📂 Library 📅 2021 🏛 Springer 🌐 English

Recent Advances in Artificial Intelligen

📁 Recent Advances in Artificial Intelligence and Data Engineering: Select Proceedings of AIDE 2020 (Advances in Intelligent Systems and Computing, 1386)

✍ Pushparaj Shetty D. (editor), Surendra Shetty (editor) 📂 Library 📅 2021 🏛 Springer 🌐 English

This book presents select proceedings of the International Conference on Artificial Intelligence and Data Engineering (AIDE 2020). Various topics covered in this book include deep learning, neural networks, machine learning, computational intelligence, cognitive computing, fuzzy logic, expert

Principles of Data Science (Transactions

📁 Principles of Data Science (Transactions on Computational Science and Computational Intelligence)

✍ Hamid R. Arabnia (editor), Kevin Daimi (editor), Robert Stahlbock (editor), Cris 📂 Library 📅 2020 🏛 Springer 🌐 English

This book provides readers with a thorough understanding of various research areas within the field of data science. The book introduces readers to various techniques for data acquisition, extraction, and cleaning, data summarizing and modeling, data analysis and communication techniques, data scien

Advances on Intelligent Informatics and

📁 Advances on Intelligent Informatics and Computing: Health Informatics, Intelligent Systems, Data Science and Smart Computing (Lecture Notes on Data Engineering and Communications Technologies, 127)

✍ Faisal Saeed (editor), Fathey Mohammed (editor), Fuad Ghaleb (editor) 📂 Library 📅 2022 🏛 Springer 🌐 English

This book presents emerging trends in intelligent computing and informatics. This book presents the papers included in the proceedings of the 6th International Conference of Reliable Information and Communication Technology 2021 (IRICT 2021) that was held virtually, on Dec. 22-23, 2021. The