<p>In science, industry, public administration and documentation centers large amounts of data and information are collected which must be analyzed, ordered, visualized, classified and stored efficiently in order to be useful for practical applications. This volume contains 50 selected theoretical a
Data Analysis and Classification: Methods and Applications
✍ Scribed by Krzysztof Jajuga (editor), Krzysztof Najman (editor), Marek Walesiak (editor)
- Publisher
- Springer
- Year
- 2021
- Tongue
- English
- Leaves
- 346
- Series
- Studies in Classification, Data Analysis, and Knowledge Organization
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
✦ Synopsis
This volume gathers peer-reviewed contributions that address a wide range of recent developments in the methodology and applications of data analysis and classification tools in micro and macroeconomic problems. The papers were originally presented at the 29th Conference of the Section on Classification and Data Analysis of the Polish Statistical Association, SKAD 2020, held in Sopot, Poland, September 7–9, 2020. Providing a balance between methodological contributions and empirical papers, the book is divided into five parts focusing on methodology, finance, economics, social issues and applications dealing with COVID-19 data. It is aimed at a wide audience, including researchers at universities and research institutions, graduate and doctoral students, practitioners, data scientists and employees in public statistical institutions.
✦ Table of Contents
Preface
Contents
About the Editors
Methodology
1 Evaluation of Two-Step Spectral Clustering Algorithm for Large Untypical Data Sets
Abstract
1 Introduction
2 Limitations of Large Data Sets Classification
3 Proposal of New Algorithm
4 Simulation Experiment Results
5 Final Remarks and Conclusions
References
2 Determining the Number of Groups in Cluster Analysis Using Classical Indexes and Stability Measures—Comparison of Results
Abstract
1 Introduction
2 Measures of Cluster Stability
2.1 Ben-Hur and Guyon Stability Measure
2.2 Brock, Pihur, Datta, and Datta Stability Measure
2.3 Fang and Wang Stability Measure
3 A Data Set and the Scheme of Research
4 Empirical Results
4.1 Results for the Social Domain
4.2 Results for the Economic Domain
4.3 Results for the Environmental Domain
4.4 Results for the Institutional and Political Domain
5 Conclusions
References
3 Identification of the Words Most Frequently Used by Different Generations of Twitter Users
Abstract
1 Theory of Generations
2 Analysis of the Textual Data from the Social Network
2.1 Preparation of Text Data
2.2 Word Frequency Analysis
2.3 N-Gram Analysis
2.4 Agglomeration Methods of Hierarchical Clustering and Quality Assessment of Group Structure
3 Applications and Results
3.1 Twitter User Analysis
3.2 Analysis of the Words Occurring Most Commonly
3.3 Bigrams and Trigrams
4 Conclusion
References
4 Classification Algorithms Applications for Information Security on the Internet: A Review
Abstract
1 Introduction
2 Information Security
2.1 Cybersecurity Incidents Classification Taxonomy
2.2 Application on the Real Data
3 Methodology
4 Application of Classification Algorithms to Information Security
4.1 Popular Classification Algorithms
4.2 Classification Algorithms Used Per Study
4.3 Cybersecurity Incidents Examined
4.4 Highly Cited Studies
4.5 Challenges in Classification Algorithms Application to the Information Security
5 Conclusions and Future Research Directions
References
5 Outlier Detection with the Use of Isolation Forests
Abstract
1 The Essence of Outliers in Cluster Analysis
2 Introduction to Isolation Forests and Extended Isolation Forests
3 The Impact of Algorithm Parameters on the method’s Effectiveness
4 The Impact of Dataset Characteristics on the Anomaly Score Values
5 Discussion of the Empirical Research Results
6 Final Conclusions
References
Application in Finance
6 Propositions of Transformations of Asymmetrical Nominants into Stimulants on the Example of Chosen Financial Ratios
Abstract
1 Introduction
2 Previous Proposition of Modification of Minimum and Maximum
3 Proposals of Nonlinear Transformation of Nominant into Stimulants Normalized in the Range [0; 1]
4 Data and Empirical Results
5 Conclusions
References
7 Gini Regression in the Capital Investment Risk Assessment—Sensitivity Risk Measures in Portfolio Analysis
Abstract
1 Introduction
2 Systematic Risk—Estimation Beta
3 Gini Regression—Multiple Regressions Model
4 Application of Gini Regression in Portfolio Analysis
5 Discussion and Conclusion
References
Application in Economics
8 Enterprise Dark Data
Abstract
1 Introduction
2 Data Classification in Enterprises
2.1 Data Visibility
2.2 Data Quality
2.3 Data Availability
3 Dark Data Definitions—Literature Overview
4 Propositions and Results
4.1 Location of Dark Data in Enterprise
5 Conclusions
References
9 The Significance of Medical Science Issues in Research Papers Published in the Field of Economics
Abstract
1 Introduction
2 Interaction of Economic and Medical Sciences
3 Description of Classifications
4 Research Methodology
4.1 Research Scope and Goals
4.2 Identification of Topics Occurring in Abstracts and Related to Main Subareas of Economics and Medical Science
4.3 Projection of Identified Topics into Main Concepts from JEL and MeSH Ontologies
4.4 Analysis of Relationships Between Concepts Related to Economics and Medical Science
5 The Analysis of the Contribution of Medical Science Issues in Research Papers Published in the Field of Economics
6 Conclusions
Acknowledgements
References
10 Application of Duration Analysis Methods in the Study of the Exit of a Real Estate Sale Offer from the Offer Database System
Abstract
1 Introduction
2 Data Used in the Study
3 Time on the Market—Censored Data
4 Duration Analysis of the Real Estate Offer
5 Empirical Research
6 Conclusions
References
11 Is Society Ready for Long-Term Investments?—Profiles of Electricity Users in Silesia
Abstract
1 Introduction
2 Study of Energy Consumers’ Behaviours and Their Impact on Shaping Pro-ecological Attitudes
3 Characteristics of the Surveyed Electricity Users and Description of the Methods Applied in the Study
4 Results and Discussion
4.1 Characteristics of the Short-Term and Long-Term Investor Classes
4.2 Profiling of the Short-Term and Long-Term Investor Classes
5 Conclusions
Acknowledgments
References
12 The Use of the Spatial Taxonomic Measure of Development to Assess the Tourist Attractiveness of Districts of the Lesser Poland Province
Abstract
1 Introduction
2 Methods
2.1 Linear Ordering
2.2 Hellwig’s Method
2.3 Spatial Taxonomy Measure
3 Dataset and Results
3.1 Dataset
3.2 Empirical Study
4 Conclusions
References
Application in Social Issues
13 Models of Competing Events in Assessing the Effects of the Transition of Unemployed People Between the States of Registration and De-Registration
Abstract
1 Introduction
2 Literature Review
3 Research Methodology
4 Data Used in the Study
5 Empirical Results
6 Conclusions
References
14 Direct Adjusted Survival Probabilities in the Analysis of Finding a Job by the Unemployed Depending on Their Individual Characteristics
Abstract
1 Introduction
2 Research Method
3 Empirical Data and the Estimation of the Models
4 Conclusions
References
15 Europe 2020 Strategy—Objective Evaluation of Realization and Subjective Assessment by Seniors as Beneficiaries of Social Assumptions
Abstract
1 Introduction
2 Research Background and Literature Review
3 Data and Methods
4 Taxonomic Measure of Good Oldness
5 Europe 2020 Strategy in the Seniors’ Opinion
6 Conclusions and Discussion
References
16 Do Seniors Get to the Disco by Bike or in a Taxi?—Classification of Seniors According to Their Preferred Means of Transport
Abstract
1 Introduction
2 Literature Review: Seniors and Their Transport Needs. Attempts at Segmentation
3 Methodology: Segmentations Rationale and Data Collection
3.1 Expert Segmentation
3.2 Segmentation Using Taxonomic Methodology
4 Results and Discussion
4.1 Main Findings of the Expert Segmentation
4.2 Main Findings of the Segmentation Using Taxonomic Methodology
4.3 Comparison of the Two Segmentations
5 Conclusions
References
Application with COVID-19 Data
17 The Impact of the COVID-19 Pandemic on the Economies of European Countries in the Period January–September 2020 Based on Economic Indicators
Abstract
1 Introduction
2 SARS-CoV-2 and Economies
3 Literature Review
4 Economic Indicators
5 Methodology
6 Economic Situation in European Countries and COVID-19
6.1 First Quarter
6.2 Second Quarter
6.3 Third Quarter
6.4 January–September 2020
7 Conclusions
References
18 Modelling the Risk of Foreign Divestment in the Visegrad Group Countries During the COVID-19 Pandemic
Abstract
1 Introduction
2 Review of Literature
3 Research Methodology
4 Results of Empirical Research
4.1 Modelling of the Risk of Insignificant Foreign Divestment
4.2 Modelling of the Risk of Moderate Foreign Divestment
4.3 Modelling of the Risk of Considerable Foreign Divestment
5 Conclusions
Acknowledgements
References
19 Analysis of COVID-19 Dynamics in EU Countries Using the Dynamic Time Warping Method and ARIMA Models
Abstract
1 Introduction
2 Research Methodology
3 Empirical Data
4 The Results of the DTW Method
5 The Results of the ARIMA Modeling
6 Conclusions
References
📜 SIMILAR VOLUMES
Data analysis as an area of importance has grown exponentially, especially during the past couple of decades. This can be attributed to a rapidly growing computer industry and the wide applicability of computational techniques, in conjunction with new advances of analytic tools. This being the case,
Data analysis as an area of importance has grown exponentially, especially during the past couple of decades. This can be attributed to a rapidly growing computer industry and the wide applicability of computational techniques, in conjunction with new advances of analytic tools. This being the case,
<p>The contributions gathered in this book focus on modern methods for statistical learning and modeling in data analysis and present a series of engaging real-world applications. The book covers numerous research topics, ranging from statistical inference and modeling to clustering and factorial me
<p><p></p><p>This volume gathers peer-reviewed contributions on data analysis, classification and related areas presented at the 28th Conference of the Section on Classification and Data Analysis of the Polish Statistical Association, SKAD 2019, held in Szczecin, Poland, on September 18–20, 2019. Pr
<p>This volume contains a selection of papers presented at the Seven~h Confer ence of the International Federation of Classification Societies (IFCS-2000), which was held in Namur, Belgium, July 11-14,2000. From the originally sub mitted papers, a careful review process involving two reviewers per