𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Big Data A Tutorial-Based Approach

✍ Scribed by Nasir Raheem


Publisher
CRC Press
Year
2019
Tongue
English
Leaves
203
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Big Data: A Tutorial-Based Approach explores the tools and techniques used to bring about the marriage of structured and unstructured data. It focuses on Hadoop Distributed Storage and MapReduce Processing by implementing (i) Tools and Techniques of Hadoop Eco System, (ii) Hadoop Distributed File System Infrastructure, and (iii) efficient MapReduce processing. The book includes Use Cases and Tutorials to provide an integrated approach that answers the β€˜What’, β€˜How’, and β€˜Why’ of Big Data.

Features

Identifies the primary drivers of Big Data
Walks readers through the theory, methods and technology of Big Data
Explains how to handle the 4 V’s of Big Data in order to extract value for better business decision making
Shows how and why data connectors are critical and necessary for Agile text analytics
Includes in-depth tutorials to perform necessary set-ups, installation, configuration and execution of important tasks
Explains the command line as well as GUI interface to a powerful data exchange tool between Hadoop and legacy r-dbms databases

✦ Table of Contents


Cover......Page 1
Half Title......Page 2
Title Page......Page 4
Copyright Page......Page 5
Dedication......Page 6
Contents......Page 8
List of Tutorials......Page 14
List of Figures/Illustrations......Page 16
Foreword......Page 18
Preface......Page 20
Acknowledgements......Page 24
Author......Page 26
RAPID GROWTH OF BIG DATA......Page 28
BIG DATA DEFINITION......Page 30
BIG DATA PROJECTS......Page 31
BUSINESS VALUE OF BIG DATA......Page 32
OVERVIEW......Page 36
HIGH-LEVEL TASKS TO IMPLEMENT INFORMATICA BDM, CLOUDERA HIVE, AND TABLEAU......Page 37
BIG DATA TRIGGERS DIGITAL TRANSFORMATION OF THE PRODUCTION MODEL......Page 38
BIG DATA CHALLENGES AND ASSOCIATED USE CASES......Page 40
HADOOP INFRASTRUCTURE: OVERVIEW......Page 41
Hyperconverged Hadoop Infrastructure......Page 42
Compute Hardware Components......Page 43
Network Hardware Components......Page 44
Storage Hardware Architecture and Components......Page 46
HADOOP ECO SYSTEM......Page 47
HADOOP DISTRIBUTED FILE PROCESSING......Page 49
MAPREDUCE SOFTWARE......Page 53
MAPREDUCE SOFTWARE INSTALLATION......Page 54
MAPREDUCE PROCESSING......Page 55
BIG DATA USE CASE: HEALTH......Page 60
BIG DATA USE CASE: MANUFACTURING......Page 62
BIG DATA USE CASE: INSURANCE......Page 63
OVERVIEW......Page 66
WHERE IS SQOOP USED?......Page 68
SQOOP COMMANDS......Page 69
HIVE ARGUMENTS USED BY SQOOP......Page 70
APACHE SQOOP ARCHITECTURE......Page 71
APACHE SQOOP COMMAND LINE INTERFACE......Page 72
OVERVIEW......Page 76
INFORMATICA: MATURE AND COMPREHENSIVE BIG DATA SOLUTION......Page 77
INFORMATICA DATA INTEGRATION......Page 79
OVERVIEW......Page 86
DATA REPOSITORY LAYER......Page 88
HIVE BIG DATA WAREHOUSE......Page 89
SLOWLY CHANGING DIMENSION IN HIVE......Page 90
HIVE METADATA: DEFINITIONS......Page 92
INTEGRATED USE OF DATA INTEGRATION, DATA MANAGEMENT, AND DATA VISUALIZATION TOOLS......Page 99
OVERVIEW......Page 102
Numbers......Page 110
Strings......Page 112
Factors......Page 113
SUCCESS FACTORS FOR TABLEAU......Page 114
TABLEAU: STEP FORWARD IN DATA ANALYTICS......Page 115
TABLEAU DATA ENGINE TUNING......Page 120
Curate Data from the Data Lake......Page 127
Optimize Data Extracts......Page 128
Customize Tableau Connection Performance......Page 129
OVERVIEW......Page 130
TEXT ANALYTICS AS MEANS TO EXTRACT VALUE FROM UN-STRUCTURED DATA......Page 131
Decision Maker......Page 132
Data Scientists......Page 133
FROM DATA TO ACTION......Page 134
CONCLUSION......Page 141
OVERVIEW......Page 142
Conclusion: Flexibility and Agility......Page 150
Pre-Installation Steps to Set Up Denodo Development Environment......Page 151
CONCLUSION......Page 163
OVERVIEW......Page 164
Platform as a Service (PaaS)......Page 166
Infrastructure as a Service (IaaS)......Page 167
CLOUD COMPUTING VERSUS HADOOP PROCESSING......Page 168
Infrastructure as a Service (IaaS)......Page 169
CONCLUSION......Page 170
SELF-ASSESSMENT QUIZ......Page 172
ANSWERS TO THE SELF-ASSESSMENT QUIZ......Page 180
REFERENCES......Page 190
INDEX......Page 194


πŸ“œ SIMILAR VOLUMES


Granular Computing Based Machine Learnin
✍ Han Liu, Mihaela Cocea πŸ“‚ Library πŸ“… 2018 πŸ› Springer International Publishing 🌐 English

<p>This book explores the significant role of granular computing in advancing machine learning towards in-depth processing of big data. It begins by introducing the main characteristics of big data, i.e., the five Vsβ€”Volume, Velocity, Variety, Veracity and Variability. The book explores granular com

Rule Based Systems for Big Data: A Machi
✍ Han Liu, Alexander Gegov, Mihaela Cocea πŸ“‚ Library πŸ“… 2015 πŸ› Springer International Publishing 🌐 English

<p><p>The ideas introduced in this book explore the relationships among rule based systems, machine learning and big data. Rule based systems are seen as a special type of expert systems, which can be built by using expert knowledge or learning from real data. </p><p>The book focuses on the developm

A Practitioner's Guide to Data Governanc
✍ Uma Gupta; San Cannon πŸ“‚ Library πŸ“… 2020 πŸ› Emerald Publishing Limited 🌐 English

Data governance looks deceptively simple on paper. In reality, it is complex. And it is increasingly recognized as a key foundational element necessary to advance analytics and improve operations for organizations of all types across industry. In this practical guide, data experts Uma Gupta and San

Data Mining: A Tutorial-Based Primer, Se
✍ Roiger, Richard J πŸ“‚ Library πŸ“… 2017 πŸ› Taylor & Francis;Chapman and Hall/CRC 🌐 English

<P><EM>"Dr. Roiger does an excellent job of describing in step by step detail formulae involved in various data mining algorithms, along with illustrations. In addition, his tutorials in Weka software provide excellent grounding for students in comprehending the underpinnings of Machine Learning as

Educational Data Science: Essentials, Ap
✍ Alejandro PeΓ±a-Ayala (editor) πŸ“‚ Library πŸ“… 2023 πŸ› Springer 🌐 English

<p><span>This book describes theoretical elements, practical approaches, and specialized tools that systematically organize, characterize, and analyze big data gathered from educational affairs and settings. Moreover, the book shows several inference criteria to leverage and produce descriptive, exp