Learn to use Apache Pig to develop lightweight big data applications easily and quickly. This book shows you many optimization techniques and covers every context where Pig is used in big data analytics. Beginning Apache Pig shows you how Pig is easy to learn and requires relatively little time to d
Beginning Apache Pig: Big Data Processing Made Easy
β Scribed by Balaswamy Vaddeman (auth.)
- Publisher
- Apress
- Year
- 2016
- Tongue
- English
- Leaves
- 285
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Learn to use Apache Pig to develop lightweight big data applications easily and quickly. This book shows you many optimization techniques and covers every context where Pig is used in big data analytics. Beginning Apache Pig shows you how Pig is easy to learn and requires relatively little time to develop big data applications.The book is divided into four parts: the complete features of Apache Pig; integration with other tools; how to solve complex business problems; and optimization of tools.You'll discover topics such as MapReduce and why it cannot meet every business need; the features of Pig Latin such as data types for each load, store, joins, groups, and ordering; how Pig workflows can be created; submitting Pig jobs using Hue; and working with Oozie. You'll also see how to extend the framework by writing UDFs and custom load, store, and filter functions. Finally you'll cover different optimization techniques such as gathering statistics about a Pig script, joining strategies, parallelism, and the role of data formats in good performance.
What You Will Learnβ’ Use all the features of Apache Pigβ’ Integrate Apache Pig with other toolsβ’ Extend Apache Pigβ’ Optimize Pig Latin codeβ’ Solve different use cases for Pig LatinWho This Book Is ForAll levels of IT professionals: architects, big data enthusiasts, engineers, developers, and big data administrators
β¦ Table of Contents
Front Matter....Pages i-xxiii
MapReduce and Its Abstractions....Pages 1-20
Data Types....Pages 21-31
Grunt....Pages 33-40
Pig Latin Fundamentals....Pages 41-67
Joins and Functions....Pages 69-87
Creating and Scheduling Workflows Using Apache Oozie....Pages 89-101
HCatalog....Pages 103-113
Pig Latin in Hue....Pages 115-122
Pig Latin Scripts in Apache Falcon....Pages 123-136
Macros....Pages 137-145
User-Defined Functions....Pages 147-155
Writing Eval Functions....Pages 157-169
Writing Load and Store Functions....Pages 171-186
Troubleshooting....Pages 187-199
Data Formats....Pages 201-208
Optimization....Pages 209-223
Hadoop Ecosystem Tools....Pages 225-248
Back Matter....Pages 249-274
β¦ Subjects
Open Source;Database Management;Data Storage Representation;Data Mining and Knowledge Discovery;Information Storage and Retrieval
π SIMILAR VOLUMES
Big Data Analytics Made Easy is a must-read for everybody as it explains the power of Analytics in a simple and logical way along with an end to end code in R. Even if you are a novice in Big Data Analytics, you will still be able to understand the concepts explained in this book. If you are alread
Big Data Analytics Made Easy is a must-read for everybody as it explains the power of Analytics in a simple and logical way along with an end to end code in R. Even if you are a novice in Big Data Analytics, you will still be able to understand the concepts explained in this book. If you are already
Big Data Analytics Made Easy is a must-read for everybody as it explains the power of Analytics in a simple and logical way along with an end to end code in R. Even if you are a novice in Big Data Analytics, you will still be able to understand the concepts explained in this book. If you are already
Big Data Analytics Made Easy is a must-read for everybody as it explains the power of Analytics in a simple and logical way along with an end to end code in R. Even if you are a novice in Big Data Analytics, you will still be able to understand the concepts explained in this book. If you are already
Build efficient, high-performance & scalable systems to process large volumes of data with Apache Ignite Key FeaturesUnderstand Apache Ignite's in-memory technologyCreate High-Performance app components with IgniteBuild a real-time data streaming and complex event processing systemBook Description A