What I really liked most about this books was that I could read the vast majority of it straight through and enjoyed the process. Very well structured and the example surrounding weather station data was an appropriate choice to give a good perspective on most of the problems. A good mix of practica
Hadoop: The Definitive Guide
β Scribed by Tom White
- Publisher
- Yahoo Press
- Year
- 2010
- Tongue
- English
- Leaves
- 625
- Edition
- Second Edition
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will learn how to set up and run Hadoop clusters.
This revised edition covers recent changes to Hadoop, including new features such as Hive, Sqoop, and Avro. It also provides illuminating case studies that illustrate how Hadoop is used to solve specific problems. Looking to get the most out of your data? This is your book.
- Use the Hadoop Distributed File System (HDFS) for storing large datasets, then run distributed computations over those datasets with MapReduce
- Become familiar with Hadoopβs data and I/O building blocks for compression, data integrity, serialization, and persistence
- Discover common pitfalls and advanced features for writing real-world MapReduce programs
- Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud
- Use Pig, a high-level query language for large-scale data processing
- Analyze datasets with Hive, Hadoopβs data warehousing system
- Take advantage of HBase, Hadoopβs database for structured and semi-structured data
- Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems
"Now you have the opportunity to learn about Hadoop from a master -- not only of the technology, but also of common sense and plain talk." --Doug Cutting, Cloudera
π SIMILAR VOLUMES
What I really liked most about this books was that I could read the vast majority of it straight through and enjoyed the process. Very well structured and the example surrounding weather station data was an appropriate choice to give a good perspective on most of the problems. A good mix of practica
<DIV><p>Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Prog
<DIV><p>Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Prog
Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers
<DIV><p>Ready to unlock the power of your data? With this comprehensive guide, youβll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and