The Internet in everyday life
โ Scribed by Pramod K. Nayar
- Publisher
- John Wiley and Sons
- Year
- 2004
- Tongue
- English
- Weight
- 70 KB
- Volume
- 55
- Category
- Article
- ISSN
- 1532-2882
No coin nor oath required. For personal study only.
โฆ Synopsis
This is a book about finding significant statistical patterns on the Web-in particular, patterns that are associated with hypertext documents, topics, hyperlinks, and queries. The term pattern in this book refers to dependencies among such items. On the one hand, the Web contains useful information on just about every topic under the sun. On the other hand, just like searching for a needle in a haystack, one would need powerful tools to locate useful information on the vast land of the Web. Soumen Chakrabarti's book focuses on a wide range of techniques for machine learning and data mining on the Web.
The goal of the book is to provide both the technical background and tools and tricks of the trade of Web content mining. Much of the technical content reflects the state of the art between 1995 and 2002. The targeted audience is researchers and innovative developers in this area, as well as newcomers who intend to enter this area.
The book begins with an introduction chapter. The introduction chapter explains fundamental concepts such as crawling and indexing as well as clustering and classification. The remaining eight chapters are organized into three parts: I) infrastructure, II) learning and III) applications.
Part I, Infrastructure, has two chapters: Chapter 2 on crawling the Web and Chapter 3 on Web search and information retrieval. The author intends to give the reader "some basic know-how for efficiently representing, manipulating, and analyzing hypertext documents with computer programs." In Chapter 2, the reader can find topics such as crawling basics, the design of large-scale crawlers, and practical issues. Chapter 3 focuses on basic concepts of information retrieval, including high-level concepts such as Boolean queries and relevance ranking as well as lower-level concepts such as stemming, recall and precision, the Vector-Space model, relevance feedback, and probabilistic relevance feedback models. Advanced issues specifically related to Web mining are also discussed in this chapter.
The second part of the book, containing chapters 4, 5, and 6, is the centerpiece. This part specifically focuses on machine learning in the context of hypertext. Chapter 4 addresses issues concerning similarity and clustering, including topics such as partitioning approaches, geometric embedding approaches, and generative models and probabilistic approaches. This chapter also includes a section on visualization and embedding, especially on self-organizing maps, multidimensional scaling, and latent semantic indexing.
Chapter 5 is concentrated on supervised learning, whereas semi-supervised learning is treated in Chapter 6. Under the heading of supervised learning, a variety of topics are discussed in detail, including nearest neighbor algorithms, feature selection algorithms, Bayesian networks, and maximum entropy learning algorithms. Chapter 6 features semi-supervised learning algorithms for various needs, such as labeling hypertext graphs.
๐ SIMILAR VOLUMES
## Abstract Based on empirical research with queer students, staff, and faculty at a typical southern university in the United States, this paper reports qualitative feedback gathered from 21 gay, lesbian, bisexual, transgender, and questioning individuals about the use of the Internet in their eve
Mathematics is a stumbling block in school for many children, yet the same children seem to acquire considerable mathematical knowledge without systematic teaching in everyday life. How can this discrefiancy in performance be understood?
The most trivial slips of the tongue or pen, Freud believed, can reveal our secret ambitions, worries, and fantasies. **The Psychopathology of Everyday Life** ranks among his most enjoyable works. Starting with the story of how he once forgot the name of an Italian painter--and how a young acquainta