𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Graph Algorithms for Data Science: With examples in Neo4j

✍ Scribed by Tomaz Bratanic


Publisher
Manning Publications Co.
Year
2024
Tongue
English
Leaves
352
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Graph Algorithms for Data Science teaches you how to construct graphs from both structured and unstructured data. You'll learn how the flexible Cypher query language can be used to easily manipulate graph structures, and extract amazing insights. Graph Algorithms for Data Science is a hands-on guide to working with graph-based data in applications. It's filled with fascinating and fun projects, demonstrating the ins-and-outs of graphs. You'll gain practical skills by analyzing Twitter, building graphs with NLP techniques, and much more. These powerful graph algorithms are explained in clear, jargon-free text and illustrations that makes them easy to apply to your own projects.

✦ Table of Contents


inside front cover
Graph Algorithms for Data Science
Copyright
contents
front matter
foreword
preface
acknowledgments
about this book
Who should read this book
How this book is organized
About the code
liveBook discussion forum
about the author
about the cover illustration
Part 1 Introduction to graphs
1 Graphs and network science: An introduction
1.1 Understanding data through relationships
1.2 How to spot a graph-shaped problem
1.2.1 Self-referencing relationships
1.2.2 Pathfinding networks
1.2.3 Bipartite graphs
1.2.4 Complex networks
Summary
2 Representing network structure: Designing your first graph model
2.1 Graph terminology
2.1.1 Directed vs. undirected graph
2.1.2 Weighted vs. unweighted graphs
2.1.3 Bipartite vs. monopartite graphs
2.1.4 Multigraph vs. simple graph
2.1.5 A complete graph
2.2 Network representations
2.2.1 Labeled-property graph model
2.3 Designing your first labeled-property graph model
2.3.1 Follower network
2.3.2 User-tweet network
2.3.3 Retweet network
2.3.4 Representing graph schema
2.4 Extracting knowledge from text
2.4.1 Links
2.4.2 Hashtags
2.4.3 Mentions
2.4.4 Final Twitter social network schema
Summary
Part 2 Network analysis
3 Your first steps with Cypher query language
3.1 Cypher query language clauses
3.1.1 CREATE clause
3.1.2 MATCH clause
3.1.3 WITH clause
3.1.4 SET clause
3.1.5 REMOVE clause
3.1.6 DELETE clause
3.1.7 MERGE clause
3.2 Importing CSV files with Cypher
3.2.1 Clean up the database
3.2.2 Twitter graph model
3.2.3 Unique constraints
3.2.4 LOAD CSV clause
3.2.5 Importing the Twitter social network
3.3 Solutions to exercises
Summary
4 Exploratory graph analysis
4.1 Exploring the Twitter network
4.2 Aggregating data with Cypher query language
4.2.1 Time aggregations
4.3 Filtering graph patterns
4.4 Counting subqueries
4.5 Multiple aggregations in sequence
4.6 Solutions to exercises
Summary
5 Introduction to social network analysis
5.1 Follower network
5.1.1 Node degree distribution
5.2 Introduction to the Neo4j Graph Data Science library
5.2.1 Graph catalog and native projection
5.3 Network characterization
5.3.1 Weakly connected component algorithm
5.3.2 Strongly connected components algorithm
5.3.3 Local clustering coefficient
5.4 Identifying central nodes
5.4.1 PageRank algorithm
5.4.2 Personalized PageRank algorithm
5.4.3 Dropping the named graph
5.5 Solutions to exercises
Summary
6 Projecting monopartite networks
6.1 Translating an indirect multihop path into a direct relationship
6.1.1 Cypher projection
6.2 Retweet network characterization
6.2.1 Degree centrality
6.2.2 Weakly connected components
6.3 Identifying the most influential content creators
6.3.1 Excluding self-loops
6.3.2 Weighted PageRank variant
6.3.3 Dropping the projected in-memory graph
6.4 Solutions to exercises
Summary
7 Inferring co-occurrence networks based on bipartite networks
7.1 Extracting hashtags from tweets
7.2 Constructing the co-occurrence network
7.2.1 Jaccard similarity coefficient
7.2.2 Node similarity algorithm
7.3 Characterization of the co-occurrence network
7.3.1 Node degree centrality
7.3.2 Weakly connected components
7.4 Community detection with the label propagation algorithm
7.5 Identifying community representatives with PageRank
7.5.1 Dropping the projected in-memory graphs
7.6 Solutions to exercises
Summary
8 Constructing a nearest neighbor similarity network
8.1 Feature extraction
8.1.1 Motifs and graphlets
8.1.2 Betweenness centrality
8.1.3 Closeness centrality
8.2 Constructing the nearest neighbor graph
8.2.1 Evaluating features
8.2.2 Inferring the similarity network
8.3 User segmentation with the community detection algorithm
8.4 Solutions to exercises
Summary
Part 3 Graph machine learning
9 Node embeddings and classification
9.1 Node embedding models
9.1.1 Homophily vs. structural roles approach
9.1.2 Inductive vs. transductive embedding models
9.2 Node classification task
9.2.1 Defining a connection to a Neo4j database
9.2.2 Importing a Twitch dataset
9.3 The node2vec algorithm
9.3.1 The word2vec algorithm
9.3.2 Random walks
9.3.3 Calculate node2vec embeddings
9.3.4 Evaluating node embeddings
9.3.5 Training a classification model
9.3.6 Evaluating predictions
9.4 Solutions to exercises
Summary
10 Link prediction
10.1 Link prediction workflow
10.2 Dataset split
10.2.1 Time-based split
10.2.2 Random split
10.2.3 Negative samples
10.3 Network feature engineering
10.3.1 Network distance
10.3.2 Preferential attachment
10.3.3 Common neighbors
10.3.4 Adamic-Adar index
10.3.5 Clustering coefficient of common neighbors
10.4 Link prediction classification model
10.4.1 Missing values
10.4.2 Training the model
10.4.3 Evaluating the model
10.5 Solutions to exercises
Summary
11 Knowledge graph completion
11.1 Knowledge graph embedding model
11.1.1 Triple
11.1.2 TransE
11.1.3 TransE limitations
11.2 Knowledge graph completion
11.2.1 Hetionet
11.2.2 Dataset split
11.2.3 Train a PairRE model
11.2.4 Drug application predictions
11.2.5 Explaining predictions
11.3 Solutions to exercises
Summary
12 Constructing a graph using natural language processing techniques
12.1 Coreference resolution
12.2 Named entity recognition
12.2.1 Entity linking
12.3 Relation extraction
12.4 Implementation of information extraction pipeline
12.4.1 SpaCy
12.4.2 Corefence resolution
12.4.3 End-to-end relation extraction
12.4.4 Entity linking
12.4.5 External data enrichment
12.5 Solutions to exercises
Summary
Appendix. The Neo4j environment
A.1 Cypher query language
A.2 Neo4j installation
A.2.1 Neo4j Desktop installation
A.2.2 Neo4j Docker installation
A.2.3 Neo4j Aura
A.3 Neo4j Browser configuration
references
index


πŸ“œ SIMILAR VOLUMES


Graph Algorithms for Data Science: With
✍ TomaΕΎ Bratanic πŸ“‚ Library πŸ“… 2024 πŸ› Manning Publications 🌐 English

Practical methods for analyzing your data with graphs, revealing hidden connections and new insights. Graphs are the natural way to represent and understand connected data. This book explores the most important algorithms and techniques for graphs in data science, with concrete advice on implemen

Graph Algorithms: Practical Examples in
✍ Mark Needham, Amy E. Hodler πŸ“‚ Library πŸ“… 2019 πŸ› O’Reilly Media 🌐 English

Learn how graph algorithms can help you leverage relationships within your data to develop intelligent solutions and enhance your machine learning models. With this practical guide,developers and data scientists will discover how graph analytics deliver value, whether they’re used for building dynam

Graph Data Science with Neo4j: Learn how
✍ Estelle Scifo πŸ“‚ Library πŸ› Packt Publishing 🌐 English

<p><span>Supercharge your data with the limitless potential of Neo4j 5, the premier graph database for cutting-edge machine learning</span></p><p><span>Purchase of the print or Kindle book includes a free PDF eBook</span></p><h4><span>Key Features</span></h4><ul><li><span><span>Extract meaningful in

Graph Data Science with Neo4j: Learn how
✍ Estelle Scifo πŸ“‚ Library πŸ“… 2023 πŸ› Packt Publishing 🌐 English

Supercharge your data with the limitless potential of Neo4j 5, the premier graph database for cutting-edge machine learning Key Features: β€’ Extract meaningful information from graph data with Neo4j's latest version 5 β€’ Use Graph Algorithms into a regular Machine Learning pipeline in Python β€’ L

Graph Data Science with Python and Neo4j
✍ Timothy Eastridge πŸ“‚ Library πŸ“… 2024 πŸ› Orange Education Pvt Ltd, AVAβ„’ 🌐 English

Graph Data Science with Python and Neo4j is your ultimate guide to unleashing the potential of graph data science by blending Python's robust capabilities with Neo4j's innovative graph database technology. From fundamental concepts to advanced analytics and machine learning techniques, you'll learn