Distributed Systems
Theory and Applications

1. Auflage Februar 2023
560 Seiten, Hardcover
Wiley & Sons Ltd
Distributed Systems
Comprehensive textbook resource on distributed systems--integrates foundational topics with advanced topics of contemporary importance within the field
Distributed Systems: Theory and Applications is organized around three layers of abstractions: networks, middleware tools, and application framework. It presents data consistency models suited for requirements of innovative distributed shared memory applications. The book also focuses on distributed processing of big data, representation of distributed knowledge and management of distributed intelligence via distributed agents. To aid in understanding how these concepts apply to real-world situations, the work presents a case study on building a P2P Integrated E-Learning system. Downloadable lecture slides are included to help professors and instructors convey key concepts to their students.
Additional topics discussed in Distributed Systems: Theory and Applications include:
* Network issues and high-level communication tools
* Software tools for implementations of distributed middleware.
* Data sharing across distributed components through publish and subscribe-based message diffusion, gossip protocol, P2P architecture and distributed shared memory.
* Consensus, distributed coordination, and advanced middleware for building large distributed applications
* Distributed data and knowledge management
* Autonomy in distributed systems, multi-agent architecture
* Trust in distributed systems, distributed ledger, Blockchain and related technologies.
Researchers, industry professionals, and students in the fields of science, technology, and medicine will be able to use Distributed Systems: Theory and Applications as a comprehensive textbook resource for understanding distributed systems, the specifics behind the modern elements which relate to them, and their practical applications.
Acknowledgments xxvii
Acronyms xxix
1 Introduction 1
1.1 Advantages of distributed systems 2
1.2 Defining Distributed Systems 4
1.3 Challenges of a Distributed System 7
1.4 Goals of distributed system 9
1.4.1 Single System View 10
1.4.2 Hiding Distributions 10
1.4.3 Degrees and Distribution of Hiding 13
1.4.4 Interoperability 14
1.4.5 Dynamic reconfiguration 15
1.5 Architectural Organization 16
1.6 Organization of the book 17
2 The Internet 21
2.1 Origin and Organization 22
2.1.1 ISPs and the Topology of the Internet 24
2.2 Addressing the Nodes 25
2.3 Network Connection Protocol 29
2.3.1 IP Protocol 31
2.3.2 Transmission Control Protocol 32
2.3.3 User Datagram Protocol 33
2.4 Dynamic Host Control Protocol 33
2.5 Domain Name Service 35
2.5.1 Reverse DNS Lookup 39
2.5.2 Client Server Architecture 44
2.6 Content Distribution Network 47
2.7 Conclusion 50
3 Process to Process Communication 53
3.1 Communication Types and Interfaces 54
3.1.1 Sequential type 55
3.1.2 Declarative type 57
3.1.3 Shared states 58
3.1.4 Message passing 59
3.1.5 Communication interfaces 59
3.2 Socket programming 61
3.2.1 Socket data structures 62
3.2.2 Socket calls 64
3.3 Remote Procedure Call 70
3.3.1 XML RPC 77
3.4 Remote Method Invocation 80
3.5 Conclusion 86
4 Microservices, Conterization and MPI 93
4.1 Microservice Architecture 94
4.2 REST Requests and APIs 97
4.2.1 Weather Data Using REST API 99
4.3 Cross Platform Applications 101
4.4 Message Passing Interface 114
4.4.1 Process Communication Models 115
4.4.2 Programming with MPI 119
4.5 Conclusion 126
5 Clock Synchronization and Event Ordering 133
5.1 The Notion of Clock Time 134
5.2 External Clock Based Mechanisms 136
5.2.1 Cristian's Algorithm 137
5.2.2 Berkeley Clock Protocol 138
5.2.3 Network Time Protocol 139
5.3 Events and Temporal Ordering 143
5.3.1 Causal Dependency 145
5.4 Definition of logical clock 146
5.5 Causal Ordering of Messages 155
5.6 Multicast Message Ordering 157
5.6.1 Implementing FIFO multicast 162
5.6.2 Implementing Causal Ordering 164
5.6.3 Implementing Total Ordering 166
5.6.4 Reliable multicast 167
5.7 Interval events 169
5.7.1 Conceptual neighborhood 172
5.7.2 Spatial Events 174
5.8 Conclusion 176
6 Global States and Termination Detection 185
6.1 Cuts and Global States 186
6.1.1 Global states 192
6.1.2 Recording of global states 196
6.1.3 Problem in recording global state 201
6.2 Liveness and Safety 204
6.3 Termination Detection 209
6.3.1 Snapshot Based Termination Detection 210
6.3.2 Ring Method 213
6.3.3 Tree method 217
6.3.4 Weight Throwing Method 221
6.4 Conclusion 225
7 Leader Election 231
7.1 Impossibility Result 232
7.2 Bully Algorithm 235
7.3 Ring based algorithms 236
7.3.1 Circulate IDs all the way 237
7.3.2 As far as an ID can go 240
7.4 Hirschberg and Sinclair algorithm 241
7.5 Distributed Spanning Tree Algorithm 245
7.5.1 Single Initiator Spanning Tree 246
7.5.2 Multiple Initiators Spanning Tree 251
7.5.3 Minimum Spanning Tree 259
7.6 Leader election in trees 260
7.6.1 Overview of the algorithm 260
7.6.2 Activation Stage 261
7.6.3 Saturation Stage 262
7.6.4 Resolution Stage 264
7.6.5 Resolution Stage 264
7.6.6 Two Nodes Enter SATURATED State 266
7.7 Leased Leader Election 269
7.8 Conclusion 272
8 Mutual Exclusion 281
8.1 System Model 282
8.2 Coordinator-based Solution 285
8.3 Assertion-Based Solutions 286
8.3.1 Lamport's algorithm 286
8.3.2 Improvement to Lamport's Algorithm 290
8.3.3 Quorum based algorithms 291
8.4 Token based solutions 301
8.4.1 Suzuki and Kasami's algorithm 301
8.4.2 Singhal's Heuristically-Aided Algorithm 304
8.4.3 Raymond's tree-based algorithm 312
8.5 Conclusion 313
9 Agreements and Consensus 321
9.1 System Model 322
9.1.1 System Model 322
9.1.2 Failures in Distributed System 324
9.1.3 Problem Definition 325
9.1.4 Agreement Problem and its Equivalence 327
9.2 Byzantine General Problem (BGP) 330
9.2.1 BGP Solution Using Oral Messages 335
9.2.2 Phase-King Algorithm 340
9.3 Commit Protocols 341
9.3.1 Two-Phase Commit Protocol 343
9.3.2 Three-Phase Commit 349
9.4 Consensus 351
9.4.1 Consensus in Synchronous Systems 351
9.4.2 Consensus in Asynchronous Systems 354
9.4.3 Paxos Algorithm 354
9.4.4 Raft Algorithm 358
9.4.5 Leader Election 362
9.5 Conclusion 364
10 Gossip Protocols 373
10.1 Direct Mail 374
10.2 Generic Gossip Protocol 377
10.3 Anti Entropy 379
10.3.1 Push-Based Anti-Entropy 379
10.3.2 Pull-Based Anti-Entropy 381
10.3.3 Hybrid Anti-Entropy 383
10.3.4 Control and Propagation in Anti-Entropy 384
10.4 Rumor Mongering Gossip 386
10.4.1 Analysis of Rumor Mongering 389
10.4.2 Fault Tolerance 392
10.5 Implementation Issues 393
10.5.1 Network Related Issues 394
10.6 Applications of Gossip 395
10.6.1 Peer Sampling 395
10.6.2 Failure Detectors 399
10.6.3 Distributed Social Networking 402
10.7 Gossip in IoT Communication 404
10.7.1 Context-aware Gossip 405
10.7.2 Flow-aware Gossip 406
10.8 Conclusion 413
11 Message Di
usion Using Publish and Subscribe 421
11.1 Publish and Subscribe Paradigm 422
11.1.1 Broker Network 424
11.2 Filters and Notifications 426
11.2.1 Subscription and Advertisement 428
11.2.2 Covering Relation 429
11.2.3 Merging Filters 432
11.2.4 Algorithms 434
11.3 Notification Service 438
11.3.1 Siena 439
11.3.2 Rebeca 440
11.3.3 Routing of Notification 441
11.4 MQTT 443
11.5 Advanced Message Queuing Protocol 445
11.6 E
ects of Technology on Performance 448
11.7 Conclusions 453
12 Peer-to-Peer Systems 461
12.1 Definition of P2P 462
12.2 Representative P2P Models 464
12.2.1 The Most Challenging Problem 466
12.3 Chord Overlay 468
12.4 Pastry 479
12.5 CAN 486
12.6 Kademlia 489
12.7 Conclusion 494
13 Distributed Shared Memory 501
13.1 Multicore and S-DSM 503
13.1.1 Coherency by Delegation to a Central Server 505
13.2 Manycore Systems and S-DSM 507
13.3 Programming Abstractions 507
13.3.1 MapReduce 508
13.3.2 OpenMP 510
13.3.3 Merging Publish and Subscribe with DSM 512
13.4 Memory Consistency Models 516
13.4.1 Sequential consistency 519
13.4.2 Linearizability or Atomic consistency 522
13.4.3 Relaxed Consistency Models 523
13.4.4 Comparison of Memory Models 531
13.5 DSM Access Algorithms 532
13.5.1 Central Sever Algorithm 532
13.5.2 Migration Algorithm 535
13.5.3 Read Replication Algorithm 537
13.5.4 Full Replication Algorithm 538
13.6 Conclusion 540
14 Distributed Data Management 551
14.1 Distributed Storage Systems 552
14.1.1 RAID 553
14.1.2 Storage Area Networks 553
14.1.3 Cloud Storage 554
14.2 Distributed File Systems 556
14.3 Distributed Index 559
14.4 NoSQL Databases 560
14.4.1 Key-Value and Document databases 561
MapReduce algorithm 563
14.4.2 Wide Column databases 565
14.4.3 Graph databases 567
Pregel Algorithm 569
14.5 Distributed Data Analytics 572
14.5.1 Distributed Clustering Algorithms 574
Distributed K-Means Clustering Algorithm 576
14.5.2 Stream Clustering 579
BIRCH Algorithm 580
14.6 Conclusion 583
15 Distributed Knowledge Management 589
15.1 Distributed Knowledge 590
15.2 Distributed Knowledge Representation 592
15.2.1 Resource Description Framework (RDF) 593
15.2.2 Web Ontology Language (OWL) 600
15.3 Linked Data 601
15.3.1 Friend of a Friend 602
15.3.2 DBpedia 603
15.4 Querying Distributed Knowledge 605
15.4.1 SPARQL Query Language 606
15.4.2 SPARQL Query Semantics 607
15.4.3 SPARQL Query Processing 610
15.4.4 Distributed SPARQL Query Processing 612
15.4.5 Federated and Peer-to-Peer SPARQL query processing 615
15.5 Data Integration in Distributed Sensor Networks 622
15.5.1 Semantic Data Integration 624
15.5.2 Data Integration in Constrained Systems 626
15.6 Conclusion 631
16 Distributed Intelligence 639
16.1 Agents and Multi-Agent Systems 640
16.1.1 Agent Embodiment 643
16.1.2 Mobile Agents 644
16.1.3 Multi-agent Systems 645
16.2 Communication in Agent based Systems 647
16.2.1 Agent Communication Protocols 648
16.2.2 Interaction Protocols 650
Request Interaction Protocol 651
16.3 Agent Middleware 652
16.3.1 FIPA Reference model 652
16.3.2 FIPA Compliant Middleware 654
JADE: Java Agent Development Environment 654
MobileC 655
16.3.3 Agent Migration 655
16.4 Agent Coordination 658
16.4.1 Planning 660
Distributed Planning Paradigms 661
Distributed Plan Representation and Execution 663
16.4.2 Task Allocation 665
Contract-Net Protocol 666
Allocation of Multiple Tasks 668
16.4.3 Coordinating through the environment 669
16.4.4 Coordination without Communication 674
16.5 Conclusion 674
17 Distributed Ledger 681
17.1 Cryptographic Techniques 682
17.2 Distributed Ledger Systems 685
17.2.1 Properties of Distributed Ledger Systems 688
17.2.2 A Framework for Distributed Ledger Systems 689
17.3 Blockchain 691
17.3.1 Distributed Consensus in Blockchain 693
17.3.2 Forking 695
17.3.3 Distributed Asset Tracking 697
17.4 Other Techniques for Distributed Consensus 698
17.4.1 Byzantine Fault Tolerance and Proof of Work 699
17.4.2 Alternative Proofs 700
17.5 Non-linear Data-structures 701
17.5.1 Tangle 702
17.5.2 Hashgraph 705
17.6 Scripts and Smart Contracts 711
17.7 Distributed Ledgers for Cyber-physical systems 716
17.7.1 Layered architecture 717
17.7.2 Smart Contract in Cyber-physical Systems 720
17.8 Conclusion 721
18 Case Study 729
18.1 Collaborative E-Learning Systems 731
18.2 P2P E-Learning System 733
18.2.1 Web Conferencing vs. P2P-IPS 735
18.3 P2P Shared Whiteboard 738
18.3.1 Repainting Shared Whiteboard 739
18.3.2 Consistency of Board View at Peers 739
18.4 P2P Live Streaming 743
18.4.1 Peer Joining 744
18.4.2 Peer Leaving 748
18.4.3 Handling "Ask Doubt" 749
18.5 P2P-IPS for Stored Contents 750
18.5.1 De Bruijn Graphs for DHT Implentation 750
18.5.2 Node Information Structure 755
18.5.3 Leaving of Peers 758
18.6 Searching, Sharing, and Indexing 760
18.6.1 Pre-processing of files 761
18.6.2 File Indexing 761
18.6.3 File Searching 762
18.7 Annotations and Discussion Forum 763
18.7.1 Annotation Format 763
18.7.2 Storing Annotations 764
18.7.3 Audio and Video Annotation 764
18.7.4 PDF Annotation 765
18.7.5 Posts, Comments and Announcements 765
18.7.6 Synchronization of Posts and Comments 766
18.8 Simulation Results 768
18.8.1 Live Streaming and Shared Whiteboard 769
18.8.2 De Bruijn Overlay 771
18.9 Conclusion 774
Bibliography 775
Index 780
Hiranmay Ghosh, PhD, is a former Adviser and Principal Scientist of TCS Research. He received his PhD degree from IIT-Delhi and his B.Tech. degree from Calcutta University. He is a Senior Member of IEEE, Life Member of IUPRAI, and a Member of ACM. He authored the Wiley title Computational Models for Cognitive Vision (2020) and co-authored of the CRC Press title Multimedia Ontology: Representation and Applications (2015).