Module 5 -- Parallel Database Systems -- Distributed Systems


Index

  1. Parallel Architectures
  2. Parallel Query Processing

Parallel Architectures

https://www.geeksforgeeks.org/design-of-parallel-databases-dbms/

Parallel Architectures in Parallel Database Systems

Parallel database systems leverage multiple processors or machines to execute database operations concurrently, improving performance for large-scale data processing. The architecture defines how these processors, memory, and storage are organized. There are three main types of parallel architectures:

Pasted image 20250530130713.png


Pasted image 20250530130834.png


Pasted image 20250530130919.png


Key Considerations


Parallel Query Processing

(This was covered previously in module 3 pdf)

https://www.geeksforgeeks.org/parallelism-in-query-in-dbms/

Parallel Query Processing in Parallel Database Systems

Parallel query processing is the technique of dividing a database query into smaller tasks that can be executed concurrently across multiple processors or nodes, significantly speeding up query execution. This is a key feature of parallel database systems, leveraging the architectures we discussed (shared-memory, shared-disk, shared-nothing). Let’s break it down:


Hash Partitioning in Parallel Query Processing

Hash partitioning is a data partitioning technique used in parallel query processing to distribute data across multiple nodes or processors, enabling parallel execution of queries. It’s particularly effective in shared-nothing architectures, where each node processes its own subset of data independently.


Range Partitioning in Parallel Query Processing

Range partitioning is another data partitioning technique used in parallel query processing to distribute data across nodes or processors, enabling parallel execution. Unlike hash partitioning, which uses a hash function, range partitioning divides data based on predefined ranges of a partitioning key, making it particularly suited for certain types of queries.


Schema Partitioning in Parallel Query Processing

Schema partitioning (sometimes referred to as vertical partitioning in the context of parallel query processing) involves dividing a database table based on its schema—specifically, by splitting the columns (attributes) of the table across different nodes or processors. This is less common than horizontal partitioning methods like hash or range partitioning but can be useful in specific scenarios.


Round-Robin Partitioning in Parallel Query Processing

Round-robin partitioning is a simple data partitioning technique used in parallel query processing to distribute data across nodes or processors. Unlike hash or range partitioning, which use a partitioning key to determine data placement, round-robin partitioning distributes data in a cyclic, sequential manner, focusing on achieving an even distribution.


Quick Recap of Partitioning Types




Key Considerations