Partitioning in Databases

Baidik Chandra

## List of Papers

  1. Relax and Let the Database do the Partitioning Online   

  2. Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster   

  3. Adaptive partitioning and indexing for in situ query processing   -------

Introduction

The structure and importance of columns in a relational database table can vary based on their relevance to different queries. In a typical relational database, a table is structured as a collection of rows and columns, with each row representing a single record and each column representing a specific attribute or field. However, not all columns of a table are equally important or frequently accessed in every query. For example, some columns may be required only for specific reports or analysis, while others may be required for every query.

Partitioning is a crucial technique in managing large tables or indexes in relational databases. The primary need for partitioning arises when a table or index becomes too large to be efficiently processed or maintained by a single server or storage device. Some key reasons why partitioning is necessary include improving query performance, optimizing storage, improving data availability, simplifying maintenance tasks, and improving scalability.

Most importantly, partitioning improves query performance and optimizes storage in large relational databases. It can help improve query performance by dividing the table into smaller, more manageable pieces, and allowing queries to target only the relevant partitions. It can optimize storage by allowing different partitions of a table to be stored on different physical devices or servers. This balances storage utilization across different devices and reduces the risk of storage device failures.

Partitioning can also improve data availability by providing fault tolerance and redundancy. By storing data across multiple partitions on different devices or servers, partitioning can help ensure that data is still available even if one device or server fails. It can simplify maintenance tasks such as backups, restores, and schema changes by dividing a large table into smaller partitions. These maintenance tasks can then be performed on individual partitions, rather than the entire table, reducing the time and resources needed for these tasks.

Finally, partitioning can improve scalability by allowing different partitions of a table to be processed in parallel by multiple servers. It also contributes to improving throughput and reducing response times for high-volume workloads. Overall, partitioning is a crucial technique for managing large tables or indexes in DBMS.

There are also some potential issues and challenges to consider. Here are some of the most common issues with partitioning:

  1. Complexity: Partitioning can add complexity to the database design and administration, which can make it more difficult to manage and troubleshoot issues.

  2. Maintenance: Partitioning can increase the time and resources required for maintenance tasks such as backups, restores, and schema changes. This can be especially challenging if partitions are distributed across multiple physical devices or servers.

  3. Data skew: If the partitioning scheme is not well-designed, it can result in data skew, where some partitions are much larger or more frequently accessed than others. This can lead to performance issues, as some partitions may become bottlenecks for queries.

  4. Query optimization: Partitioning can make query optimization more challenging, as the DBMS needs to analyze each query and determine which partitions need to be accessed. This can result in longer query planning times and more complex query plans.

  5. Partition key changes: Changing the partition key can be a difficult and time-consuming process, as it often requires data to be moved between partitions. This can also impact performance if the partition key is frequently changed.

  6. Data consistency: Ensuring data consistency across partitions can be challenging, especially for distributed transactions that span multiple partitions. This requires careful planning and coordination to ensure that all partitions are properly synchronized and transactions are committed in a consistent and reliable manner.

Background

Vertical partitioning and horizontal partitioning are the foundational stones of partitioning.

Vertical partitioning 

Vertical partitioning works by dividing the columns of a table into separate partitions, each containing a subset of the columns. These partitions can be stored on different physical devices or database servers, allowing for more efficient access and retrieval of data.

The process of vertical partitioning begins with analyzing the table's schema and identifying the columns that are frequently accessed or essential to the application. These columns are grouped together into a separate partition, while the remaining columns are grouped into another partition or partitions. The partitions are defined based on the criteria such as data type, frequency of use, importance, and security.

Once the partitions are created, queries can be optimized to access only the necessary columns. This can significantly improve query performance and reduce the amount of data that needs to be retrieved from the database. For example, if a query only requires the customer's name and address, only the partition containing these columns needs to be accessed, while the remaining columns can be ignored.

Vertical partitioning can be especially useful for databases with large tables that contain many columns that are not frequently used. By splitting these tables into smaller partitions, storage requirements can be reduced, and query performance can be improved.

However, vertical partitioning can also introduce some challenges which induce complexity. For example, joining multiple partitions to generate a result set can be slower and more complex than querying a single partition. Additionally, schema changes to a table may require changes to multiple partitions, making them more difficult to manage.

From a systems perspective, vertical partitioning involves separating the columns of a table into separate partitions that can be stored on different physical devices or database servers. This allows for more efficient storage and retrieval of data, as only the necessary columns need to be accessed for each query. To implement vertical partitioning, the database management system (DBMS) needs to support the ability to create and manage multiple partitions for a table. This can be achieved through the use of specialized software, such as a partition manager or a database cluster.

Horizontal partitioning

Horizontal partitioning, also known as sharding, begins by selecting a partitioning key, which is a column or combination of columns that will be used to distribute the rows among the partitions. The partitioning key can be based on a range of values, a hash function, or a list of values.

Once the partitioning key is selected, the table is divided into a predetermined number of partitions based on the selected criterion. For example, if the partitioning key is based on a range of values, each partition will contain a subset of rows that fall within a specific range of values.

Horizontal partitioning can be especially useful for databases that contain large tables with a high volume of data. By dividing the data into smaller partitions, the database can scale horizontally to accommodate more users and increased data volumes.

The key benefits of horizontal partitioning are improved performance, increased availability and fault tolerance. Queries can be executed more quickly because the data is distributed across multiple servers, allowing for parallel processing of queries. Additionally, the amount of data that needs to be scanned for each query is reduced because each query only needs to access the partition containing the relevant data. Another benefit of horizontal partitioning is increased availability and fault tolerance. If one partition or server fails, the remaining partitions or servers can continue to operate, ensuring that the database remains available and accessible to users.

However, like vertical partitioning, horizontal partitioning can also introduce some challenges inducing complexity. For example, managing the distribution of data across multiple servers can be complex and require significant coordination. Additionally, queries that require data from multiple partitions may be slower because they require data to be combined from multiple sources.

From a systems perspective, horizontal partitioning involves dividing the rows of a table into separate partitions that can be stored on different physical devices or database servers. This allows for more efficient storage and retrieval of data, as each partition can be processed in parallel by separate servers, improving query performance. To implement horizontal partitioning, the database management system (DBMS) needs to support the ability to create and manage multiple partitions for a table. This can be achieved through the use of specialized software, such as a partition manager or a database cluster.

Summary of the foundational stones

The partition manager is responsible for dividing the rows of a table into separate partitions based on the selected criteria, such as a range of values or a hash function. The partition manager also ensures that each partition is stored on a separate physical device or server, providing better fault tolerance and availability.

Once the partitions are created, the DBMS should be able to query only the necessary partitions for each given query. This can be achieved through the use of specialized query optimization techniques that analyze the query and determine which partitions need to be accessed. The DBMS also needs to support the ability to perform distributed transactions across multiple partitions, ensuring consistency and integrity of the data.

In addition to query optimization and distributed transactions, the DBMS should support the ability to manage partitioning metadata and configuration, such as the partitioning key and the number of partitions. This requires careful planning and coordination to ensure that the partitioning scheme is optimized for query performance and data availability.

Because the data is divided across multiple partitions, backup and recovery procedures need to be designed to handle each partition separately. This requires careful planning and coordination to ensure that all partitions are properly backed up and can be recovered in case of a system failure or error.

In summary, vertical partitioning helps to improve query performance and reduce resource usage, while horizontal partitioning helps to improve scalability and fault tolerance.

Some other types of partitioning are:

  1. Round-robin partitioning: In this method, data is distributed across different partitions in a round-robin fashion, with each partition receiving an equal amount of data. This method is simple to implement and can be useful for load balancing.

  2. Hash partitioning: This method involves distributing data across different partitions based on the value of a hash function applied to a key value. This method can be used to distribute data evenly across partitions and can improve query performance.

  3. List partitioning: In this method, data is partitioned based on a specified list of values. Each partition contains data that matches one or more of the specified values.

  4. Range partitioning: In this method, data is partitioned based on a range of values. Each partition contains data that falls within a specific range of values.

Papers

The three papers have been chosen to show a section of intersting works in the community about using various forms of partitioning. ## Overview

The three papers present solutions to different aspects of improving the performance of data querying and processing. Their common goal is to achieve efficient and cost-effective data access and querying.

AutoStore and O2P focus on solving the One-Dimensional Partitioning Problem and propose a self-tuning data store that automatically partitions data at regular intervals based on the current workload. The second paper proposes an adaptive virtual partitioning solution that adjusts partition sizes dynamically to achieve intra-query parallelism in On-Line Analytical Processing (OLAP) query processing. The third paper proposes a granular indexing approach using logical partitioning and lightweight per-partition indexing. This brings the benefits of indexing to in situ query processing while maintaining low index building costs and a small memory footprint. These approaches offer promising solutions to efficient data access and querying, particularly in scenarios where data movement is costly or impractical.

Relax and Let the Database do the Partitioning Online by Jindal and Dittrich addresses the problem of defining appropriate vertical and horizontal partitions for improving performance. The classical offline algorithms are unsuitable due to their high runtime complexity. The paper proposes AutoStore, a self-tuning data store that automatically partitions data at regular intervals based on the current workload.

The paper also proposes an efficient online algorithm called O2P (One-dimensional Online Partitioning) that solves the One-Dimensional Partitioning Problem (1DPP) in an online setting. O2P does not produce the optimal partitioning solution, but it uses several techniques to come up with a greedy solution. This solution is dynamically adapted to workload changes and does not lose much on partitioning quality. It also eliminates the need for human intervention in the partitioning process, making it more adaptable to changing workloads and eliminating the requirement for a skilled DBA.

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster by Lima, Valduriez and Mattoso aims to address heavy-weight and ad-hoc OLAP queries that require a lot of storage capacity and processing power. They propose a cost-effective alternative to a tightly-coupled multiprocessor: a database cluster. They suggest a solution to improve OLAP query processing using a data parallel processing technique called adaptive virtual partitioning, which adjusts partition sizes dynamically without requiring any prior knowledge of the database or DBMS. To test their solution, they developed a Java prototype on a 32-node cluster system and conducted experiments using typical TPC-H benchmark queries. The results showed that their solution provided linear or sometimes super-linear speedup, outperforming traditional virtual partitioning by a factor greater than 10 in many cases.

Adaptive partitioning and indexing for in situ query processing by Olma et al proposes a logical partitioning scheme for raw data files that allows for fine-grained indexing decisions at the level of each partition. It is a novel approach to efficient data access and querying. By doing so, lightweight per-partition indexing is enabled, which leads to near-optimal data access.

Solutions

Relax and Let the Database do the Partitioning Online

The paper proposes AutoStore, a self-tuning data store that automatically partitions data at regular intervals. This is based on the current workload to improve the performance of business intelligence applications. There are numerous advanced offline algorithms available for recommending appropriate vertical and horizontal partitions, but their runtime complexity is too high. This makes them unsuitable for online applications where there is limited time to decide on new partitioning.

Therefore, new algorithms must be developed. These algorithms should automatically decide on data partitioning while the database is running and be able to make decisions to repartition the data. Additionally, any automatic repartitioning must not block or stall access to the database or its objects, which is more likely in archival disk-based databases.

To solve the One-Dimensional Partitioning Problem (1DPP) associated with vertical or horizontal partitioning, the paper proposes an efficient O2P algorithm (AutoStore) that is faster than specialized affinity-based algorithms and still maintains good partitioning quality. AutoStore is part of the OctopusDB project, which aims to create a One Size Fits All Database System.

AutoStore is an online partitioned database store that uses a sliding query window to capture the workload pattern and make confident partitioning decisions. The partitioning unit clusterer is responsible for re-clustering the affinity matrix and updating only the affinities between referenced partitioning units. The partitioning analyzer takes a snapshot of the query window as input, enumerates and analyzes the partitioning candidates, and emits the best partitioning as output. The partitioning optimizer decides whether or not to transform the current partitioning scheme based on the expected costs and benefits of partitioning. AutoStore uses a cost model for full table and index scan operations over one-dimensional partitioned tables and a benefit model to make partitioning decisions.

However, this approach has exponential complexity and is not desirable in an online setting.

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster

This paper presents an online algorithm called O2P (One-dimensional Online Partitioning) that solves 1DPP in an online setting. It employs several techniques to optimize the partitioning process, including Partitioning Unit Pruning, where unused partitioning units are immediately pruned into a separate partition. O2P also uses Greedy Split Lines to determine the best split lines one at a time instead of enumerating over all possible combinations. This approach considers the unset split lines and picks the one with the lowest workload execution cost. Furthermore, O2P uses Dynamic Programming to maintain the best split line in each partition and only reevaluates split lines in the most recently split partition.

By utilizing these techniques, O2P ensures efficient partitioning analysis while maintaining uninterrupted query processing. In summary, O2P first finds the best split line and its corresponding cost in the left and right parts of the last partition and all previous partitions. If no valid split line is found, then O2P terminates, Otherwise, it compares these three split lines, chooses the one having the lowest costs, and repeats the process. Therefore, O2P uses several techniques to come up with a greedy solution that is dynamically adapted to workload changes. Even though O2P does not produce the optimal partitioning solution, it does not lose much on partitioning quality.

Simple Virtual Partitioning (SVP) is used to achieve intra-query parallelism in OLAP query processing by adding predicates to incoming queries. This limits the amount of data processed by each DBMS and forces each DBMS to process a different subset of data items. SVP has an aggregation function and accesses a large table and the virtual partitioning attribute is chosen as the attribute with a large range of values. Each node receives the same query but with different values, allowing great flexibility for node allocation during query processing. SVP is simple but its efficiency depends on certain factors such as the presence of a clustered index associated with the partitioning attribute and uniform value distribution on the partitioning attribute. 

To overcome the problems with SVP, the authors propose fine-grained virtual partitioning (FGVP). FGVP proposes an initial number of virtual partitions greater than the number of participating nodes which makes each cluster node process a set of small, light-weight sub-queries. FGVP is more robust as it is less vulnerable and allows for dynamic load balancing and avoids the use of temporary disk-based structures in some join processing techniques. The size of each virtual partition is directly related to the number of sub-queries and having many sub-queries can degrade performance. In FGVP, estimates based on database statistics and query processing time are used to determine the size of each virtual partition.

The fundamental problem in FGVP is to determine the sizes of virtual partitions for each sub-query, which refers to finding value intervals for each sub-query. The cost of processing a query consists of several stages, such as generating a query execution plan, initialization, processing, and releasing resources. Different stages are affected differently by the amount of data accessed. For example, the time required for generating a query execution plan depends more on the complexity of the query statement rather than the number of bytes read from disk. The cost of releasing resources depends on the amount of data processed and operations performed; the performance of FGVP can be poor if there are too many small sub-queries or if there are a small number of large queries.

To solve this issue of performace one can adopt various methods. One approach to solving this problem is to compute partition sizes before query execution based on information such as table cardinality and attribute values distribution. However, this approach is not applicable to black-box DBMSs, as it would require developing specific components for each DBMS.

Instead, the authors use a more dynamic approach for Adaptive Virtual Partitioning (AVP). In this approach, each node receives an interval of values to work with, and the algorithm performs the following steps: start with a very small partition size, increase the partition size while the execution time increases proportionally less than the partition size, and stop increasing when a stable size is found. If performance deterioration is detected, the algorithm decreases the size and starts again. The partition tuner component interacts with the query executor component to implement AVP. The query executor component executes one or more sub-queries, demands a partition size from the partition tuner component, and provides feedback until the entire node-assigned interval has been processed.

Adaptive partitioning and indexing for in situ query processing

In situ methods typically require significant computational resources to index and query the data. This paper addresses this challenge by introducing a granular way of indexing. This brings the benefit of indexing to in situ query processing. This approach has several benefits, including low index building cost, a small memory footprint, and improved query processing efficiency. Moreover, the partitioning and indexing decisions are refined on-the-fly using an online randomized algorithm, which further enhances the efficiency of the system.

In Adaptive partitioning and indexing for in situ query processing, the authors implement the Slalom query engine using C++. This engine utilizes the Volcano iterator model and consists of several modules, including the Partitioning and Indexing Managers and the Structure Refiner, which are all attached to the Query Executor. Slalom populates binary caches at a partition granularity and utilizes secondary indexes stored in a cache-friendly in-memory B+ tree implementation. The Statistics Store is also implemented as a separate daemon process that gathers and persists query statistics. To reduce the cost of raw data access, Slalom uses vectorized parsers, binary caches, and positional maps (PMs).

One of the key advantages of the above approach is that it offers the benefits of in situ methods while maintaining low index building costs and a small memory footprint. In situ methods involve processing data in place, as it is generated, rather than moving it to a separate system for processing. This approach can be beneficial in scenarios where data movement is costly or impractical.

The CSV parser utilizes SIMD instructions to scan vectors of 256 bytes from input files and identify delimiters. PMs are used to store delta distances for each tuple and field to reduce memory footprint, and the Partition Manager maps partitions to their corresponding PM portions. The Structure Refiner manages memory usage and decides which structures to drop on a partition basis to operate under limited resources.

Slalom aims to reduce the cost of raw data access and optimize memory usage while still providing efficient query processing capabilities. The authors also exploit specialized hardware, such as GPUs and CRC checksum units, to reduce update recognition costs and minimize changes to partitioning and indexing. This approach reduces the query execution overhead in the presence of updates.

Algorithms

Relax and Let the Database do the Partitioning Online

The paper introduces an online algorithm called O2P (One-dimensional Online Partitioning) that solves the One-Dimensional Partitioning Problem (1DPP). The O2P algorithm is used by AutoStore, a self-tuning data store that automatically partitions data based on the current workload to gradually adapt partitions to fit the observed query workload without requiring human intervention.

The O2P algorithm uses several techniques to come up with a greedy solution that is dynamically adapted to workload changes and does not lose much on partitioning quality. These techniques include Partitioning Unit Pruning, Greedy Split Lines, and Dynamic Programming.

The algorithm treats partitioning as a general 1DPP problem, with subproblems VPP and HPP handled equivalently by rotating the table and changing partitioning units. AutoStore uses a cost model for full table and index scan operations over one-dimensional partitioned tables and a benefit model to make partitioning decisions.

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster

The query handling process in the system proposed involves several components and stages. It starts with the Client Proxy receiving a query from a client application, which is then received by the Cluster Query Processor (CQP). The CQP creates a Global Query Task (GQT) object that acquires metadata from the Catalog and determines partitioning attributes. The Global Result Collector (GRC) handles the additional steps required for virtual partitioning after sub-queries are completed to produce accurate final results.

The GQT creates a new GRC object and sends the Query Execution Plan (QEP) to the Node Query Processor (NQP) of each participating node. Each NQP processes the received QEP locally and sends the local results to the GRC. After all NQPs have finished their tasks, GQT waits for the final result, which is sent by GRC. The results are then sent to CQP and delivered to the Client Proxy, which returns them to the client application.

Each node has three main components: the Node Query Processor (NQP), the DBGateway (DBG), and the DBMS instance with the local database. Upon receiving a QEP, the NQP creates a Local Query Task (LQT) object responsible for executing the client query. LQT creates a PartitionTuner object and a Local Result Collector (LRC) object in separate threads. LRC performs the same tasks as GRC, but at a node level to avoid bottlenecks.

LQT creates a Query Executor (QE) that prepares the final SQL queries to be submitted by DBGateway to the DBMS. QE communicates with PartitionTuner, which dynamically calculates and adjusts partition sizes using the adaptive algorithm described in Section 2. DBGateway provides a pool of DBMS connections, saving time during query execution, and making other middleware components DBMS-independent.

After receiving a query, DBG submits it to the DBMS, and the result is passed to the corresponding QE, which then sends it to its associated LRC. Once all virtual partitions have been processed, QE sends a message to LQT indicating its work is done. Finally, LQT receives the final local result from LRC and sends it to NQP, which sends the result to the Global Result Collector and notifies the Global Query Task that its work is finished.

Adaptive partitioning and indexing for in situ query processing

Slalom is a query optimization engine that evaluates its current state along with workload statistics during the execution of queries. Initially, Slalom lacks both data and query workload information, and the first query accesses the data file without any assistance from auxiliary structures. Slalom constructs a partition map, accesses the required data, and places it in a cache. For subsequent queries, Slalom gathers statistics about the data distribution of the accessed attributes and the average query selectivity to determine whether logical partitioning would enhance performance.

Once a partition has reached its stable state A, Slalom divides the partition into subsets. In state B, Slalom executes some queries and builds a binary cache and a partition map for the accessed attributes. It logically partitions the file into two chunks, with the first chunk (partition 1) declared to be in a stable state. Slalom checks the stable partitions for the presence of indexes, and if no index exists, it uses a randomized algorithm to determine whether to construct one.

In state C, Slalom has executes additional queries and indexes partition 1 based on the query access pattern. Partition 2 of state B is further split into several partitions, of which partition 2 is declared stable, and an index is been built on it. Slalom evaluates its current state along with workload statistics and updates its auxiliary structures accordingly during the execution of queries.

Experiments

Relax and Let the Database do the Partitioning Online

Experimental results on TPC-H datasets show that AutoStore outperforms row and column layouts by up to a factor of 2. In contrast to previous approaches, AutoStore removes the need for human intervention in the partitioning process, making it more adaptable to changing workloads and eliminating the requirement for a skilled DBA.

The authors tested the O2P algorithm on multiple datasets and workloads, comparing its performance to two algorithms proposed by Navathe and Hankins, hereafter referred to as NV/HC. They found that O2P outperformed NV/HC on both the Star Schema Benchmark and the TPC-H dataset. O2P had the best performance on Lineitem table, completing the task in just 42 iterations compared to NV/HC's 32,768 iterations.

The study also evaluated the actual running time of different O2P variants while varying the read-only workload, and found that O2P outperformed NV/HC on Lineitem table by up to two orders of magnitude. The quality of partitioning produced by O2P was also analyzed and found to be high. The scalability of O2P was tested by varying the workload size and was found to scale linearly with workload size. The authors concluded that O2P was the best algorithm for future use.

The authors evaluate query execution performance of AutoStore in comparison to No and Full Vertical Partitioning using main-memory implementation of AutoStore in Java and a universal relation denormalized from a variant of TPC-H schema. They choose a vertical partition with certain attributes and map them to integer values with the same domain cardinality, with a scale factor of 1. They vary the fraction of data accessed and OLTP/OLAP read access patterns, with a step size of 0.01%. They find that AutoStore automatically adapts to the changing workload and matches or improves Full Vertical Partitioning performance. They consider only O2P and show the performance of AutoStore when varying query window size, finding that larger query windows become slower because the partitioning analyzer has to estimate the costs of more queries.

The results show that O2P is faster than earlier approaches by more than two orders of magnitude, and still produces good quality partitioning results. Additionally, the results show that over changing workloads AutoStore outperforms existing stores.

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster

The authors analyzed the performance of two different partitioning strategies, SVP and AVP, on five different queries, Q1-5. They compared performance with different numbers of nodes.

For Q1, both strategies were evaluated, and AVP was found to perform better than SVP in all configurations. For Q2, AVP was also found to perform better than SVP with 1, 2, and 4 nodes. However, with 8 nodes, SVP performed better than AVP, and with 32 nodes, SVP outperformed AVP by a factor of 1.84.

For Q3, SVP performed better than AVP with one node, but AVP outperformed SVP with 2 to 32 nodes. With 32 nodes, AVP was found to be 28.72 times faster than SVP. For Q4, SVP outperformed AVP with 1 and 2 nodes by factors of 3.09 and 1.56, respectively. However, with 4 to 32 nodes, AVP performed better than SVP. With 32 nodes, the performance difference between AVP and SVP was 22.10.

Finally, the authors found that AVP outperformed SVP in all configurations for Q5.

Adaptive partitioning and indexing for in situ query processing

To evaluate Slalom, the authors integrated their partitioning and indexing tuner into it. They used synthetic and real-life workloads to compare the query latency of Slalom with other approaches, including a traditional DBMS, a state-of-the-art in situ query processing engine, and adaptive indexing (cracking).

The results of the experiments showed that, even when excluding the data loading cost, Slalom offered the fastest cumulative query latency. In particular, Slalom outperformed state-of-the-art disk-based approaches by one order of magnitude, state-of-the-art in-memory approaches by 3.7x (with a 2.45x smaller memory footprint), and adaptive indexing by 19% (with a 1.93 smaller memory footprint).

Finally, the authors examined the performance of Slalom in the presence of both in-place and append-like updates. The results showed that Slalom was able to handle updates efficiently, thanks to the specialized hardware and refined partitioning and indexing decisions. Overall, the proposed approach offers a promising solution for efficient data access and querying, particularly in scenarios where data movement is costly or impractical.

Another point which is important especially in the context of the Slalom paper being GPU-centric is traditional CPUs vs newer GPUs for databases. GPUs have a clear advantage for some range of database queries simply due to the fact that they have incredible memory bandwidth and computational throughput. However, databases are read/write heavy systems and graphics cards are not especially great for that purpose. They excel at numerical operations and so could provide some benefits with processing complex requests but a faster disk structure is far more beneficial.

Also, data centers typically use servers instead of personal computers and traditionally did not support large GPUs, although this has started to change in recent years. Additionally, the trend is towards reducing power consumption, which is not compatible with adding hundreds or thousands of power-hungry GPUs to already energy-intensive data centers.

Strengths

Relax and Let the Database do the Partitioning Online

The paper has several notable strengths involving adaptabilty, physical data independence, a high-quality solution for the One-Dimensional Partitioning Problem, and better performance than other data layouts.

First, AutoStore is able to adapt to changes in workload without human intervention, hence that it can self-tune its data and partitioning. This is a significant benefit as it removes the need for a skilled database administrator to manage partitioning and tuning, saving time and resources for the organization.

Secondly, the O2P algorithm used by AutoStore is able to solve the One-Dimensional Partitioning Problem (1DPP) and make confident partitioning decisions that are dynamically adapted to workload changes. This results in a greedy solution that is of high quality and adaptable to changing workloads.

Thirdly, AutoStore provides physical data independence, which means that applications can be developed without knowledge of the underlying physical data layout.

Finally, experimental results demonstrate that AutoStore outperforms row and column layouts by up to a factor of 2 and outperforms NV/HC on both the Star Schema Benchmark and the TPC-H dataset. Overall, these strengths make AutoStore a promising solution for organizations seeking a self-tuning, adaptable data store that provides physical data independence and high-performance partitioning.

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster

The AVP paper presents several strengths of the system, which includes significant increases in flexibilty, efficiency, and robustness.

Firstly, the AVP (Adaptive Virtual Partitioning) method used by the system adjusts partition sizes dynamically, which enhances its efficiency and flexibility compared to other partitioning schemes. This means that the system can automatically adapt to changes in data size and distribution, resulting in improved performance and scalability.

Secondly, the paper highlights the strength of the FGVP approach over the SVP method in handling join queries. FGVP is more robust and efficient than SVP, especially for complex queries involving large data sets. This makes the system more attractive to users who require high-performance query processing.

Lastly, the paper provides a comprehensive overview of the system architecture and query processing stages, which enables users to understand the underlying mechanisms of the system. This description also allows for better evaluation and comparison of the system with other similar database management systems.

Overall, the strengths of the system described in the paper demonstrate its potential as an efficient and scalable database management system, particularly for handling large and complex queries. The system's ability to adapt to changing data sizes and distribution, robustness in handling joins, and comprehensive documentation of its architecture and processing stages make it suitable for use in a variety of applications.

Adaptive partitioning and indexing for in situ query processing

The SLALOM paper offers several strengths for efficient data access and querying with low index building costs, online randomized algorithm for refining partitioning and indexing, integration of specialized hardware to minimize query execution overhead. It has faster query latency compared to traditional DBMS and state-of-the-art in-situ query processing engines, and can augment the design of other systems by enabling data skipping and indexed accesses while constant query adaption.

Firstly, the approach enables efficient data access and querying with low index building costs and a small memory footprint. This is achieved through a logical partitioning scheme for raw data files that allows for fine-grained indexing decisions at the level of each partition. By doing so, lightweight per-partition indexing is enabled, which leads to near-optimal data access.

Secondly, the online randomized algorithm used for refining partitioning and indexing decisions is another strength of the approach. The algorithm enables the system to adapt to changing data access patterns and workload statistics, which further enhances the efficiency of the system.

Thirdly, the approach integrates specialized hardware to reduce update recognition costs and minimize changes to partitioning and indexing. The use of GPUs and CRC checksum units reduces the query execution overhead in the presence of updates, enabling efficient handling of updates.

Fourthly, the experimental results showed that the approach offered faster query latency compared to traditional DBMS, state-of-the-art in situ query processing engine, and adaptive indexing approaches. Even when excluding the data loading cost, the proposed approach, implemented in the Slalom query engine, outperformed other approaches by 3.7x (with a 2.45x smaller memory footprint).

Lastly, the proposed approach can augment the design of other systems by enabling data skipping and indexed accesses while constantly adapting its indexing and partitioning schemes to queries. This makes it a promising solution for efficient data access and querying, particularly in scenarios where data movement is costly or impractical.

Weaknesses

Relax and Let the Database do the Partitioning Online

While AutoStore offers many strengths, there are a few weaknesses to consider.

One major weakness is that it does not produce optimal partitioning solutions. The O2P algorithm uses several techniques to come up with a greedy solution that is dynamically adapted to workload changes, but it may not always produce the most efficient partitioning scheme possible.

Another potential weakness is the overhead introduced by partitioning changes. Repartitioning can be a time-consuming process, and if the database or access to entire tables is blocked or stalled during this process, it could significantly impact the performance of the system. Therefore, it is important to carefully consider the frequency and impact of partitioning changes in order to minimize overhead.

Additionally, while AutoStore is designed to be adaptable to changes in the query workload, it may not be able to respond quickly enough to sudden or unexpected spikes in traffic. In these cases, the system may not have enough time to adjust partitioning to meet the demands of the workload, which could result in slower query response times or other performance issues.

Finally, it is worth noting that while AutoStore's physical data independence and DBA-oblivious tuning features offer many benefits, they also remove some level of control from the database administrator. Some DBAs may prefer to have more control over the partitioning scheme and tuning process, and may not feel comfortable relying solely on an automated system to make these decisions.

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster

The AVP paper has some notable weaknesses that should be considered.

Firstly, while the technical details of the proposed solution are discussed in depth, there is a lack of focus on the practical application of the solution. This could be problematic as the solution may be difficult to implement or use in real-world scenarios. Therefore, the authors may need to provide more information about the practical aspects of their proposed solution.

Secondly, the experiments carried out in this paper were performed using a Java prototype. This means that the results obtained may not be directly applicable to other programming languages or environments. This is an important limitation to note, as the performance of the proposed solution may vary significantly depending on the programming language or environment used. Thus, further experiments should be conducted to evaluate the solution's effectiveness across different environments.

Finally, the paper does not include a comparison of the proposed solution with other database clustering techniques. Without a comparison, it is challenging to determine the relative strengths and weaknesses of the proposed solution compared to other clustering techniques. This is an essential aspect to consider when evaluating the feasibility and potential impact of the proposed solution. Therefore, the authors should provide a thorough comparison of their solution with other clustering techniques to help readers gain a better understanding of its advantages and limitations.

Adaptive partitioning and indexing for in situ query processing

The SLALOM paper also has some potential weaknesses that need to be considered.

One weakness of the approach is its limited evaluation using real-life workloads. While the authors use both synthetic and real-life workloads to evaluate their approach, the real-life workloads are not sufficiently diverse to cover all possible scenarios. This means that the performance of the approach in other contexts may be different and may require further testing and evaluation.

Another weakness of the approach is its dependency on specialized hardware, which may not be available in all environments. The authors exploit specialized hardware, such as GPUs and CRC checksum units, to reduce update recognition costs and minimize changes to partitioning and indexing. While this approach can improve performance, it may also limit the applicability of the approach in environments that lack such specialized hardware. This means that the approach may not be suitable for all scenarios and may require adaptations to work effectively in different contexts.

In conclusion, the proposed approach offers several benefits for in situ query processing, such as low index building cost, a small memory footprint, and improved query processing efficiency. However, there are also some potential weaknesses that need to be considered. The limited evaluation of the approach using real-life workloads and its dependency on specialized hardware may limit its applicability in certain contexts and require further testing and evaluation.

Conclusion

Partitioning in databases is a widely studied topic that aims to improve the performance, scalability, and manageability of large-scale data processing systems. In this survey paper, an overview of the main partitioning techniques used in databases, including horizontal and vertical partitioning, as well as their advantages, limitations, and use cases has been presented. Thereafter, three papers are examined for their contributions towards partitioning in databases, focusing on how they have been used to address specific performance and scalability challenges.

Overall, the survey highlights the importance of partitioning in databases and its role in optimizing the storage, retrieval, and processing of large-scale datasets. While partitioning can significantly improve the performance and scalability of data processing systems, it requires careful consideration of various factors, such as data distribution, workload patterns, and hardware resources. Moreover, the effectiveness of partitioning techniques may vary depending on the specific use case and system architecture, and therefore, a thorough analysis and experimentation is necessary to identify the optimal partitioning strategy. The novel approaches present fairly established solutions towards the issues highlighted, exceeding the then-benchmarks in each case. However, there are a few weaknesses which come to light, and some of these have been addressed in later works.

The majority of a database's tasks involve organizing access to memory and disk, which are not easily parallelizable. Many operations share a common memory pool, so the use of a GPU is typically unnecessary. However, certain tasks, such as sorting, analytics, compression, hashing, cryptography, and parsing, may benefit from offloading to a GPU to alleviate strain on the CPU. For cetrain types of machine learning tasks on databases, GPUs are useful, but more general tasks would benefit more from CPUs.

In conclusion, partitioning in databases remains a vibrant area of research and practice, with numerous challenges and opportunities for innovation and improvement. We hope that this survey paper provides a useful reference for researchers, practitioners, and database users interested in the field of partitioning and its applications in large-scale data processing.