November 9, 2011

Partitioning

We distinguish between two partitioning approaches: vertical and horizontal partitioning, whereas a combination of both approaches is also possible.

Vertical partitioning refers to rearranging individual database columns. It is achieved by splitting columns of a database table in two or more column sets. Each of the column sets can be distributed on individual databases servers. This can also be used to build up database columns with different ordering to achieve better search performance while guaranteeing high-availability of data. Key to success of vertical partitioning is a thorough understanding of the application’s data access patterns. Attributes that are accessed in the same query should rely in the same partition since locating and joining additional may degrade overall performance. In contrast, horizontal partitioning addresses long database tables and how to divide them into smaller pieces of data. As a result, each piece of the database table contains a subset of the complete data within the table.

Splitting data in equivalent long horizontal partitions is used to support search operations and better scalability. For example, a scan of the request history results in a full table scan. Without any partitioning a single thread needs to access all individual history entries and checks the selection predicate. When using a naïve round robin horizontal partitioning across ten partitions, the total table scan can be performed in parallel by ten simultaneously processing threads reducing response time by approx. 1/9 compared to the single threaded full table scan. This example shows that the resulting partitions depend on the incorporated partitioning strategy. For example, rather than using round-robin as partitioning strategy attribute ranges can be used, e.g. inquirers are portioned in groups of 1,000 with the help of their user id or requested product id.

Please also see our podcast on this technology concept.

No comments:

Post a Comment