Briefly talk about the concepts of OLTP, OLAP and column storage


Organized to: The second half of Chapter 3 of data-intensive applications


OLTP(online transaction processing)

In the early stage of commercial data processing, writing to the database usually corresponds to commercial transaction scenarios, such as: sales, orders and other scenarios involving money transactions. The English name of transaction is transaction, which is the source of the word transaction. It represents a A set of read and write operations for a logical unit.

Although databases are now widely used in various scenarios, applications usually use certain keys in the index to query a small number of records, or insert or update records based on user input. Because these applications are interactive, the access mode Also known as Online Transaction Processing (OLTP).


OLAP(online analytic processing)

Databases are also beginning to be used more and more for data analysis. Data analysis requires scanning a large number of records, and each record usually only reads a few columns and calculates summary statistics (such as counts, sums or averages) ) instead of returning raw data to the user.

For example: counting the average sales of each store, these queries are usually written by business analysts to form better decisions that help the company's management; to distinguish it from transaction processing systems, we call it Online Analytical Processing (OLAP) .


OLTP VS OLAP

Attributes Transaction Processing OLTP Analysis system OLAP
main read feature Based on key query, each query returns a small number of records Summarize a large number of records
main write feature Random access, writing requires low latency Bulk import (ETL) or event streaming
Applicable scene end user, via web application In-house data analyst to support decision making
data representation The latest state of the data (current point in time) historical events over time
data size GB~TB TB ~ PB

At the beginning, the same database could be used for both transaction processing and analytical queries, but later large companies began to gradually abandon the use of OLTP systems for data analysis, and instead used a separate database for analysis, this separate database is also known as for the data warehouse.


database

For large enterprises, there are usually dozens of different transaction processing systems. Because these OLTP systems are critical to business operations, it is generally not allowed to run ad hoc analysis queries directly on the OLTP database. These queries are very expensive. This impairs the performance of OLTP systems executing transactions concurrently due to the need to scan large datasets.

It is therefore common practice to use a separate database, the data warehouse, which contains a read-only copy of all the company's OLTP systems, periodically extract data from the OLTP database, convert to an analytics-friendly schema, perform the necessary cleanup, and then load into the data warehouse.

The process of importing data into the data warehouse is called Extract-Transform-Load (ETL), as shown in the figure below:
insert image description here
using a separate data warehouse instead of directly querying the OLTP system for analysis, the great advantage lies in the data warehouse Can be optimized for analyzing access patterns.


Difference Between OLTP Database and Data Warehouse

Since SQL is generally suitable for analytical queries, there are many graphical data analysis tools that can generate SQL queries, visualize the results, and support analysts to explore data, such as through operations such as drilling down, slicing and dicing, so the most common use of database warehouses Models are relational.

Notice:

  • Drilling down, slicing, and dicing are all commonly used data analysis techniques in the field of big data.

  • Drill Down refers to the method of digging deeper into data by expanding data details layer by layer.

    • For example, in a sales data report, we can drill down to understand the sales situation of a certain region, and then further understand the sales situation of a certain sales point in the region, and so on, until the most detailed data.
  • Slicing refers to cutting data according to a certain dimension in order to better understand the distribution of data.

    • For example, in a customer satisfaction survey data, we can slice the data according to different regions to understand the customer satisfaction conditions in different regions.
  • Dice refers to cutting data according to multiple dimensions to get a more comprehensive understanding of the distribution of data.

    • For example, in a sales data report, we can dice the data according to different regions and time to understand the sales situation in different regions and time periods.

In short, drilling down, slicing and dicing are all very important data analysis techniques in the field of big data, which can help us better understand the distribution and trends of data, so as to make more accurate decisions.

Although data warehouses and relational LOTP databases look similar because they both have SQL query interfaces, they are completely different in terms of internal storage and query engine implementation.


Star and Snowflake Analysis Schemas

According to different application requirements, a variety of different data models will be used in the transaction processing field, such as: relational database, document database, graph database, etc.

For analytical businesses, there are far fewer data models, and most data warehouses use star analysis schemas.

Star analysis schema:

  • A star analysis schema is a data warehouse design pattern that uses a central fact table (Fact Table) and surrounding dimension tables (Dimension Table) to store and analyze data.
  • The central fact table contains all the fact data, and the dimension tables contain dimension information related to the fact data.
  • The advantage of this mode is that it can quickly perform multi-dimensional data analysis, but the disadvantage is that performance problems may occur when processing large amounts of data.

Here we take a retail data warehouse as an example:
insert image description here
the center of the schema is a so-called fact table, in this case the fact_sales table, each row of the fact table represents an event that occurred at a specific time, here each row represents an item purchased by a customer.

If we were analyzing website traffic instead of retail, each row might represent a page view or a user's click.

Typically, facts are captured as individual events, which allow for maximum flexibility for later analysis, but also mean that the fact table can become very large.

The columns in the fact table are attributes, such as the price at which a product is sold and the cost of a purchase from a supplier, and other columns may be foreign keys to other tables, called dimension tables. Since each row in the fact table represents an event, dimensions typically represent the event's object (who), what (what), location (where), time (when), method (how), and reason (why).

In this example, one of the dimensions is the product sold (dim_product), and each row in the fact_sales table uses a foreign key to represent the product sold in that particular transaction.

Dates and times are often represented using dimensions, which encode information about dates such as public holidays so that queries can compare sales between holiday and non-holiday days.

The name "star schema" comes from that when table relationships are visualized, the fact table sits in the middle, surrounded by a series of dimension tables that are joined like the light of a star.


Snowflake analysis mode:

  • The snowflake analysis schema is also a data warehouse design pattern, which is similar to the star analysis schema, but uses more hierarchical relationships in the dimension tables.
  • The advantage of this mode is that it can better handle complex analysis requirements, but the disadvantage is that query performance may be affected.

The snowflake analysis schema is a variant of the star analysis schema in which dimensions are further subdivided into subspaces.

For example: there might be separate tables for brands and product categories, and each row in the dim_product table could again refer to the brand and category as a foreign key instead of storing it as a string directly in the dim_product table.

The snowflake analysis schema is more canonical than the star analysis schema, but the star analysis schema is usually preferred, mainly because for the analyst, the star analysis schema is simpler to use.

In a typical data warehouse, the tables are usually very wide, the fact table may have hundreds of columns, and the dimension tables may be very wide, which may include all the metadata relevant to the analysis.


columnar storage

While fact tables often have over 100 columns, typical data warehouse queries access only 4 or 5 of them at a time. If there are PB-level data in the fact table, efficiently storing and querying these data will become a difficulty.

Dimension tables are usually much smaller, with only a few million rows.

Still take the following picture as an example, we write SQL statements to analyze whether people are more inclined to buy fresh fruit or candy on a certain day of the week
insert image description here

SELECT
  dim_date.weekday,
  dim_product.category,
  SUM(fact_sales.quantity) AS quantity_sold
FROM fact_sales
  JOIN dim_date ON fact_sales.date_key = dim_date.date_key
  JOIN dim_product ON fact_sales.product_sk = dim_product.product_sk
WHERE
  dim_date.year = 2013 AND
  dim_product.category IN ('Fresh fruit', 'Candy')
GROUP BY
  dim_date.weekday, dim_product.category;

How can we execute this query efficiently?

  • In most OLTP databases, storage is laid out in a row-oriented fashion: all values ​​in a row of a table are stored next to each other.
  • Document databases are similar: the entire document is usually stored as a contiguous sequence of bytes.

To handle queries like in this example, you probably have indexes fact_sales.date_key、fact_sales.product_skon tell the storage engine where to find all sales for a particular date or for a particular product.

However, a row-oriented storage engine still needs to load all these rows (each containing more than 100 attributes) from disk into memory, parse them, and filter out those attributes that do not meet the requirements. This can take a long time.

The idea behind columnar storage is simple: instead of storing all the values ​​from one row together, store all the values ​​from each column together.

If each column is stored in a separate file, the query only needs to read and parse those columns used in the query, which can save a lot of work.

insert image description here
Columnar storage layouts rely on each column file to contain rows in the same order. So if you need to reassemble the complete row, you can take the 23rd item from each individual column file and put them together to form row 23 of the table.


column compression

In addition to loading only the columns needed for the query from disk, we can further reduce the demand for disk throughput by compressing the data. Fortunately, columnar storage is often well suited for compression.

The core idea of ​​column compression technology is to reuse the same data and reduce storage space by compressing data. Common column compression techniques include dictionary encoding, bitmap compression, and matrix compression.

  • Taking dictionary encoding as an example, suppose there is a data table containing city names and corresponding populations, and the city names are repeated. Using the dictionary encoding technique, the city names can be stored separately in a dictionary table, and then the numbers in the dictionary table can be used to replace the city names in the original data table. This can greatly reduce the storage space for city names, and also allow for faster matching at query time.
  • Another example is the bitmap compression technology. Suppose there is a data table containing user IDs and corresponding purchase records, where the purchase records have only two states: purchased and unpurchased. Using the bitmap compression technology, the purchased and non-purchased can be represented by 1 and 0 respectively, and then the purchase records of all users are stored in a bitmap. This can greatly reduce the storage space, and it can also perform bit operations faster at query time.

Here we take bitmap encoding as an example, as shown in the figure below:
insert image description here
Usually, the number of different values ​​in a column is much smaller than the number of rows.

Example: A retailer may have billions of sales transactions, but only 100,000 distinct products

Now we can take a column with n distinct values ​​and convert it into n separate bitmaps:

  • Each distinct value corresponds to a bitmap, and each row corresponds to a bit. The bit is 1 if the row has that value, 0 otherwise.

These bitmap indexes are well suited for a variety of queries commonly found in data warehouses. For example:

WHERE name IN("大忽悠","小朋友"

Load the three bitmaps with name="Big Huyou" and name="Little Friends" and calculate the bitwise OR (OR) of the three bitmaps:
insert image description here

WHERE name="大忽悠" and school="清华"

Load the bitmap with name="Dahuyou" and school="Tsinghua", and calculate the bitwise AND (AND). This is because the columns contain the rows in the same order, so the kth bit in one column's bitmap corresponds to the same row as the kth bit in the other column's bitmap.
insert image description here
If n is very small (for example, a country column might have around 200 different values), these bitmaps can store one bit per row.

However, if n is larger, most bitmaps will have a lot of zeros (we say they are sparse). In this case, the bitmap can additionally be run-length encoded (a lossless data compression technique), as shown below. This can make the encoding of columns very compact.
insert image description here
There are also various compression schemes for different kinds of data.


Columnar storage and column families

Cassandra and HBase have a concept of column families which they inherit from Bigtable.

However, calling them column-oriented is very misleading:

  • Within each column family, they store all columns in a row with the row key and do not use column compression.

Therefore, the Bigtable model is still primarily row-oriented.


Memory Bandwidth and Vectorization

For data warehouse queries that need to scan millions of rows, a huge bottleneck is the bandwidth to fetch data from disk to memory.

However, this is not the only bottleneck. Developers of analytical databases also need to efficiently utilize memory-to-CPU cache bandwidth, avoid branch mispredictions and idle waits in the CPU instruction processing pipeline, and use single instruction multiple data (SIMD) on modern CPUs.

What are branch mispredictions and idle waits:

  • When the CPU is executing an instruction, it processes the instructions one by one through the instruction processing pipeline. Branch mispredictions and idle waits are common problems in instruction processing pipelines.

  • A branch misprediction is when the CPU processes a branch instruction and the CPU tries to predict the outcome of the branch so that the instruction can be executed faster if the prediction is correct. However, if the CPU's prediction is wrong, then it must fall back and re-execute the previous instruction, which causes delays and performance degradation in the instruction processing pipeline.

    • For example, suppose a program has a loop that jumps to the end of the loop on the first iteration and skips the loop on subsequent iterations. If the CPU mispredicts, it jumps to the end of the loop on subsequent iterations, which causes delays and performance degradation in the instruction processing pipeline.
  • When the processor executes instructions, it may encounter some dependencies, and it needs to wait for the previous instructions to be executed before continuing to execute. This wastes time and resources, causing idle waiting.

    • For example, suppose a program needs to read some data from memory and then write that data back to memory. If the CPU is idle while waiting for the memory read to complete, then it cannot execute other instructions, which causes delays and performance degradation in the instruction processing pipeline.
  • To solve these problems, CPUs usually use techniques such as branch target caching and out-of-order execution to reduce the impact of branch mispredictions and idle waits.

Why Branch Target Caching and Out-of-Order Execution Improve Processor Performance and Efficiency:

  • The branch target cache is a cache mechanism that stores the target addresses of branch statements. When the processor encounters a branch statement, it first checks the branch target cache to determine the target address of the branch. If the target address is already in the cache, the processor can jump directly to the target address without speculation. This reduces the impact of branch mispredictions.
    • For example, suppose a program has an if statement that jumps to a different code block based on the condition. If the processor needs to predict the outcome of a branch every time, time and resources are wasted. However, if the branch target cache is used, the processor can jump directly to the target address, avoiding the effects of misprediction.
  • Out-of-order execution is an instruction execution technique that allows the processor to continue executing subsequent instructions while waiting for previous instructions to complete. This reduces the impact of idle waiting.
    • For example, suppose there are a series of instructions in the program, there is a dependency relationship between instruction 1 and instruction 2, and instruction 2 can only be executed after instruction 1 is executed. If the processor uses out-of-order execution technology, it can execute the following instructions 3 and 4 while waiting for instruction 1 to execute. This can reduce the waiting time and improve the efficiency of the processor.
  • To sum up, branch target caching and out-of-order execution are two technologies commonly used in modern processors. They can reduce the impact of branch prediction errors and idle waiting, and improve the performance and efficiency of the processor.

What is a Single Instruction Multiple Data (SIMD) instruction and why it speeds up operations:

  • Single Instruction Multiple Data (SIMD) instructions are a set of computer instructions that can perform the same operation on multiple pieces of data at the same time. This instruction set can process multiple data in one clock cycle, thus speeding up calculations.
  • In traditional computer instructions, each instruction can only process one piece of data. If the same operation needs to be performed on multiple data, the same instruction needs to be executed multiple times. In SIMD instructions, multiple data can be processed at one time, thereby reducing the number of executions of instructions and improving computing efficiency.
    • For example, suppose you need to add every element in a vector. In traditional instructions, loops are used to process each element in turn, while in SIMD instructions, the entire vector can be processed at once, which greatly speeds up the calculation.
  • Therefore, SIMD instructions are widely used in many fields that require high-performance computing, such as image processing, digital signal processing, and scientific computing.

In addition to reducing the amount of data that needs to be loaded from disk, the columnar storage layout also makes efficient use of CPU cycles.

For example, a query engine can put a whole block of compressed columnar data into the CPU's L1 cache and traverse it in a tight loop (ie, no function calls).

Compared with the code that requires a lot of function calls and conditional judgments for the processing of each record, it is much faster for the CPU to execute such a loop. Column compression allows more rows in a column to be simultaneously placed in the limited L1 cache. The previously described bitwise AND and OR operators can be designed to operate directly on such compressed column data blocks. This technique is called vectorized processing.


Sort order in columnar storage

In columnar storage, the order in which rows are stored is not critical. Storing them in insertion order is easiest, since inserting a new row requires only appending to each column file.

However, we can also choose to arrange the data in a certain order, as we do with SSTables, and use that as an indexing mechanism.

SSTable (Sorted String Table) is a data structure used to store key-value pairs. It sorts the key-value pairs by key and stores them on disk for quick lookup and access.

注意,对每列分别执行排序是没有意义的,因为那样就没法知道不同列中的哪些项属于同一行。我们只能在明确一列中的第 k 项与另一列中的第 k 项属于同一行的情况下,才能重建出完整的行。

相反,数据的排序需要对一整行统一操作,即使它们的存储方式是按列的

Database administrators can choose columns in tables to sort by based on their knowledge of common queries.

  • For example, if queries typically target a date range, such as "last month", you can use the date column as the first sort key. In this way, the query optimizer can only scan the rows in the range of nearly 1 month, which is much faster than scanning all rows.

Rows with the same value in the first sort column can be further sorted by the second sort column.

  • For example, if the date column is the first sort key, then product_sk might be the second sort key, so that all sales data for the same product on the same day are stored adjacent. This will help queries that need to group or filter sales by product within a certain date range.

Another benefit of sorting sequentially is that it can help compress columns. If the primary sort column does not have too many distinct values, then after sorting you will end up with a sequence of the same value repeated many times in a row. A simple run-length encoding can compress this column to a few KB -- even with billions of rows in the table.

The first sort key compresses most. The second and third sort keys will be more confusing, so there won't be such a long run of duplicate values. Columns with lower sort priority appear in almost random order and so may not be compressed. But sorting the first few columns is still beneficial overall.


several different sort orders

Since different queries benefit from different sort orders, why not store the same data in several different ways?

Anyway, the data needs to be backed up to prevent data loss in case of a single point of failure. So you can store redundant data in different sorts, so that when processing a query, the version that best fits the query pattern is invoked.

Having multiple sort orders in a columnar store is somewhat similar to having multiple secondary indexes in a row-oriented store. But the big difference is that row-oriented storage keeps each row in one place (either in a heap file or in a clustered index), and the secondary index only contains pointers to matching rows. In columnar storage, you usually don't have any pointers to the data anywhere else, only the columns containing the values.


write to columnar storage

The optimizations above all make sense in a data warehouse, where the workload consists mostly of large read-only queries run by analysts. Columnar storage, compression, and sorting all contribute to faster reads for these queries. However, they have the disadvantage of being more difficult to write.

In-place update methods using B-trees are not possible with compressed columns. If you want to insert a row in the middle of a sorted table, you will most likely have to rewrite all column files. Since rows are identified by positions in columns, inserts must update all columns consistently.

Obviously, in the data warehouse scenario, log-type append writing is more suitable for the current application scenario. Therefore, we first think of the LSM tree.

  • LSM tree (Log-Structured Merge Tree) is a data structure used to implement key-value storage. Its design inspiration comes from the log structure (Log-Structured) and merge tree (Merge Tree) in traditional database systems.

  • The basic idea of ​​the LSM tree is to store data in a multi-layer ordered structure, and each layer structure is an ordered key-value storage structure, such as a B-tree. When writing data, new data is first appended to the top level of the LSM tree. This top level structure is called a memory table (MemTable). When the data in the memory table reaches a certain size or quantity, it is written to the next layer structure, which is called the disk table (DiskTable), and the memory table is cleared to continue writing new ones. data.

  • When reading data, the LSM tree will first search from the memory table. If the data is not found in the memory table, it will search from the disk table. Since the structure of each layer is ordered, this feature can be used for optimization when searching for data, for example, algorithms such as binary search can be used.

  • When the number of disk tables increases, in order to ensure read and write performance, it is necessary to periodically merge multiple disk tables into a larger disk table. This process is called Merge. The purpose of the merge operation is to merge multiple disk tables into a larger disk table, deduplicate and sort at the same time, so that the performance of data query is better.

  • The advantage of the LSM tree is that it can support high-throughput write operations and can still guarantee read performance even when the amount of data is very large. The disadvantage is that the merge operation needs to be performed periodically, which will affect the performance of the system and may lead to data inconsistency in some cases.

All writes first go to an in-memory store where they are added to an ordered structure and ready to be written to disk. It doesn't matter whether the in-memory storage is row- or column-oriented. When enough writes have accumulated, they are merged with the column files on disk and written to new files in batches. This is basically what Vertica does.

Query operations need to check the column data on disk and the most recent writes in memory, and combine the results of the two. However, the query optimizer hides this detail from the user. From an analyst's perspective, data modified through insert, update, or delete operations is immediately reflected in subsequent queries.


Aggregation: Data Cubes and Materialized Views

Another aspect of data warehouses worth mentioning is materialized aggregates.

  • Data warehouse queries often involve an aggregate function, such as COUNT, SUM, AVG, MIN, or MAX in SQL.
  • If the same aggregate is used by many different queries, it might be too wasteful to go through the raw data each time. Why not cache the most frequently used counts or sums for some queries?

One way to create this cache is a Materialized View.

  • In a relational data model, it is usually defined as a standard (virtual) view :
    • A table-like object whose contents are the results of some queries.
    • The difference is that materialized views are actual copies of query results that are written to disk, while virtual views are just a shortcut for writing queries.
    • When reading from a virtual view, the SQL engine expands it into the view's underlying query before processing the expanded query.
  • Virtual view and materialized view are two types of views in database. A view is a virtual table derived from one or more tables that contains only a logical representation of data picked from those tables, rather than the actual data. The main difference between virtual views and materialized views is their data storage method and query efficiency.

  • A virtual view (also called a "query view") is a result set of a query statement. It is a virtual table that does not actually store data, but is dynamically generated during query. Virtual views can include complex SQL
    queries that can select, filter, join data from one or more tables, and then return the result set to the user as a view. The advantage of the virtual view is that it saves storage space, is convenient for management and maintenance, and can calculate and return the latest data in real time when querying.

  • A materialized view (also known as a "snapshot view") is a precomputed and stored on-disk view that is actually a table containing data. A materialized view selects, filters, joins data from one or more tables and stores the results in the table. When a query requests access to a materialized view, it does not have to recompute the data, but retrieves the data directly from the materialized view. The advantage of materialized view is that it can improve query performance, reduce response time and avoid frequent queries.

  • The choice of virtual view and materialized view depends on the specific application scenario. Virtual views are suitable for situations where data volume is small, queries are frequent, and query performance requirements are not high; materialized views are suitable for situations where data volumes are large, queries are complex, and query performance requirements are high. The disadvantage of materialized view is that it takes up storage space, and the cost of data update and maintenance is high, so it needs to be used with caution.

When the underlying data changes, the content of the materialized view may become outdated, so the materialized view needs to be updated to maintain its correctness and consistency. This update operation usually increases the cost of writing, so materialized views are not often used in OLTP databases.

By contrast, in a read-heavy data warehouse, using materialized views can make more sense. In this case, due to frequent and complex queries, using materialized views can improve query performance and reduce query costs. Materialized views allow pre-computation and pre-aggregation on the result set of the query, thereby reducing the calculation and aggregation workload required for the actual query and improving query performance. In addition, because the materialized view is pre-calculated and pre-aggregated, for some query requests, the materialized view can directly return the result, thus avoiding the cost of real-time calculation and aggregation.

It should be noted that the use of materialized views also has some limitations and considerations.

  • Since a materialized view is a denormalized copy of data, its contents need to be consistent and correct with the underlying data.
  • In addition, because the update operation of the materialized view may increase the writing cost, it is necessary to weigh the pros and cons and choose the appropriate technical solution according to the specific scenario and requirements.

"Non-normalized copy" means that the data contained in the materialized view does not conform to the normalized design principles in the database, that is, there is redundant and repeated data.

  • This redundant and repeated data is deliberately introduced to improve query performance and reduce query costs, because the materialized view will pre-calculate and store some complex query results, so that the results can be quickly obtained when querying.
  • This means that the data in the materialized view may differ from the underlying data because updates to the materialized view may lag behind updates to the underlying data, or because updates to the underlying data are not reflected in the materialized view in a timely manner.

A common special case of a materialized view is called a data cube or OLAP cube. It is an aggregated grid grouped by different dimensions as follows:
insert image description here

Two dimensions of a data cube, aggregated by summation

As shown in the image above, each fact now has only foreign keys to two dimension tables, date and product.

  • You can now plot a two-dimensional table with dates on one axis and products on the other. Each cell contains an aggregation (eg SUM) of an attribute (eg net_price) with all facts for that date-product combination.
  • You can then apply the same aggregation along each row or column and get an aggregation with one dimension reduced (sales by product, regardless of date, or sales by date, regardless of product).

In general, facts tend to have more than two dimensions. Suppose we have five dimensions: date, product, store, promotion and customer.

It's hard to imagine what a five-dimensional hypercube would look like, but the principle is the same: each cell contains sales for a specific date-product-store-promotion-customer combination. These values ​​can be summed and aggregated across each dimension.

The advantage of materializing a data cube is that it can make certain queries very fast because they are effectively precomputed.

  • For example, if you want to know the total sales for each store, you can just look at the grand totals in the appropriate dimensions, without scanning millions of rows of raw data.

The disadvantage of data cubes is that they do not have the flexibility to query the raw data .

  • For example, there is no way to calculate what percentage of sales are from items that cost more than $100 because price is not one of the dimensions.
  • Therefore, most data warehouses try to keep as much raw data as possible and use aggregated data (such as data cubes) only as a performance boost for certain queries.

summary

In general, storage engines are classified into the following two categories:

  • Storage engine optimized for transaction processing (OLTP) and storage engine optimized for online analytics (OLAP).

There is a big difference between the access patterns for these two use cases:

  • OLTP systems are typically end-user oriented, which means that the system may receive a high volume of requests. To handle the load, applications typically access only a small number of records per query. Applications request records with some kind of key, and the storage engine uses indexes to find the data for the requested key. Hard disk seek time is often the bottleneck here.
  • Data warehouses and similar analytical systems are less common because they are primarily used by business analysts rather than end users. They are much less query-intensive than OLTP systems, but are often expensive per query, requiring millions of records to be scanned in a short amount of time. Disk bandwidth (rather than lookup time) is often the bottleneck, and columnar storage is an increasingly popular solution for this workload.

On the OLTP side, we can see two mainstream storage engines:

  • Log-structured school of thought: only allows appending to files and deleting obsolete files, but does not update files that have already been written.
    • Bitcask, SSTables, LSM trees, LevelDB, Cassandra, HBase, Lucene, etc. all fall into this category.
  • In-place update school: Think of the hard disk as a set of fixed-size pages that can be overwritten.
    • B-trees are the poster child for this idea, used in all major relational databases and many non-relational databases.

Log-structured storage engines are relatively new technology. Their main idea is that by systematically converting random-access writes to sequential writes on the hard disk, higher write throughput can be achieved due to the performance characteristics of HDDs and SSDs .

Guess you like

Origin blog.csdn.net/m0_53157173/article/details/130297430