forked from StarRocks/starrocks
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from wangsimo0/master
Update doc
- Loading branch information
Showing
11 changed files
with
117 additions
and
121 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Introduction | ||
|
||
## What is StarRocks | ||
|
||
* StarRocks is a high-performance, MySQL-compatible, distributed relational columnar database. It has been tested and modernized by the industry for multiple data analysis scenarios. | ||
|
||
* StarRocks takes advantage of the relational Online Analytical Processing (OLAP) database and distributed storage system. Through architectural upgrades and functional optimization, StarRocks has developed into an enterprise-level product. | ||
|
||
* StarRocks is committed to accommodating multiple data analysis scenarios for enterprise users. It supports multiple data warehouse schemas(flat tables, pre-aggregations, star or snowflake schema), multiple data import methods (batch and streaming) and allows direct access to data from Hive, MySQL and Elasticsearch without importing. | ||
|
||
* StarRocks is compatible with the MySQL protocol. Users can use the MySQL client and common Business Intelligence (BI) tools to connect to StarRocks for data analysis. | ||
|
||
* StarRocks uses a distributed architecture to divide the table horizontally and store it in multiple replications. The clusters are highly scalable and therefore support 1) 10PB-level data analysis, 2) Massively Parallel Processing (MPP), and 3) data replication and elastic fault tolerance. | ||
|
||
* Leveraging a relational model, strong data typing, and a columnar storage engine, StarRocks reduces read-write amplification through encoding and compression techniques. Using vectorized query execution, it fully unleashes the power of parallel computing on multi-core CPUs, therefore significantly improves query performance. | ||
|
||
## Main features | ||
|
||
The architectural design of StarRocks integrates the MPP database and the design ideas of distributed systems, and has the following advantages: | ||
|
||
### Simple architecture | ||
|
||
StarRocks does not rely on any external systems. The simple architecture makes it easy to deploy, maintain and scale out. | ||
|
||
### Native vectorized SQL engine | ||
|
||
StarRocks adopts vectorization technology to make full use of the parallel computing power of CPU, achieving sub-second query returns in multi-dimensional analyses. Administrators only need to focus on the StarRocks system itself, without having to learn and manage other external systems. | ||
|
||
### Query optimization | ||
|
||
StarRocks can optimize complex queries through CBO (Cost Based Optimizer). With a better execution plan, the data analysis efficiency will be greatly improved. | ||
|
||
### Query federation | ||
|
||
StarRocks allows direct access to data from Hive, MySQL and Elasticsearch without importing. | ||
|
||
### Efficiently update | ||
|
||
The updated model of StarRocks can perform upsert/delete operations according to the primary key, and achieve efficient query while concurrent updates. | ||
|
||
### Intelligent materialized view | ||
|
||
StarRocks supports intelligent materialized views. Users can create materialized views and generate pre-aggregated tables to speed up aggregate queries. StarRocks's materialized view automatically runs the aggregation when data is imported, keeping it consistent with the original table. When querying, users do not need to specify a materialized view, StarRocks can automatically select the best-materialized view to satisfy the query. | ||
|
||
### Standard SQL | ||
|
||
StarRocks supports standard SQL syntax, including aggregation, JOIN, sorting, window functions, and custom functions. Users can perform data analysis with standard SQL. In addition, StarRocks is compatible with MySQL protocol. Users can use various existing client tools and BI software to access StarRocks and perform data analysis with a simple drag-and-drop in StarRocks. | ||
|
||
### Unified batch and streaming | ||
|
||
StarRocks supports batch and streaming data import. It supports Kafka, HDFS, and local files as data sources, and ORC, Parquet, and CSV data formats. StarRocks can consume real-time Kafka data in data importing to avoid data loss or duplication. StarRocks can also import data in batches from local or remote (HDFS) data sources. | ||
|
||
### High availability, high scalability | ||
|
||
StarRocks supports multi-replica data storage and multi-instance data deployment. The cluster has the ability of self-healing and elastic recovery. | ||
|
||
StarRocks adopts a distributed architecture which allows its storage capacity and computing power to be scaled horizontally. StarRocks clusters can be expanded to hundreds of nodes to support up to 10PB data storage. | ||
|
||
## Use Case | ||
|
||
StarRocks can meet a variety of analysis needs, including OLAP analysis, customized reports, real-time data analysis, ad hoc data analysis, etc. Specific business scenarios include: | ||
|
||
* OLAP analysis | ||
* Real time data analysis | ||
* High concurrency query | ||
* Unified analysis |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Basic Concepts of StarRocks | ||
|
||
* FE: The StarRocks frontend node is responsible for metadata management, management of client connectors, query planning, query scheduling, and so on. | ||
* BE: The StarRocks backend node is responsible for data storage, calculation execution, compaction, replication management, and so on. | ||
* Broker: A transit service that connects external data such as HDFS and object storage, assisting import and export functions. | ||
* Tablet: The logical sharding of a StarRocks table, as well as the basic unit of copy management. Each table is divided into multiple tablets and stored on different BE nodes according to the partitioning and bucketing mechanisms. |
File renamed without changes.
Oops, something went wrong.