diff --git a/_search-plugins/caching/index.md b/_search-plugins/caching/index.md new file mode 100644 index 0000000000..4d0173fdc7 --- /dev/null +++ b/_search-plugins/caching/index.md @@ -0,0 +1,32 @@ +--- +layout: default +title: Caching +parent: Improving search performance +has_children: true +nav_order: 100 +--- + +# Caching + +OpenSearch relies heavily on different on-heap cache types to accelerate data retrieval, providing significant improvement in search latencies. However, cache size is limited by the amount of memory available on a node. If you are processing a larger dataset that can potentially be cached, the cache size limit causes a lot of cache evictions and misses. The increasing number of evictions impacts performance because OpenSearch needs to process the query again, causing high resource consumption. + +Prior to version 2.13, OpenSearch supported the following on-heap cache types: + +- **Request cache**: Caches the local results on each shard. This allows frequently used (and potentially resource-heavy) search requests to return results almost instantly. +- **Query cache**: The shard-level query cache caches common data from similar queries. The query cache is more granular than the request cache and can cache data that is reused in different queries. +- **Field data cache**: The field data cache contains field data and global ordinals, which are both used to support aggregations on certain field types. + +## Additional cache stores +**Introduced 2.13** +{: .label .label-purple } + +This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/10024). +{: .warning} + +In addition to existing OpenSearch custom on-heap cache stores, cache plugins provide the following cache stores: + +- **Disk cache**: This cache stores the precomputed result of a query on disk. You can use a disk cache to cache much larger datasets, provided that the disk latencies are acceptable. +- **Tiered cache**: This is a multi-level cache, in which each tier has its own characteristics and performance levels. For example, a tiered cache can contain on-heap and disk tiers. By combining different tiers, you can achieve a balance between cache performance and size. To learn more, see [Tiered cache]({{site.url}}{{site.baseurl}}/search-plugins/caching/tiered-cache/). + +In OpenSearch 2.13, the request cache is integrated with cache plugins. You can use a tiered or disk cache as a request-level cache. +{: .note} \ No newline at end of file diff --git a/_search-plugins/caching/tiered-cache.md b/_search-plugins/caching/tiered-cache.md new file mode 100644 index 0000000000..3842ebe5a9 --- /dev/null +++ b/_search-plugins/caching/tiered-cache.md @@ -0,0 +1,82 @@ +--- +layout: default +title: Tiered cache +parent: Caching +grand_parent: Improving search performance +nav_order: 10 +--- + +# Tiered cache + +This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/OpenSearch/issues/10024). +{: .warning} + +A tiered cache is a multi-level cache, in which each tier has its own characteristics and performance levels. By combining different tiers, you can achieve a balance between cache performance and size. + +## Types of tiered caches + +OpenSearch 2.13 provides an implementation of _tiered spillover cache_. This implementation spills the evicted items from upper to lower tiers. The upper tier is smaller in size but offers better latency, like the on-heap tier. The lower tier is larger in size but is slower in terms of latency compared to the upper tier. A disk cache is an example of a lower tier. OpenSearch 2.13 offers on-heap and disk tiers. + +## Enabling a tiered cache + +To enable a tiered cache, configure the following setting: + +```yaml +opensearch.experimental.feature.pluggable.caching.enabled: true +``` +{% include copy.html %} + +For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/). + +## Installing required plugins + +A tiered cache provides a way to plug in any disk or on-heap tier implementation. You can install the plugins you intend to use in the tiered cache. As of OpenSearch 2.13, the available cache plugin is the `cache-ehcache` plugin. This plugin provides a disk cache implementation to use within a tiered cache as a disk tier. + +A tiered cache will fail to initialize if the `cache-ehcache` plugin is not installed or disk cache properties are not set. +{: .warning} + +## Tiered cache settings + +In OpenSearch 2.13, a request cache can use a tiered cache. To begin, configure the following settings in the `opensearch.yml` file. + +### Cache store name + +Set the cache store name to `tiered_spillover` to use the OpenSearch-provided tiered spillover cache implementation: + +```yaml +indices.request.cache.store.name: tiered_spillover: true +``` +{% include copy.html %} + +### Setting on-heap and disk store tiers + +The `opensearch_onheap` setting is the built-in on-heap cache available in OpenSearch. The `ehcache_disk` setting is the disk cache implementation from [Ehcache](https://www.ehcache.org/). This requires installing the `cache-ehcache` plugin: + +```yaml +indices.request.cache.tiered_spillover.onheap.store.name: opensearch_onheap +indices.request.cache.tiered_spillover.disk.store.name: ehcache_disk +``` +{% include copy.html %} + +For more information about installing non-bundled plugins, see [Additional plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/#additional-plugins). + +### Configuring on-heap and disk stores + +The following table lists the cache store settings for the `opensearch_onheap` store. + +Setting | Default | Description +:--- | :--- | :--- +`indices.request.cache.opensearch_onheap.size` | 1% of the heap | The size of the on-heap cache. Optional. +`indices.request.cache.opensearch_onheap.expire` | `MAX_VALUE` (disabled) | Specify a time-to-live (TTL) for the cached results. Optional. + +The following table lists the disk cache store settings for the `ehcache_disk` store. + +Setting | Default | Description +:--- | :--- | :--- +`indices.request.cache.ehcache_disk.max_size_in_bytes` | `1073741824` (1 GB) | Defines the size of the disk cache. Optional. +`indices.request.cache.ehcache_disk.storage.path` | `""` | Defines the storage path for the disk cache. Required. +`indices.request.cache.ehcache_disk.expire_after_access` | `MAX_VALUE` (disabled) | Specify a time-to-live (TTL) for the cached results. Optional. +`indices.request.cache.ehcache_disk.alias` | `ehcacheDiskCache#INDICES_REQUEST_CACHE` (this is an example of request cache) | Specify an alias for the disk cache. Optional. +`indices.request.cache.ehcache_disk.segments` | `16` | Defines the number of segments the disk cache is separated into. Used for concurrency. Optional. +`indices.request.cache.ehcache_disk.concurrency` | `1` | Defines the number of distinct write queues created for the disk store, where a group of segments share a write queue. Optional. +