Unified search layer #310

op-hunter · 2021-04-27T03:35:49Z

The function searchBaseLayer and searchBaseLayerST has almost the same code, this pull request unified the two functions.

Signed-off-by: cmli <chengming.li@zilliz.com>

op-hunter · 2021-04-27T05:14:18Z

@yurymalkov Can you help me with this pr? Error: "ImportError: cannot import name 'NoReturn'" occurs.

yurymalkov · 2021-05-03T04:04:39Z

Thanks for the PR! It seems like a python 3.6 error. I can look into it.

From the point of the PR, I wonder if you've benchmarked the performance of the single-threaded version (e.g. check that performance is the same/better)?
The main intention to separate them is to avoid locks/branching in the search code (which isn't really need them unless there are updates at the same time).

op-hunter · 2021-05-07T11:44:09Z

I did a simple benchmark on my machine with 32G memory.
Index parametes: efConstruction = 200, M = 48, efSearch = 128. nq = 10000
Dataset: sift-1m, dimension = 128.
It seems that there is no much difference.
Test with multithread:

	branch master	unified search layer
build cost	293890	295734
query cost	2140	2053

Test with single thread:

	branch master	unified search layer
build cost	1854470	1849888
query cost	12939	12951

Note that the unit of time is millisecond.

yurymalkov · 2021-05-10T05:05:39Z

Thank you for the test! I usually test it with low dim (say d=4), so to make evident the bottlenecks outside of the distance function computation.
I'll do my typical tests for the change the following week and will get back to you.

yurymalkov · 2021-05-19T21:35:29Z

Hey @op-hunter ,
I've finally checked the performance for d=4 and it seems like there is a slowdown by more than 10% in case of large ef values (e.g. searching 1M vectors is 26.5k/s in single thread for ef=128 and only 20.8 k/s for the proposed change).

I think this can be solved by using templates. I wonder if there any way to make conditional lock creation?

yurymalkov · 2021-05-19T21:49:37Z

Maybe even pointers are fine. Not sure if there is a clean alternative.

op-hunter · 2021-05-20T11:18:06Z

@yurymalkov How about use std::defer_lock to delay the lock operation?
line 198:

std::unique_lock<std::mutex> lk(link_list_locks_[current_node_id], std::defer_lock);
if (!is_st) lk.lock();

yurymalkov · 2021-05-20T20:08:49Z

Hm. I can check that.

cmli added 3 commits April 27, 2021 11:23

refactor searchBaseLayer and searchBaseLayerST

15d3990

Signed-off-by: cmli <chengming.li@zilliz.com>

remove searchBaseLayerST

1562674

Signed-off-by: cmli <chengming.li@zilliz.com>

restart ci

a5a9d86

Signed-off-by: cmli <chengming.li@zilliz.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unified search layer #310

Unified search layer #310

op-hunter commented Apr 27, 2021

op-hunter commented Apr 27, 2021

yurymalkov commented May 3, 2021

op-hunter commented May 7, 2021 •

edited

Loading

yurymalkov commented May 10, 2021

yurymalkov commented May 19, 2021

yurymalkov commented May 19, 2021

op-hunter commented May 20, 2021

yurymalkov commented May 20, 2021

Unified search layer #310

Are you sure you want to change the base?

Unified search layer #310

Conversation

op-hunter commented Apr 27, 2021

op-hunter commented Apr 27, 2021

yurymalkov commented May 3, 2021

op-hunter commented May 7, 2021 • edited Loading

yurymalkov commented May 10, 2021

yurymalkov commented May 19, 2021

yurymalkov commented May 19, 2021

op-hunter commented May 20, 2021

yurymalkov commented May 20, 2021

op-hunter commented May 7, 2021 •

edited

Loading