Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SQL Server Memory to use the new VECTOR type #796

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

marcominerva
Copy link
Contributor

Motivation and Context (Why the change? What's the scenario?)

Things are moving fast in the Vector Support for SQL Azure. Now that the official VECTOR type has been introduced (https://devblogs.microsoft.com/azure-sql/eap-for-vector-support-refresh-introducing-vector-type/), I have updated SqlServerMemoryDb to use it.

Changes should be straightforward. However, I don't have access to the new data type yet, so for the moment I have opened this PR as Draft. As soon as I can make some tests, I finalize it.

/cc @dluc

@dluc
Copy link
Collaborator

dluc commented Oct 1, 2024

@marcominerva how should we handle this PR, in terms of testing it before merge? Is there a timeline for the underlying SQL feature to be GA?

@dluc dluc added the waiting for author Waiting for author to reply or address comments label Oct 1, 2024
@marcominerva
Copy link
Contributor Author

At this moment, this feature is in Early Adopter Preview. I think the next step will be the Public Preview, but the timeline hasn't been shared yet.

The PR with the initial support (using VARBINARY column and functions like JSON_ARRAY_TO_VECTOR) has already been merged: #722. Now that the native VECTOR type has been official annunced (https://devblogs.microsoft.com/azure-sql/eap-for-vector-support-refresh-introducing-vector-type), the old PR should be replaced by the new implementation (available in this PR), because the old features based on VARBINARY will be removed before the Public Preview.

I got access to the native VECTOR type in the last days and, since then, I have made some tests. Tomorrow I'll update this PR with some minor changes.

We can flag this new feature as Experimental, until at least the Public Preview. What do you think?

@dluc
Copy link
Collaborator

dluc commented Oct 2, 2024

I'd only want to be sure that if we merge it won't cause errors to users already using the code in the main branch.
I don't use this storage type so I can't get sufficient confidence without someone taking the time for testing, so perhaps I'll wait for you to test and for the feature to be public.

It's fine to wait, no rush. We can avoid the experimental flag, since the entire repo is experimental, and I wouldn't want to give the wrong impression (e.g. if X is experimental, the rest is not?)

Introduce detailed documentation for `useNativeVectorSearch`
parameter in `KernelMemoryBuilderExtensions` and `DependencyInjection`
classes. Update `SqlServerConfig` with remarks on EAP status and a link to a blog post for more information.
Corrected the formatting of the `useNativeVectorSearch` parameter's
description in the `WithSqlServerMemoryDb` and `AddSqlServerAsMemoryDb`
methods.
@marcominerva
Copy link
Contributor Author

No problem, I'll continue to test the feature, I'm actively working with VECTOR type on SQL Azure. As of now, all the functional tests are green and I have made some manual tests to verify that everything works as expected and got the same results of the classic approach (that hasn't been interested by this change).

So, I have marked the PR as Ready for review, but I agree that it's absolutely OK to wait. The only thing to keep in mind is that the current implementation of native Vector search that has already been merged in the main branch via #722 will stop working as soon as the feature will enter the Public Preview stage.

Moreover, as VECTOR type is currently supported only on Azure SQL Database and Managed instance, I have added some other comments to clarify the situation, for example:

/// <summary>
/// Kernel Memory Builder extension method to add SQL Server memory connector.
/// </summary>
/// <param name="builder">KM builder instance</param>
/// <param name="connectionString">SQL Server connection string</param>
/// <param name="useNativeVectorSearch">Whether to use native vector search or not. Currently, the native Vector search is in Early Access Preview (EAP) and is available on Azure SQL Database and Managed Instance only.</param>
/// <param name="vectorSize">When <paramref name="useNativeVectorSearch"/> is <see langword="true"/>, it is the vector size used by the VECTOR SQL Server type.</param>
public static IKernelMemoryBuilder WithSqlServerMemoryDb(
    this IKernelMemoryBuilder builder,
    string connectionString,
    bool useNativeVectorSearch = false,

@marcominerva marcominerva marked this pull request as ready for review October 2, 2024 08:27
@dluc
Copy link
Collaborator

dluc commented Oct 16, 2024

@marcominerva what if we had 3 query providers, so we can merge now and support everyone?

  1. query provider using VARBINARY column
  2. query provider using VECTOR type
  3. DefaultQueryProvider

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for author Waiting for author to reply or address comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants