Azure SQL Database has taken a significant step forward by introducing native support for vectors, unlocking advanced capabilities for applications that rely on semantic search, AI, and machine learning. By integrating vector search into Azure SQL, developers can now store, search, and analyze vector data directly alongside traditional SQL data, offering a unified solution for complex data analysis and enhanced search experiences.
Vectors in Azure SQL Database
Vectors are numerical representations of objects like text, images, or audio. They are essential for applications involving semantic search, recommendation systems, and more. These vectors are typically generated by machine learning models, capturing the semantic meaning of the data they represent.
The new vector functionality in Azure SQL Database allows you to store and manage these vectors within a familiar SQL environment. This eliminates the need for separate vector databases, streamlining your application architecture and simplifying your data management processes.

Key Benefits of Native Vector Support in Azure SQL
- Unified Data Management: Store and query both traditional and vector data in a single database, reducing complexity and maintenance overhead.
- Advanced Search Capabilities: Perform similarity searches alongside standard SQL queries, leveraging Azure SQLās sophisticated query optimizer and powerful enterprise features.
- Optimized Performance: Vectors are stored in a compact binary format, allowing for efficient distance calculations and optimized performance on vector-related operations.
Embeddings: The Foundation of Vector Search
At the heart of vector search are embeddingsādense vector representations of objects, generated by deep learning models. These embeddings capture the semantic similarities between related concepts, enabling tasks such as semantic search, natural language processing, and recommendation systems.
For example, word embeddings can cluster related words like ācomputer,ā āsoftware,ā and āmachine,ā while distant clusters might represent words with entirely different meanings, such as ālion,ā ācat,ā and ādog.ā These embeddings are particularly powerful in applications where context and meaning are more important than exact keyword matches.
Azure OpenAI makes it easy to generate embeddings by providing pre-trained machine learning models accessible through REST endpoints. Once generated, these embeddings can be stored directly in an Azure SQL Database, allowing you to perform vector search queries to find similar data points.

You can explore how vector embeddings work by visiting this amazing website: Transformer Explainer. It offers an excellent interactive experience to help you better understand how Generative AI operates in general.
Vector Search Use Cases

Vector search is a powerful technique used to find vectors in a dataset that are similar to a given query vector. This capability is essential in various applications, including:
- Semantic Search: Rank search results based on their relevance to the userās query.
- Recommendation Systems: Suggest related items based on similarity in vector space.
- Clustering: Group similar items together based on vector similarity.
- Anomaly Detection: Identify outliers in data by finding vectors that differ significantly from the norm.
- Classification: Classify items based on the similarity of their vectors to predefined categories.
For instance, consider a semantic search application where a user queries for āhealthy breakfast options.ā A vector search would compare the vector representation of the query with vectors representing product reviews, finding the most contextually relevant itemsāeven if the exact keywords don’t match.
Key Features of Native Vector Support in Azure SQL
Azure SQLās native vector support introduces several new functions to operate on vectors, which are stored in a binary format to optimize performance. Here are the key functions:
JSON_ARRAY_TO_VECTOR:Converts a JSON array into a vector, enabling you to store embeddings in a compact format.ISVECTOR:Checks whether a binary value is a valid vector, ensuring data integrity.VECTOR_TO_JSON_ARRAY:Converts a binary vector back into a human-readable JSON array, making it easier to work with the data.VECTOR_DISTANCE:Calculates the distance between two vectors using a chosen distance metric, such as cosine or Euclidean distance.
These functions enable powerful operations for creating, storing, and querying vector data in Azure SQL Database.
Example: Vector Search in Action
Letās walk through an example of using Azure SQL Database to store and query vector embeddings. Imagine you have a table of customer reviews, and you want to find reviews that are contextually related to a userās search query.
- Storing Embeddings as Vectors:
After generating embeddings using Azure OpenAI, you can store these vectors in aVARBINARY(8000)column in your SQL table:
ALTER TABLE [dbo].[FineFoodReviews] ADD [VectorBinary] VARBINARY(8000);
UPDATE [dbo].[FineFoodReviews]
SET [VectorBinary] = JSON_ARRAY_TO_VECTOR([vector]);This allows you to store the embeddings efficiently, ready for vector search operations.
- Performing Similarity Searches:
To find reviews that are similar to a userās query, you can convert the query into a vector and calculate the cosine distance between the query vector and the stored embeddings:
DECLARE @e VARBINARY(8000);
EXEC dbo.GET_EMBEDDINGS @model = '<yourmodeldeploymentname>', @text = 'healthy breakfast options', @embedding = @e OUTPUT;
SELECT TOP(10) ProductId,
Summary,
Text,
VECTOR_DISTANCE('cosine', @e, VectorBinary) AS Distance
FROM dbo.FineFoodReviews
ORDER BY Distance;This query returns the top reviews that are contextually related to the userās search, even if the exact words donāt match.
- Hybrid Search with Filters:
You can enhance vector search by combining it with traditional keyword filters to improve relevance and performance. For example, you could filter reviews based on criteria like user identity, review score, or the presence of specific keywords, and then apply vector search to rank the results by relevance:
-- Comprehensive query with multiple filters.
SELECT TOP(10)
f.Id,
f.ProductId,
f.UserId,
f.Score,
f.Summary,
f.Text,
VECTOR_DISTANCE('cosine', @e, VectorBinary) AS Distance,
CASE
WHEN LEN(f.Text) > 100 THEN 'Detailed Review'
ELSE 'Short Review'
END AS ReviewLength,
CASE
WHEN f.Score >= 4 THEN 'High Score'
WHEN f.Score BETWEEN 2 AND 3 THEN 'Medium Score'
ELSE 'Low Score'
END AS ScoreCategory
FROM FineFoodReviews f
WHERE
f.UserId NOT LIKE 'Anonymous%' -- Exclude anonymous users
AND f.Score >= 2 -- Score threshold filter
AND LEN(f.Text) > 50 -- Text length filter for detailed reviews
AND (f.Text LIKE '%gluten%' OR f.Text LIKE '%dairy%') -- Keyword filter
ORDER BY
Distance, -- Order by cosine distance
f.Score DESC, -- Secondary order by review score
ReviewLength DESC; -- Tertiary order by review lengthThis query combines semantic search with traditional filters, balancing relevance and computational efficiency.
Leveraging REST Services for Embedding Generation
Azure OpenAI provides REST endpoints for generating embeddings, which can be consumed directly from Azure SQL Database using the sp_invoke_external_rest_endpoint system stored procedure. This integration enables seamless interaction between your data and AI models, allowing you to build intelligent applications that combine the power of machine learning with the familiarity of SQL.
Hereās a stored procedure example that retrieves embeddings from a deployed Azure OpenAI model and stores them in the database:
CREATE PROCEDURE [dbo].[GET_EMBEDDINGS]
(
@model VARCHAR(MAX),
@text NVARCHAR(MAX),
@embedding VARBINARY(8000) OUTPUT
)
AS
BEGIN
DECLARE @retval INT, @response NVARCHAR(MAX);
DECLARE @url VARCHAR(MAX);
DECLARE @payload NVARCHAR(MAX) = JSON_OBJECT('input': @text);
SET @url = 'https://<resourcename>.openai.azure.com/openai/deployments/' + @model + '/embeddings?api-version=2023-03-15-preview';
EXEC dbo.sp_invoke_external_rest_endpoint
@url = @url,
@method = 'POST',
@payload = @payload,
@headers = '{"Content-Type":"application/json", "api-key":"<openAIkey>"}',
@response = @response OUTPUT;
DECLARE @jsonArray NVARCHAR(MAX) = JSON_QUERY(@response, '$.result.data[0].embedding');
SET @embedding = JSON_ARRAY_TO_VECTOR(@jsonArray);
END
GOThis stored procedure retrieves embeddings from the Azure OpenAI model and converts them into a binary format for storage in the database, making them available for similarity search and other operations.
Let’s implementing a experiment with the Native Vector Support in Azure SQL
Azure SQL Database provides a seamless way to store and manage vector data despite not having a specific vector data type. Column-store indexes, vectors, and essentially lists of numbers can be efficiently stored in a table. Each vector can be represented in a row with individual elements as columns or serialized arrays. This approach ensures efficient storage and retrieval, making Azure SQL suitable for large-scale vector data management.
I used the Global News Dataset from Kaggle in my experiment.
First, you must create the columns to save the vector information. In my case, I created two columns: title_vector For the news title and content_vector the news content. For this, create a small Python code, but you can also do that directly from SQL using a cursor. It's important to know that you don't need to pay for any Vector Databases by saving the vector information inside the Azure SQL.
from litellm import embedding
import pyodbc # or another SQL connection library
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
# Set up OpenAI credentials from environment variables
os.environ['AZURE_API_KEY'] =os.getenv('AZURE_API_KEY')
os.environ['AZURE_API_BASE'] = os.getenv('AZURE_API_BASE')
os.environ['AZURE_API_VERSION'] = os.getenv('AZURE_API_VERSION')
# Connect to your Azure SQL database
conn = pyodbc.connect(f'DRIVER={{ODBC Driver 17 for SQL Server}};'
f'SERVER={os.getenv("DB_SERVER")};'
f'DATABASE={os.getenv("DB_DATABASE")};'
f'UID={os.getenv("DB_UID")};'
f'PWD={os.getenv("DB_PWD")}')
def get_embeddings(text):
# Truncate the text to 8191 characters bacause of the text-embedding-3- small OpenAI API Embedding model limit
truncated_text = text[:8191]
response = embedding(
model="azure/text-embedding-3-small",
input=truncated_text,
api_key=os.getenv('AZURE_API_KEY'),
api_base=os.getenv('AZURE_API_BASE'),
api_version=os.getenv('AZURE_API_VERSION')
)
embeddings = response['data'][0]['embedding']
return embeddings
def update_database(article_id, title_vector, content_vector):
cursor = conn.cursor()
# Convert vectors to strings
title_vector_str = str(title_vector)
content_vector_str = str(content_vector)
# Update the SQL query to use the string representations
cursor.execute("""
UPDATE newsvector
SET title_vector = ?, content_vector = ?
WHERE article_id = ?
""", (title_vector_str, content_vector_str, article_id))
conn.commit()
def embed_and_update():
cursor = conn.cursor()
cursor.execute("SELECT article_id, title, full_content FROM newsvector where title_vector is null and full_content is not null and title is not null order by published asc")
title_vector = ""
content_vector = ""
for row in cursor.fetchall():
article_id, title, full_content = row
print(f"Embedding article {article_id} - {title}")
title_vector = get_embeddings(title)
content_vector = get_embeddings(full_content)
update_database(article_id, title_vector, content_vector)
embed_and_update()These two columns will contain something like this: [-0.02232750505208969, -0.03755787014961243, -0.0066827102564275265ā¦]
Second, you must create a procedure in the Azure Database to transform the query into a vector embedding.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GET_EMBEDDINGS]
(
@model VARCHAR(MAX),
@text NVARCHAR(MAX),
@embedding VARBINARY(8000) OUTPUT
)
AS
BEGIN
DECLARE @retval INT, @response NVARCHAR(MAX);
DECLARE @url VARCHAR(MAX);
DECLARE @payload NVARCHAR(MAX) = JSON_OBJECT('input': @text);
-- Set the @url variable with proper concatenation before the EXEC statement
SET @url = 'https://<Your App>.openai.azure.com/openai/deployments/' + @model + '/embeddings?api-version=2024-02-15-preview';
EXEC dbo.sp_invoke_external_rest_endpoint
@url = @url,
@method = 'POST',
@payload = @payload,
@headers = '{"Content-Type":"application/json", "api-key":"<Your Azure Open AI API Key"}',
@response = @response OUTPUT;
-- Use JSON_QUERY to extract the embedding array directly
DECLARE @jsonArray NVARCHAR(MAX) = JSON_QUERY(@response, '$.result.data[0].embedding');
SET @embedding = JSON_ARRAY_TO_VECTOR(@jsonArray);
ENDI also create another procedure to search directly to the dataset using the Native Vector Support in Azure SQL.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[SearchNewsVector]
@inputText NVARCHAR(MAX)
AS
BEGIN
-- Query the SimilarNewsContentArticles table using the response
IF OBJECT_ID('dbo.result', 'U') IS NOT NULL
DROP TABLE dbo.result;
--Assuming you have a stored procedure to get embeddings for a given text
DECLARE @e VARBINARY(8000);
EXEC dbo.GET_EMBEDDINGS @model = 'text-embedding-3-small', @text = @inputText, @embedding = @e OUTPUT;
SELECT TOP(10)
[article_id]
,[source_id]
,[source_name]
,[author]
,[title]
,[description]
,[url]
,[url_to_image]
,[content]
,[category]
,[full_content]
,[title_vector]
,[content_vector]
,[published]
,VECTOR_DISTANCE('cosine', @e, VectorBinary) AS cosine_distance
into result
FROM newsvector
ORDER BY cosine_distance;
ENDFinally, you can start querying your table using prompts instead of keywords. This is awesome!
Check out the app I developed with the Native Vector Support in Azure SQL, which is designed to assist you in crafting prompts and evaluating your performance using my newsvector dataset. To explore the app, click here.

Like always, I also created this GitHub repository with everything I did.

Azure SQL Database Native vector support subscription for the Private Preview
You can sign up for the private preview at this link.

This article, published by Davide Mauri and Pooja Kamath at Microsoft Build 2024 event, provides all the information.

Conclusion
The integration of Azure OpenAI with native vector support in Azure SQL Database unlocks new possibilities for applications that require advanced search capabilities and data analysis. By storing and querying vector embeddings alongside traditional SQL data, you can build powerful solutions that combine the best of both worldsāsemantic understanding with the reliability and performance of Azure SQL.
This innovation simplifies application development, enhances data insights, and paves the way for the next generation of intelligent applications.
That’s it for today!
Sources
Azure SQL DB Vector Functions Private Preview | Data Exposed (youtube.com)
Announcing EAP for Vector Support in Azure SQL Database – Azure SQL Devsā Corner (microsoft.com)