Session Explorer in Action: Personalizing Conference Experiences with AI Using Vector Support in Azure SQL, LangChain, and Streamlit

Conferences can be overwhelming, with countless sessions, various speakers, and a wealth of information to digest. What if we could simplify this experience using AI, turning data into a personalized assistant? Enter Session Explorer, a cutting-edge solution built using Vector Support in Azure SQL, LangChain, and Streamlit. This AI-powered application transforms the way attendees interact with conference data, making it intuitive, efficient, and engaging.

The Inspiration Behind Session Explorer

Conferences like the PASS Data Community Summit bring together diverse topics, speakers, and attendees. While the variety is exciting, navigating this wealth of information can be daunting. The Session Explorer was designed to:

  • Help attendees quickly find relevant sessions.
  • Provide insights about specific speakers or topics.
  • Create a seamless, user-friendly interface powered by AI.

By leveraging Vector Support in Azure SQL for intelligent data retrieval, LangChain for conversational AI, and Streamlit for a dynamic user interface, the app makes conference exploration smarter and simpler.

Key Technologies Powering Session Explorer

1. Vector Support in Azure SQL

Azure SQL’s vector capabilities enable efficient semantic searches by transforming text into embeddings. These embeddings are compared using the vector_distance function, allowing the system to find similar sessions based on user queries.

Here’s how it works:

SQL
DECLARE @qv vector(1536);
EXEC web.get_embedding 'Data-driven insights', @qv OUTPUT;

SELECT TOP(5) 
    se.id AS session_id, 
    vector_distance('cosine', se.embeddings, @qv) AS distance
FROM 
    web.sessions se
ORDER BY
    distance;

This query takes a user-provided text, converts it to an embedding using OpenAI, and retrieves the most relevant sessions.

If you want to know more about Vector Support in Azure SQL go to my post below.

2. LangChain

LangChain integrates the retrieval-augmented generation (RAG) approach to provide additional context to the language model. Using LCEL (LangChain Expression Language), it dynamically injects session data into prompts for personalized responses.

Python
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "ai",
            """ 
            You are a system assistant who helps users find all sessions related to watch from the conference, based off the sessions that are provided to you.
            Sessions will be provided in an assistant message in the format of `title|abstract|speakers|start-time|end-time|room|track|session_type|topics|level|Session URL`. You can use only the provided session list to help you answer the user's question.
            if the user asks about a speaker, you can respond with the sessions that the speaker is participating in.
            If the user asks a question that is not related to the provided sessions, you can respond with a message that I'm unable to assist with that question because the information you're asking for is not available in the database..
            Your answer must have the session title, a very short summary of the abstract, the speakers, the start time, the end time, the room, track, session type topic and level. In the end insert the session URL to open in a new windows.
            """,
        ),
        (
            "human", """
            The sessions available at the conference are the following: 
            {sessions}                
            """
        ),
        (
            "human", 
            "{question}"
        ),
    ]
)

retriever = RunnableLambda(get_similar_sessions).bind() 

rag_chain = {"sessions": retriever, "question": RunnablePassthrough()} | prompt | llm

3. Streamlit

Streamlit creates an intuitive web interface, making the AI capabilities accessible to users. With a few lines of code, we crafted an interactive app where users can ask questions and receive detailed session recommendations.

Python
import streamlit as st
import os
import sys
from dotenv import load_dotenv
import logging
from utilities import get_similar_sessions
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.schema.runnable import Runnable
from langchain.schema.runnable.config import RunnableConfig
from langchain_core.runnables import RunnableLambda
from langchain_core.runnables import RunnablePassthrough
import getpass

logging.basicConfig(level=logging.INFO)

# Adicionando o diretório pai ao caminho
sys.path.append(os.path.abspath('..'))

# Carrega as variáveis de ambiente do arquivo .env
load_dotenv()

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

MODELO = ""

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "ai",
            """ 
            You are a system assistant who helps users find all sessions related to watch from the conference, based off the sessions that are provided to you.
            Sessions will be provided in an assistant message in the format of `title|abstract|speakers|start-time|end-time|room|track|session_type|topics|level|Session URL`. You can use only the provided session list to help you answer the user's question.
            if the user asks about a speaker, you can respond with the sessions that the speaker is participating in.
            If the user asks a question that is not related to the provided sessions, you can respond with a message that I'm unable to assist with that question because the information you're asking for is not available in the database..
            Your answer must have the session title, a very short summary of the abstract, the speakers, the start time, the end time, the room, track, session type topic and level. In the end insert the session URL to open in a new windows.
            """,
        ),
        (
            "human", """
            The sessions available at the conference are the following: 
            {sessions}                
            """
        ),
        (
            "human", 
            "{question}"
        ),
    ]
)

def configure_app():
    """Configura a aparência e o layout do aplicativo Streamlit."""
    st.set_page_config(
        page_title="Chat",
        page_icon="💬",
        layout="wide",
        initial_sidebar_state="expanded",
    )
    st.header('Session Explorer: Your Guide to the Data Pass Summit 2024')

    st.write("""Ask something like 'What are the sessions about Azure SQL and AI?' or 'What are the sessions by Davide Mauri?'""")
        
def sidebar_inputs():
    """Exibe o menu lateral para inserção das informações do administrador e upload do arquivo."""
    with st.sidebar:        
        
        st.image("https://passdatacommunitysummit.com/assets/images/pass-2024-logo-lock-up--dark--with-icon.svg")   
        
        # Opção de seleção entre Open AI e Groq para definir o modelo
        modelo = st.selectbox("Select the model:", ('gpt-4o-mini','gpt-4o' ))
                         
        ""

        "You can find more information about this app in my blog post: [Session Explorer in Action: Personalizing Conference Experiences with AI Using Vector Support in Azure SQL, LangChain, and Streamlit](https://lawrence.eti.br/2024/11/20/session-explorer-in-action-personalizing-conference-experiences-with-ai-using-vector-support-in-azure-sql-langchain-and-streamlit/)"
        ""
        "The PASS Data Community Summit is an annual conference designed for data professionals to connect, share insights, and learn from peers and industry leaders. It focuses on a wide range of topics, including analytics, architecture, database management, development, and professional growth, across multiple platforms such as Microsoft, AWS, Google, PostgreSQL, and others."        
        ""
        "Official website: [PASS Data Community Summit](https://passdatacommunitysummit.com/)"
        ""	
        ""
        ""
        "Created by [Lawrence Teixeira](https://www.linkedin.com/in/lawrenceteixeira/)"

    return modelo        
        
def main():    
    """Função principal do aplicativo, onde todas as funções são chamadas."""
        
    configure_app()
            
    global MODELO

    modelo = sidebar_inputs()
    
    MODELO = modelo

    llm = ChatOpenAI(
        model=MODELO,
        temperature=0,
        max_tokens=16383,
        timeout=None,
        max_retries=2,
        streaming=False,
    )

    if "messages" not in st.session_state:
        st.session_state["messages"] = [{"role": "assistant", "content": "Hi! 😊 How are you? 💬 Feel free to ask anything about the sessions at PASS Data Community Summit 2024!"}]

    for msg in st.session_state.messages:
        if msg["role"] != "system":
            st.chat_message(msg["role"]).write(msg["content"])

    if prompt_entrada := st.chat_input("Type your message here"):
        st.session_state.messages.append({"role": "user", "content": prompt_entrada})
        st.chat_message("user").write(prompt_entrada)
        
        with st.spinner('Searching...'): 
            
            retriever = RunnableLambda(get_similar_sessions).bind() 

            rag_chain = {"sessions": retriever, "question": RunnablePassthrough()} | prompt | llm

            response_chat = rag_chain.invoke(prompt_entrada)            
            
            response = response_chat.content
        
        if response:
            result = (str(response))
        else:
            result = (str(":)"))

        msg = { "role": "assistant",
                "content": result
        }

        st.session_state.messages.append(msg)
                
        st.chat_message("assistant").write(msg["content"])    

if __name__ == "__main__":
    main()

How Session Explorer Works

Step 1: User Query

Users input a question, such as:
“Which sessions are led by John Doe?” or “What sessions discuss AI in data management?”

Step 2: Intelligent Retrieval

The app uses the get_similar_sessions function, querying Azure SQL with the user’s input. The SQL stored procedure returns a list of sessions ranked by relevance.

Python
results = cursor.execute("EXEC web.find_sessions @text=?", (search_text)).fetchall()

Step 3: Conversational AI

LangChain takes the retrieved data, formats it into a response template, and interacts with the language model to generate human-readable answers.

Step 4: Seamless Delivery

The personalized session details are displayed on the Streamlit interface, complete with clickable URLs to explore further.

Personalizing the Conference Experience

Here’s an example of a typical interaction:

  • User: “Show me all sessions by Dr. Smith.”
  • Session Explorer:
  • Title: “Advancing Data Science with AI”
  • Speakers: Dr. Smith, Jane Doe
  • Time: 10:00 AM – 11:30 AM
  • Room: Auditorium A
  • Track: Data Science
  • URL: Session Details

This level of personalization not only saves time but also enhances the conference experience by connecting attendees with sessions that truly matter to them.

If you want to try deploying the app yourself, visit my GitHub repository.

Introducing the Session Explorer App for PASS Data Community Summit 2024

The Session Explorer App is your go-to tool example for navigating the wealth of knowledge at the PASS Data Community Summit 2024. Built to streamline your conference experience, this app leverages the latest advancements in AI to help you discover sessions, speakers, and topics tailored to your interests.

What Can the App Do?

The app enables you to:

  • Explore Sessions: Ask specific questions about the available sessions, such as “What sessions discuss AI in data management?” or “Show me sessions by Dr. Smith.”
  • Discover Speakers: Find all sessions led by a particular speaker or group of speakers.
  • Plan Your Schedule: Get session details, including title, abstract, speakers, start and end times, room location, track, session type, and difficulty level.
  • Direct Access to Information: Each session includes a clickable link to its detailed page, making it easy to add it to your schedule.

How It Works

Using the app is simple:

  1. Click on this link: https://sessionschat.fly.dev/
  2. Ask a Question: Type your query into the chat interface, like “Which sessions are about Azure SQL?”
  3. Get Intelligent Responses: The app uses AI to understand your question and retrieve the most relevant session information from the database.
  4. View Results: Receive detailed session summaries, including timing, location, and a direct link to learn more.

Why Use the Session Explorer App?

The app transforms the way you navigate the conference, ensuring you never miss a session relevant to your goals. Whether you’re interested in AI, analytics, architecture, or professional development, the Session Explorer helps you focus on what matters most.

Experience the power of AI in personalizing your conference journey with the Session Explorer App. Dive into the sessions that spark your curiosity and make the most of the PASS Data Community Summit 2024!

Final Thoughts

The Session Explorer is more than just a chatbot; it’s a tool that redefines how we interact with conference data. Combining Vector Support’s power in Azure SQL, LangChain, and Streamlit delivers a personalized, AI-driven experience that attendees will love.

Whether you’re a data enthusiast, a tech professional, or an AI developer, the Session Explorer showcases the immense potential of AI in enhancing user experiences. Ready to explore it in action? Try it for yourself at the PASS Data Community Summit 2024 app!

That’s it for today!

Sources

https://devblogs.microsoft.com/azure-sql/build-a-chatbot-on-your-own-data-in-1-hour-with-azure-sql-langchain-and-chainlit/

https://github.com/Azure-Samples/azure-sql-db-rag-langchain-chainlit

Cost-Effective Text Embedding: Leveraging Ollama Local Models with Azure SQL Databases

Embedding text using a local model can provide significant cost advantages and flexibility over cloud-based services. In this blog post, we explore how to set up and use a local model for text embedding and how this approach can be integrated with Azure SQL databases for advanced querying capabilities.

Cost Comparison: Open AI text-embedding-ada-002 pay Model vs. Local Model Setup Cost

When choosing between a paid service and setting up a local model for text embedding, it’s crucial to consider the cost implications based on the scale of your data and the frequency of usage. Below is a detailed comparison of the costs of using a paid model versus establishing a local one.

Pay Model Cost Estimate:

Open AI text-embedding-ada-002:

Using a paid model like OpenAI’s Ada V2 for embedding 1 terabyte of OCR texts would cost around $25,000. This estimation is based on converting every 4 characters into one token, which might vary depending on the content and structure of the OCR texts.

Local Model Cost Estimate:

Setup Costs:

The initial investment for setting up a local model can range from $4,050 to $12,750, depending on the selection of components, from mid-range to high-end. This one-time cost can be amortized over many uses and datasets, potentially offering a more cost-effective solution in the long run, especially for large data volumes.

Overall Financial Implications

While the upfront cost for a local model might seem high, it becomes significantly more economical with increased data volumes and repeated use. In contrast, the cost of using a pay model like OpenAI’s text-embedding-ada-002 scales linearly with data volume, leading to potentially high ongoing expenses.

Considering these factors, the local model offers a cost advantage and greater control over data processing and security, making it an attractive option for organizations handling large quantities of sensitive data.

Why I Have Decided to Use a Local Model?

Cost and data volume considerations primarily drove the decision to use a local model for text embedding. With over 20 terabytes of data, including 1 terabyte of OCR text to embed, the estimated cost of using a commercial text-embedding model like OpenAI’s text-embedding-ada-002 would be around USD 25,000. By setting up a local model, we can process our data at a fraction of this cost, reducing expenses by 49% to 84%.

Exploring Local Models: Testing BGE-M3, MXBAI-EMBED-LARGE, NOMIC-EMBED-TEXT, and text-embedding-ada-002 from Open AI.

I encountered some intriguing results in my recent tests with local embedding models BGE-M3 and NOMIC-EMBED-TEXT. Both models showed an accuracy below 0.80 when benchmarked against OpenAI’s “Text-embedding-ada-002.” This comparison has sparked a valuable discussion about the capabilities and limitations of different embedding technologies.

How to Choose the Best Model for Your Needs?

When considering open-source embedding models like NOMIC-EMBED-TEXT, BGE-M3, and MXBAI-EMBED-LARGE, specific strengths and applications that make them suitable for various machine learning tasks should be considered.

1. NOMIC-EMBED-TEXT: This model is specifically designed for handling long-context text, making it suitable for tasks that involve processing extensive documents or content that benefits from understanding broader contexts. It achieves this by training on full Wikipedia articles and various large-scale question-answering datasets, which helps it capture long-range dependencies.

2. BGE-M3: Part of the BGE (Beijing Academy of Artificial Intelligence) series, this model is adapted for sentence similarity tasks. It’s built to handle multilingual content effectively, which makes it a versatile choice for applications requiring understanding or comparing sentences across different languages.

3. MXBAI-EMBED-LARGE: This model is noted for its feature extraction capabilities, making it particularly useful for tasks that require distilling complex data into simpler, meaningful representations. Its training involves diverse datasets, enhancing its generalization across text types and contexts.

Each model brings unique capabilities, such as handling longer texts or providing robust multilingual support. When choosing among these models, consider the specific needs of your project, such as the length of text you need to process, the importance of multilingual capabilities, and the type of machine learning tasks you aim to perform (e.g., text similarity, feature extraction). Testing them with specific data is crucial to determine which model performs best in your context.

In our analysis, we’ve compared various results and identified the best open-source model to use compared to the OpenAI’s Text-embedding-ada-002.

We executed this query using the keyword ‘Microsoft’ to search the vector table and compare the content of Wikipedia articles.

SQL
declare @v nvarchar(max)
select @v = content_vector from dbo.wikipedia_articles_embeddings where title = 'Microsoft'
select w.title, w.text from 
(select top (10) id, title, text, dot_product
from [$vector].find_similar$wikipedia_articles_embeddings$content_vector(@v, 1, 0.25) 
order by dot_product desc) w
order by w.title
go

We utilized the KMeans compute node for text similarity analysis, focusing on a single cluster search. For a detailed, step-by-step guide on creating this dataset, please refer to the article I shared at the end of this article.

Follow the results overview:

To calculate the percentage of similarity of each model with “Text-embedding-ada-002”, we’ll determine how many keywords match between “Text-embedding-ada-002” and the other models, then express this as a percentage of the total keywords in “Text-embedding-ada-002”. Here’s the updated table with the percentages:

Follow the comparison table:

  1. Text-embedding-ada-002 Keywords Total: 10 (100% is based on these keywords).
  2. Matching Keywords:
       – BGE-M3: Matches 7 out of 10 keywords of Text-embedding-ada-002.
       – NOMIC-EMBED-TEXT: Matches 3 out of 10 keywords of Text-embedding-ada-002.
       – MXBAI-EMBED-LARGE: Matches 1 out of 10 keywords of Text-embedding-ada-002.

This table illustrates that the BGE-M3 model is similar to “Text-embedding-ada-002,” with 70% of the keywords matching. It is followed by “NOMIC-EMBED-TEXT” at 30% and “MXBAI-EMBED-LARGE,” with the least similarity at 10%.

How does it perform when doing an approximate search with 1, 4, 8, and 16 clusters?

We execute this query within the Azure database to perform this test across each database and model we use:

SQL
create table #trab ( linha varchar( 200) null )

insert into #trab (linha) values ('Model: mxbai-embed-large')

declare @v nvarchar(max)
select @v = content_vector from dbo.wikipedia_articles_embeddings where title = 'Microsoft'

insert into #trab (linha) values ('')
insert into #trab (linha) values ('Search with 1 cluster')

insert into #trab (linha)
select w.title from 
(select top (10) id, title, text, dot_product
from [$vector].find_similar$wikipedia_articles_embeddings$content_vector(@v, 1, 0.25) 
order by dot_product desc) w
order by w.title
go

declare @v nvarchar(max)
select @v = content_vector from dbo.wikipedia_articles_embeddings where title = 'Microsoft'

insert into #trab (linha) values ('')
insert into #trab (linha) values ('Search with 4 clusters')

insert into #trab (linha)
select w.title from 
(select top (10) id, title, text, dot_product
from [$vector].find_similar$wikipedia_articles_embeddings$content_vector(@v, 4, 0.25) 
order by dot_product desc) w
order by w.title
go

declare @v nvarchar(max)
select @v = content_vector from dbo.wikipedia_articles_embeddings where title = 'Microsoft'

insert into #trab (linha) values ('')
insert into #trab (linha) values ('Search with 8 clusters')

insert into #trab (linha)
select w.title from 
(select top (10) id, title, text, dot_product
from [$vector].find_similar$wikipedia_articles_embeddings$content_vector(@v, 8, 0.25) 
order by dot_product desc) w
order by w.title
go

declare @v nvarchar(max)
select @v = content_vector from dbo.wikipedia_articles_embeddings where title = 'Microsoft'

insert into #trab (linha) values ('')
insert into #trab (linha) values ('Search with 16 clusters')

insert into #trab (linha)
select w.title from 
(select top (10) id, title, text, dot_product
from [$vector].find_similar$wikipedia_articles_embeddings$content_vector(@v, 16, 0.25) 
order by dot_product desc) w
order by w.title
go

select * from #trab

drop table #trab

Follow the results overview:

Based on the previous detailed list, here are the calculations for the percentage of similarity:

1. Total Distinct Keywords in Text-embedding-ada-002: 10 (100% based on these keywords)

2. Keywords in each Cluster Search:

   – BGE-M3: 5 keywords (Microsoft, Microsoft Office, Microsoft Windows, Microsoft Word, MSN)

   – NOMIC-EMBED-TEXT: 4 keywords (Microsoft, MSN, Nokia, Outlook.com)

   – MXBAI-EMBED-LARGE: 2 keywords (Microsoft, Nokia)

Here’s the updated table with the percentage similarity for searches with 1, 4, 8, and 16 clusters:

This table shows the similarity percentages for each model across different cluster configurations compared to the “text-embedding-ada-002” model. Each model retains a consistent similarity percentage across all cluster numbers, indicating that the cluster configuration did not affect the keywords searched for in these cases.

To execute the Python code to embed the vectors, first, you have to install Ollama

How Did You Set Up a Local Model Using Ollama?

To run an Ollama model with your GPU, you can use the official Docker image provided by Ollama. The Docker image supports Nvidia GPUs and can be installed using the NVIDIA Container Toolkit. Here are the steps to get started:

  1. Install Docker: Download and install Docker Desktop or Docker Engine, depending on your operating system.
  2. Select and Pull the Ollama Model: Choose a preferred model from the Ollama library, such as nomic-embed-text or mxbai-embed-large, and pull it using the following command: docker pull ollama/ollama..
  3. Run the Ollama Docker Image: Execute Docker run commands to set up the Ollama container. You can configure it specifically for either CPU or Nvidia GPU environments. Run the Docker container with the following command: docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama.
  4. You can now run the Ollama model using the following command: docker exec -it ollama ollama run nomic-embed-text  or docker exec -it ollama ollama run mxbai-embed-large .
  5. Access and Use the Model: To start interacting with your model, utilize the Ollama WebUI by navigating to the local address provided (typically http://localhost:11434).

Please note that the above commands assume you have already installed Docker on your system. If you haven’t installed Docker yet, you can download it from the official Docker website.

You can also download and install Ollama on Windows:

How do you convert text into embedding using the local model using Ollama?

After setting up your local model with Ollama, you can use the following Python script to convert text into embeddings:

Python
# Importing necessary libraries and modules
import os
import pyodbc  # SQL connection library for Microsoft databases
import requests  # For making HTTP requests
from dotenv import load_dotenv  # To load environment variables from a .env file
import numpy as np  # Library for numerical operations
from sklearn.preprocessing import normalize  # For normalizing data
import json  # For handling JSON data
from db.utils import NpEncoder  # Custom JSON encoder for numpy data types

# Load environment variables from a .env file located in the same directory
load_dotenv()

# This is the connection string for connecting to the Azure SQL database we are getting from the environment variables
#MSSQL='Driver={ODBC Driver 17 for SQL Server};Server=localhost;Database=<DATABASE NAME>;Uid=<USER>;Pwd=<PASSWOPRD>;Encrypt=No;Connection Timeout=30;'

# Retrieve the database connection string from environment variables
dbconnectstring = os.getenv('MSSQL')

# Establish a connection to the Azure SQL database using the connection string
conn = pyodbc.connect(dbconnectstring)

def get_embedding(text, model):
    # Prepare the input text by truncating it or preprocessing if needed
    truncated_text = text

    # Make an HTTP POST request to a local server API to get embeddings for the input text
    res = requests.post(url='http://localhost:11434/api/embeddings',
                        json={
                            'model': model, 
                            'prompt': truncated_text
                        }
    )
    
    # Extract the embedding from the JSON response
    embeddings = res.json()['embedding']
    
    # Convert the embedding list to a numpy array
    embeddings = np.array(embeddings)    
    
    # Normalize the embeddings array to unit length
    nc = normalize([embeddings])
        
    # Convert the numpy array back to JSON string using a custom encoder that handles numpy types
    return json.dumps(nc[0], cls=NpEncoder )

def update_database(id, title_vector, content_vector):
    # Obtain a new cursor from the database connection
    cursor = conn.cursor()

    # Convert numpy array embeddings to string representations for storing in SQL
    title_vector_str = str(title_vector)
    content_vector_str = str(content_vector)

    # SQL query to update the embeddings in the database
    cursor.execute("""
        UPDATE wikipedia_articles_embeddings
        SET title_vector = ?, content_vector = ?
        WHERE id = ?
    """, (title_vector_str, content_vector_str, id))
    conn.commit()  # Commit the transaction to the database

def embed_and_update(model):
    # Get a cursor from the database connection
    cursor = conn.cursor()
    
    # Retrieve articles from the database that need their embeddings updated
    cursor.execute("select id, title, text from wikipedia_articles_embeddings where title_vector = '' or content_vector = '' order by id desc")
    
    for row in cursor.fetchall():
        id, title, text = row
        
        # Get embeddings for title and text
        title_vector = get_embedding(title, model)
        content_vector = get_embedding(text, model)
        
        # Print the progress with length of the generated embeddings
        print(f"Embedding article {id} - {title}", "len:", len(title_vector), len(content_vector))
        
        # Update the database with new embeddings
        update_database(id, title_vector, content_vector)

# Call the function to update embeddings using the 'nomic-embed-text' model
embed_and_update('nomic-embed-text')

# To use another model, uncomment and call the function with the different model name
# embed_and_update('mxbai-embed-large')

I’ve also created a GitHub repository with these codes; you can access it at this link.

Download the pre-calculated embeddings using OpenAI’s text-embedding-ada-002

The pre-calculated embeddings with OpenAI’s text-embedding-ada-002, both for the title and the body, of a selection of Wikipedia articles, is made available by Open AI here:

https://cdn.openai.com/API/examples/data/vector_database_wikipedia_articles_embedded.zip

Once you have successfully embedded your text, I recommend exploring two of my blog posts that detail how to create a vector database for prompting and searching. These posts provide step-by-step guidance on utilizing Azure SQL alongside cosine similarity and KMeans algorithms for efficient and effective data retrieval.

Azure SQL Database now has native vector support

You can sign up for the private preview at this link.

This article, published by Davide Mauri and Pooja Kamath during this week’s Microsoft Build event, provides all the information.

Announcing EAP for Vector Support in Azure SQL Database – Azure SQL Devs’ Corner (microsoft.com)

Conclusion

Embedding text locally using models like Ollama presents a cost-effective, scalable solution for handling large volumes of data. By integrating these embeddings into Azure SQL databases, organizations can leverage generative AI to enhance their querying capabilities, making extracting meaningful insights from vast datasets easier. The outlined process ensures significant cost savings and enhances data security and processing efficiency.

This approach is a technical exercise and a strategic asset that can drive better decision-making and innovation across various data-intensive applications.

That’s it for today!

Sources

GitHub – Azure-Samples/azure-sql-db-vectors-kmeans: Use KMeans clustering to speed up vector search in Azure SQL DB

Vector Similarity Search with Azure SQL database and OpenAI | by Davide Mauri | Microsoft Azure | Medium

Ollama

How to Install and Run Ollama with Docker: A Beginner’s Guide – Collabnix

Leveraging KMeans Compute Node for Text Similarity Analysis through Vector Search in Azure SQL – Tech News & Insights (lawrence.eti.br)

Navigating Vector Operations in Azure SQL for Better Data Insights: A Guide How to Use Generative AI to Prompt Queries in Datasets – Tech News & Insights (lawrence.eti.br)

GitHub – LawrenceTeixeira/embedyourlocalmodel