Leveraging KMeans Compute Node for Text Similarity Analysis through Vector Search in Azure SQL

In the ever-evolving landscape of data management and retrieval, the ability to efficiently search through high-dimensional vector data has become a cornerstone for many modern applications, including recommendation systems, image recognition, and natural language processing tasks. Azure SQL Database (DB), in combination with KMeans clustering, is at the forefront of this revolution, offering an innovative solution that significantly enhances vector search capabilities.

What is KMeans?

KMeans is a widely used clustering algorithm in machine learning and data mining. It’s a method for partitioning an N-dimensional dataset into K distinct, non-overlapping clusters. Each cluster is defined by its centroid, which is the mean of the points in the cluster. The algorithm aims to minimize the variance within each cluster, effectively grouping the data points into clusters based on their similarity.

Let’s implement an example of the code to understand how it works Import the NumPy module and then go through the rest of the code to know how the K-Means clustering is implemented.

Python
#Loading the required modules
 
import numpy as np
from scipy.spatial.distance import cdist 
from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
 
#Defining our function 
def kmeans(x,k, no_of_iterations):
    idx = np.random.choice(len(x), k, replace=False)
    #Randomly choosing Centroids 
    centroids = x[idx, :] #Step 1
     
    #finding the distance between centroids and all the data points
    distances = cdist(x, centroids ,'euclidean') #Step 2
     
    #Centroid with the minimum Distance
    points = np.array([np.argmin(i) for i in distances]) #Step 3
     
    #Repeating the above steps for a defined number of iterations
    #Step 4
    for _ in range(no_of_iterations): 
        centroids = []
        for idx in range(k):
            #Updating Centroids by taking mean of Cluster it belongs to
            temp_cent = x[points==idx].mean(axis=0) 
            centroids.append(temp_cent)
 
        centroids = np.vstack(centroids) #Updated Centroids 
         
        distances = cdist(x, centroids ,'euclidean')
        points = np.array([np.argmin(i) for i in distances])
         
    return points 
 
 
#Load Data
data = load_digits().data
pca = PCA(2)
  
#Transform the data
df = pca.fit_transform(data)
 
#Applying our function
label = kmeans(df,10,1000)
 
#Visualize the results
 
u_labels = np.unique(label)
for i in u_labels:
    plt.scatter(df[label == i , 0] , df[label == i , 1] , label = i)
plt.legend()
plt.show()

How does Voronoi Cell-based Vector Search Optimization work?

Vector Search Optimization via Voronoi Cells is an advanced technique to enhance the efficiency and accuracy of searching for similar vectors in a high-dimensional space. This method is particularly relevant in the context of Approximate Nearest Neighbor (ANN) searches, which aim to quickly find vectors close to a given query vector without exhaustively comparing the query vector against every other vector in the dataset.

Understanding Voronoi Cells

To grasp the concept of vector search optimization via Voronoi Cells, it’s essential to understand what Voronoi diagrams are. A Voronoi diagram is partitioning a plane into regions based on the distance to points in a specific subset of the plane. Each region (Voronoi cell) is defined so that any point within the region is closer to its corresponding “seed” point than any other. These seed points are typically referred to as centroids in the context of vector search.

Application in Vector Search

Voronoi Cells can efficiently partition the high-dimensional space into distinct regions in vector search. Each area represents a cluster of vectors closer to its centroid than any other centroid. This approach is based on the assumption that vectors within the same Voronoi cell are more likely to be similar to each other than to vectors in different cells.

The Process

  1. Centroid Initialization: Like KMeans clustering, the process begins by selecting a set of initial centroids in the high-dimensional space.
  2. Voronoi Partitioning: The space is partitioned into Voronoi cells, each associated with one centroid. This partitioning is done such that every vector in the dataset is assigned to the cell of the closest centroid.
  3. Indexing and Search Optimization: Once the high-dimensional space is partitioned, an inverted file index or a similar data structure can be created to map each centroid to the list of vectors (or pointers to them) within its corresponding Voronoi cell. During a search query, instead of comparing the query vector against all vectors in the dataset, the search can be limited to vectors within the most relevant Voronoi cells, significantly reducing the search space and time.

Advantages

  • Efficiency: By reducing the search space to a few relevant Voronoi cells, the algorithm can achieve faster search times than brute-force searches.
  • Scalability: This method scales better with large datasets, as the overhead of partitioning the space and indexing is compensated by the speedup in query times.
  • Flexibility: The approach can be adapted to various data types and dimensionalities by adjusting the centroid selection and cell partitioning methods.

Introducing the Project of Azure SQL DB and KMeans Compute Node

Azure SQL DB has long been recognized for its robustness, scalability, and security as a cloud database service. By integrating KMeans clustering—a method used to partition data into k distinct clusters based on similarity—the capability of Azure SQL DB is expanded to include advanced vector search operations.

The KMeans Compute Node is a specialized component that handles the compute-intensive task of clustering high-dimensional data. This integration optimizes the performance of vector searches and simplifies the management and deployment of these solutions.

How It Works

  1. Data Storage: Vector data is stored in Azure SQL DB, leveraging its high availability and scalable storage solutions. This setup ensures that data management adheres to best practices regarding security and compliance.
  2. Vector Clustering: The KMeans Compute Node performs clustering on the vector data. This process groups vectors into clusters based on similarity, significantly reducing the search space for query operations.
  3. Search Optimization: Approximate Nearest Neighbor (ANN) searches can be executed more efficiently with vectors organized into clusters. Queries are directed towards relevant clusters rather than the entire dataset, enhancing search speed and accuracy.
  4. Seamless Integration: The entire process is streamlined through Azure Container Apps, which host the KMeans Compute Node. This setup provides a scalable, serverless environment that dynamically adjusts demand-based resources.

Advantages of the Azure SQL DB and KMeans Approach

  • Performance: By reducing the complexity of vector searches, response times are significantly improved, allowing for real-time search capabilities even in large datasets.
  • Scalability: The solution effortlessly scales with your data, thanks to Azure’s cloud infrastructure. This ensures that growing data volumes do not compromise search efficiency.
  • Cost-Effectiveness: Azure SQL DB offers a cost-efficient storage solution, while using Azure Container Apps for the KMeans Compute Node optimizes resource utilization, reducing overall expenses.
  • Simplicity: The integration simplifies the architecture of vector search systems, making it easier to deploy, manage, and maintain these solutions.

Use Cases and Applications

The Azure SQL DB and KMeans Compute Node solution is versatile, supporting a wide range of applications:

1-Analysis of Similarity in Court Decisions
  • Legal Research and Precedents: By analyzing the similarities in court decisions, legal professionals can efficiently find relevant precedents to support their cases. This application can significantly speed up legal research, ensuring lawyers can access comprehensive and pertinent case law that aligns closely with their current matters.
2-Personalized Medicine and Genomic Data Analysis
  • Drug Response Prediction: Leveraging vector search to analyze genomic data allows researchers to predict how patients might respond to different treatments. By clustering patients based on genetic markers, medical professionals can tailor treatments to individual genetic profiles, advancing the field of personalized medicine.
3-Market Trend Analysis
  • Consumer Behavior Clustering: Businesses can cluster consumer behavior data to identify emerging market trends and tailor their marketing strategies accordingly. Vector search can help analyze high-dimensional data, such as purchase history and online behavior, to segment consumers into groups with similar preferences and behaviors.
4-Cybersecurity Threat Detection
  • Anomaly Detection in Network Traffic: Vector search can monitor network traffic, identifying unusual patterns that may indicate cybersecurity threats. By clustering network events, it’s possible to quickly isolate and investigate anomalies, enhancing an organization’s ability to respond to potential security breaches.
5-Educational Content and Learning Style Personalization
  • Matching Educational Materials to Learning Styles: By clustering educational content and student profiles, educational platforms can personalize learning experiences. Vector search can identify the most suitable materials and teaching methods for different learning styles, improving student engagement and outcomes.
6-Environmental Monitoring and Conservation Efforts
  • Species Distribution Modeling: Vector search can analyze environmental data to model the distribution of various species across different habitats. This information is crucial for conservation planning, helping identify critical areas for biodiversity conservation.
7-Supply Chain Optimization
  • Predictive Maintenance and Inventory Management: In supply chain management, vector search can cluster equipment performance data to predict maintenance needs and optimize inventory levels. This application ensures that operations run smoothly, with minimal downtime and efficient use of resources.
8-Creative Industries and Content Creation
  • Similarity Analysis in Music and Art: Artists and creators can use vector search to analyze patterns and themes in music, art, and literature. This approach can help understand influences, trends, and the evolution of styles over time, providing valuable insights for new creations.

    Architecture of the project

    The project’s architecture is straightforward as it comprises a single container that exposes a REST API to build and rebuild the index and search for similar vectors. The container is deployed to Azure Container Apps and uses Azure SQL DB to store the vectors and the clusters.

    The idea is that compute-intensive operations, like calculating KMeans, can be offloaded to a dedicated container that is easy to deploy, quick to start, and offers serverless scaling for the best performance/cost ratio.

    Once the container runs, it is entirely independent of the database and can work without affecting database performance. Even better, if more scalability is needed, data can be partitioned across multiple container instances to achieve parallelism.

    Once the model has been trained, the identified clusters and centroids – and thus the IVF index – are saved back to the SQL DB so that they can be used to perform ANN search on the vector column without the need for the container to remain active. The container can be stopped entirely as SQL DB is completely autonomous now.

    The data is stored back in SQL DB using the following tables:

    • [$vector].[kmeans]: stores information about created indexes
    • [$vector].[<table_name>$<column_name>$clusters_centroids]: stores the centroids
    • [$vector].[<table_name>$<column_name>$clusters]: the IVF structure, associating each centroid to the list of vectors assigned to it

    to search even more accessible, a function is created also:

    • [$vector].[find_similar$<table_name>$<column_name>](<vector>, <probe cells count>, <similarity threshold>): the function to perform ANN search

    The function calculates the dot product, the same as the cosine similarity if vectors are normalized to 1.

    Also, the function:

    • [$vector].[find_cluster$<table_name>$<column_name>](<vector>): find the cluster of a given vector

    is provided as it is needed to insert new vectors into the IVF index.

    Implementation

    The project is divided into two GitHub repositories: one with Python source code for the KMeans compute node created by Davide Mauri, Principal Product Manager in Azure SQL DB at Microsoft, and the other part with the actual example app I have created to test the project.

    1. Azure SQL DB Vector – KMeans Compute Node

    KMeans model from Scikit Learn is executed within a container as a REST endpoint. The API exposed by the container are:

    • Server Status: GET /
    • Build Index: POST /kmeans/build
    • Rebuild Index: POST /kmeans/rebuild

    Both Build and Rebuild API are asynchronous. The Server Status API can be used to check the status of the build process.

    Build Index

    To build an index from scratch, the Build API expects the following payload:

    JSON
    {
      "table": {
        "schema": <schema name>,
        "name": <table name>
      },
      "column": {
        "id": <id column name>,
        "vector": <vector column name>
      },
      "vector": {
        "dimensions": <dimensions>
      }
    }

    Using the aforementioned Wikipedia dataset, the payload would be:

    JSON
    POST /kmeans/build
    {
        "table": {
            "schema": "dbo",
            "name": "news"
        },
        "column": {
            "id": "article_id",
            "vector": "content_vector"
        },
        "vector": {
            "dimensions": 1536
        }
    }

    The API would verify that the request is correct and then start the build process asynchronously, returning the ID assigned to the index being created:

    JSON
    {
      "id": 1,
      "status": {
        "status": {
          "current": "initializing",
          "last": "idle"
        },
        "index_id": "1"
      }
    }

    The API will return an error if the index on the same table and vector column already exists. If you want to force the creation of a new index over the existing one, you can use the force Option:

    POST /kmeans/build?force=true

    Rebuild Index

    If you need to rebuild an existing index, you can use the Rebuild API. The API doesn’t need a payload, as it will use the existing index definition. Just like the build process, the rebuild process is also asynchronous. The index to be rebuilt is specified via the URL path:

    POST /kmeans/rebuild/<index id>

    For example, to rebuild the index with id 1:

    POST /kmeans/rebuild/1

    Query API Status

    The status of the build process can be checked using the Server Status API:

    GET /

    And you’ll get the current status and the last status report:

    JSON
    {
      "server": {
        "status": {
          "current": "building",
          "last": "initializing"
        },
        "index_id": 1
      },
      "version": "0.0.1"
    }

    Checking the previous status is helpful to understand whether an error occurred during the build process.

    You can also check the index build status by querying the [$vector].[kmeans] table.

    2. Leveraging KMeans Compute Node for Text Similarity Analysis through Vector Search in Azure SQL

    Search for similar vectors.


    Once you have built the index, you can search for similar vectors. Using the sample dataset, you can search for the 10 most similar news to ‘How Generative AI Is Transforming Today’s And Tomorrow’s Software Development Life Cycle.’ using the find_similar function created as part of the index build process. For example:

    SQL
    /*
        This SQL code is used to search for similar news articles based on a given input using vector embeddings.
        It makes use of an external REST endpoint to retrieve the embeddings for the input text.
        The code then calls the 'find_similar$news$content_vector' function to find the top 10 similar news articles.
        The similarity is calculated based on the dot product of the embeddings.
        The result is ordered by the dot product in descending order.
    */
    
    declare @response nvarchar(max);
    declare @payload nvarchar(max) = json_object('input': 'How Generative AI Is Transforming Today’s And Tomorrow’s Software Development Life Cycle.');
    
    exec sp_invoke_external_rest_endpoint
        @url = 'https://<YOUR APP>.openai.azure.com/openai/deployments/embeddings/embeddings?api-version=2023-03-15-preview',
        @credential = [https://<YOUR APP>.openai.azure.com],
        @payload = @payload,
        @response = @response output;
    
    select top 10 r.published, r.category, r.author, r.title, r.content, r.dot_product
    from [$vector].find_similar$news$content_vector(json_query(@response, '$.result.data[0].embedding'), 50, 0.80)  AS r
    order by dot_product desc

    The find_similar the function takes three parameters:

    • the vector to search for
    • the number of clusters to search in
    • the similarity threshold

    The similarity threshold filters out vectors that are not similar enough to the query vector. The higher the threshold, the more similar the vectors returned will be. The number of clusters to search in is used to speed up the search. The higher the number of clusters, the more similar the vectors returned will be. The lower the number of clusters, the faster the search will be.

    Explore the latest app I’ve created, which is tailored to help you craft prompts and assess your performance utilizing my updated news dataset. Click here to start discovering the app’s features. In my app, I use all 50 clusters to search in, and 80% with the similarity threshold.

    It’s important to understand that you can search multiple articles simultaneously and get similar results. Look at the example below:

    This post connects to another one where I discuss ‘Navigating Vector Operations in Azure SQL for Better Data Insights: A Guide on Using Generative AI to Prompt Queries in Datasets.’ However, using the cosine similarity.

    Conclusion

    Integrating Azure SQL DB with KMeans Compute Node represents a significant advancement in vector search, providing an efficient, scalable, and cost-effective solution. This innovative approach to managing and querying high-dimensional data stands as a beacon for businesses wrestling with the intricacies of big data. By leveraging such cutting-edge technologies, organizations are better positioned to unlock their data’s full potential, uncovering insights previously obscured by the sheer volume and complexity of the information. This, in turn, allows for delivering superior services and products more closely aligned with user needs and preferences.

    Moreover, adopting Azure’s robust infrastructure and the strategic application of the KMeans clustering algorithm underscores a broader shift towards more intelligent, data-driven decision-making processes. As companies strive to remain competitive in an increasingly data-centric world, the ability to swiftly and accurately sift through vast datasets to find relevant information becomes paramount. Azure SQL DB and KMeans Compute Node facilitate this, enabling businesses to improve operational efficiencies, innovate, and personalize their offerings, enhancing customer satisfaction and engagement.

    Looking ahead, the convergence of Azure SQL DB and KMeans Compute Node is setting the stage for a new era in data management and retrieval. As this technology continues to evolve and mature, it promises to open up even more possibilities for deep analytical insights and real-time data interaction. This revolution in vector search is not just about managing data more effectively; it’s about reimagining what’s possible with big data, paving the way for future innovations that will continue to transform industries and redefine user experiences. With Azure at the forefront, the future of data management is bright, marked by an era of unparalleled efficiency, scalability, and insight.

    That’s it for today!

    Sources

    Azure-Samples/azure-sql-db-vectors-kmeans: Use KMeans clustering to speed up vector search in Azure SQL DB (github.com)

    K-Means Clustering From Scratch in Python [Algorithm Explained] – AskPython

    Interactive Data Analysis: Chat with Your Data in Azure SQL Database Using Vanna AI

    In an era where data is the new gold, the ability to effectively mine, understand, and utilize this valuable resource determines the success of businesses. Traditional data analysis methods often create a bottleneck due to their complexity and the need for specialized skills. This is where the groundbreaking integration of Vanna AI with Azure SQL Database heralds a new dawn. Inspired by the pivotal study “AI SQL Accuracy: Testing different LLMs + context strategies to maximize SQL generation accuracy,” this article explores how Vanna AI is not just an innovation but a revolution in data analytics. It simplifies complex data queries into conversational language, making data analysis accessible to all, irrespective of their technical prowess.

    Understanding Vanna AI: The Next Frontier in Data Analytics

    Vanna AI emerges as a pivotal innovation in the rapidly evolving landscape of artificial intelligence and data management. But what exactly is Vanna AI, and why is it becoming a game-changer in data analytics? Let’s delve into the essence of Vanna AI and its transformative impact.

    What is Vanna AI?

    Vanna AI is an advanced AI-driven tool designed to bridge the gap between complex data analysis and user-friendly interaction. At its core, Vanna AI is a sophisticated application of Large Language Models (LLMs) optimized for interacting with databases. It leverages the power of AI to translate natural language queries into precise SQL commands, effectively allowing users to “converse” with their databases.

    Key Features and Capabilities

    1. Natural Language Processing (NLP): Vanna AI excels at understanding and processing human language, enabling users to ask questions in plain English and receive accurate data insights.
    2. Contextual Awareness: One of the standout features of Vanna AI is its ability to understand a specific database’s structure and nuances contextually. This includes schema definitions, documentation, and historical queries, significantly enhancing the accuracy of SQL generation.
    3. Adaptability Across Databases: Vanna AI is not limited to a single type of database. Its versatility allows it to be integrated with various database platforms, including Azure SQL Database, enhancing its applicability across different business environments.
    4. Ease of Use: By simplifying the process of data querying, Vanna AI democratizes data analysis, making it accessible to non-technical users, such as business analysts, marketing professionals, and decision-makers.

    How Vanna works

    Vanna works in two easy steps – train a RAG “model” on your data and then ask questions that will return SQL queries that can be set up to run on your database automatically.

    1. vn.train(...): Train a RAG “model” on your data. These methods add to the reference corpus below.
    2. vn.ask(...): Ask questions. This will use the reference corpus to generate SQL queries that can be run on your database.

    Empowering SQL Generation with AI

    The challenge in traditional data analysis has been the necessity of SQL expertise. Vanna AI disrupts this norm by enabling users to frame queries in plain language and translate them into SQL. This approach democratizes data access and accelerates decision-making by providing quicker insights.

    The research compared the efficacy of various Large Language Models (LLMs) like Google Bison, GPT 3.5, GPT 4 turbo, and Llama 2 in generating SQL. While GPT 4 excelled overall performance, the study highlighted that other LLMs could achieve comparable accuracy with the proper context.

    Presenting the Practical Application I Developed for Your Evaluation.

    A testament to Vanna AI’s practical application, I created an example app that you can test yourself and understand how it works, an innovative application designed for the Microsoft Adventure Works database. Available at this URL. This application exemplifies how AI can transform data interaction. It allows users to converse with the Adventure Works database in natural language, simplifying complex data queries and making data analysis more approachable and efficient.

    Exploring the AdventureWorksLT Schema: An Overview of Database Relationships and Structure

    Here is a concise introduction to the Adventure Work database. This will help you better understand the database structures and tables, enabling you to make more effective inquiries in the test application I developed.

    In the Dbo schema, there is an ErrorLog table designed to capture error information, with fields such as ErrorTime, UserName, and ErrorMessage. The CustomerAddress table bridges customers to addresses, suggesting a many-to-many relationship as one customer can have multiple addresses, and one address can be associated with multiple customers.

    The SalesLT schema is more complex and includes several interconnected tables:

    • Product: Contains product details, such as name, product number, color, and size.
    • ProductCategory: Organizes products into hierarchical categories.
    • ProductModel: Defines models for products, which could include multiple products under a single model.
    • ProductModelProductDescription: This link between product models and their descriptions indicates a many-to-many relationship between models and descriptions facilitated by a culture identifier.
    • ProductDescription: Stores descriptions for products in different languages (indicated by the Culture field).
    • Address: Holds address information and is related to customers through the CustomerAddress table.
    • Customer: Holds customer information such as name, contact details, and password hashes for customer accounts.
    • SalesOrderHeader: Captures the header information of sales orders, including details like order date, due date, and total due amount.
    • SalesOrderDetail: Provides line item details for each sales order, such as quantity and price.

    The schema includes primary keys (PK) to uniquely identify each entry in a table, foreign keys (FK) to establish relationships between tables, and indexes (U1, U2) to improve query performance on the database.

    Explore the Source Codes of the app I developed.

    To develop an app yourself using the Azure SQL database, click this link to access my GitHub repository containing all source codes.

    Conclusion

    As we stand at the cusp of a data revolution, Vanna AI’s integration with Azure SQL Database and its practical embodiment in applications like the app I created, for example, for the Microsoft Adventure Works database, represents more than technological advancement; they signify a paradigm shift in data interaction and analysis. This evolution marks the transition from data being experts’ exclusive domain to becoming a universal language understood and utilized across various business sectors. The journey of data analytics, powered by AI and made user-friendly through Vanna AI, is not just about technological transformation; it’s about empowering organizations and individuals with the tools to unlock the true potential of their data. Stay connected with the evolving world of Vanna AI and discover how this revolutionary tool can redefine your approach to data, paving the way for a more informed, efficient, and data-driven future.

    That’s it for today!

    Vanna.AI – Personalized AI SQL Agent

    Navigating the Future of AI with Embedchain’s RAG Framework: The Power of Embedchain’s Vector Database

    Imagine you’re an adventurer exploring an unknown land full of mysteries and opportunities. That’s similar to navigating the evolving landscape of artificial intelligence (AI). Imagine you have a magical guidebook called Embedchain, offering detailed maps and tools to make your journey smoother and more exciting. Embedchain is an innovative open-source retrieval-augmented generation (RAG) framework for AI enthusiasts like a Swiss Army knife. It’s designed to help you quickly create and deploy AI applications, whether you’re a seasoned explorer (developer) or just starting. It’s about making the complex world of AI as simple and enjoyable as building a castle out of toy blocks.

    First, let’s explain what RAG is

    Retrieval-augmented generation (RAG) is a technique used in natural language processing that combines the powers of both retrieval (searching for relevant information) and generation (creating coherent text). It’s designed to improve the quality and relevance of the generated text in models like chatbots or question-answering systems.

    Here’s how RAG works

    1. Retrieval: When the model receives a prompt or a question, it first searches a large dataset or database to find relevant documents or text snippets. This is similar to how you might use a search engine to find information on a topic.
    2. Augmentation: The retrieved texts are then fed into a generative model. This model, often a large language model like GPT-4, PaLM 2, Claude, LLaMA, or BERT, uses the information from the retrieved texts to better understand the context and nuances of the topic.
    3. Generation: Finally, the model generates a response or completes the text, incorporating the relevant information retrieved. The model can provide more accurate, informative, and contextually relevant answers by grounding its responses in real-world information.

    Benefits of RAG

    • Improved Accuracy: The model can provide more factual and up-to-date information by basing its responses on retrieved documents.
    • Contextual Understanding: RAG helps models understand the context better by providing background information.
    • Versatility: It’s useful for various applications, from chatbots and customer service to content creation.

    Challenges

    • Quality of Sources: The output is only as good as the retrieved documents. The final output will suffer if the retrieval step fetches irrelevant or poor-quality information.
    • Complexity: Implementing RAG can be technically complex and resource-intensive, requiring powerful models and large, well-curated datasets.

    What is Embedchain?

    Embedchain is a bit like a bright and friendly robot that’s great at organizing things. Imagine you have a massive pile of Lego blocks in all shapes and sizes but want to build a specific model. Embedchain is the friend who sorts through the pile, finds exactly what you need, and hands it to you when needed. It does this for AI by handling various types of unstructured data, breaking them into manageable chunks, generating relevant ’embeddings’ (think of these as intelligent labels that help the computer understand the data), and then storing them in a ‘vector database’ for easy retrieval. For example, if you’re building an AI to help students learn history, Embedchain can take historical texts, understand the crucial parts, and help the AI use this information to answer student questions accurately.

    Key Features of Embedchain:

    1. Data Processing: It automatically recognizes the data type, processes it, and creates embeddings for critical parts of the data.
    2. Data Storage: Users can choose where to store processed data in a vector database.
    3. Diverse APIs: Embedchain offers APIs that enable users to extract contextual information, find precise answers, or engage in interactive chat conversations.
    4. User-Friendly for Varied Expertise Levels: It is designed for a wide range of users, from AI professionals to beginners, offering ease of use and extensive customization options.
    5. Simplified RAG Pipeline Management: The framework handles the complexities of developing an RAG pipeline, such as integrating and indexing data from diverse sources, determining optimal data chunking methods, and implementing efficient data storage.
    6. Tailored Application Development: Users can tailor the system to meet specific needs, whether for simple projects or complex AI applications.

    Who is Embedchain for?

    Embedchain is like a universal toolset that’s helpful for a wide range of people. Whether you’re a data scientist, a machine learning engineer, a college student, an independent developer, or someone who loves tinkering with technology, Embedchain has something for you. It’s designed to be user-friendly, allowing beginners to build sophisticated AI applications with just a few lines of code. At the same time, it’s also highly customizable, letting experts tweak and fine-tune various aspects to fit their exact needs. Think of it as a set of building blocks that can be as simple or complex as you want them to be. For instance, a high school student might use Embedchain to create a simple chatbot for a school project, while a professional developer might use it to build a complex AI-powered system for analyzing scientific data.

    Why Use Embedchain?

    Using Embedchain is like having a personal assistant who’s good at jigsaw puzzles. Developing an AI involves combining many different data and processes, which can be complicated. Embedchain simplifies this by handling the tough stuff for you. It automatically recognizes and processes data, creates embeddings, and decides where to store this information. When your AI needs to answer a question or decide, Embedchain quickly finds the relevant information and helps the AI understand it. This means you can focus on the creative and essential parts of building your AI, like deciding what it should do and how it should interact with people. For example, if you’re creating an AI to provide cooking tips, Embedchain can help you understand and use a vast collection of recipes, cooking techniques, and flavor profiles so it can give you the best advice whether you’re making breakfast or planning a gourmet dinner.

    How does Embedchain work?

    The image outlines a workflow for an AI-powered application using Embedchain’s vector database system. Here’s how it works, explained in a simplified way:

    Understanding the Flow:

    1. OpenAI API: This is the central hub where everything starts. It connects to two key components:

      • gpt-3.5-turbo: This is likely a model for generating responses or completing tasks based on user input.
      • text-embedding-ada-002: This component is responsible for turning text into numerical representations, called embeddings, which the computer can understand and process.
    2. Compute chunk embedding: This process involves breaking down large pieces of text into smaller, more manageable parts, called chunks. Each chunk is then transformed into an embedding by the text-embedding model.

    3. Vector Database: Think of this like a big, smart library where all the chunk embeddings are stored. It’s organized in such a way that it’s easy to find and retrieve the chunks later when needed.

    4. Database Interface: This acts as the librarian, helping users to upload their customized data (in chunks) into the Vector Database and retrieve them when needed.

    5. Query Interface: This is where users interact with the system. They ask questions, and the Query Interface translates those questions into embeddings, much like it does with the data chunks.

    6. Compute question embedding: When a user asks a question, the Query Interface calculates the embedding for this question to understand what’s being asked.

    7. Ask for chunks: Once the question is understood, the system looks for relevant chunks in the Vector Database that might contain the answer.

    8. Ask for responses: The relevant chunks are then passed to the gpt-3.5-turbo model, which uses them to generate a precise and informative response.

    9. Users: There are two main interactions for users:

      • Asking questions: Users can ask questions to get information or responses from the AI.
      • Uploading customized data: Users can add their own data to the Vector Database, which can then be used by the AI to generate responses.

    The Role of Embedchain


    Embedchain is the framework that facilitates this entire process. The underlying structure allows all these components to work together smoothly. Embedchain’s vector database is crucial, as it efficiently stores and retrieves the data embeddings. This enables the AI to provide fast and relevant responses to user queries, drawing on a wealth of organized information. The result is an intelligent system that can interact with users conversationally, providing them with information or assistance based on a vast and easily accessible knowledge database.

    Let’s say you’re making a scrapbook, but instead of pictures and stickers, you’re using bits of information. Embedchain helps you by cutting and organizing these bits and then pasting them in the right places. For AI, this means taking data (like text, images, or sound), breaking it into pieces, understanding what each piece means, and then storing it in a way that’s easy to find later. When someone asks your AI a question, Embedchain quickly flips through the scrapbook to find the correct information and helps the AI understand it to give a good answer. For instance, if you’ve built an AI to help travelers find the perfect vacation spot, Embedchain can help it understand and remember details about thousands of destinations, from the best local dishes to the most exciting activities, to give personalized recommendations.

    How to install it?

    Installing Embedchain is like downloading a new app on your phone. You go to the place where it’s available, in this case, a website called GitHub, and follow some simple instructions to get it on your computer. There’s some technical stuff involved, like using a ‘command prompt’ to tell your computer what to do, but the instructions are clear and easy to follow. Once you’ve installed Embedchain, it’s like having a new superpower for your computer, letting it understand and use AI in all sorts of exciting ways.

    Embedchain Installation Process

    Embedchain Installation Process

    The installation process for Embedchain is straightforward and can be completed in a few simple steps. Here’s a step-by-step guide to help you get started:

    Step 1: Install the Python Package

    1. Open a Terminal: Start by opening your terminal or command prompt.
    2. Install Embedchain: Use Python’s package manager, pip, to install Embedchain. Enter the following command:
    pip install embedchain

    Step 2: Choose Your Model Type

    With Embedchain, you have the option to use either open-source models or paid models.

    Option 1: Open Source Models

    • Open-source LLMs (Large Language Models) like Mistral, Llama, etc., are free to use and run locally on your machine.

    Option 2: Paid Models

    • This includes paid LLMs like GPT-4, Claude, etc. These models cost money and are accessible via an API.

    Step 3: Set Up the Environment

    For Open Source Models (e.g., Mistral)

    1. Obtain a Hugging Face Token: If you’re using a model hosted on Hugging Face (like Mistral), you’ll need a Hugging Face token. You can create one for free on their website.
    2. Set the Environment Variable: Replace "hf_xxxx" with your actual Hugging Face token in the following command and run it:
    import os
    os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "hf_xxxx"

    Or Paid Models (e.g., GPT-4)

    1. Obtain an OpenAI API Key: If you’re using a paid model from OpenAI, you’ll need an OpenAI API key.
    2. Set the Environment Variable: Replace "sk-xxxx" with your actual OpenAI API key in the following command and run it:
    import os
    os.environ["OPENAI_API_KEY"] = "sk-xxxx"

    Step 4: Create and Run Your Application

    1. Import Embedchain: Import the App class from the embedchain package.
    2. Initialize the App: Create an instance of the App class.
    3. Add Data: Add URLs or other data sources to your application using the add method.
    4. Query: Use the query method to ask questions or get information from your data.

    Example Code Snippet:

    Python
    import os
    from embedchain import App
    
    # replace this with your OpenAI key
    os.environ["OPENAI_API_KEY"] = "sk-xxxx"
    
    app = App()
    
    app.add("https://www.forbes.com/profile/elon-musk")
    app.add("https://en.wikipedia.org/wiki/Elon_Musk")
    
    app.query("What is the net worth of Elon Musk today?")
    # Answer: The net worth of Elon Musk today is $258.7 billion.

    This basic guide should help you get Embedchain installed and running on your system. Remember to replace tokens and URLs with your specific data and credentials.

    Cookbook for using Azure Open AI and OpenAI with Embedchain

    1-Open AI

    Step-1: Install Embedchain package

    !pip install embedchain

    Step-2: Set OpenAI environment variables

    You can find this env variable on your OpenAI dashboard.

    import os
    from embedchain import App
    
    os.environ["OPENAI_API_KEY"] = "sk-xxx"

    Step-3 Create Embedchain app and define your config

    app = App.from_config(config={
        "llm": {
            "provider": "openai",
            "config": {
                "model": "gpt-3.5-turbo",
                "temperature": 0.5,
                "max_tokens": 1000,
                "top_p": 1,
                "stream": False
            }
        },
        "embedder": {
            "provider": "openai",
            "config": {
                "model": "text-embedding-ada-002"
            }
        }
    })

    Step-4: Add data sources to your app

    app.add("https://www.forbes.com/profile/elon-musk")
    app.add("https://en.wikipedia.org/wiki/Elon_Musk")

    Step-5: All set. Now start asking questions related to your data

    while(True):
        question = input("Enter question: ")
        if question in ['q', 'exit', 'quit']:
            break
        answer = app.query(question)
        print(answer)

    2-Azure Open AI

    Step-1: Install Embedchain package

    !pip install embedchain

    Step-2: Set Azure OpenAI-related environment variables

    You can find these env variables on your Azure OpenAI dashboard.

    import os
    from embedchain import App
    
    os.environ["OPENAI_API_TYPE"] = "azure"
    os.environ["OPENAI_API_BASE"] = "https://xxx.openai.azure.com/"
    os.environ["OPENAI_API_KEY"] = "xxx"
    os.environ["OPENAI_API_VERSION"] = "xxx"

    Step-3: Define your LLM and embedding model config

    config = """
    llm:
      provider: azure_openai
      model: gpt-35-turbo
      config:
        deployment_name: ec_openai_azure
        temperature: 0.5
        max_tokens: 1000
        top_p: 1
        stream: false
    
    embedder:
      provider: azure_openai
      config:
        model: text-embedding-ada-002
        deployment_name: ec_embeddings_ada_002
    """
    
    # Write the multi-line string to a YAML file
    with open('azure_openai.yaml', 'w') as file:
        file.write(config)

    Step-4 Create Embedchain app based on the config

    app = App.from_config(config_path="azure_openai.yaml")

    Step-5: Add data sources to your app

    app.add("https://www.forbes.com/profile/elon-musk")
    app.add("https://en.wikipedia.org/wiki/Elon_Musk")

    Step-6: All set. Now start asking questions related to your data

    while(True):
        question = input("Enter question: ")
        if question in ['q', 'exit', 'quit']:
            break
        answer = app.query(question)
        print(answer)

    Choosing the Right Model

    Embedchain supports open-source and paid models, giving users flexibility based on their requirements and resources. Here’s an overview of the models supported by Embedchain and their benefits:

    Open Source Models

    1. Mistral:
      • Hosted on Hugging Face.
      • It is free to use and runs primarily on your local machine.
      • Benefits: Ideal for users with privacy concerns or limited budgets. Suitable for experimentation and learning.
    2. Llama:
      • Another open source LLM.
      • Benefits: Offers a balance between performance and cost-effectiveness. Suitable for projects where cost is a concern.
    3. GPT4All:
      • A free-to-use, locally running model.
      • Benefits: Privacy-aware, does not require a GPU or internet. Good for local development and privacy-focused applications.
    4. JinaChat:
      • Requires setting up a JINACHAT_API_KEY.
      • Benefits: Provides flexibility and local control over the language model.
    1. GPT-4 (from OpenAI):
      • Accessible via an API.
      • Benefits: State-of-the-art model offering high-quality responses. Ideal for complex and commercial applications.
    2. Claude (from Anthropic):
      • Requires setting up the ANTHROPIC_API_KEY.
      • Benefits: Offers advanced AI capabilities for sophisticated applications.
    3. Azure OpenAI:
      • Provides access to OpenAI models through Azure’s cloud services.
      • Benefits: Combines the power of OpenAI models with the reliability and scalability of Azure’s cloud infrastructure.
    4. Cohere:
      • Access through COHERE_API_KEY.
      • Benefits: Known for its natural language understanding capabilities, it is suitable for various applications, including content generation and analysis.
    5. Together:
      • Accessed via the TOGETHER_API_KEY.
      • Benefits: Offers specialized language models for specific use cases.

    Benefits of Open Source vs. Paid Models

    • Cost-Effectiveness: Open source models are generally free, making them accessible for users with limited budgets or experimenting.
    • Privacy and Security: Open source models can be run locally, providing better control over data privacy.
    • State-of-the-Art Performance: Paid models like GPT-4 often deliver more advanced capabilities and higher accuracy, suitable for professional and commercial applications.
    • Scalability: Paid models, especially those offered through cloud services like Azure OpenAI, provide scalability for handling large volumes of requests or data.
    • Support and Reliability: Paid models often come with professional support, regular updates, and reliability guarantees, which are crucial for business-critical applications.

    Choosing between open-source and paid models depends on your specific needs, budget, and project scale. Embedchain’s support for various models ensures flexibility and adaptability for various use cases.

    Use Cases of Embedchain

    1. Chatbots

    • Application Areas:
      • Customer Service: Automating responses to common inquiries and providing round-the-clock support.
      • Education: Personalized tutoring and learning assistance.
      • E-commerce: Assisting in product discovery, making recommendations, and facilitating transactions.
      • Content Management: Helping in writing, summarizing, and content organization.
      • Data Analysis: Extracting insights from large datasets.
      • Language Translation: Offering real-time support in multiple languages.
      • Mental Health: Providing preliminary support and conversational engagement.
      • Entertainment: Engaging users through games, quizzes, and humorous interactions​​.

    2. Question Answering

    • Versatile Applications:
      • Educational Aid: Enhancing learning experiences and helping with homework.
      • Customer Support: Efficiently addressing and resolving customer queries.
      • Research Assistance: Supporting academic and professional research.
      • Healthcare Information: Providing basic medical knowledge.
      • Technical Support: Resolving technology-related questions.
      • Legal Information: Offering essential legal advice and information.
      • Business Insights: Delivering market analysis and strategic business advice.
      • Language Learning: Aiding in understanding and translating various languages.
      • Travel Guidance: Providing travel and hospitality information.
      • Content Development: Assisting authors and creators in research and idea generation​​.
    • Enhanced Information Retrieval and Discovery:
      • Information Retrieval: Improving search accuracy in databases and websites.
      • E-commerce: Enhancing product discovery in online shopping platforms.
      • Customer Support: Empowering chatbots for more effective responses.
      • Content Discovery: Aiding in finding relevant media content.
      • Knowledge Management: Streamlining document and data retrieval in enterprises.
      • Healthcare: Facilitating medical research and literature searches.
      • Legal Research: Assisting in legal document and case law searches.
      • Academic Research: Aiding in academic paper discovery.
      • Language Processing: Enabling multilingual search capabilities​​.

    Each of these use cases demonstrates the versatility and wide-ranging applications of Embedchain, highlighting its capability to enhance various domains with advanced AI-driven functionalities.

    Configuration and Customization in Embedchain

    Embedchain offers various configuration and customization options across its components, ensuring flexibility and adaptability for diverse use cases. Here’s an organized overview:

    Components Configuration

    1. Data Source:
      • Embedchain supports a variety of data sources, enabling the loading of unstructured data through a user-friendly interface. Supported data sources include:
        • PDF, CSV, JSON files
        • Text, MDX, DOCX files
        • HTML web pages, YouTube channels, and videos
        • Docs websites, Notion, Sitemaps, XML files
        • Q&A pairs, OpenAPI, Gmail, GitHub repositories
        • PostgreSQL, MySQL databases
        • Slack, Discord, Discourse, Substack
        • Beehiiv, Dropbox, Images, and custom sources​​.
    2. Large Language Models (LLMs):
      • Embedchain integrates various popular LLMs, simplifying the process of incorporating them into your application. Supported LLMs include:
        • OpenAI (requiring OPENAI_API_KEY)
        • Google AI, Azure OpenAI, Anthropic, Cohere
        • Together, Ollama, GPT4All, JinaChat
        • Hugging Face, Llama2, Vertex AI​​.
    3. Embedding Models:
      • Embedchain supports several embedding models from providers such as:
        • OpenAI, GoogleAI, Azure OpenAI
        • GPT4All, Hugging Face, Vertex AI​​.
    4. Vector Databases:
      • The integration of vector databases is streamlined in Embedchain. You can configure them within the YAML configuration file. Supported databases include:
        • ChromaDB, Elasticsearch, OpenSearch
        • Zilliz, LanceDB, Pinecone, Qdrant
        • Weaviate (requiring WEAVIATE_ENDPOINT and WEAVIATE_API_KEY)​​.

    Deployment of Embedchain

    Embedchain simplifies the deployment process of RAG applications, allowing them to be hosted on various cloud platforms. This flexibility ensures that users can select a hosting service that best suits their needs and preferences. The various cloud providers supported by Embedchain for deployment are:

    1. Fly.io: A platform known for its simplicity and ease of use, suitable for applications requiring global distribution.
    2. Modal.com: Offers scalable computing for large-scale applications.
    3. Render.com: Known for its developer-friendly features, it provides static sites, web services, and private services.
    4. Streamlit.io: A popular choice for machine learning and data science applications, enabling easy creation of interactive web apps.
    5. Gradio. App: Ideal for creating sharable machine learning demos and web applications.
    6. Huggingface.co: A platform specializing in natural language processing and machine learning models, particularly those involving LLMs.
    7. Embedchain.ai: The native platform for Embedchain, likely offering the most integrated and streamlined experience for deploying Embedchain applications.

    Each platform offers unique features and benefits, catering to various application requirements, from small-scale projects to large, enterprise-level deployments​​.

    Practical Applications and Examples

    Embedchain offers a versatile set of tools that can be utilized to create various types of chatbots, each tailored for specific applications and platforms. Here are some practical examples and applications:

    1. Full Stack Chatbot:
      • Application: integrate a chatbot within a full-stack application.
      • Use Case: Ideal for web applications that require interactive user engagement.
    2. Custom GPT Creation:
      • ApplicationBuild a tailored GPT chatbot suited to your specific needs.
      • Use Case: Useful for creating specialized chatbots for customer service or personalized assistance.
    3. Slack Integration Bot:
      • Application: Enhance your Slack workspace with a specialized bot.
      • Use Case: Integrating AI functionalities into Slack for improved workplace communication and automation.
    4. Discord Community Bot:
      • Application: Create an engaging bot for your Discord server.
      • Use Case: Enhancing community interaction on Discord servers with automated responses or interactive features.
    5. Telegram Assistant Bot:
      • Application: Develop a handy assistant for Telegram users.
      • Use Case: Providing assistance, automation, and engagement in Telegram channels or groups.
    6. WhatsApp Helper Bot:
      • Application: Design a WhatsApp bot for efficient communication.
      • Use Case: Automate responses and provide information services on WhatsApp.
    7. Poe Bot for Unique Interactions:
      • Application: Explore advanced bot interactions with Poe Bot.
      • Use Case: Creating bots with unique, advanced interaction capabilities, possibly for gaming, storytelling, or engaging user experiences.

    These examples demonstrate Embedchain’s adaptability in creating chatbots for different platforms and purposes, ranging from simple automation to complex, interactive applications​​.

    Access the Notebooks examples featuring LLMs, Embedding Models, and Vector DBs with Embedchain by clicking this link.

    Conclusion

    Embedchain is a beacon of guidance and empowerment in AI’s vast and ever-changing landscape. It’s akin to having a compass and a map while navigating uncharted territories. This remarkable tool demystifies the complexities of AI, making it approachable and accessible to everyone, from curious novices to seasoned experts. Whether you’re taking your first steps into this exciting field or an experienced traveler looking to push the boundaries further, Embedchain offers the resources, support, and flexibility you need to bring your visionary AI projects to life.

    Embedchain isn’t just a tool; it’s a companion on your journey through the world of AI. It’s there to handle the heavy lifting, allowing you to focus on your projects’ creative and impactful aspects. With its user-friendly nature and adaptable framework, Embedchain ensures that the future of AI isn’t just a realm for the few but an accessible, enriching, and empowering experience for all. It’s your ally in unlocking the full potential of AI, helping you turn your imaginative ideas into real-world solutions and innovations.

    That’s it for Today!

    Sources

    https://embedchain.ai/

    https://embedchain.ai/blog/introducing-embedchain

    https://gptpluginz.com/embedchain-ai/

    embedchain/embedchain: The Open Source RAG framework (github.com)

    Introducing the New Google Gemini API: A Comparative Analysis with ChatGPT in the AI Revolution

    Google’s recent announcement of the Gemini API marks a transformative leap in artificial intelligence technology. This cutting-edge API, developed by Google DeepMind, is a testament to Google’s commitment to advancing AI and making it accessible and beneficial for everyone. This blog post will explore the multifaceted features, potential applications, and impact of the Google Gemini API, as revealed in Google’s official blogs and announcements.

    What is the Google Gemini?

    Google Gemini is a highly advanced, multimodal artificial intelligence model developed by Google. It represents a significant step forward in AI capabilities, especially in understanding and processing a wide range of data types.

    Extract from the Google Germini official website

    Gemini’s Position in the AI Landscape

    Gemini is a direct competitor to OpenAI’s GPT-3 and GPT-4 models. It differentiates itself through its native multimodal capabilities and its focus on seamlessly processing and combining different types of information​​. Its launch was met with significant anticipation and speculation, and it is seen as a crucial development in the AI arms race between major tech companies​.

    Below is a comparison of text and multimodal capabilities provided by Google, comparing Germi Ultra, which has not yet been officially launched, with Open AI’s GTP-4.


    Key Features of Gemini

    1. Multimodal Capabilities: Gemini’s groundbreaking design allows it to process and comprehend various data types seamlessly, from text and images to audio and video, facilitating sophisticated multimodal reasoning and advanced coding capabilities.
    2. Three Distinct Models: The Gemini API offers three versions – Ultra, Pro, and Nano, each optimized for different scales and types of tasks, ranging from complex data center operations to efficient on-device applications.
    3. State-of-the-Art Performance: Gemini models have demonstrated superior performance on numerous academic benchmarks, surpassing human expertise in specific tasks and showcasing their advanced reasoning and problem-solving abilities.
    4. Diverse Application Spectrum: The versatility of Gemini allows for its integration across a wide array of sectors, including healthcare, finance, and technology, enhancing functionalities like predictive analytics, fraud detection, and personalized user experiences.
    5. Developer and Enterprise Accessibility: The Gemini Pro is now available for developers and enterprises, with various features such as function calling, semantic retrieval, and chat functionality. Additionally, Google AI Studio and Vertex AI support the integration of Gemini into multiple applications.

    The New Google Gemini API

    The Gemini API represents a significant stride in AI development, introducing Google’s most capable and comprehensive AI model to date. This API is the product of extensive collaborative efforts, blending advanced machine learning and artificial intelligence capabilities to create a multimodal system. Unlike previous AI models, Gemini is designed to understand, operate, and integrate various types of information, including text, code, audio, images, and video, showcasing a new level of sophistication in AI technology.

    Benefits for Developers and Creatives:

    Gemini’s versatility unlocks a plethora of possibilities for developers and creatives alike. Imagine:

    • Building AI-powered applications: Germini can power chatbots, virtual assistants, and personalized learning platforms.
    • Boosting your creative workflow: Generate song lyrics, script ideas, or even marketing copy with Gemini’s innovative capabilities.
    • Simplifying coding tasks: Let Germini handle repetitive coding tasks or write entire code snippets based on your instructions.
    • Unlocking new research avenues: Gemini’s multimodal abilities open doors for exploring the intersection of language, code, and other modalities in AI research.

    How to use the Google Germini API?

    Using the Google Gemini API involves several steps and can be applied to various programming languages and platforms. Here’s a comprehensive guide based on the information from Google AI for Developers:

    Setting Up Your Project

    1. Obtain an API Key: First, create an API key in Google AI Studio or MakeSuite. Securing your API key and not checking it into your version control system is crucial. Instead, pass your API key to your app before initializing the model.
    2. Initialize the Generative Model: Import and initialize the Generative Model in your project. This involves specifying the model name (e.g., gemini-pro-vision for multimodal input) and accessing your API key.

    Follow a quick start with Pyhton at Google Colab.

    Implementing Use Cases

    The Gemini API allows you to implement different use cases:

    1. Text-Only Input: Use the gemini-pro model with the generateContent method for text-only prompts.
    2. Multimodal Input (Text and Image): Use the gemini-pro-vision model. Make sure to review the image requirements for input.
    3. Multi-Turn Conversations (Chat): Use the gemini-pro model and initialize the chat by calling startChat(). Use sendMessage() to send new user messages.
    4. Streaming for Faster Interactions: Implement streaming with the generateContentStream method to handle partial results for faster interactions.

    Germini Pro

    Python
    """
    At the command line, only need to run once to install the package via pip:
    
    $ pip install google-generativeai
    """
    
    import google.generativeai as genai
    
    genai.configure(api_key="YOUR_API_KEY")
    
    # Set up the model
    generation_config = {
      "temperature": 0.9,
      "top_p": 1,
      "top_k": 1,
      "max_output_tokens": 2048,
    }
    
    safety_settings = [
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      }
    ]
    
    model = genai.GenerativeModel(model_name="gemini-pro",
                                  generation_config=generation_config,
                                  safety_settings=safety_settings)
    
    prompt_parts = [
      "Write a  10 paragraph about the Germini functionalities':",
    ]
    
    response = model.generate_content(prompt_parts)
    print(response.text)

    Germini Pro Vision

    Python
    """
    At the command line, only need to run once to install the package via pip:
    
    $ pip install google-generativeai
    """
    
    from pathlib import Path
    import google.generativeai as genai
    
    genai.configure(api_key="YOUR_API_KEY")
    
    # Set up the model
    generation_config = {
      "temperature": 0.4,
      "top_p": 1,
      "top_k": 32,
      "max_output_tokens": 4096,
    }
    
    safety_settings = [
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      }
    ]
    
    model = genai.GenerativeModel(model_name="gemini-pro-vision",
                                  generation_config=generation_config,
                                  safety_settings=safety_settings)
    
    # Validate that an image is present
    if not (img := Path("image0.jpeg")).exists():
      raise FileNotFoundError(f"Could not find image: {img}")
    
    image_parts = [
      {
        "mime_type": "image/jpeg",
        "data": Path("image0.jpeg").read_bytes()
      },
    ]
    
    prompt_parts = [
      image_parts[0],
      "\nTell me about this image, what colors do we have here? How many people do we have here?",
    ]
    
    response = model.generate_content(prompt_parts)
    print(response.text)

    Implementing in Various Languages

    The Gemini API supports several programming languages, each with its specific implementation details:

    • Python, Go, Node.js, Web, Swift, Android, cURL: Each language requires specific code structures and methods for initializing the model, sending prompts, and handling responses. Examples include setting up the Generative Model, defining prompts, and processing the generated content.

    Further Reading and Resources

    • The Gemini API documentation and API reference on Google AI for Developers provide detailed information, including safety settings, guides on large language models, and embedding techniques.
    • For specific language implementations and more advanced use cases like token counting, refer to the respective quickstart guides available on Google AI for Developers.

    By following these steps and referring to the detailed documentation, you can effectively utilize the Google Gemini API for various applications ranging from simple text generation to more complex multimodal interactions.

    Germini vs. ChatGPT: The Ultimate Multimodal Mind Showdown

    The world of large language models (LLMs) is heating up, and two titans stand at the forefront: Google’s Germini and OpenAI’s ChatGPT. Both boast impressive capabilities, but which one reigns supreme? Let’s dive into a head-to-head comparison.

    Google Germini API – Pricing

    Free for Everyone Plan:

    • Rate Limits: 60 QPM (queries per minute)
    • Price (input): Free
    • Price (output): Free
    • Input/output data used to improve our products: Yes

    Pay-as-you-go Plan: ( will coming soon to Google AI Studio)

      • Rate Limits: Starts at 60 QPM
      • Price (input): $0.00025 / 1K characters, $0.0025 / image
      • Price (output): $0.0005 / 1K characters
      • Input/output data used to improve our products: No

      Source: Gemini API Pricing  |  Google AI for Developers

      Open AI ChatGPT API – Pricing

      GPT-4 Turbo

      With 128k context, fresher knowledge, and the broadest set of capabilities, the GPT-4 Turbo is more potent than the GPT-4 and is offered at a lower price.

      Learn about GPT-4 Turbo

      ModelInputOutput
      gpt-4-1106-preview$0.01 / 1K tokens$0.03 / 1K tokens
      gpt-4-1106-vision-preview$0.01 / 1K tokens$0.03 / 1K tokens

      GPT-4

      With broad general knowledge and domain expertise, GPT-4 can follow complex instructions in natural language and solve difficult problems accurately.

      Learn about GPT-4

      ModelInputOutput
      gpt-4$0.03 / 1K tokens$0.06 / 1K tokens
      gpt-4-32k$0.06 / 1K tokens$0.12 / 1K tokens

      GPT-3.5 Turbo

      GPT-3.5 Turbo models are capable and cost-effective.

      gpt-3.5-turbo This family’s flagship model supports a 16K context window optimized for dialog.

      gpt-3.5-turbo-instruct It is an Instruction model and only supports a 4K context window.

      Learn about GPT-3.5 Turbo

      ModelInputOutput
      gpt-3.5-turbo-1106$0.0010 / 1K tokens$0.0020 / 1K tokens
      gpt-3.5-turbo-instruct$0.0015 / 1K tokens$0.0020 / 1K tokens

      Source: Pricing (openai.com)

      Strengths of Germini:

      • Multimodality: Germini shines in its ability to handle text, code, images, and even audio. This opens doors for applications like generating image captions or translating spoken language.
      • Function Calling: Germini seamlessly integrates into workflows thanks to its function calling feature, allowing developers to execute specific tasks within their code.
      • Embeddings and Retrieval: Gemini’s understanding of word relationships and semantic retrieval leads to more accurate information retrieval and question answering.
      • Custom Knowledge: Germini allows fine-tuning with your own data, making it a powerful tool for specialized tasks.
      • Multiple Outputs: Germini goes beyond text generation, offering creative formats like poems, scripts, and musical pieces.

      Strengths of ChatGPT:

      • Accessibility: ChatGPT is widely available through various platforms and APIs, offering free and paid options. Germini is currently in limited access.
      • Creative Writing: ChatGPT excels in creative writing tasks, producing engaging stories, poems, and scripts.
      • Large Community: ChatGPT has a well-established user community that offers extensive resources and tutorials.

      An experiment comparing the Germini and ChatGPT APIs applying the Sparse Priming Representations (SPR) technique

      I conducted an experiment using the APIs from Open AI – ChatGPT and Google Germini, applying the technique(Sparse Priming Representations (SPR)) of prompt engineering to compress and decompress a text. Click here to access the experimental code I created in Google Colab.

      The outcome was interesting; both APIs responded very well to the test. In the table below, we can observe a contextual difference, but both APIs were able to perform the task satisfactorily.

      If you want to learn more about Sparse Priming Representations (SPR), I’ve written an entire post discussing it. Here it is below:

      Conclusion

      In the rapidly evolving landscape of artificial intelligence, the Google Gemini API represents a significant milestone. Its introduction heralds a new era where AI transcends traditional boundaries, offering multimodal capabilities far beyond the text-centric focus of models like ChatGPT. Google Gemini’s ability to process and integrate diverse data types — from images to audio and video — not only sets it apart but also showcases the future direction of AI technology.

      While ChatGPT excels in textual creativity and enjoys widespread accessibility and community support, Gemini’s native multimodal functionality and advanced features like function calling and semantic retrieval position it as a more versatile and comprehensive tool. This distinction is crucial in an AI landscape where the needs range from simple text generation to complex, multimodal interactions and specialized tasks.

      As we embrace this new phase of AI development, it’s clear that both ChatGPT and Google Gemini have unique strengths and applications. The choice between them hinges on specific needs and project requirements. Gemini’s launch is not just a technological breakthrough; it’s a testament to the ever-expanding possibilities of AI, promising to revolutionize various sectors and redefine our interaction with technology. With such advancements, the future of AI seems boundless, limited only by our imagination and the ethical considerations of its application.

      That’s it for today!

      Sources:

      https://tech.co/news/gemini-vs-chatgpt

      https://mockey.ai/blog/google-gemini-vs-chatgpt/

      https://www.pcguide.com/ai/compare/google-gemini-vs-openai-gpt-4/

      https://gptpluginz.com/google-gemini/

      https://www.augustman.com/sg/gear/tech/google-gemini-vs-chatgpt-core-differences-of-the-ai-model-chatbots/

      https://whatsthebigdata.com/gemini-vs-chatgpt-how-does-googles-latest-ai-compare/

      https://www.washingtonpost.com/technology/2023/12/06/google-gemini-chatgpt-alternatives/

      Google Gemini Vs OpenAI ChatGPT: What’s Better? (businessinsider.com)