Chatting with your Enterprise data privately and securely through the use of Azure Cognitive Search and Azure Open AI

In an age where data is power, businesses are constantly looking for ways to leverage their vast enterprise data stores. One promising avenue lies in the intersection of AI and search technologies, specifically through the use of Azure Cognitive Search and Azure Open AI. These tools provide powerful ways to converse with enterprise data privately and securely.

Enterprise data can take various forms, from structured database datasets to unstructured documents, emails, and files. Some examples are data about the company’s benefits, internal policies, job descriptions, roles, and much more.

What is Azure Cognitive Search?

Azure Cognitive Search is a cloud-based service provided by Microsoft Azure that enables developers to build sophisticated search experiences into custom applications. It integrates with other Azure Cognitive Services to enable AI-driven content understanding through capabilities such as natural language processing, entity recognition, image analysis, and more.

Here are some of the key benefits of Azure Cognitive Search:

  1. Fully Managed: Azure Cognitive Search is fully managed, meaning you don’t have to worry about infrastructure setup, maintenance, or scaling. You just need to focus on the development of your application.
  2. Rich Search Experiences: It allows for the creation of rich search experiences, including auto-complete, geospatial search, filtering, and faceting.
  3. AI-Enhanced Search Capabilities: When combined with other Azure Cognitive Services, Azure Cognitive Search can provide advanced search features. For example, it can extract key phrases, detect languages, identify entities, and more. It can even index and search unstructured data, like text within documents or images.
  4. Scalability and Performance: Azure Cognitive Search can automatically scale to handle large volumes of data and high query loads. It provides fast, efficient search across large datasets.
  5. Data Integration: It can pull in data from a variety of sources, including Azure SQL Database, Azure Cosmos DB, Azure Blob Storage, and more.
  6. Security: Azure Cognitive Search supports data encryption at rest and in transit. It also integrates with Azure Active Directory for identity and access management.
  7. Developer Friendly: It provides a simple, RESTful API and integrates with popular programming languages and development frameworks. This makes it easier for developers to embed search functionality into applications.
  8. Indexing: The service provides robust indexing capabilities, allowing you to index data from a variety of sources and formats. This allows for a more comprehensive search experience for end-users.

In summary, Azure Cognitive Search can provide powerful, intelligent search capabilities for your applications, allowing users to find the information they need quickly and easily.

What is Azure Open AI?

Azure OpenAI Service is a platform that provides REST API access to OpenAI’s powerful language models, including GPT-3, GPT-4, Codex, and Embeddings. It can be used for tasks such as content generation, summarization, semantic search, and natural language-to-code translation.

The security and safety of enterprise data is a top priority for Azure OpenAI. Here are some key points on how it ensures safety:

  • The Azure OpenAI Service is fully controlled by Microsoft and does not interact with any services operated by OpenAI. Your prompts (inputs) and completions (outputs), your embeddings, and your training data are not available to other customers, OpenAI, or used to improve OpenAI models, any Microsoft or 3rd party products or services, or to automatically improve Azure OpenAI models for your use in your resource. Your fine-tuned Azure OpenAI models are available exclusively for your use.
  • The service processes different types of data including prompts and generated content, augmented data included with prompts, and training & validation data.
  • When generating completions, images, or embeddings, the service evaluates the prompt and completion data in real-time to check for harmful content types. The models are stateless, meaning no prompts or generations are stored in the model, and prompts and generations are not used to train, retrain, or improve the base models.
  • With the “on your data” feature, the service retrieves relevant data from a configured data store and augments the prompt to produce generations that are grounded with your data. The data remains stored in the data source and location you designate. No data is copied into the Azure OpenAI service.
  • Training data uploaded for fine-tuning is stored in the Azure OpenAI resource in the customer’s Azure tenant. It can be double encrypted at rest and can be deleted by the customer at any time. This data is not used to train, retrain, or improve any Microsoft or 3rd party base models.
  • Azure OpenAI includes both content filtering and abuse monitoring features to reduce the risk of harmful use of the service. To detect and mitigate abuse, Azure OpenAI stores all prompts and generated content securely for up to thirty (30) days.
  • The data store where prompts and completions are stored is logically separated by customer resources. Prompts and generated content are stored in the Azure region where the customer’s Azure OpenAI service resource is deployed, within the Azure OpenAI service boundary. Human reviewers can only access the data when it has been flagged by the abuse monitoring system.
  • Customers who meet additional Limited Access eligibility criteria and attest to specific use cases can apply to modify the Azure OpenAI content management features. Suppose Microsoft approves a customer’s request to change abuse monitoring. In that case, Microsoft does not store any prompts and completions associated with the approved Azure subscription for which abuse monitoring is configured.

In conclusion, Azure OpenAI takes numerous measures to ensure that your enterprise data is kept secure and confidential while using its service.

Revolutionize your Enterprise Data with ChatGPT: step by step how to create your own Enterprise Chat

This sample demonstrates a few approaches for creating ChatGPT-like experiences over your own data using the Retrieval Augmented Generation pattern. It uses Azure Open AI Service to access the ChatGPT model (gpt-35-turbo), and Azure Cognitive Search for data indexing and retrieval.

The repo includes sample data so it’s ready to try end-to-end. In this sample application, we use a fictitious company called Contoso Electronics, and the experience allows its employees to ask questions about the benefits, internal policies, as well as job descriptions, and roles.


  • Chat and Q&A interfaces
  • Explores various options to help users evaluate the trustworthiness of responses with citations, tracking of source content, etc.
  • Shows possible approaches for data preparation, prompt construction, and orchestration of interaction between model (ChatGPT) and retriever (Cognitive Search)
  • Settings directly in the UX to tweak the behavior and experiment with options
Chat screen

Getting Started

IMPORTANT: In order to deploy and run this example, you’ll need an Azure subscription with access enabled for the Azure OpenAI service. You can request access here. You can also visit here to get some free Azure credits to get you started.

AZURE RESOURCE COSTS by default this sample will create Azure App Service and Azure Cognitive Search resources that have a monthly cost, as well as Form Recognizer resource that has cost per document page. You can switch them to free versions of each of them if you want to avoid this cost by changing the parameters file under the infra folder (though there are some limits to consider; for example, you can have up to 1 free Cognitive Search resource per subscription, and the free Form Recognizer resource only analyzes the first 2 pages of each document.)


To Run Locally

  • Azure Developer CLI
  • Python 3+
    • Important: Python and the pip package manager must be in the path in Windows for the setup scripts to work.
    • Important: Ensure you can run python --version from the console. On Ubuntu, you might need to run sudo apt install python-is-python3 to link python to python3.
  • Node.js
  • Git
  • Powershell 7+ (pwsh) – For Windows users only.
    • Important: Ensure you can run pwsh.exe from a PowerShell command. If this fails, you likely need to upgrade PowerShell.

NOTE: Your Azure Account must have Microsoft.Authorization/roleAssignments/write permissions, such as User Access Administrator or Owner.


Project Initialization

  1. Create a new folder and switch to it in the terminal
  2. Run azd auth login
  3. Run azd init -t azure-search-openai-demo
    • For the target location, the regions that currently support the models used in this sample are East US or South Central US. For an up-to-date list of regions and models, check here
    • note that this command will initialize a git repository and you do not need to clone this repository

Starting from scratch:

Execute the following command, if you don’t have any pre-existing Azure services and want to start from a fresh deployment.

  1. Run azd up – This will provision Azure resources and deploy this sample to those resources, including building the search index based on the files found in the ./data folder.
  2. After the application has been successfully deployed you will see a URL printed to the console. Click that URL to interact with the application in your browser.

For detailed information click here on my GitHub and follow a video from Microsoft talking about the example solution.

You can look at the Chat App that I’ve developed, which I will make available for you to test for a few days.

Firstly, it’s important to understand that you have the ability to replace the PDF files within the “./data” directory with your own business data.

If you wish to examine these files first to gain insights into the types of questions you can make in the chat to test, please click here.

Regrettably, the demo app had to be deactivated due to Azure expenses. If you’d like it to be reactivated, please click here to contact me. Thank you.

You’re able to query any content found within the enterprise PDF files located in the “./data” directory. The chat will respond with citations from the respective PDFs, and you have the option to click through and verify the information directly from the source PDF.


The vast universe of enterprise data, spanning from structured database datasets to unstructured documents, emails, and files, holds a wealth of insights that can drive an organization’s growth and success. Azure Cognitive Search and Azure OpenAI serve as powerful tools that make this data readily accessible, private, and secure. By leveraging these technologies, businesses can tap into the full potential of their internal data, from understanding the intricacies of their benefits and policies to defining roles and job descriptions more effectively. With a future powered by AI and machine learning, the conversations we can have with our data are only just beginning. This is more than just a technological shift; it’s a new era of informed decision-making, driven by data that’s within our reach. This solution provides an array of opportunities to assist businesses in leveraging their corporate data and disseminating it amongst their employees. This method simplifies comprehension, fostering organizational growth and enhancing the company culture. Should you require additional details on this topic, please do not hesitate to reach out to me.

That’s it for today!