Azure OpenAI Service for Analytics
Unlock AI-powered insights: RAG pipelines, NL-to-SQL, embeddings, and GPT-driven BI
Overview
Azure OpenAI Service brings GPT-4o, o-series reasoning models, and state-of-the-art embedding models directly into your Azure data stack — governed by the same security, compliance, and data residency guarantees as the rest of your Azure environment. This two-day training is built for analytics professionals who want to put AI to work on their data. You will build Retrieval-Augmented Generation (RAG) pipelines that answer questions over enterprise datasets, design natural language-to-SQL interfaces for non-technical users, and create semantic search layers on top of data catalogs and document stores. Every pattern is deployed within Azure, using Azure AI Search as the vector store and your existing data lake or SQL databases as the knowledge source — no external APIs, no data leaving your tenant.
What you'll learn
- Deploy and configure Azure OpenAI models (GPT-4o, text-embedding-3-large) within a governed Azure environment
- Build end-to-end RAG pipelines that retrieve relevant context from ADLS Gen2, Azure SQL, or Cosmos DB and generate grounded answers
- Design a natural language-to-SQL interface that translates business questions into T-SQL or Spark SQL queries against your data warehouse
- Implement semantic search on top of enterprise data catalogs using Azure AI Search and text-embedding-3-large vectors
- Fine-tune GPT-4o mini on domain-specific datasets for specialised analytics tasks such as financial classification or clinical coding
- Apply content safety filters, evaluate model outputs with built-in metrics, and trace LLM calls for production reliability
Programme
Day 1 — Foundations, embeddings & semantic search
- Azure OpenAI Service architecture: deployments, quota, global vs regional routing, and responsible AI controls
- Working with GPT-4o via the Chat Completions and Responses APIs: prompting strategies for analytics tasks
- Embedding models deep-dive: text-embedding-3-large, vector dimensions, cosine similarity, and when embeddings outperform keyword search
- Azure AI Search as a vector store: indexing ADLS Gen2 documents, chunking strategies, and hybrid search with BM25 + vectors
- Building a semantic search interface over an enterprise data catalog: enriching metadata with AI-generated summaries
- Hands-on: index a corpus of financial reports in Azure AI Search and build a Q&A endpoint powered by GPT-4o
Day 2 — RAG pipelines, NL-to-SQL & production patterns
- RAG architecture patterns: naive RAG vs advanced RAG with re-ranking, query expansion, and multi-step retrieval
- Natural language-to-SQL: designing system prompts that translate business questions into T-SQL or Synapse SQL queries with schema-aware context injection
- Grounding LLM responses in structured data: combining Azure SQL query results with GPT-generated narrative for automated BI commentary
- Fine-tuning GPT-4o mini: preparing JSONL training data, running fine-tuning jobs, and evaluating domain-specific model improvements
- Production readiness: content safety evaluation, metric-based output scoring, LLM call tracing with Azure Monitor, and cost governance
- Hands-on: build a complete NL-to-SQL assistant that accepts plain-English questions and returns answers grounded in your Azure Synapse data warehouse
Who is this for?
- Data engineers who want to add AI-driven data retrieval and transformation to existing pipelines
- Analytics engineers and BI developers building natural-language interfaces on top of their data warehouse
- Data scientists transitioning from ML model development to LLM-based application patterns
- Solution architects designing AI layers that sit on top of Fabric, Synapse, or Azure SQL
Prerequisites
- Python programming experience (pandas, requests, or similar)
- Basic Azure familiarity (portal, storage accounts, Entra ID)
- SQL knowledge — understanding of SELECT, JOIN, and aggregate queries